Tài liệu tham khảo |
Loại |
Chi tiết |
[7] Luong, H. T., & Vu, H. Q. (2016, December). A non-expert Kaldi recipe for Vietnamese speech recognition system. In Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016) (pp. 51-55) |
Sách, tạp chí |
Tiêu đề: |
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016) |
Tác giả: |
Luong, H. T., & Vu, H. Q |
Năm: |
2016 |
|
[10] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778) |
Sách, tạp chí |
Tiêu đề: |
Proceedings of the IEEE conference on computer vision and pattern recognition |
Tác giả: |
He, K., Zhang, X., Ren, S., & Sun, J |
Năm: |
2016 |
|
[11] Smith, L. N., & Topin, N. (2019, May). Super-convergence: Very fast training of neural networks using large learning rates. In Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications (Vol. 11006, p. 1100612).International Society for Optics and Photonics |
Sách, tạp chí |
Tiêu đề: |
Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications |
Tác giả: |
Smith, L. N., & Topin, N |
Năm: |
2019 |
|
[14] Fayek, H. (2016). Speech processing for machine learning: Filter banks, mel- frequency cepstral coefficients (mfccs) and what’s in-between. URL:https://haythamfayek.com/2016/04/21/speech-processingfor-machine-learning.html |
Sách, tạp chí |
Tiêu đề: |
URL |
Tác giả: |
Fayek, H |
Năm: |
2016 |
|
[15] Choné, A. (2018). Computing MFCCs voice recognition features on ARM systems. Computing MFCCs voice recognition features on ARM systems,[Online].Available:https://medium.com/linagoralabs/computing-mfccs-voice-recognition-features-on-arm-systemsdae45f016eb6 |
Sách, tạp chí |
Tiêu đề: |
Computing MFCCs voice recognition features on ARM systems,[Online] |
Tác giả: |
Choné, A |
Năm: |
2018 |
|
[16] Wikipedia. (2016). Probabilistic parameters of a hidden Markov model. Probabilistic parameters of a hidden Markov model,[Online]. Available:https://en.wikipedia.org/wiki/Hidden_Markov_model |
Sách, tạp chí |
Tiêu đề: |
Probabilistic parameters of a hidden Markov model,[Online]. Available |
Tác giả: |
Wikipedia |
Năm: |
2016 |
|
[8] VinBigData. (2020). The speech corpus for the automatic speech recognition task in VLSP-2020,[Online]. Available https://slp.vinbigdata.org |
Link |
|
[12] Vietnamese NLP Research Group – (UnderTheSea). (2021). Word Tokenize. Word Tokenize ,[Online]. Available: http://undertheseanlp.com |
Link |
|
[17] Kiyoshi Kawaguchi. (2000). Artificial Neural Networks. Artificial Neural Networks, [Online]. Available: http://osp.mans.edu.eg/rehan/ann4.htm |
Link |
|
[18] Colah. (2015). Understanding LSTM Networks. Understanding LSTM Networks, [Online]. Available: https://colah.github.io/posts/2015-08-Understanding-LSTMs |
Link |
|
[19] Facebook Open Source. (2020). Transfer Function Layers. Transfer Function Layers, [Online]. Available: https://nn.readthedocs.io/en/rtd/transfer |
Link |
|
[20] Nvidia Inc. (2018). DeepSpeech2. DeepSpeech2 OpenSeq2Seq, [Online]. Available: https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/deepspeech2.html |
Link |
|
[1] Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., ... & Ng, A |
Khác |
|
[3] Hannun, A. Y., Maas, A. L., Jurafsky, D., & Ng, A. Y. (2014). First-pass large vocabulary continuous speech recognition using bi-directional recurrent DNNs. arXiv preprint arXiv:1408.2873 |
Khác |
|
[4] Vineel Pratap, A. H. (2018). wav2letter++: The fastest open-source speech recognition system. CoRR, vol. abs/1812.07625 |
Khác |
|
[5] Park, D. S., Chan, W., Zhang, Y., Chiu, C. C., Zoph, B., Cubuk, E. D., & Le, Q. V |
Khác |
|
[6] Schneider, S., Baevski, A., Collobert, R., & Auli, M. (2019). wav2vec: Unsupervised pre-training for speech recognition. arXiv preprint arXiv:1904.05862 |
Khác |
|
[9] Tran, Duc Chung. (2020). FPT Open Speech Dataset (FOSD) – Vietnamese. Mendeley Data, V4, doi: 10.17632/k9sxg2twv4.4 |
Khác |
|
[13] Acree, B., Hansen, E., Jansa, J., & Shoub, K. (2016). Comparing and evaluating cosine similarity scores, weighted cosine similarity scores and substring matching. Working Paper |
Khác |
|