Tài liệu tham khảo |
Loại |
Chi tiết |
[1] K. Audhkhasi, B. Ramabhadran, G. Saon, M. Picheny, and D. Nahamoo, “Direct Acoustics-to-Word Models for English Conversational SpeechRecognition,” ArXiv170307754 Cs Stat, Mar. 2017, Accessed: Oct. 14, 2020.[Online]. Available: http://arxiv.org/abs/1703.07754 |
Sách, tạp chí |
Tiêu đề: |
Direct Acoustics-to-Word Models for English Conversational Speech Recognition,” "ArXiv170307754 Cs Stat |
|
[2] J. Wang et al., “Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures,” ArXiv180708974 Cs Eess, Jul. 2018, Accessed: Oct. 14, 2020. [Online]. Available: http://arxiv.org/abs/1807.08974 |
Sách, tạp chí |
Tiêu đề: |
et al.", “Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures,” "ArXiv180708974 Cs Eess |
|
[3] S. Araki et al., “Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming,” in 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA), Mar. 2017, pp. 16–20, doi:10.1109/HSCMA.2017.7895553 |
Sách, tạp chí |
Tiêu đề: |
et al.", “Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming,” in "2017 Hands-free Speech Communications and Microphone Arrays (HSCMA) |
|
[4] N. Tomashenko, “Speaker adaptation of deep neural network acoustic models using Gaussian mixture model framework in automatic speechrecognition systems,” 2017 |
Sách, tạp chí |
Tiêu đề: |
Speaker adaptation of deep neural network acoustic models using Gaussian mixture model framework in automatic speech recognition systems |
|
[7] P. Ghahremani, B. BabaAli, D. Povey, K. Riedhammer, J. Trmal, and S. Khudanpur, “A pitch extraction algorithm tuned for automatic speechrecognition,” in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2014, pp. 2494–2498, doi:10.1109/ICASSP.2014.6854049 |
Sách, tạp chí |
Tiêu đề: |
A pitch extraction algorithm tuned for automatic speech recognition,” in "2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
|
[8] B. S. Lee and D. P. W. Ellis, “Noise Robust Pitch Tracking by Subband Autocorrelation Classification,” p. 4 |
Sách, tạp chí |
Tiêu đề: |
Noise Robust Pitch Tracking by Subband Autocorrelation Classification |
|
[10] Mingyang Wu, DeLiang Wang, and G. J. Brown, “A multipitch tracking algorithm for noisy speech,” IEEE Trans. Speech Audio Process., vol. 11, no. 3, pp. 229–241, May 2003, doi: 10.1109/TSA.2003.811539 |
Sách, tạp chí |
Tiêu đề: |
A multipitch tracking algorithm for noisy speech,” "IEEE Trans. Speech Audio Process |
|
[12] H. Bourlard and C. J. Wellekens, “Links Between Markov Models and Multilayer Perceptrons,” in Advances in Neural Information Processing Systems 1, D. S. Touretzky, Ed. Morgan-Kaufmann, 1989, pp. 502–510 |
Sách, tạp chí |
Tiêu đề: |
Links Between Markov Models and Multilayer Perceptrons,” in "Advances in Neural Information Processing Systems 1 |
|
[13] N. Morgan and H. Bourlard, “Continuous speech recognition using multilayer perceptrons with hidden Markov models,” in International Conference |
Sách, tạp chí |
Tiêu đề: |
Continuous speech recognition using multilayer perceptrons with hidden Markov models,” in |
|
[14] H. Bourlard, N. Morgan, C. Wooters, and S. Renals, “CDNN: a context dependent neural network for continuous speech recognition,” in [Proceedings]ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, Mar. 1992, vol. 2, pp. 349–352 vol.2, doi:10.1109/ICASSP.1992.226048 |
Sách, tạp chí |
Tiêu đề: |
CDNN: a context dependent neural network for continuous speech recognition,” in "[Proceedings] "ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing |
|
[15] G. E. Dahl, D. Yu, L. Deng, and A. Acero, “Large vocabulary continuous speech recognition with context-dependent DBN-HMMS,” in 2011 IEEEInternational Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2011, pp. 4688–4691, doi: 10.1109/ICASSP.2011.5947401 |
Sách, tạp chí |
Tiêu đề: |
Large vocabulary continuous speech recognition with context-dependent DBN-HMMS,” in "2011 IEEE "International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
|
[16] G. Hinton et al., “Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” IEEE Signal Process.Mag., vol. 29, no. 6, pp. 82–97, Nov. 2012, doi: 10.1109/MSP.2012.2205597 |
Sách, tạp chí |
Tiêu đề: |
et al.", “Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” "IEEE Signal Process. "Mag |
|
[17] D. Yu, L. Deng, and G. E. Dahl, “Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition,” Dec |
Sách, tạp chí |
Tiêu đề: |
Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition |
|
[18] A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang, “Phoneme recognition using time-delay neural networks,” IEEE Trans. Acoust.Speech Signal Process., vol. 37, no. 3, pp. 328–339, Mar. 1989, doi:10.1109/29.21701 |
Sách, tạp chí |
Tiêu đề: |
Phoneme recognition using time-delay neural networks,” "IEEE Trans. Acoust. "Speech Signal Process |
|
[19] L. Bottou, F. Fogelman Soulié, P. Blanchet, and J. S. Liénard, “Speaker- independent isolated digit recognition: Multilayer perceptrons vs. Dynamic time warping,” Neural Netw., vol. 3, no. 4, pp. 453–465, Jan. 1990, doi:10.1016/0893-6080(90)90028-J |
Sách, tạp chí |
Tiêu đề: |
Speaker-independent isolated digit recognition: Multilayer perceptrons vs. Dynamic time warping,” "Neural Netw |
|
[20] I. Guyon, P. Albrecht, Y. Le Cun, J. Denker, and W. Hubbard, “Design of a neural network character recognizer for a touch terminal,” Pattern Recognit., vol. 24, no. 2, pp. 105–119, Jan. 1991, doi: 10.1016/0031-3203(91)90081-F |
Sách, tạp chí |
Tiêu đề: |
Design of a neural network character recognizer for a touch terminal,” "Pattern Recognit |
|
[21] D. Povey et al., “Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI,” 2016, doi: 10.21437/Interspeech.2016-595 |
Sách, tạp chí |
Tiêu đề: |
et al.", “Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI |
|
[23] K. Li, H. Xu, Y. Wang, D. Povey, and S. Khudanpur, “Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition,” |
Sách, tạp chí |
Tiêu đề: |
Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition |
|
[24] T. Hori, J. Cho, and S. Watanabe, “End-to-end Speech Recognition With Word-Based Rnn Language Models,” in 2018 IEEE Spoken LanguageTechnology Workshop (SLT), Dec. 2018, pp. 389–396, doi |
Sách, tạp chí |
Tiêu đề: |
End-to-end Speech Recognition With Word-Based Rnn Language Models,” in "2018 IEEE Spoken Language "Technology Workshop (SLT) |
|
[25] S. Katz, “Estimation of probabilities from sparse data for the language model component of a speech recognizer,” IEEE Trans. Acoust. Speech Signal Process., vol. 35, no. 3, pp. 400–401, Mar. 1987, doi:10.1109/TASSP.1987.1165125 |
Sách, tạp chí |
Tiêu đề: |
Estimation of probabilities from sparse data for the language model component of a speech recognizer,” "IEEE Trans. Acoust. Speech Signal Process |
|