University of Wollongong Research Online University of Wollongong Thesis Collection 2017+ University of Wollongong Thesis Collections 2021 Domestic Multi-channel Sound Detection and Classification for the Monitoring of Dementia Residents’ Safety and Well-being using Neural Networks Abigail Copiaco Follow this and additional works at: https://ro.uow.edu.au/theses1 University of Wollongong Copyright Warning You may print or download ONE copy of this document for the purpose of your own research or study The University does not authorise you to copy, communicate or otherwise make available electronically to any other person any copyright material contained on this site You are reminded of the following: This work is copyright Apart from any use permitted under the Copyright Act 1968, no part of this work may be reproduced by any process, nor may any other exclusive right be exercised, without the permission of the author Copyright owners are entitled to take legal action against persons who infringe their copyright A reproduction of material that is protected by copyright may be a copyright infringement A court may impose penalties and award damages in relation to offences and infringements relating to copyright material Higher penalties may apply, and higher damages may be awarded, for offences and infringements involving the conversion of material into digital or electronic form Unless otherwise indicated, the views expressed in this thesis are those of the author and not necessarily represent the views of the University of Wollongong Research Online is the open access institutional repository for the University of Wollongong For further information contact the UOW Library: research-pubs@uow.edu.au Domestic Multi-channel Sound Detection and Classification for the Monitoring of Dementia Residents’ Safety and Well-being using Neural Networks Abigail Copiaco Supervisor: Prof Christian Ritz Co-supervisors: Dr Nidhal Abdulaziz and Dr Stefano Fasciani This thesis is presented as part of the requirement for the conferral of the degree: Doctor of Philosophy (PhD) University of Wollongong School of Electrical, Computer, and Telecommunications Engineering Faculty of Informatics August 2021 Abstract Recent studies conducted by the World Health Organization reveal that approximately 50 million people are affected by dementia Such individuals require special care that translates to high social costs In the last decade, we assisted to the introduction of dementia assistive technologies that aimed at improving the quality of life of residents, as well as facilitating the work of caregivers Merging the significance of both the alleviation in coping with dementia with the perceptible popularity of assistive technology and smart home devices, the main focus of this work is to further improve home organization and management of individuals living with dementia and their caregivers through the use of technology and artificial intelligence In particular, we aim at developing an effective but non-invasive environment monitoring solution This thesis proposes a novel strategy to detect, classify, and estimate the location of householdrelated acoustic scenes and events, enabling a less intrusive monitoring system for the assistance and supervision of dementia residents The proposed approach is based on classification of multichannel acoustical data acquired from omnidirectional microphone arrays (nodes), which consists of four linearly arranged microphones, placed on four corner locations across each room The development of a customized synthetic database that reflects real-life recordings relevant to dementia healthcare is also explored, in order to improve and assess the overall robustness of the system A combination of spectro-temporal acoustic features extracted from the raw digitizedacoustic data will be used for detection and classification purposes Alongside this, spectral-based phase information is utilized in order to estimate the sound node location In particular, this work will explore and conduct a detailed study on the performance of different types and topologies of Convolutional Neural Networks, developing an accurate and compact neural network with a series architecture, that is suitable for devices with limited computational resources Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability The effectiveness of the Neural Network classification techniques is measured through a set of quantitative performance parameters that will also account for dementia-specific issues Top performing classifiers and data from multiple microphone arrays will then be subject to fine-tuning methods in order to maximize the recognition accuracy, and overall efficiency of the designed system The optimum methodology developed has improved the performance of the AlexNet network while decreasing its network size by over 95% Finally, the implementation of the detection and classification algorithm includes an easy-to-use interface enabling caregivers to customize the system for individual resident needs, which is developed based on a design thinking research approach i Acknowledgments I would like to thank my supervisors: Prof Christian Ritz, Dr Nidhal Abdulaziz, and Dr Stefano Fasciani, for their continuous support and guidance throughout the course of my PhD studies It was truly an honor to work with you Thank you for always believing in me My sincerest gratitude is also given to my family Thank you for being the source of my strength, motivation, and encouragement Yours were the silent applause every time I was able to accomplish something Further, your valuable gentle tap on my shoulders every time I feel discouraged did wonders for me To my best friend, thank you for always being there for me, for cheering me up during difficult times, and for lending me your research server every time I need to tedious training tasks for my PhD None of these would have been possible without your valuable support, and I will forever be grateful Similarly, appreciation is also intended to the University of Wollongong in Australia, for the International Postgraduate Tuition Awards (IPTA) granted to me I am truly honored, and at the same time humbled, for the opportunity to be associated with a novel project under a prestigious university I am also thankful for the University of Wollongong in Dubai for the support Above all, I would like to bring back all the glory and thanks to God For I know that I was not able to complete this thesis through my own strength and wisdom, but by His grace and blessings © Copyright by Abigail Copiaco, 2021 All Rights Reserved ii Certification I, Abigail Copiaco, declare that this thesis submitted in fulfilment of the requirements for the conferral of the degree PhD by Research, from the University of Wollongong, is wholly my own work unless otherwise referenced or acknowledged This document has not been submitted for qualifications at any other academic institution Abigail Copiaco 20th August 2021 iii Thesis Publications Journal Articles: [1] A Copiaco, C Ritz, N Abdulaziz, and S Fasciani, A Study of Features and Deep Neural Network Architectures and Hyper-parameters for Domestic Audio Classification, Applied Sciences 2021, 11, 4880 https://doi.org/10.3390/app11114880 Conference Proceedings: [1] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “Development of a Synthetic Database for Compact Neural Network Classification of Acoustic Scenes in Dementia Care Environments”, APSIPA, accepted for publication, 2021 [2] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “Identifying Sound Source Node Location using Neural Networks trained with Phasograms”, 20th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) 2020, Louisville, Kentucky, USA, Dec 9-11, 2020, pp 1-7 [3] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “An Application for Dementia Patients Monitoring with an Integrated Environmental Sound Levels Assessment Tool”, 3rd International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates (UAE), Nov 25-26, 2020, pp 1-4 [4] A Copiaco, C Ritz, N Abdulaziz, and S Fasciani, “Identifying Optimal Features for Multi-channel Acoustic Scene Classification”, 2nd International Conference on Signal Processing and Information Security (ICSPIS), Dubai, United Arab Emirates (UAE), 2019, pp 1-4 [5] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “Scalogram Neural Network Activations with Machine Learning for Domestic Multi-channel Audio Classification”, 19th IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Ajman, United Arab Emirates, 2019, pp 1-6 Technical Reports and Pre-prints: [1] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “DASEE: A Synthetic Database of Domestic Acoustic Scenes and Events in Dementia Patients’ Environment”, arXiv:2104.13423v2 [eess.AS], Apr 2021 [2] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “Sound Event Detection and Classification using CWT Scalograms and Deep Learning”, Detection and Classification of Acoustic Scenes and Events (DCASE) 2020, Task Challenge, Technical Report, 2020 [3] A Copiaco, C Ritz, S Fasciani, and N Abdulaziz, “Detecting and Classifying Separated Sound Events using Wavelet-based Scalograms and Deep Learning”, Detection and Classification of Acoustic Scenes and Events (DCASE) 2020, Task Challenge, Technical Report, 2020 iv Awards and Distinctions Best Paper Award, for paper entitled “An Application for Dementia Patients Monitoring with an Integrated Environmental Sound Levels Assessment Tool”, presented at the rd International Conference on Signal Processing and Information Security (ICSPIS) 2020 Best Paper Award, for paper entitled “Identifying Optimal Features for Multi-channel Acoustic Scene Classification”, presented at the 2nd International Conference on Signal Processing and Information Security (ICSPIS) 2019 Artificial Intelligence Practitioner – Instructor Certificate, issued by IBM on April 2021 Related Certificates: - Enterprise Design Thinking, Team Essentials for AI Certificate, March 2021 - Artificial Intelligence Analyst, Explorer Award, July 2020 - Artificial Intelligence Analyst, Mastery Award, August 2020 Enterprise Design Thinking Practitioner – Instructor Certificate, issued by IBM on February 2021 Related Certificates: - Enterprise Design Thinking, Practitioner Badge, January 2021 - Enterprise Design Thinking, Co-creator Badge, January 2021 v List of Names or Abbreviations AARP The American Association of Retired Persons ANN Artificial Neural Networks ASN Acoustic Sensor Network AT Assistive Technology CNN Convolutional Neural Network CSV Comma Separated Value CWT Continuous Wavelet Transform DAG Directed Acyclic Graph DASEE Domestic Acoustic Sounds and Events in the Environment database DCASE Detection and Classification of Acoustic Scenes and Events DCT Discrete Cosine Transform DCNN Deep Convolutional Neural Network DEMAND Diverse Environments Multi-channel Acoustic Noise Database DFT Discrete Fourier Transform DNN Deep Neural Network DOA Direction of Arrival DWT Discrete Wavelet Transform eLU Exponential Linear Unit ESPRIT Estimation of Signal Parameters via Rotational Invariance Techniques FIR Finite Impulse Response FFT Fast Fourier Transform GLCM Gray-level Co-occurrence Matrix GMM Gaussian Mixture Model GRNN Gated Recurrent Neural Network GUI Graphical User Interface k-NN k-nearest Neighbor LMS Least Mean Square LPCC Linear Predictive Cepstral Coefficients LSTM Long-short Term Memory Recurrent Neural Network LUFS Loudness Units relative to Full Scale MCI Mild Cognitive Impairment MFCC Mel Frequency Cepstral Coefficients MMSE Minimum Mean Squared Error MUSIC Multiple Signal Classification NATSEM The National Centre for Social and Economic Modelling PNCC Power Normalized Cepstral Coefficients RASTA-PLP Relative Spectral Perceptual Linear Prediction ReLU Rectified Linear Unit vi RIR Room Impulse Response RLS Recursive Least Squares RNN Recurrent Neural Network SGDM Stochastic Gradient Descent with Momentum SINS Sound INterfacting through the Swarm database SNR Signal-to-Noise Ratio SPCC Subspace Projection Cepstral Coefficients STFT Short Time Fourier Transform SVM Support Vector Machines TinyEARS Tiny Energy Accounting and Reporting System WHO World Health Organization VDT Virtual Dementia Tour ZCR Zero Crossing Rate vii Table of Contents Abstract i Acknowledgments ii Certification iii Thesis Publications iv Awards and Distinctions v List of Names or Abbreviations vi Table of Contents viii List of Tables, Figures and Illustrations xiii Introduction 1.1 Overview 1.2 Dementia 1.2.1 Signs and Symptoms 1.2.2 Influence of Age and Gender 1.2.3 Statistical Evidence 1.3 Assistive Technology .4 1.3.1 Continual Influence of Smart Home Devices 1.3.2 Ethical Concerns and Considerations 1.4 Existing Assistive Technology Related to Dementia .6 1.4.1 Summary of the Limitations of Existing AT Devices for Dementia Care 1.4.2 Recommendations and Compliance to Ethical Requirements 1.4.3 Identification of Domestic Hazards for Dementia Monitoring Systems 1.4.4 Users of the Monitoring System 10 1.5 Objectives and Contributions 11 1.5.1 Objectives 11 1.5.2 Contributions 12 1.6 Thesis Scope 12 1.7 Thesis Structure .13 1.7.1 Publications 13 1.7.2 Thesis Structure and Research Output Alignment 13 15 Review of Approaches to Classifying and Localizing Sound Sources 15 2.1 Introduction 15 2.1.1 System Framework 15 2.2 Acoustic Data 16 2.2.1 Single-channel Audio Classification 16 2.2.2 Multi-channel Audio Classification 17 2.2.3 Factors affecting Real-life Audio Recordings 17 2.3 Feature Engineering for Audio Signal Classification 18 viii C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [140] K Simonyan and A Zisserman, “Very Deep Convolutional Networks for Large-scale Image Recognition,” in ICLR, 2015 [141] Y LeCun, L Bottou, Y Bengio and P Haffner, “Gradient-based Learning Applied to Document Recognition,” in Proceedings of the IEEE, 1998 [142] A Mesaros, T Heittola and T Virtanen, “TUT database for acoustic scene classification and sound event detection,” in 24th European Signal Processing Conference 2016 (EUSIPCO 2016), Budapest, Hungray, 2016 [143] D Rothmann, 26 March 2018 [Online] Available: https://towardsdatascience.com/whats-wrong-with-spectrogramsand-cnns-for-audio-processing-311377d7ccd [Accessed 24 February 2019] [144] G Dekkers, L Vuegen, T van Waterschoot, B Vanrumste and P Karsmakers, “DCASE 2018 Challenge - Task 5: Monitoring of domestic activities based on multi-channel acoustics,” Tokyo, Japan, 2018 [145] Y Huang and e al., “Real-time passive source localization: a practical linearcorrection least-squares approach,” IEEE Transactions on Speech and Audio Processing, vol 9, no 8, 2001 [146] U Klein and T Quoc Vo, “Direction-of-arrival estimation using a microphone array with the multichannel crosscorrelation method,” in 2012 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Ho Chi Minh City, 2012 [147] J Chen, J Benesty and Y Huang, “Robust time delay estimation exploiting redundancy among multiple microphones,” IEEE Speech Audio Processing, vol 11, no 6, pp 549-557, 2003 [148] J Benesty, J Chen and Y Huang, “Direction-of-arrival and time-deifference-of-arrival estimation,” in Microphone Array Signal Processing, Berlin and Heidelberg, Springer-Verlag, 2008, pp 181-215 [149] Mathworks, “Documentation: phased.MUSICEstimator,” MATLAB, 2019 [150] H Hwang, Z Aliyazicioglu, M Grice and A Yakovlev, “Direction of Arrival Estimation using a Root-MUSIC Algorithm,” in International Multi-conference of Engineers and Computer Scientists, Hong Kong, 2008 [151] H Hwang, A Zekeriya and A Lok, “Simulation study and sensitivity analysis on DOA estimation using ESPRIT algorithm,” Engineering Letters, vol 18, no 2, 2010 [152] S Adavanne, A Politis and T Virtanen, “Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network,” in EUSIPCO, 2018 [153] S Chakrabarty and E Habets, “Multi-scale aggregation of phase information for reducing computational cost of CNN based DOA estimation,” arXiv, Cornell University, 2018 [154] DCASE, “Task 3: Sound Event Localization and Detection with Directional Interference,” DCASE, 2021 [155] K Shimada and e al., “Ensemble of ACCDOA- and EINV2-based Systems with D3NETS and impulse response simulation for sound event localization and detection,” DCASE, Sony Group Corporation, 2021 [156] M Rieger, “Ambisonics für Virtual Reality und 360° Schallfeld,” VRTONUNG, 2021 [Online] Available: https://www.vrtonung.de/en/ambisonics/ [Accessed 25 July 2021] [157] J Naranjo-Alcazar, S Perez-Castanos, M Cobos, F Ferri and P Zuccarello, “Sound Event Localization and Detection using Squeeze-excitation Residual CNNs,” DCASE 2021, 2021 [158] Z Pu, J Bai and J Chen, “DCASE 2021 Task 3: SELD System based on Resnet and Random Segment Augmentation,” DCASE 2021, 2021 [159] S Phung, A Bouzerdoum and G Nguyen, “Learning pattern classification tasks with imbalanced data sets,” P Yin (Eds.), Pattern Recognition, pp 193-208, 2009 [160] B Shmueli, “Multi-Class Metrics Made Simple, Part II: the F1-score,” Towards Data Science, 2019 163 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [161] A Varga and H Steeneken, “Assessment for automatic speech recognition: II NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,” Speech Communication, vol 12, no 3, pp 247-251, July 1993 [162] G Hirsch, “Applying the Advanced ETSI frontend to the Aurora-2 task,” 2006 [163] G Dekkers, S Lauwereins, B Thoen, M Adhana, H Brouckxon, T van Waterschoot, B Vanrumste, M Verhelst and P Karsmakers, “The SINS database for detection of daily activities in a home environment using an acoustic sensor network,” 2017 [164] A Mesaros, T Heittola and T Virtanen, “TUT database for acoustic scene classification and sound event detection,” in 24th European Signal Processing Conference (EUSIPCO), Budapest, 2016 [165] J Thiemann, N Ito and E Vincent, “DEMAND: a collection of multi-channel recordings of acoustic noise in diverse environments,” 2013 [166] H Adel, M Souad, A Alqeeli and A Hamid, “Beamforming Techniques for Multichannel Audio Signal Separation,” International Journal of Digital Content Technology and its Applications, vol 6, no 20, pp 659 - 667, December 2012 [167] Y Fujita and R Takashima, “Data Augmentation using Multi-Input Multi-Output Source Separation for Networkbased Acoustic Modeling,” in INTERSPEECH , 2016 [168] S Joiner, J Guillory, A Bratton and R Hwong, “Investigation of Delay and Sum Beamforming Using a TwoDimensional Array,” Connexions, Rice University, Houston, Texas, 2006 [169] Y Zeng and R Hendriks, “Distributed Delay and Sum Beamformer for Speech Enhancement via Randomized Gossip,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014 [170] N Modhave, Y Karuna and S Tonde, “Design of matrix wiener filter for noise cancellation of speech signals in hearing aids,” in International Conference on Green Engineering and Technologies (IC-GET), 2016 [171] G Yadav and B Krisha, “Study of Different Adaptive Filter Algorithms for Noise Cancellation in Real-Time Environment,” International Journal of Computer Applications, vol 96, no 10, pp 20-25, 2014 [172] C Ritcher and K Kulewski, “Automated Method of Noise Removal from Multichannel Audio,” in Technical Disclosure Commons, 2016 [173] N Takahashi, M Gygli, B Pfister and L Van Gool, “Deep convolutional neural networks and data augmentation for acoustic event recognition,” in INTERSPEECH 2016, 2016 [174] H Zhang, M Cisse, Y Dauphin and D Lopez-Paz, “Mixup: Beyond empirical risk minimization,” in ICLR, 2018 [175] P Magron, R Badeau and B David, “Phase reconstruction of spectrograms with linear unwrapping: application to audio signal restoration,” in EUSIPCO, Nice, France, 2015 [176] J Lim, Two-dimensional Signal and Image Processing, Englewood Cliffs, NJ: Prentice Hall, 1990 [177] M Desouky, A Abbas, E El-Rabaie and W Al-Nuaimy, “Speech Enhancement with an adaptive Wiener filter,” International Journal of Speech Technology, vol 17, pp 53-64, 2014 [178] Social Care Institute for Excellence, “Dementia-friendly Environments: Noise Levels,” May 2015 [Online] Available: https://www.scie.org.uk/dementia/supporting-people-with-dementia/dementia-friendly- environments/noise.asp#:~:text=Of%20all%20the%20senses%2C%20hearing,such%20as%20noise%20and%20ligh t [Accessed 18 February 2021] [179] M Hayne and R Fleming, “Acoustic design guidelines for dementia care facilities,” in Proceedings on 43rd International Congress on Noise Control Engineering: Internoise, Melbourne, Australia, 2014 164 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [180] J van Hoof, H Kort, M Dujinstee, P Rutten and J Hensen, “The indoor environment and the integrated design of homes for older people with dementia,” Building and Environment, vol 45, no 5, pp 1244-1261, 2010 [181] J Price, D Hermans and J Grimley-Evans, “Subjective barriers to prevent the wandering of people cognitively impaired people,” Cochrane Database of Systematic Reviews, 3:CD001932, 2007 [182] N Turpault, R Serizel, A Shah and J Salamon, “Sound event detection in domestic environments with weakly labeled data and soundscape synthesis,” in DCASE Workshop, 2019 [183] E Fonseca, X Favory, J Pons, F Font and X Serra, “FSD50K: an Open Dataset of Human-Labeled Sound Events,” arXiv:2010.00475, 2020 [184] E Fonseca, M Plakal, F Font, D Ellis and X Serra, “Audio tagging with noisy labels and minimal supervision,” in DCASE Workshop, New York, USA, 2019 [185] J Salamon, C Jacoby and J Bello, “A Dataset and Taxonomy for Urban Sound Research,” in 22nd ACM International Conference on Multimedia, Orlando, USA, 2014 [186] F He and e al., “Open-source Multi-speaker Speech Corpora for Building Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu Speech Synthesis Systems,” in 12th Language Resources and Evaluation Conference (LREC), Marseille, France, 2020 [187] H SeniorLife [Online] Available: https://www.hebrewseniorlife.org/newbridge/types-residences/independentliving/independent-living-apartments [188] S Hafezi, A Moore and P Naylor, “Room Impulse Response for Directional source generator (RIRDgen),” 2015 [Online] Available: http://www.commsp.ee.ic.ac.uk/~ssh12/RIRD.htm [189] S Simm and D Coley, “The relationship between wall reflectance and daylight factor in real rooms,” Architectural Science Review, vol 54, no 4, pp 329-334, 2011 [190] European Committee for Standardization, “EN 12464-1 Lighting of work places - Part 1: Indoor work places.,” 2011 [191] J Allen and D Berkley, “Image method for efficiently simulating small-room acoustics,” Journal Acoustic Society of America, vol 65, no 4, p 943, 1979 [192] M Boudreau, “What is Signal to Noise Ratio (SNR)? What to look for & how to use it,” How to record a podcast, 2017 [Online] Available: https://www.thepodcasthost.com/recording-skills/signal-to-noise-ratio/ [Accessed June 2021] [193] N database, Urban Sound Dataset, [Online] Available: https://urbansounddataset.weebly.com/urbansound8k.html [194] X Ying, “An Overview of Overfitting and its Solutions,” IOP Conf Series: Journal of Physics, vol 1168, no 022022, 2019 [195] C Kim and R Stern, “Power-normalized cepstral coefficients (PNCC) for robust speech recognition,” in IEEE ICASSP, 2012 [196] M Zulkifly and N Yahya, “Relative spectral-perceptual linear prediction (RASTA-PLP) speech signals analysis using singular value decomposition (SVD),” in 3rd International Symposium in Robotics and Manufacturing Automation, 2017 [197] S Dinkar Apte, “Random Signal Processing,” CRC Press by Taylor & Francis Group, LLC, 2018 [198] D Han, “Comparison of Commonly Used Image Interpolation Methods,” in ICCSEE Conference, Paris, France, 2013 [199] T Korting, “How SVM (Support Vector Machines) Algorithm Works,” 2014 [200] Z Pan, H Lu and A Huang, “Automated Diagnosis of Alzheimer's Disease with Degenerate SVM-based Adaboost,” in 5th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, 2013 165 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [201] MATLAB Documentation, “Continuous Wavelet Transform and Scale-Based Analysis,” 2019 [202] Mathworks, “Documentation: Choose a Wavelet,” 2019 [203] I De Moortel, S Munday and A Hood, “Wavelet Analysis: The Effect of Varying Wavelet Parameters,” Solar Physics, vol 222, pp 203-228, 2004 [204] Mathworks, “Help Center Documentation: cwtft,” 2011 [205] G Wolf, S Mallat and S Shamma, “Audio Source Separation with Time-Frequency Velocities,” in Proceedings of IEEE MLSP, Reims, France, 2014 [206] M Cohen, “A better way to define and describe Morlet wavelets for time-frequency analysis,” Biorxiv, Donders Institute for Neuroscience, n.d [207] M Rhudy, “Time Alignment Techniques for Experimental Sensor Data,” International Journal of Computer Science and Engineering Survey, vol 5, no 2, pp 1-14, 2014 [208] S Paulose, E Sebastian and B Paul, “Acoustic Source Localization,” International Journal of Advanced Research in Electrical, Electronics, and Instrumentation Engineering, vol 2, no 2, pp 933-939, 2013 [209] X Bian, G Abowd and J Rehg, “Using Sound Source Localization to Monitor and Infer Activities in the Home,” Georgia Institute of Technology, Atlanta, 2004 [210] C Lorenzi, S Gatehouse and C Level, “Sound localization in noise in normal-hearing listeners,” J Acoust Soc Am., vol 105, no 6, pp 3454-3463, 1999 [211] G Athanasopoulos, T Dekens and W Verhelst, “Acoustic localization enhanced with phase information from modified STFT magnitude,” in 18th International Conference on Digital Signal Processing (DSP), 2013 [212] P Magron, R Badeau and B David, “Phase Reconstruction of Spectrograms with linear Unwrapping: Application to Audio Signal Restoration,” in 23rd European Signal Processing Conference (EUSIPCO 2015), Nice, France, 2015 [213] F Leonard, “Phase spectrogram and frequency spectrogram as new diagnostic tools,” Mechanical Systems and Signal PRocessing, vol 21, no 1, pp 125-137, 2007 [214] S Chakrabarty and E Habets, “Multi-Speaker DOA Estimation using Deep Convolutional Networks trained with Noise Signals,” IEEE Journal of Sel Top in Sig Proc., vol 13, no 1, pp 8-21, 2019 [215] J Dmochowski, J Benesty and S Affes, “On Spatial Aliasing in Microphone Arrays,” IEEE Transactions on Signal Processing, vol 57, no 4, pp 1383-1395, 2009 [216] J Pak and J Shin, “Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks,” IEEE / ACM Transactions on Audio, Speech, and Language Processing, vol 27, no 8, pp 1335-1345, 2019 [217] Mathworks, “MATLAB Documentation: Continuous Wavelet Transform and Scale-based Analysis,” 2019 [218] Y Zhang, Z Guo, W Wang, S He, T Lee and M Loew, “Comparison of the Wavelet and Short-time Fourier Transforms for Doppler Spectral Analysis,” Med Eng Phys., vol 25, no 7, pp 547-557, 2003 [219] V Gerla, E Saifutdinova, M Macas, A Mladek and L Lhotska, “Comparison of short-time Fourier transform and continuous wavelet transform for frequency analysis of sleep EEG,” Clinical Neurophysiology, Elsevier Journals, vol 129, no 4, 2018 [220] Vocal Technologies, “Spatial Sampling and Aliasing with Microphone Array,” Vocal, 2018 [Online] Available: https://www.vocal.com/echo-cancellation/spatial-sampling-and-aliasing-with-microphone-array/ [Accessed 01 07 2020] [221] European Broadcasting Union, “Loudness Normalisation and Permitted Maximum Level of Audio Signals,” EBU-R128, 2014 166 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [222] T Hirvonin, “Classification of Spatial Audio Location and Content Using Convolutional Neural Networks,” in Audio Engineering Society 138th Convention, Warsaw, Poland, 2015 [223] Y Wang, Y Li, Y Song and X Rong, “The Influence of the Activation Function in a Convolution Neural Network Model of Facial Expression Recognition,” Applied Sciences, MDPI, vol 10, no 1897, 2020 [224] M Weir, “A method for self-determination of adaptive learning rates in back propagation,” Neural Networks, vol 4, pp 371-379, 1991 [225] S Shi and X Chu, “Speeding up Convolutional Neural Networks By Exploiting the Sparsity of Rectifier Units,” arXiv, 2017 [226] W Hu, M Wang, B Liu, F Ji, H Chen, D Zhao, J Ma and R Yan, “Transformation of Dense and Sparse Text Representations,” arXiv, 2019 [227] L Lu, Y Shin, Y Su and G Karniadakis, “Dying ReLU and Initialization: Theory and Numerical Examples,” Commun Comput Physics, vol 28, no 5, pp 1671-1706, 2020 [228] C Doshi, “Why Relu? Tips for using Relu Comparison between Relu, Leaky Relu, and Relu-6,” Medium, 2019 [229] A Maas, A Hanuun and A Ng, “Rectifier nonlinearities improve neural network acoustic models,” in Proceedings of ICML, 2013 [230] C Djork-Arne, T Unterthiner and S Hochre-iter, “Fast and accurate deep network learning by exponential linear units (ELUs),” arXiv preprint, 2015 [231] P Ramachandran, B Zoph and Q Le, “Swish: A Self-gated Activation Function,” arXiv, 16 October 2017 [232] D Pedamonti, “Comparison of non-linear activation functions for deep neural networks on MNIST classification task,” arXiv publication, 2018 [233] S Basha, S Dubey, V Pulabaigari and S Mukherjee, “Impact of Fully Connected Layers on Performance of Convolutional Neural Networks for Image Classification,” arXiv: preprint submitted to Neurocomputing, 2019 [234] V Romanuke, “An Efficient Technique for Size Reduction of Convolutional Neural Networks after Transfer Learning for Scene Recognition Tasks,” Applied Computer Systems, vol 23, no 2, pp 141-149, 2018 [235] Mathworks, “DAG Network, Matlab Documentation,” 2017 [Online] Available: https://www.mathworks.com/help/deeplearning/ref/dagnetwork.html [Accessed March 2021] [236] A Zayegh and N A Bassam, “Neural Network Principles and Applications,” in Digital Systems, DOI:10.5772/intechopen.80416, 2018 [237] X Glorot and Y Bengio, “Understanding the Difficulty of Training Deep Feedforward Neural Networks,” in In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010 [238] J Xu, X Sun, Z Zhang, G Zhao and J Lin, “Understanding and Improving Layer Normalization,” in 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 2019 [239] N Vijayrania, “Different Normalization Layers in Deep Learning,” Towards Data Science, 10 December 2020 [Online] Available: https://towardsdatascience.com/different-normalization-layers-in-deep-learning-1a7214ff71d6 [Accessed April 2020] [240] S Qiao, H Wang, C Liu, W Shen and A Yuille, “Micro-Batch Training with Batch-Channel Normalization and Weight Standardization,” Journal of Latex Class Files, vol 14, no 8, pp 1-15, 2015 [241] Mathworks, “MATLAB Documentation: convolution2DLayer,” Mathworks, 2016 [Online] Available: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.convolution2dlayer.html;jsessionid=c7f49d801e8 a242e369f28459e95 [Accessed April 2021] 167 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [242] S Jadon, “Why we don’t use Bias in Regularization?,” Medium, February 2018 [Online] Available: https://medium.com/@shrutijadon10104776/why-we-dont-use-bias-in-regularization-5a86905dfcd6 [Accessed April 2021] [243] R Zaheer and H Shaziya, “A Study of the Optimization Algorithms in Deep Learning,” in International Conference on Inventive Systems and Control (ICISC 2019), 2019 [244] M Kochenderfer and T Wheeler, Algorithms for Optimization, The MIT Press, 2019, p 29 [245] S Ruder, “An overview of gradient descent optimization algorithms,” Sebastian Ruder, 19 January 2016 [Online] Available: https://ruder.io/optimizing-gradient-descent/index.html#momentum [Accessed 11 April 2021] [246] N Chauhan, “Optimization Algorithms in Neural Networks,” KDNuggets, 2020 [247] M Mukkamala and M Hein, “Variants of RMSProp and Adagrad with Logarithmic Regret Bounds,” in Proceedings of ICML, 2017 [248] D Kingma and J Ba, “Adam: A Method for Stochastic Optimization,” in 3rd International Conference for Learning Representations, San Diego, 2015 [249] M Sandler, A Howard, M Zhu, A Zhmoginov and L Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [250] B Zoph, V Vasudevan, J Shlens and Q Le, “Learning Transferable Architectures for Scalable Image Recognition,” arXiv, 2018 [251] X Zhang, X Zhou, M Lin and J Sun, “ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [252] V Thost and J Chen, “Directed Acyclic Graph Neural Networks,” in ICLR, 2021 [253] IBM, Train-the-Trainer Course: Enterprise Design Thinking Practicioner, United States of America: IBM, 2021 [254] A Research, “Caring for People with Dementia: Caregivers' Experiences,” AARP Research, 2018 [255] H Shah, “The State of Mobile App Usage,” Simform, January 2021 [Online] Available: https://www.simform.com/the-state-of-mobile-app-usage/ [Accessed 16 April 2021] [256] E Zwierenberg, H Nap, D Lukkien, L Cornelisse, E Finnema, A Dijkstra, M Hagedoorn and R Sanderman, “A lifestly monitoring system to support (in)formal caregivers of people with dementia: Analysis of users need, benefits, and concerns,” Gerontechnology, vol 17, no 4, pp 194-205, 2018 [257] A Suresh and K Shunmuganathan, “Image Texture Classification using Gray Level Co-Occurrence Matrix Based Statistical Features,” European Journal of Scientific Research, vol 75, no 4, pp 591-597, 2012 [258] W Galitz, The essential guide to user interface design: an introduction to GUI design principles and techniques, John Wiley & Sons, 2007 [259] World Health Organization, “Comnoise-4,” WHO, 2020 [260] Mathworks, “splMeter,” Mathworks Documentation, 2018 [261] F Lin and e al., “Hearing Loss and Cognitive Decline among Older Adults,” JAMA International Med., vol 173, no 4, 2013 [262] K Holmberg, S Bowman, T Bowman, F Didegah and T Kortelainen, “What Is Societal Impact and Where Do Altmetrics Fit into the Equation?,” Journal of Altmetrics, vol 2, no 1, p 6, 2019 [263] Creative Tim, “Optuna,” Optuna, 2017 [Online] Available: https://optuna.org/ [Accessed 2021] [264] S Bhuiya, F Islam and M Matin, “Analysis of Direction of Arrival Techniques Using Uniform Linear Array,” International Journal of Computer Theory and Engineering, vol 4, 2012 168 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [265] H Hwang, Z Aliyazicioglu, M Grice and A Yakovlev, “Direction of Arrival Estimation using a Root-MUSIC Algorithm,” in International Multi-conference of Engineers and Computer Scientists, Hong Kong, 2008 [266] M R a G Jiji, “Detection of Alzheimer's disease through automated hippocampal segmentation,” in Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), 2013 International Multi-Conference on, Kottayam, 2013 [267] S L a M Boukadoum, “Automatic detection of Alzheimer disease in brain magnetic resonance images using fractal features,” in Neural Engineering (NER), 2013 6th Internaional IEEE/EMBS Conference on,, San Diego, CA, 2013 [268] Z P H L a A L Huang, “Automated Diagnosis of Alzheimer's Disease with Degenerate SVM-based Adaboost,” in Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2013 5th International Conference on, Hangzhou, 2013 [269] S Q Y K W F a E H Y Zhong, “Automatic skull stripping in brain MRI based on local moment of intertia structure tensor,” in Information and Automation (ICIA), 2012 International Conference on, Shenyang, 2012 [270] B F a A Dale, “Measuring the thickness of the human cerebral cortex from magnetic resonance images,” Proceedings of the National Academy of Sciences, vol 97, no 20, pp 11050-11055, 2000 [271] A P J R a J G I Krashenyi, “Fuzzy computer-aided diagnosis of Alzheimer's disease using MRI and PET statistical features,” in 2016 IEEE 36th International Conference on Electronics and Nanotechnology (ELNANO), Kiev, 2016 [272] Q Z e al., “Regional MRI measures and neuropsychological test for multi-dimensional analysis in Alzheimer's disease,” in Neural Engineering (NER), 2013 6th International IEEE/EMBS Conference on, , San Diego, CA, 2013 [273] Z T L A A G A T a P T J.H Morra, “Comparison of Adaboost and Support Vector Machines for Detecting Alzheimer's Disease Through Automated Hippocampal Segmentation,” IEEE Transactions on Medical Imaging, vol 29, no 1, pp 30-43, January 2010 [274] Alzheimer's Association, 2016 [Online] Available: www.alz.org/dementia/types-of-dementia.asp [275] S C e al., “Building a surface atlas of hippocampal subfields from MRI scans using FreeSurfer, FIRST, and SPHARM,” in 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS), College Station, TX, 2014 [276] S K e al., “Automatic classification of MR scans in Alzheimer's disease,” in Brain 131, 2008 [277] M Guiberteau, 2015 [Online] Available: RadiologyInfo.org [278] M J e al., “Structural abnormalities in cortical volume, thickness, and surface area in 22q11.2 microdeletion syndrome,” Elsevier Journals, vol NeuroImage, no Clinical 3, pp 405-415, 2013 [279] F S e al., “A hybrid approach to the skull stripping problem in MRI,” Elsevier Journals, vol NeuroImage, no 22, pp 1060-1075, 2004 [280] J R e al., “Early Alzheimer's disease diagnosis using partial least squares and random forests,” in 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Rotterdam, 2010 [281] R A a B Barkana, “A comparative study of brain volume changes in Alzheimer's disease using MRI scans,” in Systems, Applications, and Technology Conference (LISAT), 2015 IEEE Long Island, Farmingdale, NY, 2015 [282] R M a L Machado, “A Fuzzy Poisson Naive Bayes classifier for epidemiological purposes,” in 2015 7th International Joint Conference on Computational Intelligence (IJCCI), Lisbon, Portugal, 2015 [283] J B.-P C B A M A a G C O Ben Ahmed, “Early Alzheimer disease detection with bag-of-visual-words and hybrid fusion on structural MRI,” in Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on, Veszprem, 2013 169 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [284] L W a E S X II, “AdaBoost with SVM-based component classifiers,” Engineering Applications of Artificial Intelligence, vol 21, no 5, pp 785-795, 2008 [285] F d l T M.H Nguyen, “Optimal feature selection for support vector machines,” Elsevier: Pattern Recognition, vol 43, pp 584-591, February 2010 [286] A Rodgers, Alzheimer's Disease: Unraveling the Mystery, U D o H a H Services, Ed., NIH Publication Number: 08-3782, 2008 [287] Khan Academy Medicine, September 2015 [Online] Available: www.youtube.com/watch?v=LieVEfl4luw&list=PLbKSbFnKYVY3_PviSE5ANWtMPXAXpMj5-&index=6 [288] D B V F D V M F a L B V Nicolas, “Relationships between hippocampal atrophy, white matter disruption, and gray matter hypometabolism in Alzheimer's disease,” Neuroscience, 2008 [289] W G R K T Kapur, “Segmentation of Brain Tissue from MR Images,” AITR-1566, June 1995 [290] B G K Leibe, “Chapter Indexing and Visual Vocabulary,” in Visual Recognition, Texas, University of Texas, 2009, pp 62-69 [291] M Daliri, “Automated Diagnosis of Alzheimer Disease using Scale Invariant feature transform in magnetic resonance images,” Journal of medical systems, pp 995-1000, 2011 [292] M C R C B D H K M N B D S L L G E E O C T I G Chetelat, “Multidimensional classification of hippocampal shape features discriminates Alzheimer's disease and mild cognitive impairment from normal aging,” Neuroimage, 2009 [293] R Poldrack, “Chapter 4: Spatial Normalization,” in Handbook of fMRI Data Analysis, Cambridge, Cambridge University Press, 2011, pp 53-69 [294] H Deng, “A Brief Introduction to Adaboost,” 2007 [295] A C L Breiman, January 2001 [Online] Available: www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#workings [296] D Bhalla, November 2014 [Online] Available: www.listendata.com/2014/11/random-forest-with-r.html [297] Scikit Learn Developers, 2016 [Online] Available: scikit-learn.org/stable/modules/naive_bayes.html [298] H Zhang, “The Optimality of Naive Bayes,” Proc FLAIRS, 2004 [299] I Mihaescu, “Naive-Bayes Classification Algorithm,” Romania [300] J P J.F Norfray, “Alzheimer's disease: neuropathologic findings and recent advances in imaging,” American Journal of Roentgenology, vol 182, no 1, pp 3-13, 2003 [301] C Hill, July 2016 [Online] Available: www.verywell.com/neuropsychological-testing-alzheimers-disease-98062 [302] M M K G S R.-T D C J G J.L Cummings, “The Neuropsychiatric Inventory: Comprehensive Assessment of psychopathology in dementia,” Neurology, vol 44, no 12, pp 2308-2314, 1994 [303] e a S Ueckert, “Improved Utilization of ADAS-Cog Assessment Data Through Item Response Theory Based Pharmacometric Modeling,” Pharm Res., vol 31, no 8, pp 2152-2165, 2014 [304] M W L Kurlowicz, “The Mini Mental State Examination (MMSE),” New York, 1999 [305] D Ray, “Edge Detection in Digital Image Processing,” Washington, 2013 [306] G Bebbis, “Edge Detection,” 2003 [307] A E.-Z W.K Al-Jibory, “Edge Detection for diagnosis early Alzheimer's disease by using Weibull distribution,” in 2013 5th International Conference on Microelectronics (ICM), Beirut, 2013 [308] McGill University, “Chapter 5: Edge Detection,” 2005 170 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [309] Alzheimer's Disease Neuroimaging Initiative, 2016 [Online] Available: adni.loni.usc.edu [310] Digital Imaging and Communications in Medicine, [Online] Available: dicom.nema.org/Dicom/about-DICOM.html [311] A L C R J A M Brett, “Spatial Normalization of Brain Images with Focal Lesions Using Cost Function Masking,” in MRC Cognition and Brain Sciences Unit, NeuroImage 14, Cambridge, 2000 [312] K F J Ashburner, “Chapter 3: Spatial Normalization using Basis Functions,” in Human Brain Function: 2nd Edition, London, pp 1-26 [313] R Bemis, “MRI Brain Segmentation : Mathworks File Exchange,” 2004 [314] C Kurniawan, “How to Read Multiple DICOM Images: Mathworks, MATLAB Answers,” 2011 [315] Alzheimer's Association, “Tests for Alzheimer's Disease and Dementia,” 2016 [316] S W K K T G J D A K R Perneczky, “Mapping Scores onto Stages: Mini Mental State Examination and Clinical Dementia Rating,” Am J Geriatr Psychiatry, vol 14, no 2, pp 139-144, February 2006 [317] Y Z a L Wu, “An MR Brain Images Classifier via Principal Component Analysis and Kernel Support Vector Machine,” Progress in Electromagnetics Research, vol 130, pp 369-388, June 2012 [318] B Manu, “Brain Tumor Segmentation and Classification : Mathworks File Exchange Website.,” 2015 [319] Image Analyst, “Histogram of ROI of an Image : Mathworks, MATLAB Answers, Website.,” 2013 [320] I Analyst, “Specific Area Inside of an Image : Mathworks, MATLAB Newsgroup Website,” 2009 [321] B Manu, “Plant Leaf Disease Detection and Classification : Mathworks File Exchange Website.,” 2015 [322] D Hull, “GUIDE Basics Tutorial,” 2005 [323] D Hull, “GUIDE Advanced Techniques,” 2005 [324] M A N S E.M Tagluk, “Classification of sleep apnea by using wavelet transform and artificial neural networks,” Expert Systems with Applications, vol 37, no 2, pp 1600-1607, 2010 [325] McConnell Brain Imaging Centre, 2016 [Online] Available: https://www.mcgill.ca/bic/resources/brainatlases/human [326] D J a J Martin, “Chapter Logistic Regression,” in Speech and Language Processing, 2016 [327] S P R S M Dudik, “Maximum Entropy Density Estimation with Generalized Regularization and an application to species distribution modeling,” Journal of Machine Learning Research, vol 8, no 6, 2007 [328] M J A.Y Ng, “On Discriminative vs Generative Classifiers: A comparison of logistic regression and naive bayes,” NIPS 14, pp 841-848, 2002 [329] R Tibshirani, “Regression Shrinkage and Selection via the lasso,” Journal of the Royal Statistical Society Series B (Methodological), vol 58, no 1, pp 267-288, 1996 [330] V Fang, “Advantages and Disadvantages of Logistic Regression,” 2011 [331] K P K Thangadurai, “Computer Visionimage Enhancement for Plant Leaves Disease Detection,” in World Congress on Computing and Communication Technologies, 2014 [332] D T R Krutsch, “Histogram Equalization,” Microcontroller Solutions Group, Freescale Semiconsudctor Document No.: AN4318, June 2011 [333] Y S K Vij, “Enhancement of Images using Histogram Processing Techniques,” Comp Tech Appl., vol [334] K V T Kumar, “A Theory Based on Conversion of RGB image to Gray Image,” International Journal of Computer Applications, vol 7, no 2, pp 0975-8887, September 2010 [335] Mathworks, [Online] Available: www.mathworks.com/products/statistics/classification-learner.html 171 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an [336] Mathworks, “Train SVM Classifiers using a Gaussian Kernel,” 2017 [337] A Maurya, “Intuition behind Gaussian Kernel in the SVM,” 2016 [338] S David, “Handwritten Number Classification using Logistic Regression,” 2015 [339] Mathworks, “Documentation: svmclassify,” 2013 [340] Mathworks, “Create Apps with Graphical User Interfaces in MATLAB” [341] Government of Dubai, Health Facility Guidelines: Planning, Design, Construction, and Commissioning, 2012 [342] V Tiwari, “MFCC and its applications in speaker recognition,” International Journal of Emerging Technologies, vol 1, no 1, pp 19-22, 2010 [343] A Graves, A Mohamed and G Hinton, “Speech recognition with deep recurrent neural networks,” in IEEE International Conference on Acoustics, Speech, and Signal processing, 2013 [344] R Pascanu, C Gulcehre, K Cho and Y Bengio, “How to construct Deep Recurrent Neural Networks,” 2013 [345] M Liang and X Hu, “Recurrent convolutional neural network for object recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, 2015 [346] P Pinheiro and R Collobert, “Recurrent Convolutional Neural Networks for Scene Labeling,” in International Conference on Machine Learning, 2014 [347] X Zhang, J Yu, G Feng and D Xu, “Blind Direction of Arrival Estimation of Coherent Sources using Multiinvariance Property,” Progress In Electromagnetics Research, PIER, vol 88, pp 181-195, 2008 [348] D Stewart, H Wang, J Shen and P Miller, “Investigations into the robustness of audio-visual gender classification to background noise and illumination effects,” in Digital Image Computing: Techniques and Applications, IEEE Computer Society, 2009 [349] IBM, Enterprise Design Thinking Course, 2021 172 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Appendices Appendix In this section, we provide the codes used for the generation of the DASEE dataset The synthetic dataset is made publicly available through Kaggle in order to be used for further research Room Impulse Response Generation Code The following code was used for generating the relevant room impulse response for four linearly arranged microphones, as utilized for the DASEE synthetic database The example used is for the first node However, similar codes are used for the other three nodes, adjusted according to the placements and relevant coefficients Code Room Impulse Response Generation Code % This code is used to develop the DASEE domestic audio dataset by Abigail Copiaco, % Prof Christian Ritz, Dr Nidhal Abdulaziz, and Dr Stefano Fasciani, 2021 % % Variables List: % Input/s: % location = room location (bedroom, living, kitchen, bath, halfbath, % dressing) % s = sound source position [x y z] (m) % Output/s: % h1 = room impulse response for the first mic of the linear mic array % h2 = room impulse response for the second mic of the linear mic array % h3 = room impulse response for the third mic of the linear mic array % h4 = room impulse response for the fourth mic of the linear mic array % name = room location name % node = node number (1) % % Completed and tested on MATLAB R2020a % ================================================================================== % Acknowledgement: % This code adds on and uses the 'rird_generator.mat' function from the following source: % Room Impulse Response for Directional source generator (RIRDgen) % by Sina Hafezi, Alastair H Moore, and Patrick A Naylor, 2015 % ================================================================================= %% NODE function [h1,h2,h3,h4, name, node] = RIRD_Node1(location, s) c = 343; % Sound velocity (m/s) fs = 16000; % Sample frequency (samples/s) node = 1; %% Bedroom if (strcmp(string(location), string('bedroom')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [3.45, 0.05, 2.8]; r2 = [3.5, 0.05, 2.8]; r3 = [3.55, 0.05, 2.8]; r4 = [3.6, 0.05, 2.8]; L = [3.6576 4.2418 3]; % Room dimensions [x y z] (m) - Bedroom name = string('bedroom'); betha = [0.568, 0.572, 0.7; 0.576, 0.568, 0.488]; % Walls reflection coefficients %% Living/Dining Room elseif (strcmp(string(location), string('living')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [6.58, 0.05, 2.8]; r2 = [6.63, 0.05, 2.8]; r3 = [6.68, 0.05, 2.8]; r4 = [6.73, 0.05, 2.8]; L = [6.7818 5.207 3]; % Room dimensions [x y z] (m) - Living/Dining Room name = string('living'); betha= [0.568, 0.572, 0.7; 0.572, 0.568, 0.488]; % Walls reflection Receiver position [x y z] (m) for a microphone array of first node r1 = [2.54, 0.05, 2.8]; r2 = [2.59, 0.05, 2.8]; 173 r3 = [2.64, 0.05, 2.8]; r4 = [2.69, 0.05, 2.8]; L = [2.7432 1.6 3]; % Room dimensions [x y z] (m) - Bathroom name = string('bath'); betha = [0.626, 0.62, 0.8; 0.7, 0.7, 0.541]; Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an %% Kitchen elseif (strcmp(string(location), string('kitchen')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [3.05, 0.05, 2.8]; r2 = [3.1, 0.05, 2.8]; r3 = [3.15, 0.05, 2.8]; r4 = [3.2, 0.05, 2.8]; L = [3.2512 3.0226 3]; % Room dimensions [x y z] (m) - Kitchen name = string('kitchen'); betha = [0.594, 0.594, 0.8; 0.594, 0.594, 0.515]; % Walls reflection %% Bath elseif (strcmp(string(location), string('bath')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [2.54, 0.05, 2.8]; r2 = [2.59, 0.05, 2.8]; r3 = [2.64, 0.05, 2.8]; r4 = [2.69, 0.05, 2.8]; L = [2.7432 1.6 3]; % Room dimensions [x y z] (m) - Bathroom name = string('bath'); betha = [0.626, 0.62, 0.8; 0.7, 0.7, 0.541]; %% Half bath elseif (strcmp(string(location), string('halfbath')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [1.47, 0.05, 2.8]; r2 = [1.52, 0.05, 2.8]; r3 = [1.57, 0.05, 2.8]; r4 = [1.62, 0.05, 2.8]; L = [1.6764 2.0828 3]; % Room dimensions [x y z] (m) - Halfbath name = string('halfbath'); betha = [0.626, 0.626, 0.8; 0.7, 0.7, 0.541]; %% Dressing Room elseif (strcmp(string(location), string('dressing')) == 1) % Receiver position [x y z] (m) for a microphone array of first node r1 = [1.88, 0.05, 2.8]; r2 = [1.93, 0.05, 2.8]; r3 = [1.98, 0.05, 2.8]; r4 = [2.03, 0.05, 2.8]; L = [2.0828 2.1336 3]; % Room dimensions [x y z] (m) - Dressing Room name = string('dressing'); betha = [0.578, 0.578, 0.7; 0.565, 0.565, 0.488]; end %% This part of the code is derived from the acknowledged source n = 1000; % Number of samples max_ref=-1; % Maximum reflection order (-1 is all possible reflection order) r1_s=r1-s; % Receiver location w/ resp to source % Source orientation (radian) [azimuth elevation] source_orient1= [atan(r1_s(2)/r1_s(1)) (pi/2)-acos(r1_s(3)/norm(r1_s))]; r2_s=r2-s; % Receiver location w/ resp to source % Source orientation (radian) [azimuth elevation] source_orient2= [atan(r2_s(2)/r2_s(1)) (pi/2)-acos(r2_s(3)/norm(r2_s))]; r3_s=r3-s; % Receiver location w/ resp to source % Source orientation (radian) [azimuth elevation] source_orient3= [atan(r3_s(2)/r3_s(1)) (pi/2)-acos(r3_s(3)/norm(r3_s))]; r4_s=r4-s; % Receiver location w/ resp to source % Source orientation (radian) [azimuth elevation] source_orient4= [atan(r4_s(2)/r4_s(1)) (pi/2)-acos(r4_s(3)/norm(r4_s))]; % Customized pattern azimuth_samples=[0 2*pi-0.0001]; elevation_samples=[-pi/2 +pi/2]; frequency_samples=[0 20000]; gain=repmat(1,length(azimuth_samples),length(elevation_samples),length(frequency_sam ples)); % Sampled Omnidirectinal source_type={azimuth_samples,elevation_samples,frequency_samples,gain}; interp_method='linear'; % interpolation method used in directivity pattern in case of customised pattern %% Geneating the room impulse responses h1 = rird_generator(c, fs, r1, s, L, betha, n, source_orient1, max_ref,source_type, interp_method); h2 = rird_generator(c, fs, r2, s, L, betha, n, source_orient2, max_ref,source_type, interp_method); h3 = rird_generator(c, fs, r3, s, L, betha, n, source_orient3, max_ref,source_type, interp_method); h4 = rird_generator(c, fs, r4, s, L, betha, n, source_orient4, max_ref,source_type, interp_method); end Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn 174 C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Code for Sound Convolution with the Room Impulse Response The following code was utilized for convolving the generated room impulse responses with the audio signal, at a four-channel basis This code is used to generate the data files from the DASEE domestic audio dataset Code Four-channel Sound Convolution with Room Impulse Response %% ================================================================================ % Description: % This code convolves the room impulse response with the audio signals, at a fourchannel % basis This code is used to generate the data files from the DASEE domestic audio dataset % % This code is used to develop the DASEE domestic audio dataset by Abigail Copiaco, % Prof Christian Ritz, Dr Nidhal Abdulaziz, and Dr Stefano Fasciani, 2021 % % Variables List: % Input/s: % wav_file = audio file (.wav format) % h1 = room impulse response from first mic % h2 = room impulse response from second mic % h3 = room impulse response from third mic % h4 = room impulse response from fourth mic % labels = to assign to a specific label / folder % ii = iteration number (will be included in the file name, remove if required % name = audio file name % node = node number (between 1-4) % Output/s: convolved audio signal saved at specified location (.wav) % % Completed and tested on MATLAB R2020a % ================================================================================= function convolve_fourchannel(wav_file, h1, h2, h3, h4, labels, ii, name, node) %% [x, Fs] = audioread(wav_file); % resample the audio if the sampling rate is not 16 kHz if Fs ~= 16000 [x] = resample(x, 16000, Fs); Fs = 16000; end x_new = mean(x,2); %% Renaming the impulse responses IR_1 = h1; IR_2 = h2; IR_3 = h3; IR_4 = h4; %% Convolve x = x'; %% convolved_1 = conv(x_new, IR_1'); convolved_2 = conv(x_new, IR_2'); convolved_3 = conv(x_new, IR_3'); convolved_4 = conv(x_new, IR_4'); %% Four channels for one node convolved = [convolved_1, convolved_2, convolved_3, convolved_4]; %% Save imageRoot = 'D:\Datasets\source_files_new\'; % change with the image root imgLoc = fullfile(imageRoot,labels); imFileName1 = strcat(name,'_',labels,'_',string(ii),'_',string(node),'.wav'); %% audiowrite(string(fullfile(imgLoc,imFileName1)),convolved,16000); %% Uncomment to play the audio (per channel) % conplayer = audioplayer(convolved_1, Fs); % play(conplayer); end Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn 175 C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Code for adding background noises Relevant background noises were added to the audio signals in order to reflect real-life recordings To this, the following code was used Code Code for Adding Background Noises at Specified SNR level % Description: % This code adds on to 'v_addnoise.mat' function, for adding noise to a clean audio % signal at a specified SNR level (dB), for a four-channel audio This code is % used to generate the noisy data files from the DASEE domestic audio dataset % % This code is used to develop the DASEE domestic audio dataset by Abigail Copiaco, % Prof Christian Ritz, Dr Nidhal Abdulaziz, and Dr Stefano Fasciani, 2021 % Variables List: % Input/s: % original = original signal (.wav format) % noise = noise signal (.wav format) % labels = label of the original signal(to be used to categorize the generated noisy audio file) % noise_labels = label of the noise signal (to be used to concatenate with the audio name, % as the filename of the generated noisy audio file) % name = audio file name % db = desired SNR noise level by which the noise will be added to the signal % Output/s: noisy audio signal saved at specified location (.wav) % Note: % If importing noise as a matrix, comment line 12 and pass noise as a matrix % % Completed and tested on MATLAB R2020a % ================================================================================== % Acknowledgement: % This code uses the 'v_addnoise.mat' function, downloadable from the following link: % http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/mdoc/v_mfiles/v_addnoise.html % Copyright (C) Mike Brookes 2014 % Version: $Id: v_addnoise.m 10461 2018-03-29 13:30:51Z dmb $ % % VOICEBOX is a MATLAB toolbox for speech processing % Home page: http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % This program is free software; you can redistribute it and/or modify % it under the terms of the GNU General Public License as published by % the Free Software Foundation; either version of the License, or % (at your option) any later version % % This program is distributed in the hope that it will be useful, % but WITHOUT ANY WARRANTY; without even the implied warranty of % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE See the % GNU General Public License for more details % % You can obtain a copy of the GNU General Public License from % http://www.gnu.org/copyleft/gpl.html or by writing to % Free Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA %% ================================================================================= function audio_mix_noise(original, noise, labels, noise_labels, name,db) [s, fsx] = audioread(original); %reads the original signal [nb, fsa] = audioread(noise); % reads the noise signal % comment out the audio read if you are passing data in matrix format % use the v_addnoise to mix the noisy signal with the clean one [z, p, fso] = v_addnoise(s,fsx,db,'',mean(nb,2),fsa); % Multi-channel (4-channel) L = length(z)/4; z1 = z(1:L); z2 = z(L+1:(L*2)); z3 = z((L*2)+1:(L*3)); z4 = z((L*3)+1:(L*4)); new = [z1, z2, z3, z4]; % Save imageRoot = 'E:\Other_SNRs\Audio'; %change to the path imgLoc = fullfile(imageRoot,labels); imFileName = strcat(name,'_',string(noise_labels),'_',string(db),'.wav'); audiowrite(string(fullfile(imgLoc,imFileName)),new,fso); end Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn 176 C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn