Luận văn thạc sĩ VNU real time multimodal baby monitoring system ,hệ thống giám sát em bé đa phương thức thời gian thực

UNIVERSITÉ NATIONALE DU VIETNAM, HANOÏ INSTITUT FRANCOPHONE INTERNATIONAL Abdoul Djalil OUSSEINI HAMZA Real Time Multimodal Baby Monitoring System Hệ thống giám sát em bé đa phương thức thời gian thực Spécialité : Systèmes Intelligents et Multimédia Code : 8480201.02 MÉMOIRE DE FIN D’ÉTUDES DU MASTER INFORMATIQUE Sous la direction de : Dr NGUYEN Trong Phuc - Ifi-solution HANOÏ - 2020 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com UNIVERSITÉ NATIONALE DU VIETNAM, HANOÏ INSTITUT FRANCOPHONE INTERNATIONAL Abdoul Djalil OUSSEINI HAMZA Real Time Multimodal Baby Monitoring System Hệ thống giám sát em bé đa phương thức thời gian thực Spécialité : Systèmes Intelligents et Multimédia Code : Programme pilote MÉMOIRE DE FIN D’ÉTUDES DU MASTER INFORMATIQUE Sous la direction de : Dr NGUYEN Trong Phuc - Ifi-solution HANOÏ - 2020 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ATTESTATION SUR L’HONNEUR J’atteste sur l’honneur que ce mémoire a été réalisé par moi-même et que les données et les résultats qui y sont présentés sont exacts et n’ont jamais été publiés ailleurs La source des informations citées dans ce mémoire a été bien précisée LỜI CAM ĐOAN Tôi cam đoan cơng trình nghiên cứu riêng tơi Các số liệu, kết nêu Luận văn trung thực chưa công bố công trình khác Các thơng tin trích dẫn Luận văn rõ nguồn gốc Signature de l’étudiant Abdoul Djalil OUSSEINI HAMZA LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Remerciements Je remercie d’abord Dieu le TOUT PUISSANT de m’avoir accordé des parents qui m’ont montré le chemin de l’école et grâce qui je suis présentement La réalisation de ce mémoire a été possible grâce au concours de plusieurs personnes qui je voudrais témoigner toute ma gratitude J’aimerais tout d’abord remercier mon encadreur pédagogique de stage Dr NGUYEN Trong Phuc Chef de projet - Ifi-Solution et enseignant chercheur l’université de Transport et de Communication de Hanoi, Vietnam La porte du bureau du Dr NGUYEN Trong Phuc était toujours ouverte chaque fois que je rencontrais un problème ou si j’avais une question sur mes recherches Il a toujours permis que ce document soit mon propre travail, mais il m’a guidé dans la bonne direction chaque fois qu’il pensait que j’en avais besoin Je tiens également remercier M Hoan Dinh Van manager Ifi-Solution qui a participé la réalisation et validation de ce projet Ce travail n’aurait pu être accompli sans leur effort et leur contributions passionnées Je voudrais remercier notre responsable de Master Dr Ho Tuong Vinh ainsi que tous les personnels pédagogiques et administratifs de l’Institut Francophone International, Université National de Vietnam Hanoi Je leur suis reconnaissant de tout cœur pour avoir assuré et amélioré la qualité de notre formation Enfin, je tiens exprimer ma profonde gratitude mes parents, ma famille et Mme Võ Thu Trang pour m’avoir apporté un soutien indéfectible et des encouragements constants tout au long de mes années de Master Sans oublier mes amis qui ont toujours été pour moi Votre soutien inconditionnel et vos encouragements ont été d’une grande aide Je vous remercie Abdoul Djalil OUSSEINI HAMZA LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Résumé Peu de travaux se sont intéressés la détection des mouvements des bébés dans leur berceaux et les rares travaux qui ont traité ce problème ont plutôt utilisé les algorithmes classiques de Machine Learning comme les SVM en tant que classifier Dans ce travail, nous proposons une nouvelle approche pour détecter les mouvement et les cris des nourrissons en se basant sur les nouvelles architectures des réseaux de neurones convolution CNN La première partie porte sur la présentation de la structure d’accueille où est décrit les missions et les départements qui composent l’entreprise La seconde partie fait mention de l’état de l’art dans laquelle les travaux connexes ont été développés et une étude comparative été établie La troisième partie opère sur les solutions proposées et les contributions apportées La quatrième partie fait l’objet des expérimentations et résultats où nous avons mené toutes nos expériences pour la réalisation du projet et enfin, la dernière partie porte sur la conclusion et les perspectives pour les futures travaux dans le domaine Mots clés : surveillance des bébés, détection d’objets, proposition de région, réseau neuronal convolutif, cri des bébé LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Abstract Few studies have focused on detecting the movements of babies in their cradles, and the few studies that have dealt with this problem have instead used conventional machine learning algorithms such as SVM as classifiers In this work, we propose a new approach to detect movement and cry of infants based on new architectures of CNN convolutional neural networks The first part relates to the presentation of the hosting structure where the missions and departments that make up the company are described The second part mentions the state of the art in which the related works were developed and a comparative study was established The third part operates on the solutions proposed and the contributions made The fourth part is the subject of experiments and results where we conducted all our experiences for the realization of the project and finally, the last part relates to the conclusion and prospects for future work in the field Keywords : baby monitoring, object detection, region proposal, convolutional neural network, baby cry LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Table des matières Liste des tableaux iv Table des figures v Liste des tables v INTRODUCTION GÉNÉRALE 1.1 Présentation de l’établissement d’accueil 1.1.1 Ifi-solution 1.2 Contexte-objectifs-problématiques 1.2.1 Contexte 1.2.2 Objectifs 1.2.3 Problématiques 3 4 4 6 7 8 9 11 13 14 14 15 16 17 17 ETAT DE L’ART 2.1 Techniques classiques 2.1.1 Frame Differencing 2.1.2 Optical flow 2.1.3 Background Subtraction 2.2 Techniques basées sur les réseaux de neurones 2.2.1 Convolutional Neural Networks (CNNs/ConvNets) 2.2.1.1 Architecture CNN 2.2.1.2 Fonctionnement de ConvNet 2.2.1.3 Conception des ConvNets 2.2.1.4 Autres architectures ConvNet 2.2.2 Quelques algorithmes de détection d’objets 2.2.2.1 Fast R-CNN 2.2.2.2 Faster R-CNN 2.2.2.3 SSD (Single Shot Detector) 2.3 Comparaison des méthodes utilisées 2.3.1 Faster R-CNN i LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com TABLE DES MATIÈRES 2.3.1.1 Présentation et architecture 2.3.1.2 Détails du modèle 2.3.2 SSD(Single Shot Detector) 2.3.2.1 Présentation et architecture 2.3.2.2 Détails du modèle 2.3.3 Comparaison 2.3.3.1 Comparaison des extracteurs de caractéristiques 2.3.3.2 Combinaison des modèles SOLUTIONS PROPOSÉES & CONTRIBUTIONS 3.1 Pourquoi les algorithmes de Tensorflow detection model zoo ? 3.2 Pourquoi l’aspect AUDIO ne figure pas dans notre travail ? 3.3 Déroulement de notre travail 3.4 Architecture générale de la solution 3.4.1 Transfer Learning 3.4.1.1 Concepts général 3.4.2 Fine-tuning avec notre Dataset : BbsD 3.4.3 L’architecture de la solution 3.4.4 Contribution 3.4.5 Détails sur les contributions apportées 3.4.5.1 Les differentes classes de Faster-RCNN 3.4.5.2 Les changements et modifications apportés au niveau des classes de Faster-RCNN 3.4.6 Ce qu’il faut retenir dans nos contributions 3.4.7 Paramètres impactants EXPÉRIMENTATIONS & RÉSULTATS 4.1 Problèmes rencontrés 4.1.1 Condition d’acquisition des données 4.1.2 Dataset : BbsD 4.1.3 Performance matérielle 4.2 Expérimentations 4.2.1 Pré-traitement des données 4.2.2 Algorithme d’optimisation et fonction de perte 4.2.3 Mesure d’évaluation 4.2.4 Pipeline d’intégration au Raspberry 4.3 Résultats 4.3.1 Analyses 4.3.2 Résultats des graphes 4.3.3 Output 17 18 19 19 20 20 20 21 22 22 23 23 23 23 23 24 25 25 26 26 27 28 29 31 31 31 31 32 33 33 34 35 35 36 37 38 39 CONCLUSION & PERSPECTIVES 44 5.1 Conclusion générale 44 5.2 Perspectives 45 ii LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com TABLE DES MATIÈRES A Algorithmes A.1 Implémentation du modèle Faster R-CNN A.2 prepare batch A.3 generate data A.4 start train 48 48 49 52 53 iii LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com CHAPITRE EXPÉRIMENTATIONS & RÉSULTATS F IGURE 4.8 – Posture : Stand Nous remarquons dans les figures que les précisions sont au tops La (Fig.4.7) montre une bonne précision de 99% avec la position stand = "debout" alors que les precisions dans la (Fig.4.8 sont moyennement bon, jusqu’à 80-89% Mais l’algorithme commet des confusion quand-t’a la détection des postures Il ne sait pas exactement si c’est la position debout ou couché Cette erreur est survenue parce que la position du bébé n’est pas très claire dans l’image On peut donc supposer que malgré la confusion de l’algorithme détecter la meilleur position, sa fiable n’est pas mis en cause 40 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com CHAPITRE EXPÉRIMENTATIONS & RÉSULTATS F IGURE 4.9 – Posture : Sleep F IGURE 4.10 – Posture : Sleep Les positions Sleep = "couché" dans les (Fig.4.9) et (Fig.4.10) ont été bien détectées avec de très bonnes précisions Cependant, on constate encore, une confusion de l’algorithme dans la (Fig.4.10), mais cette fois-ci, la confusion n’est pas portée sur les postures mais plutôt sur la forme du corps du bébé 41 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com CHAPITRE EXPÉRIMENTATIONS & RÉSULTATS F IGURE 4.11 – Posture : Sit F IGURE 4.12 – Posture : Sit Dans les images ( Fig.4.11 & Fig.4.12 ), la détection a été bien faite La précision 42 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com CHAPITRE EXPÉRIMENTATIONS & RÉSULTATS dans ces deux cas était au rendez-vous, jusqu’à un taux de 99.99% pour cette classe (Sit = "assis") Tout laisse croire que le programme détecte mieux cette classe par-rapport aux autres classes Ceci pourrait s’expliquer du fait qu’il plus d’images d’entrnements dans cette classe que les autres 43 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Chapitre CONCLUSION & PERSPECTIVES 5.1 Conclusion générale Ce chapitre traitera des conclusions basées sur l’ensemble du travail et aussi des résultats du chapitre 4, ainsi que des limites du cadre et des travaux futurs pouvant être réalisés Ce travail fait partie d’un premier essai pour résoudre un problème de surveillance des bébés, nous n’avons pas suffisamment de données pour effectuer une solution très grande échelle Les ressources et tous les besoins nécessaires sont redéfinis après les expérimentations Au terme de cet effort, nous avons d’abord effectué un état de l’art plus approfondi sur les méthodes d’algorithmes de détection d’objets, des plus anciennes aux plus récentes utilisant les réseaux de neurones Ensuite, nous avons effectué une étude comparative des méthodes les plus performantes de l’état de l’art Puis, sur la base de cette étude, nous avons choisi une démarches raisonnées qui nous a semblé adéquat parrapport ce qui existe pour résoudre notre problème Enfin, nous avons proposé une solution que nous avons implémenté afin de parvenir au résultats attendus Les résultats de ce travail seront utilisés pour avoir un aperỗu et explorer les possibilitộs datteindre nos objectifs De plus, par rapport l’état de l’art, les résultats de nos expériences sont assez prometteurs car ils sont réalisés sur des ressources et des données limitées ; nous pouvons nous attendre avoir de meilleures performances par la suite Bien qu’une précision de test respectable de 93% ait été atteinte, l’ensemble de données collecté était trop limité et n’était pas vraiment représentatif du domaine du monde réel Il faut aussi noter que pendant ce travail, aucune validation n’a été effectuée, car il n’y a pas de jeu de données de référence en ligne avec lequel on peut tester nos algo44 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com CHAPITRE CONCLUSION & PERSPECTIVES rithmes et réaliser une comparaison 5.2 Perspectives Notre travail s’inscrit dans le cadre d’un projet de faisabilité afin de voir les possibilité de pouvoir l’appliquer dans le monde réel En perspective, nous prévoyons de travailler sur la création d’une base de données de référence pour les tâches liées la surveillance du bébé Bien qu’il serait difficile de collecter un ensemble de données entièrement représentatif pour entrner nos modèles, il serait donc bon d’étudier et de tester davantage les techniques d’adaptation de domaine décrites dans l’article de Csurka [17] 45 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Bibliographie [1] M S K Mishra, F Jtmcoe, and K Bhagat, “A survey on human motion detection and surveillance,” International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) Volume, vol 4, 2015 [2] H S A Karaki, S A Alomari, and M H Refai, “A comprehensive survey of the vehicle motion detection and tracking methods for aerial surveillance videos,” IJCSNS, vol 19, no 1, p 93, 2019 [3] N Dastanova, S Duisenbay, O Krestinskaya, and A P James, “Bit-plane extracted moving-object detection using memristive crossbar-cam arrays for edge computing image devices,” IEEE Access, vol 6, pp 18954–18966, 2018 [4] I Kartika and S S Mohamed, “Frame differencing with post-processing techniques for moving object detection in outdoor environment,” in 2011 IEEE 7th International Colloquium on Signal Processing and its Applications, pp 172–176, IEEE, 2011 [5] Y Lin, M Fang, and D Shihong, “An object reconstruction algorithm for moving vehicle detection based on three-frame differencing,” in 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), pp 1864–1868, IEEE, 2015 [6] A K Chauhan and P Krishan, “Moving object tracking using gaussian mixture model and optical flow,” International Journal of Advanced Research in Computer Science and Software Engineering, vol 3, no 4, 2013 [7] A Bruhn, J Weickert, and C Schnörr, “Lucas/kanade meets horn/schunck : Combining local and global optic flow methods,” International journal of computer vision, vol 61, no 3, pp 211–231, 2005 [8] D Raju and P Joseph, “Motion detection and optical flow,” International Journal of Computer Science and Information Technologies, vol 5, no 4, pp 5716–5719, 2014 46 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com BIBLIOGRAPHIE [9] C Sukanya, R Gokul, and V Paul, “A survey on object recognition methods,” International Journal of Science, Engineering and Computer Technology, vol 6, no 1, p 48, 2016 [10] R K Rout, A survey on object detection and tracking algorithms PhD thesis, 2013 [11] M Sankari and C Meena, “Estimation of dynamic background and object detection in noisy visual surveillance,” International Journal, vol 2, 2011 [12] H S Parekh, D G Thakore, and U K Jaliya, “A survey on object detection and tracking methods,” International Journal of Innovative Research in Computer and Communication Engineering, vol 2, no 2, pp 2970–2979, 2014 [13] A D Alzughaibi, H A Hakami, and Z Chaczko, “Review of human motion detection based on background subtraction techniques,” International Journal of Computer Applications, vol 122, no 13, 2015 [14] R S Rakibe and B D Patil, “Background subtraction algorithm based human motion detection,” International Journal of scientific and research publications, vol 3, no 5, pp 2250–3153, 2013 [15] Y Du, W Wang, and L Wang, “Hierarchical recurrent neural network for skeleton based action recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1110–1118, 2015 [16] L Torrey and J Shavlik, “Transfer learning,” in Handbook of research on machine learning applications and trends : algorithms, methods, and techniques, pp 242– 264, IGI Global, 2010 [17] G Csurka, “Domain adaptation for visual applications : A comprehensive survey,” arXiv preprint arXiv :1702.05374, 2017 [18] W.-T Lee and H.-T Chen, “Histogram-based interest point detectors,” in 2009 IEEE conference on computer vision and pattern recognition, pp 1590–1596, IEEE, 2009 [19] S Ji, W Xu, M Yang, and K Yu, “3d convolutional neural networks for human action recognition,” IEEE transactions on pattern analysis and machine intelligence, vol 35, no 1, pp 221–231, 2012 [20] P Wang, W Li, P Ogunbona, J Wan, and S Escalera, “Rgb-d-based human motion recognition with deep learning : A survey,” Computer Vision and Image Understanding, vol 171, pp 118–139, 2018 [21] G Johansson, “Visual perception of biological motion and a model for its analysis,” Perception & psychophysics, vol 14, no 2, pp 201–211, 1973 47 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Annexe A Algorithmes A.1 Implémentation du modèle Faster R-CNN class RoIPooling ( Layer ) : def init ( self , size =(7 , 7) ) : self size = size super ( RoIPooling , self ) init () def build ( self , input_shape ) : self shape = input_shape super ( RoIPooling , self ) build ( input_shape ) 10 11 12 13 def call ( self , inputs , ** kwargs ) : ind = K reshape ( inputs [2] ,( -1 ,) ) x = K tf image crop_and_resize ( inputs [0] , inputs [1] , ind , self size ) return x 14 15 16 17 18 19 20 def c o m p u t e _ o u t p u t _ s h a p e ( self , input_shape ) : a = input_shape [1][0] b = self size [0] c = self size [1] d = input_shape [0][3] return (a ,b ,c , d ) 21 22 23 BATCH =256 24 25 26 27 feature_map = Input ( batch_shape =( None , None , None ,1536) ) rois = Input ( batch_shape =( None , 4) ) ind = Input ( batch_shape =( None , 1) , dtype = ’ int32 ’) https ://github.com/dongjk/faster_rcnn_keras/blob/master/RCNN.py 48 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ANNEXE A ALGORITHMES 28 29 p1 = RoIPooling () ([ feature_map , rois , ind ]) 30 31 flat1 = Flatten () ( p1 ) 32 33 34 35 36 37 38 39 40 41 42 43 44 45 fc1 = Dense ( units =1024 , activation = " relu " , name = " fc2 " ) ( flat1 ) fc1 = Ba tc hN or ma li za tio n () ( fc1 ) output_deltas = Dense ( units =4 * 200 , activation = " linear " , ke rn el _i ni ti al iz er = " uniform " , name = " deltas2 " ) ( fc1 ) 46 47 48 49 50 51 52 output_scores = Dense ( units =1 * 200 , activation = " softmax " , ke rn el _i ni ti al iz er = " uniform " , name = " scores2 " ) ( fc1 ) 53 54 55 56 57 model = Model ( inputs =[ feature_map , rois , ind ] , outputs =[ output_scores , output_deltas ]) model summary () model compile ( optimizer = ’ rmsprop ’ , loss ={ ’ deltas2 ’: smoothL1 , ’ scores2 ’: ’ c a t e g o r i c a l _ c r o s s e n t r o p y ’ }) Listing A.1 – RoI Pooling layer A.2 prepare batch FG_FRAC =.25 FG_THRESH =.5 BG_THRESH_HI =.5 BG_THRESH_LO =.1 10 11 12 # load an example to void graph problem # TODO fix this pretrained_model = Incept ionRes NetV2 ( include_top = False ) img = load_img ( " / I L S V R C _ t r a i n _ 0 JPEG " ) x = img_to_array ( img ) x = np expand_dims (x , axis =0) not_used = pretrained_model predict ( x ) 13 14 rpn_model = load_model ( ’ weights hdf5 ’ , 49 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ANNEXE A ALGORITHMES 15 16 custom_objects ={ ’ loss_cls ’: loss_cls , ’ smoothL1 ’: smoothL1 }) not_used = rpn_model predict ( np load ( ’ n02676566_6914 ’) [ ’ fc ’ ]) 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 def produce_batch ( filepath , gt_boxes , h_w , category ) : img = load_img ( filepath ) img_width = np shape ( img ) [1] * scale [1] img_height = np shape ( img ) [0] * scale [0] img = img resize (( int ( img_width ) , int ( img_height ) ) ) # feed image to pretrained model and get feature map img = img_to_array ( img ) img = np expand_dims ( img , axis =0) feature_map = pretrained_model predict ( img ) height = np shape ( feature_map ) [1] width = np shape ( feature_map ) [2] num_feature_map = width * height # calculate output w , h stride w_stride = h_w [1] / width h_stride = h_w [0] / height # generate base anchors according output stride # base anchors are anchors wrt a tile (0 ,0 , w_stride -1 , h_stride -1) base_anchors = generate_anchors ( w_stride , h_stride ) # slice tiles according to image size and stride # each x1x1532 feature map is mapping to a tile shift_x = np arange (0 , width ) * w_stride shift_y = np arange (0 , height ) * h_stride shift_x , shift_y = np meshgrid ( shift_x , shift_y ) shifts = np vstack (( shift_x ravel () , shift_y ravel () , shift_x ravel () , shift_y ravel () ) ) transpose () # apply base anchors to all tiles , to have a num_feature_map *9 anchors all_anchors = ( base_anchors reshape ((1 , , 4) ) + shifts reshape ((1 , num_feature_map , 4) ) transpose ((1 , , 2) ) ) total_anchors = num_feature_map *9 all_anchors = all_anchors reshape (( total_anchors , 4) ) # feed feature map to pretrained RPN % model , get proposal labels and bboxes res = rpn_model predict ( feature_map ) scores = res [0] scores = scores reshape ( -1 ,1) deltas = res [1] deltas = np reshape ( deltas ,( -1 ,4) ) # proposals transform to bbox values ( x1 , y1 , x2 , y2 ) proposals = b bo x_t ns fo rm _i nv ( all_anchors , deltas ) proposals = clip_boxes ( proposals , ( h_w [0] , h_w [1]) ) # remove small boxes , here threshold is 40 pixel keep = filter_boxes ( proposals , 40) proposals = proposals [ keep , :] scores = scores [ keep ] 50 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ANNEXE A ALGORITHMES 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 # sort socres and only keep top 6000 pre_nms_topN =6000 order = scores ravel () argsort () [:: -1] if pre_nms_topN > 0: order = order [: pre_nms_topN ] proposals = proposals [ order , :] scores = scores [ order ] # apply NMS to to 6000 , and then keep top 300 post_nms_topN =300 keep = py_cpu_nms ( np hstack (( proposals , scores ) ) , 0.7) if post_nms_topN > 0: keep = keep [: post_nms_topN ] proposals = proposals [ keep , :] scores = scores [ keep ] # add gt_boxes to proposals proposals = np vstack ( ( proposals , gt_boxes ) ) # calculate overlaps of proposal and gt_boxes overlaps = bbox_overlaps ( proposals , gt_boxes ) gt_assignment = overlaps argmax ( axis =1) max_overlaps = overlaps max ( axis =1) # labels = gt_labels [ gt_assignment ] #? 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 # sub sample fg_inds = np where ( max_overlaps >= FG_THRESH ) [0] f g _ r o i s _ p e r _ t h i s _ i m a g e = ( int ( BATCH * FG_FRAC ) , fg_inds size ) # Sample foreground regions without replacement if fg_inds size > 0: fg_inds = npr choice ( fg_inds , size = fg_rois_per_this_image , replace = False ) bg_inds = np where (( max_overlaps < BG_THRESH_HI ) & ( max_overlaps >= BG_THRESH_LO ) ) [0] b g _ r o i s _ p e r _ t h i s _ i m a g e = BATCH - f g _ r o i s _ p e r _ t h i s _ i m a g e b g _ r o i s _ p e r _ t h i s _ i m a g e = ( bg_rois_per_this_image , bg_inds size ) # Sample background regions without replacement if bg_inds size > 0: bg_inds = npr choice ( bg_inds , size = bg_rois_per_this_image , replace = False ) # The indices that we ’ re selecting ( both fg and bg ) keep_inds = np append ( fg_inds , bg_inds ) # Select sampled values from various arrays : # labels = labels [ keep_inds ] rois = proposals [ keep_inds ] gt_rois = gt_boxes [ gt_assignment [ keep_inds ]] targets = bbox_transform ( rois , gt_rois ) # input rois rois_num = targets shape [0] batch_box = np zeros (( rois_num , 200 , 4) ) for i in range ( rois_num ) : batch_box [i , category ] = targets [ i ] batch_box = np reshape ( batch_box , ( rois_num , -1) ) # get gt category 51 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ANNEXE A ALGORITHMES 110 111 112 113 114 batch_categories = np zeros (( rois_num , %200 , 1) ) for i in range ( rois_num ) : batch_categories [i , category ] = batch_categories = np reshape ( batch_categories , ( rois_num , -1) ) return rois , batch_box , batch_categories Listing A.2 – Preparation du batch A.3 generate data I L SV R C _d a t as e t _p a t h = ’/ home / jk / faster_rcnn / ’ img_path = I L S VR C _ d at a s et _ pat h + ’ Data / DET / train / ’ anno_path = I L S VR C _ da t a se t _ pa t h + ’ Annotations / DET / train / ’ import glob from multiprocessing import Process , Queue 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 def worker ( path ) : print ( ’ worker start ’ + path ) batch_rois =[] b a t c h _ f e a t u r e m a p _ i n d s =[] batch_categories =[] batch_bboxes =[] fc_index =0 dataset ={} # ’/ ImageSets / DET / train_ * ’ for fname in glob glob ( I L SV R C _ da t a se t _ pa t h + path ) : print ( fname ) with open ( fname , ’r ’) as f : basename = os path basename ( fname ) category = int ( basename split ( ’_ ’) [1] split ( ’ ’) [0]) content =[] for line in f : if ’ extra ’ not in line : content append ( line ) dataset [ category ]= content print ( len ( dataset ) ) from random import randint while 1: try : category = randint (1 , 200) content = dataset [ category ] n = randint (0 , len ( content ) ) line = content [ n ] _ , gt_boxes , h_w = parse_label ( anno_path + line split () [0]+ ’ xml ’) if len ( gt_boxes ) ==0: continue rois , bboxes , categories = produce_batch ( img_path + line split () [0]+ ’ JPEG ’ , gt_boxes , h_w , category ) except Exception : # print ( ’ parse label or produce batch failed : for : ’+ line split () [0]) 52 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ANNEXE A ALGORITHMES 40 41 42 43 # traceback print_exc () continue if len ( rois )