Mô hình đề xuất trong luận văn tuy chứng minh được hướng tiếp cận là khả thi nhưng độ chính xác của mô hình còn thấp. Công việc tiếp theo cần áp dụng các phương pháp điều tiết, tăng độ chính xác của mô hình bằng cách sử dụng các hàm lỗi phụ và thu thập thêm dữ liệu đào tạo, mở rộng tập các hành vi bất thường cần đoán nhận.
Với mục tiêu áp dụng vào thực tế, chúng tôi sẽ tối ưu hóa mô hình, tăng thời gian thực thi, giảm dung lượng mô hình để có thể triển khai thực tế trên các thiết bị đầu cuối trong tương lai.
TÀI LIỆU THAM KHẢO
[1] A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. van der Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2758–2766.
[2] A. Ranjan and M. J. Black, “Optical flow estimation using a spatial pyramid network,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2017.
[3] C. Lu, J. Shi, and J. Jia, “Abnormal event detection at 150 fps in Matlab,” in Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013, pp. 2720–2727.
[4] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, “Flownet 2.0: Evolution of optical flow estimation with deep networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 2017.
[5] Farnebäck, Gunnar. (2003). "Two-Frame Motion Estimation Based on Polynomial Expansion", in Image analysis. 2749. 363-370. 10.1007/3-540- 45103-X_50.
[6] H. Rabiee, J. Haddadnia, H. Mousavi, M. Kalantarzadeh, M. Nabi, and V. Murino, “Novel dataset for fine-grained abnormal behavior understanding in crowd,” in Advanced Video and Signal Based Surveillance (AVSS), 2016 13th IEEE International Conference on. IEEE, 2016.
[7] H. Idrees, I. Saleemi, C. Seibert, and M. Shah, “Multi-source multiscale counting in extremely dense crowd images,” in Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on. IEEE,2013, pp. 2547–2554.
[8] J. R. Medel and A. Savakis, “Anomaly detection in video using predictive convolutional long short-term memory networks,” arXiv preprint
arXiv:1612.00390, 2016.
[9] Jianbo Shi and Tomasi, "Good features to track," 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 593- 600, doi: 10.1109/CVPR.1994.323794.
[10] L. Lazaridis, A. Dimou and P. Daras, "Abnormal Behavior Detection in Crowded Scenes Using Density Heatmaps and Optical Flow," 2018 26th European Signal Processing Conference (EUSIPCO), 2018, pp. 2060-2064, doi: 10.23919/EUSIPCO.2018.8553620.
[11] L. Zeng, X. Xu, B. Cai, S. Qiu, and T. Zhang, “Multi-scale convolutional nơ-ron networks for crowd counting,” arXiv preprint arXiv:1702.02359, 2017.
[12] Yuhong Li, Xiaofan Zhang, Deming Chen. (2018). " CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes". Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018. IEEE Computer Society, 2018. pp. 1091-1100.
[13] M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, and L. S. Davis, “Learning temporal regularity in video sequences,” in Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE, 2016, pp. 733–742.
[14] S. Zhou, W. Shen, D. Zeng, and Z. Zhang, “Unusual event detection in crowded scenes by trajectory analysis,” in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 1300–1304.
[15] S. Zhou, W. Shen, D. Zeng, M. Fang, Y. Wei, and Z. Zhang, “Spatial– temporal convolutional neural networks for anomaly detection and localization in crowded scenes,” Signal Processing: Image Communication, vol. 47, pp. 358–368, 2016.
detection of violent crowd behavior,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, 2012, pp. 1–6.
[17] T. Xiao, C. Zhang, H. Zha, and F. Wei, “Anomaly detection via local coordinate factorization and spatio-temporal pyramid,” in Asian Conference on Computer Vision. Springer, 2014, pp. 66–82.
[18] V. Reddy, C. Sanderson, and B. C. Lovell, “Improved anomaly detection in crowded scenes via cell-based analysis of foreground speed, size and texture,” in Computer Vision and Pattern Recognition Workshops (CVPRW), 2011 IEEE Computer Society Conference on. IEEE, 2011. [19] Y. Zhang, D. Zhou, S. Chen, S. Gao, and Y. Ma, “Single-image crowd
counting via multi-column convolutional neural network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 589–597.
[20] Yang, Xiaodong et al. “Multilayer and Multimodal Fusion of Deep Nơ-ron Networks for Video Classification.” Proceedings of the 24th ACM international conference on Multimedia (2016): n. pag.
[21] Sambit Mahapatra, Towards Data Science,
https://towardsdatascience.com/why-deep-learning-is-needed-over- traditional-machine-learning-1b6a99177063, truy cập ngày 29/12/2021. [22] Sawakinome, Sawakinome,
https://vi.sawakinome.com/articles/people/difference-between-normal-and- abnormal-behavior-2.html, truy cập ngày 29/12/2021.
[23] Christopher Olah, Colah's blog, https://colah.github.io/posts/2015-08- Understanding-LSTMs, truy cập ngày 29/12/2021.