Методи розпізнавання рухів, дій людей на відео послідовностях

Основний зміст сторінки статті

Denys Soldatov

Анотація

В статті розглянуто постановку проблеми розпізнавання рухів об’єктів на відеопослідовностях, етапи її вирішення, проведено аналіз основних методів кожного з етапів. Розглянуто ключові складнощі, що виникають при вирішенні задачі. Наведено способи порівняння різних методів. Проаналізовано існуючі підходи до розпізнавання рухів на відеопослідовностях, виявлено особливості, сильні та слабкі сторони та обмеження різних методів виявлення ознак та їх класифікації. Обрано методи для подальшого дослідження та вдосконалення.

Блок інформації про статтю

Як цитувати
Soldatov, D. (2019). Методи розпізнавання рухів, дій людей на відео послідовностях. Електронна та Акустична Інженерія, 2(3), 27–33. https://doi.org/10.20535/2617-0965.2019.2.3.164709
Розділ
Електронні системи та сигнали

Посилання

Y. Du, F. Chen, and W. Xu, “Human interaction representation and recognition through motion decomposition,” IEEE Signal Processing Letters, vol. 14, no. 12, pp. 952–955, 2007. DOI: 10.1109/LSP.2007.908035

C. Schüldt, I. Laptev, and B. Caputo, “Recognizing human actions: A local SVM approach,” in Proceedings - International Conference on Pattern Recognition, 2004, vol. 3, pp. 32–36. DOI: 10.1109/ICPR.2004.1334462

L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 12, pp. 2247–2253, 2007. DOI: 10.1109/TPAMI.2007.70711

Y. Ke, R. Sukthankar, and M. Hebert, “Event detection in crowded videos,” in Proceedings of the IEEE International Conference on Computer Vision, 2007, pp. 1–8. DOI: 10.1109/ICCV.2007.4409011

J. Yuan, Z. Liu, and Y. Wu, “Discriminative subvolume search for efficient action detection,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2442–2449. DOI: 10.1109/CVPRW.2009.5206671

I. Laptev and P. Pérez, “Retrieving actions in movies,” in Proceedings of the IEEE International Conference on Computer Vision, 2007, pp. 1–8. DOI: 10.1109/ICCV.2007.4409105

M. D. Rodriguez, J. Ahmed, and M. Shah, “Action MACH: A spatio-temporal maximum average correlation height filter for action recognition,” in 26th IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, 2008. DOI: 10.1109/CVPR.2008.4587727

I. Laptev, M. Marszałek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. DOI: 10.1109/CVPR.2008.4587756

M. Marszałek, I. Laptev, and C. Schmid, “Actions in context,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009, pp. 2929–2936. DOI: 10.1109/CVPRW.2009.5206557

K. Soomro, A. R. Zamir, and M. Shah, “UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild,” 2012. URL: http://arxiv.org/abs/1212.0402

C. Snoek, B. Ghanem, J.C. Niebles, F.C. Heilbron, W. Barrios, V. Escorcia, and P. Mettes. ActivityNet: A Large-Scale Activity Recognition Challenge.

S. M. Kang and R. P. Wildes, “Review of Action Recognition and Detection Methods,” 2016. URL: http://arxiv.org/abs/1610.06906

J. Wang, P. Liu, M. F. H. She, A. Kouzani, and S. Nahavandi, “Supervised learning probabilistic Latent Semantic Analysis for human motion analysis,” Neurocomputing, vol. 100, pp. 134–143, 2013. DOI: 10.1016/j.neucom.2011.10.033

A. Klaeser, M. Marszalek, and C. Schmid, “A Spatio-Temporal Descriptor Based on 3D-Gradients,” in British Machine Vision Conference, 2012, p. 99.1-99.10. DOI: 10.5244/c.22.99

P. Dollár, V. Rabaud, G. Cottrell, and S. Belongie, “Behavior recognition via sparse spatio-temporal features,” in Proceedings - 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, VS-PETS, 2005, pp. 65–72. DOI: 10.1109/VSPETS.2005.1570899

D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004. DOI: 10.1023/B:VISI.0000029664.99615.94

K. Mikolajczk and C. Schmid, “A performance of local descriptors,” IEEE Conf. Comput. Vis. Pattern Recognit., vol. 27, no. 10, pp. 1615–1630, 2003. DOI: 10.1109/TPAMI.2005.188

L. Yeffet and L. Wolf, “Local trinary patterns for human action recognition,” in Proceedings of the IEEE International Conference on Computer Vision, 2009, pp. 492–497. DOI: 10.1109/ICCV.2009.5459201

E. Shechtman and M. Irani, “Space-time behavior-based correlation - OR - How to tell if two underlying motion fields are similar without computing them?,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 11, pp. 2045–2056, 2007. DOI: 10.1109/TPAMI.2007.1119

H. Ning, T. X. Han, D. B. Walther, M. Liu, and T. S. Huang, “Hierarchical space-time model enabling efficient search for human actions,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 6, pp. 808–820, 2009. DOI: 10.1109/TCSVT.2009.2017399

O. Chomat and J. L. Crowley, “Probabilistic recognition of activity using local appearance,” 2003, pp. 104–109. DOI: 10.1109/cvpr.1999.784616

J. M. Gryn, R. P. Wildes, and J. K. Tsotsos, “Detecting motion patterns via direction maps with application to surveillance,” Comput. Vis. Image Underst., vol. 113, no. 2, pp. 291–307, 2009. DOI: 10.1016/j.cviu.2008.10.006

A. A. Efros, A. C. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” in IEEE International Conference on Computer Vision, 2004, pp. 726–733. DOI: 10.1109/iccv.2003.1238420

A. Fathi and G. Mori, “Action recognition by learning mid-level motion features,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. DOI: 10.1109/CVPR.2008.4587735

N. Dalal, B. Triggs, and C. Schmid, “Human detection using oriented histograms of flow and appearance,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 3952 LNCS, pp. 428–441, 2006. DOI: 10.1007/11744047_33

H. Wang, A. Kläser, C. Schmid, and C. L. Liu, “Action recognition by dense trajectories,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2011, pp. 3169–3176. DOI: 10.1109/CVPR.2011.5995407

H. Wang and C. Schmid, “Action recognition with improved trajectories,” Proc. IEEE Int. Conf. Comput. Vis., pp. 3551–3558, 2013. DOI: 10.1109/ICCV.2013.441

H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded up robust features,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2006, vol. 3951 LNCS, pp. 404–417. DOI: 10.1007/11744023_32

J. Shi and C. Tomasi, “Good features to track,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR-94, 1994, pp. 593–600. DOI: 10.1109/CVPR.1994.323794

H. Bilen, B. Fernando, E. Gavves, A. Vedaldi, and S. Gould, “Dynamic Image Networks for Action Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3034–3042. DOI: 10.1109/cvpr.2016.331

X. Wang, A. Farhadi, and A. Gupta, “Actions ~ Transformations,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2658–2667. URL: http://arxiv.org/abs/1512.00795

Z. Shou, D. Wang, and S.-F. Chang, “Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs,” in IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1049–1058. URL: http://arxiv.org/abs/1601.02129

J. Yue et al., “Beyond short snippets: Deep networks for video classification,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015, vol. 07–12–June, pp. 4694–4702. DOI: 10.1109/CVPR.2015.7299101

S. Yeung, O. Russakovsky, G. Mori, and L. Fei-Fei, “End-to-end Learning of Action Detection from Frame Glimpses in Videos,” in IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 2678–2687. DOI: 10.1109/CVPR.2016.293

A. Basharat, A. Gritai, and M. Shah, “Learning object motion patterns for anomaly detection and improved object detection,” in 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008, pp. 1–8. DOI: 10.1109/CVPR.2008.4587510

C. Fanti, L. Zelnik-Manor, and P. Perona, “Hybrid models for human motion recognition,” in Proceedings - 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, 2005, vol. I, pp. 1166–1173. DOI: 10.1109/CVPR.2005.179

S. Yeung, O. Russakovsky, G. Mori, and L. Fei-Fei, “End-to-end Learning of Action Detection from Frame Glimpses in Videos,” 2015. URL: http://arxiv.org/abs/1511.06984

S. Yan, Y. Xiong, and D. Lin, “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition,” Dep. Inf. Eng. Chinese Univ. Hong Kong, 2018. URL: http://arxiv.org/abs/1801.07455