Human activities include group activities, individual actions, and interactions between objects and people. In computer vision technology, recognizing and categorizing these activities is a vital process. This system will aim at developing a model that can recognize and detect such behavior and apply it in surveillance, health care, military operations, and patient monitoring. In the first place, videos were gathered in order to get a better understanding of the various human activities and interactions. Subsequently, we have converted Video frames into images and pre-processed each image. Characteristic features are extracted from video images by capturing spatial and temporal details. Spatio-temporal interest points using three descriptors Harris STIP, Gabor STIP, and HoG3D STIP are extracted as features. Extracted features are passed to a Heatmap generation process which gives confident key features related to human action. Support Vector Machine (SVM) Classifier is used to analyze these confident key features to label and classify human actions. Various classifier performance metrics, such as accuracy, sensitivity, and specificity, were used to evaluate the performance of the system. Classifiers exhibiting accuracy of around 98.60% served as an indicator of the overall reliability of the proposed system in effectively recognizing human actions.
Keywords
Spatial-Temporal Interest Point, Action Recognition, Multitask Human Action Recognition, Histogram of Oriented Gradients, Harris STIP and Gabor STIP.
P. Naik and R. Srinivasa Rao Kunte, “Review of Literature on Human Activity Detection and Recognition,” International Journal of Management, Technology, and Social Sciences, pp. 196–212, Nov. 2023, doi: 10.47992/ijmts.2581.6012.0318.
N. Efthymiou, P. Koutras, P. P. Filntisis, G. Potamianos, and P. Maragos, “Multi- View Fusion for Action Recognition in Child-Robot Interaction,” 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 455–459, Oct. 2018, doi: 10.1109/icip.2018.8451146.
N. H. Friday, M. A. Al-garadi, G. Mujtaba, U. R. Alo, and A. Waqas, “Deep learning fusion conceptual frameworks for complex human activity recognition using mobile and wearable sensors,” 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), pp. 1–7, Mar. 2018, doi: 10.1109/icomet.2018.8346364.
V.-M. Khong and T.-H. Tran, “Improving Human Action Recognition with Two-Stream 3D Convolutional Neural Network,” 2018 1st International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6, Apr. 2018, doi: 10.1109/mapr.2018.8337518.
N. E. D. Elmadany, Y. He, and L. Guan, “Information Fusion for Human Action Recognition via Biset/Multiset Globality Locality Preserving Canonical Correlation Analysis,” IEEE Transactions on Image Processing, vol. 27, no. 11, pp. 5275–5287, Nov. 2018, doi: 10.1109/tip.2018.2855438.
P. S, M. U, and D. M. S. kumar, “Human Motion Detection and Tracking for Real-Time Security System,” IJARCCE, vol. 5, no. 12, pp. 203–207, Dec. 2016, doi: 10.17148/ijarcce.2016.51245.
L. Yuan, Z. He, Q. Wang, L. Xu, and X. Ma, “Improving Small-Scale Human Action Recognition Performance Using a 3D Heatmap Volume,”Sensors, vol. 23, no. 14, p. 6364, Jul. 2023, doi: 10.3390/s23146364.
I. Song, J. Lee, M. Ryu, and J. Lee, “Motion-Aware Heatmap Regression for Human Pose Estimation in Videos,” Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence, pp. 1245–1253, Aug. 2024, doi: 10.24963/ijcai.2024/138.
M. Liu and J. Yuan, “Recognizing Human Actions as the Evolution of Pose Estimation Maps,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1159–1168, Jun. 2018, doi: 10.1109/cvpr.2018.00127.
D. Nagpal and S. Gupta, “Human Activity Recognition and Prediction: Overview and Research Gaps,” 2023 IEEE 8th International Conference for Convergence in Technology (I2CT), pp. 1–5, Apr. 2023, doi: 10.1109/i2ct57861.2023.10126458.
C. Gupta et al., “A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model,” IEEE Open Journal of the Computer Society, vol. 5, pp. 14–26, 2024, doi: 10.1109/ojcs.2023.3334528.
M. Batool et al., “Multimodal Human Action Recognition Framework Using an Improved CNNGRU Classifier,” IEEE Access, vol. 12, pp.158388–158406, 2024, doi: 10.1109/access.2024.3481631.
S. M. Hejazi and C. Abhayaratne, “Handcrafted localized phase features for human action recognition,” Image and Vision Computing, vol.123, p. 104465, Jul. 2022, doi: 10.1016/j.imavis.2022.104465.
H. El Zein, F. Mourad-Chehade, and H. Amoud, “CSI-Based Human Activity Recognition via Lightweight CNN Model and Data Augmentation,” IEEE Sensors Journal, vol. 24, no. 15, pp. 25060–25069, Aug. 2024, doi: 10.1109/jsen.2024.3414168.
L. Xia and J. K. Aggarwal, “Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera,” 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2834–2841, Jun. 2013, doi: 10.1109/cvpr.2013.365.
M. B. Holte, C. Tran, M. M. Trivedi, and T. B. Moeslund, “Human Pose Estimation and Activity Recognition From Multi-View Videos: Comparative Explorations of Recent Developments,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 5, pp. 538–552, Sep. 2012, doi: 10.1109/jstsp.2012.2196975.
J. K. Aggarwal and Sangho Park, “Human motion: modeling and recognition of actions and interactions,” Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004., pp. 640–647, doi: 10.1109/tdpvt.2004.1335299.
P. B S and N. Thillaiarasu, “Personality Prediction From Handwriting Using Adaptive Deep Convolutional Recurrent Neural Network,” 2025 Fifth International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–8, Jan.2025, doi: 10.1109/icaect63952.2025.10958973.
C. Chen, R. Jafari, and N. Kehtarnavaz, “A survey of depth and inertial sensor fusion for human action recognition,” Multimedia Tools and Applications, vol. 76, no. 3, pp. 4405–4425, Dec. 2015, doi: 10.1007/s11042-015-3177-1.
H. Patil et al., “Precise Human Activity Recognition using Convolutional Neural Network and Deep Learning Models,” International Journal of Computer Sciences and Engineering, vol. 12, no. 7, pp. 24–32, Jul. 2024, doi: 10.26438/ijcse/v12i7.2432.
R. Qian et al., “Spatiotemporal Contrastive Video Representation Learning,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, doi: 10.1109/cvpr46437.2021.00689.
Mr. M. U and D. H. S. Mohana, “Human Action Recognition using STIP Techniques,” International Journal of Innovative Technology and Exploring Engineering, vol. 9, no. 7, pp. 878–883, May 2020, doi: 10.35940/ijitee.g5482.059720.
CRediT Author Statement
The authors confirm contribution to the paper as follows:
Conceptualization: Pavankumar Naik and Srinivasa Rao Kunte R;
Methodology: Srinivasa Rao Kunte R;
Software: Pavankumar Naik;
Data Curation: Srinivasa Rao Kunte R;
Writing- Original Draft Preparation: Pavankumar Naik and Srinivasa Rao Kunte R;
Visualization: Srinivasa Rao Kunte R;
Investigation: Pavankumar Naik;
Supervision: Pavankumar Naik and Srinivasa Rao Kunte R;
Validation: Pavankumar Naik;
Writing- Reviewing and Editing: Pavankumar Naik and Srinivasa Rao Kunte R; All authors reviewed the results and approved the final version of the manuscript.
Acknowledgements
Author(s) thanks to Dr. Srinivasa Rao Kunte R for this research completion and support.
Funding
No funding was received to assist with the preparation of this manuscript.
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Availability of data and materials
Data sharing is not applicable to this article as no new data were created or analysed in this study.
Author information
Contributions
All authors have equal contribution in the paper and all authors have read and agreed to the published version of the manuscript.
Corresponding author
Pavankumar Naik
Department of Computer Science and Engineering, Institute of Engineering and Technology, Srinivas University, Mangalore, Karnataka, India.
Open Access This article is licensed under a Creative Commons Attribution NoDerivs is a more restrictive license. It allows you to redistribute the material commercially or non-commercially but the user cannot make any changes whatsoever to the original, i.e. no derivatives of the original work. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-nd/4.0/
Cite this article
Pavankumar Naik and Srinivasa Rao Kunte R, “Detection and Recognition of Multitask Human Action Identification from Preloaded Videos Using CCTV Stationary Cameras”, Journal of Machine and Computing, vol.5, no.3, pp. 1518-1531, July 2025, doi: 10.53759/7669/jmc202505120.