Classification of Student Actions Through 2D Pose-Based CNN-LSTM Networks

Authors

  • Iskandarova Sayyora Nurmamatovna Tashkent university of information technologies named after Muhammad al-Khwarizmi
  • Jurayev Xudoyshukur Utkir ugli Tashkent university of information technologies named after Muhammad al-Khwarizmi

Keywords:

CNN-LSTM, 2D Pose Estimation, Student Action Classification

Abstract

Automated analysis of student behavior in classrooms offers educators a reliable method to enhance engagement and assess participation. This study introduces a 2D pose-based CNN-LSTM model designed to classify student actions—hand raising, writing, and reading—from video data using the EduNet dataset. Video frames were processed with Mediapipe to extract pose landmarks, focusing on upper-body features. The proposed architecture leverages Convolutional Neural Networks (CNN) for spatial analysis and Long Short-Term Memory (LSTM) units for temporal sequence understanding. Despite constraints imposed by a limited dataset size, the model successfully achieved a validation accuracy of 98.83%. These findings confirm that pose-based approaches provide precise, efficient alternatives to traditional behavior analysis. Future enhancements, such as expanding datasets and modeling multi-person scenarios, are recommended to improve applicability in diverse classroom environments.

Downloads

Published

2025-05-13

How to Cite

Classification of Student Actions Through 2D Pose-Based CNN-LSTM Networks. (2025). American Journal of Engineering , Mechanics and Architecture (2993-2637), 3(5), 91-96. https://www.grnjournal.us.e-scholar.org/index.php/AJEMA/article/view/7617