Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models

Raza, Muhammad Amjad and Mehmood, Nasir and Siddiqui, Hafeez Ur Rehman and Saleem, Adil Ali and Álvarez, Roberto Marcelo and Miró Vera, Yini Airet and Díez, Isabel de la Torre UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, UNSPECIFIED, roberto.alvarez@uneatlantico.es, yini.miro@uneatlantico.es, UNSPECIFIED (2026) Human Activity Recognition in Domestic Settings Based on Optical Techniques and Ensemble Models. Sensors, 26 (5). p. 1516. ISSN 1424-8220

Text
sensors-26-01516-v2.pdf
Available under License Creative Commons Attribution.
Download (4MB)

Official URL: http://doi.org/10.3390/s26051516

Abstract

Human activity recognition (HAR) is essential in many applications, such as smart homes, assisted living, healthcare monitoring, rehabilitation, physiotherapy, and geriatric care. Conventional methods of HAR use wearable sensors, e.g., acceleration sensors and gyroscopes. However, they are limited by issues such as sensitivity to position, user inconvenience, and potential health risks with long-term use. Optical camera systems that are vision-based provide an alternative that is not intrusive; however, they are susceptible to variations in lighting, intrusions, and privacy issues. The paper uses an optical method of recognizing human domestic activities based on pose estimation and deep learning ensemble models. The skeletal keypoint features proposed in the current methodology are extracted from video data using PoseNet to generate a privacy-preserving representation that captures key motion dynamics without being sensitive to changes in appearance. A total of 30 subjects (15 male and 15 female) were sampled across 2734 activity samples, including nine daily domestic activities. There were six deep learning architectures, namely, the Transformer (Transformer), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Multilayer Perceptron (MLP), One-Dimensional Convolutional Neural Network (1D CNN), and a hybrid Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM) architecture. The results on the hold-out test set show that the CNN–LSTM architecture achieves an accuracy of 98.78% within our experimental setting. Leave-One-Subject-Out cross-validation further confirms robust generalization across unseen individuals, with CNN–LSTM achieving a mean accuracy of 97.21% ± 1.84% across 30 subjects. The results demonstrate that vision-based pose estimation with deep learning is a useful, precise, and non-intrusive approach to HAR in smart healthcare and home automation systems.

Item Type:	Article
Uncontrolled Keywords:	deep learning; human activity recognition; LSTM; PoseNet; skeleton-based recognition; smart home; Transformer
Subjects:	Subjects > Engineering
Divisions:	Europe University of Atlantic > Research > Scientific Production Fundación Universitaria Internacional de Colombia > Research > Scientific Production Ibero-american International University > Research > Scientific Production Ibero-american International University > Research > Scientific Production Universidad Internacional do Cuanza > Research > Scientific Production University of La Romana > Research > Scientific Production
Depositing User:	Sr Bibliotecario
Date Deposited:	23 Mar 2026 09:13
Last Modified:	23 Mar 2026 09:13
URI:	http://repositorio.funiber.org/id/eprint/27968

Actions (login required)

View Item