Self-Supervised Learning for Action Recognition:  Trends, Models, and Applications

Mouwiya S. A. Al-Qaisieh

doi:10.52783/jisem.v10i48s.9738

PDF

Published: May 19, 2025

DOI: https://doi.org/10.52783/jisem.v10i48s.9738

Keywords:

Component; Self-supervised learning; Human action recognition; Contrastive learning, Multimodal fusion

Mouwiya S. A. Al-Qaisieh, Mas Rina Mustaffa

Abstract

Recent advances in self-supervised learning (SSL) have reshaped the landscape of human action recognition by reducing dependency on large-scale annotated datasets. This survey provides a comprehensive overview of state-of-the-art SSL techniques developed for understanding human actions in videos. We categorize methods into three primary paradigms: contrastive learning, masked video modeling, and multimodal or sensor-based approaches. Across each category, we discuss key innovations including motion-guided contrastive sampling, transformer-based masked autoencoders, and cross-modal alignment strategies that leverage audio, skeleton, or wearable sensor signals. Models such as VideoMAE, ST-MAE, XDC, and Actionlet-Contrastive represent significant milestones in capturing both spatial and temporal cues without supervision. Beyond model design, we identify major challenges facing current SSL systems, including generalization across domains, modeling long-horizon activities, and real-time deployment constraints. We also highlight underexplored areas such as explainability and unified evaluation protocols. To guide future work, we present a structured taxonomy, a comparative table of representative models, and a discussion of promising research directions including multimodal fusion, modality-agnostic learning, and hardware-aware training. This survey aims to equip researchers with a clear understanding of the evolving trends, persistent gaps, and opportunities that lie ahead in self-supervised action recognition.

Issue

Vol. 10 No. 48s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

Self-Supervised Learning for Action Recognition: Trends, Models, and Applications

Abstract

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details