Special Sessions

Special Session 1

Video Semantics, Scene Understanding, and Reasoning

Motivation and Need for the Special Session

As video data continues to grow rapidly across domains such as surveillance, autonomous systems, healthcare, entertainment, and human–computer interaction, there is an increasing need for video understanding systems that move beyond simple object and action recognition toward deeper semantic comprehension. Real-world activities are inherently complex, involving multiple entities, their interactions, temporal dependencies, and contextual relationships that evolve over space and time. Traditional data-driven approaches often struggle to provide interpretable reasoning and robust generalization in such dynamic environments. Recent advances in knowledge graphs, scene graphs, spatio-temporal graph representations, neuro-symbolic learning, and large language models (LLMs) offer promising directions for capturing structured semantics and enabling higher-level reasoning over video content. By integrating graph-based representations with multimodal language-vision frameworks, researchers can bridge the gap between low-level visual observations and rich semantic understanding, supporting explainable decision-making, contextual reasoning, semantic querying, and knowledge-driven video analytics. This special session aims to foster research that advances the development of intelligent, interpretable, and semantically grounded video understanding systems capable of reasoning about complex activities in real-world scenarios.

Topics of Interest (Areas of Concern)

Submissions are encouraged on, but not limited to, the following topics:

Semantic video representation and understanding of activities, events, entities, and their interactions.
Knowledge graph, scene graph, and spatio-temporal graph-based approaches for structured video modelling and reasoning.
Relational, compositional, and temporal activity understanding for complex event recognition and long-term reasoning.
Graph neural networks, graph transformers, and neuro-symbolic learning for video semantics and explainable AI.
Large Language Models (LLMs), Vision-Language Models (VLMs), and multimodal foundation models for video understanding and reasoning.
Multimodal fusion and cross-modal learning integrating visual, textual, audio, and knowledge representations.
Video captioning, question answering, semantic retrieval, querying, and summarization using language-guided frameworks.
Knowledge-enhanced, retrieval-augmented, and agentic AI approaches for semantic grounding and decision-making.
Few-shot, zero-shot, open-vocabulary learning, and generalization techniques for robust video understanding.

Session Organizers

Dr. Ashish Singh Patel
Assistant Professor
Department of Computer Science & Engineering
National Institute of Technology Mizoram
Chaltlang, Aizawl, Mizoram, India – 796012
Email: ashish.cse@nitmz.ac.in

Submission Details

Paper Submission Link: https://cmt3.research.microsoft.com/AICTA2026
Select Track as: SS1: Video Semantics, Scene Understanding, and Reasoning
Last Date of Paper Submission: 30th June, 2026
Decision Notification: 31st July, 2026

Special Session 2

Lightweight Machine and Deep Learning Models for eHealthcare Applications

Motivation and Need for the Special Section

The rapid integration of the Internet of Medical Things (IoMT), wearable sensors, and smartphones into ambient-assisted living has fundamentally transformed remote patient monitoring and eHealthcare. While conventional deep learning models achieve remarkable accuracy in clinical diagnostics, their high computational overhead, significant memory footprint, and heavy power consumption severely limit their deployment on edge devices. For continuous, real-time healthcare monitoring, edge deployment is critical. Split-second operations, including detecting a life-threatening fall, enabling seamless communication through real-time Human Sign Language (HSL) translation, or rapidly processing complex physiological signals (such as EMG or ECG) to predict anomalies, cannot afford the latency and privacy risks introduced by cloud computing. This special section addresses the pressing need to develop computationally efficient, lightweight machine learning and deep learning architectures tailored to resource-constrained environments. By focusing on optimised algorithms, this track aims to bridge the gap between advanced artificial intelligence and practical, privacy-preserving, and battery-efficient healthcare solutions at the edge.

Topics of Interest (Areas of Concern)

Submissions are encouraged on, but not limited to, the following topics:

Lightweight Deep Learning Models for Human Activity Recognition (HAR) in Healthcare Environments.
Edge-AI Systems for Human Sign Language (HSL) Recognition and Translation for the Hearing Impaired.
Real-time Fall Detection and Prediction Systems for Emergency Assistance.
Spatio-temporal Attention Mechanisms for Multimodal Sensor Fusion on Mobile Devices.
Efficient Processing of Physiological Signals (e.g., EMG, ECG, EEG) for eHealth Applications on Resource-Constrained Hardware.
Explainable AI (XAI) and Interpretable Models for Edge-based Clinical Diagnostics.
Model Compression Techniques (Pruning, Quantization, Knowledge Distillation) for Medical AI.
Federated Learning for Privacy-Preserving Collaborative Healthcare Monitoring.
Energy-efficient Architectures for Continuous Ambient Assisted Living (AAL).

Session Organizers

Dr. Nurul Amin Choudhury
Assistant Professor
Department of Computer Science & Engineering
National Institute of Technology (NIT) Meghalaya
East Khasi Hills, Meghalaya, India – 793108
Email: nurul.choudhury@nitm.ac.in

Dr. Soumen Das
Assistant Professor
Department of Computer Science & Engineering
National Institute of Technology (NIT) Meghalaya
East Khasi Hills, Meghalaya, India – 793108
Email: soumen.das@nitm.ac.in

Submission Details

Paper Submission Link: https://cmt3.research.microsoft.com/AICTA2026
Select Track as: SS2: Lightweight Machine and Deep Learning Models for eHealthcare Applications
Last Date of Paper Submission: 30th June, 2026
Decision Notification: 31st July, 2026

Contact Us

Department of Computer Science and Engineering

National Institute of Technology Silchar

Fakirtilla, Silchar - 788010, India

Email: aicta2026@nits.ac.in

Organizer

NIT Silchar

Special Session 1

Video Semantics, Scene Understanding, and Reasoning

Motivation and Need for the Special Session

Topics of Interest (Areas of Concern)

Session Organizers

Submission Details

Special Session 2

Lightweight Machine and Deep Learning Models for eHealthcare Applications

Motivation and Need for the Special Section

Topics of Interest (Areas of Concern)

Session Organizers

Submission Details

Contact Us

Organizer

Publication Partner