Research
๐ Research Vision
I aim to advance the frontier of safe, interpretable, and adaptive AI for cyber-physical systems operating under uncertainty and dynamic constraints. My research sits at the intersection of machine learning, optimization, and control theory, with a particular focus on:
- โ๏ธ Physics-informed Deep Reinforcement Learning (DRL)
- ๐ Probabilistic & Bayesian Modeling
- ๐ง Large Language Models (LLMs) for autonomous reasoning
- ๐ผ๏ธ Vision-based simulation environments
By tightly integrating domain knowledge into learning frameworks, I design agents capable of robust decision-making in real-world, high-stakes environments such as smart grids, robotics, and intelligent infrastructure.
๐ Research Highlights
- โ Proposed the first Physics-Informed LSTM-PPO agent for volt-var control on 8500-node networks.
- ๐ Achieved 98% reduction in voltage violations and 3ร faster convergence in federated DRL.
- ๐ง Developed one-shot transfer learning for control agents in complex topologies.
- ๐ค Integrated LLM-guided planning into multi-building simulations via CityLearn.
- ๐ Built resilient DRL systems that withstand adversarial and distributional attacks.
๐ง Research Focus Areas
๐ฏ Objective
Develop control agents that guarantee system safety, stability, and robust learning in dynamic, uncertain, and partially observable environments.
๐ Core Focus Areas
- ๐งฉ Constrained policy optimization and reward shaping
- ๐ฌ Physics-based priors in DRL
- ๐ก๏ธ Adversarial resilience and anomaly detection
- ๐ Epistemic and aleatoric uncertainty quantification
๐ฏ Objective
Enable rapid generalization across distribution shifts in topology, weather, or load profiles.
๐ Core Focus Areas
- โ๏ธ Transferable actor-critic architectures
- ๐ฐ๏ธ Simulation-to-real (Sim2Real) adaptation
- ๐ง Meta-RL for sample efficiency
๐ฏ Objective
Bridge the gap between perception and control by combining synthetic sensors, simulated environments, and end-to-end learning pipelines.
๐ Core Focus Areas
- ๐ Perception-action loops with CARLA, AirSim
- ๐ Multi-modal representation fusion (image + state)
- ๐ฏ Autonomous control with embedded perception
- ๐ End-to-end control pipelines
๐ฏ Objective
Empower agents to interpret language-based inputs and coordinate intelligently in multi-agent and human-AI settings.
๐ Core Focus Areas
- ๐งพ LLMs for summarizing states and guiding actions
- ๐ฃ๏ธ Translating natural language into policy primitives
- ๐ฅ Facilitating human-AI collaboration
๐ง Research Focus Areas

๐ Safe & Trustworthy RL
Design agents that ensure safety, robustness, and uncertainty awareness in complex environments.

๐ Transfer & Meta-Adaptation
Enable rapid adaptation to unseen domains, environments, or grid conditions using Meta-RL and Sim2Real.

๐๏ธ Vision-Simulation Integration
Bridge perception and control using multi-modal simulation environments and synthetic sensors.

๐ง LLM-Augmented Decision Systems
Use LLMs for reasoning, planning, and translating natural language into actionable policies.
๐ง Research Focus Areas
๐ Safe & Trustworthy RL

Robust & Stable Learning
Develop agents that ensure system safety, robustness, and interpretability under uncertainty.

Uncertainty-Aware Policies
Quantify epistemic and aleatoric uncertainty in high-stakes, partially observable settings.
๐ Transfer & Meta-Adaptation

Domain Adaptation
Enable agents to generalize across grids with different topologies, dynamics, and loads.

Meta-RL for Efficiency
Leverage meta-reasoning to accelerate learning in low-data, high-variance scenarios.
๐๏ธ Vision-Simulation Integration

Perception-Control Fusion
Use CARLA and AirSim to train end-to-end systems in visual RL tasks with sensors.

Multi-modal Representations
Combine visual, state, and contextual features for better decision-making.
๐ง LLM-Augmented Decision Systems

LLM-Guided Control
Translate natural language into actionable policies for real-world environments.

Human-AI Collaboration
Facilitate interactive control loops between humans and agents using LLMs.
๐ง Research Focus Areas

Robust & Stable Learning
Develop agents that ensure system safety, robustness, and interpretability under uncertainty.

Uncertainty-Aware Policies
Quantify epistemic and aleatoric uncertainty in high-stakes, partially observable settings.

Domain Adaptation
Enable agents to generalize across grids with different topologies, dynamics, and loads.

Meta-RL for Efficiency
Leverage meta-reasoning to accelerate learning in low-data, high-variance scenarios.

Perception-Control Fusion
Use CARLA and AirSim to train end-to-end systems in visual RL tasks with sensors.

Multi-modal Representations
Combine visual, state, and contextual features for better decision-making.

LLM-Guided Control
Translate natural language into actionable policies for real-world environments.

Human-AI Collaboration
Facilitate interactive control loops between humans and agents using LLMs.
๐ฌ Application Domains
Domain | Description |
---|---|
โก Smart Energy Systems | Volt-VAR control, DER coordination, and federated DRL for power grid stability |
๐ Autonomous Systems | Safe navigation, adaptive planning, and control in simulation and real-world environments |
๐ก Secure AI for Infrastructure | Resilience against cyber-attacks and adversarial scenarios in safety-critical systems |
๐ Publications
- Kundan Kumar, Gelli Ravikumar
Physics-based Deep Reinforcement Learning for Grid-Resilient Volt-VAR Control (Under Review)
IEEE Transactions on Smart Grid, 2025
Paper Code Poster
Kundan Kumar, Gelli Ravikumar
Transfer Learning Enhanced Deep Reinforcement Learning for Volt-Var Control in Smart Grids
IEEE PES Grid Edge Technologies, 2025Kundan Kumar, Aditya Akilesh Mantha, Gelli Ravikumar
Bayesian Optimization for DRL in Robust Volt-Var Control
IEEE PES General Meeting, 2024Kundan Kumar, Gelli Ravikumar
Volt-VAR Control and Attack Resiliency using Deep RL
IEEE ISGT, 2024JK Francis, C Kumar, J Herrera-Gerena, Kundan Kumar, MJ Darr
Sensor Data Regression using Deep Learning & Patterns
IEEE ICMLA, 2022Kin Gwn Lore, Nicholas Sweet, Kundan Kumar, et al.
Deep Value of Information Estimators for Human-Machine Collaboration
ACM/IEEE ICCPS, 2016
๐ Ongoing Projects
๐ค Federated DRL for Cyber-Resilient Volt-VAR Optimization
Decentralized, communication-efficient control using LSTM-enhanced PPO agents across distributed DERs.โก One-Shot Policy Transfer with Physics Priors
Train agents on small topologies and adapt to IEEE 123-bus, 8500-node networks in a few iterations.๐ง LLM-Guided Autonomous Planning for Smart Buildings
Convert user prompts to interpretable control policies using LLMs (OpenAI, Claude) in CityLearn environments.