Kundan Kumar
  • About
  • CV
  • Research
  • Projects
    • โœจ All Projects

    • Large Language Models
    • Deep Reinforcement Learning
    • Deep/Machine Learning
    • Statistics
    • Computer Vision
    • Robotics
  • Teaching
  • Blogs

On this page

  • ๐Ÿš€ Research Vision
  • ๐ŸŒŸ Research Highlights
  • ๐Ÿง  Research Focus Areas
  • ๐Ÿง  Research Focus Areas
  • ๐Ÿง  Research Focus Areas
    • ๐Ÿ” Safe & Trustworthy RL
    • ๐Ÿ”„ Transfer & Meta-Adaptation
    • ๐Ÿ‘๏ธ Vision-Simulation Integration
    • ๐Ÿง  LLM-Augmented Decision Systems
  • ๐Ÿง  Research Focus Areas
  • ๐Ÿ”ฌ Application Domains
  • ๐Ÿ“š Publications
  • ๐Ÿ”„ Ongoing Projects
  • Edit this page
  • Report an issue

Research

๐Ÿš€ Research Vision

I aim to advance the frontier of safe, interpretable, and adaptive AI for cyber-physical systems operating under uncertainty and dynamic constraints. My research sits at the intersection of machine learning, optimization, and control theory, with a particular focus on:

  • โš™๏ธ Physics-informed Deep Reinforcement Learning (DRL)
  • ๐Ÿ” Probabilistic & Bayesian Modeling
  • ๐Ÿง  Large Language Models (LLMs) for autonomous reasoning
  • ๐Ÿ–ผ๏ธ Vision-based simulation environments

By tightly integrating domain knowledge into learning frameworks, I design agents capable of robust decision-making in real-world, high-stakes environments such as smart grids, robotics, and intelligent infrastructure.


๐ŸŒŸ Research Highlights

  • โœ… Proposed the first Physics-Informed LSTM-PPO agent for volt-var control on 8500-node networks.
  • ๐Ÿ“‰ Achieved 98% reduction in voltage violations and 3ร— faster convergence in federated DRL.
  • ๐Ÿง  Developed one-shot transfer learning for control agents in complex topologies.
  • ๐Ÿค– Integrated LLM-guided planning into multi-building simulations via CityLearn.
  • ๐Ÿ”’ Built resilient DRL systems that withstand adversarial and distributional attacks.

๐Ÿง  Research Focus Areas

  • ๐Ÿ” Safe & Trustworthy Reinforcement Learning
  • ๐Ÿ”„ Transfer Learning & Meta-Adaptation
  • ๐Ÿ‘๏ธ Vision-Simulation Integration
  • ๐Ÿง  LLM-Augmented Decision Systems

๐ŸŽฏ Objective

Develop control agents that guarantee system safety, stability, and robust learning in dynamic, uncertain, and partially observable environments.

๐Ÿ” Core Focus Areas

  • ๐Ÿงฉ Constrained policy optimization and reward shaping
  • ๐Ÿ”ฌ Physics-based priors in DRL
  • ๐Ÿ›ก๏ธ Adversarial resilience and anomaly detection
  • ๐Ÿ“ Epistemic and aleatoric uncertainty quantification

Safe RL Diagram


๐ŸŽฏ Objective

Enable rapid generalization across distribution shifts in topology, weather, or load profiles.

๐Ÿ” Core Focus Areas

  • โš™๏ธ Transferable actor-critic architectures
  • ๐Ÿ›ฐ๏ธ Simulation-to-real (Sim2Real) adaptation
  • ๐Ÿง  Meta-RL for sample efficiency

Transfer Learning Diagram


๐ŸŽฏ Objective

Bridge the gap between perception and control by combining synthetic sensors, simulated environments, and end-to-end learning pipelines.

๐Ÿ” Core Focus Areas

  • ๐Ÿš— Perception-action loops with CARLA, AirSim
  • ๐Ÿ”€ Multi-modal representation fusion (image + state)
  • ๐ŸŽฏ Autonomous control with embedded perception
  • ๐Ÿ” End-to-end control pipelines

Vision Simulation Diagram


๐ŸŽฏ Objective

Empower agents to interpret language-based inputs and coordinate intelligently in multi-agent and human-AI settings.

๐Ÿ” Core Focus Areas

  • ๐Ÿงพ LLMs for summarizing states and guiding actions
  • ๐Ÿ—ฃ๏ธ Translating natural language into policy primitives
  • ๐Ÿ‘ฅ Facilitating human-AI collaboration

LLM Control Diagram


๐Ÿง  Research Focus Areas

Safe RL

๐Ÿ” Safe & Trustworthy RL

Design agents that ensure safety, robustness, and uncertainty awareness in complex environments.

Transfer Learning

๐Ÿ”„ Transfer & Meta-Adaptation

Enable rapid adaptation to unseen domains, environments, or grid conditions using Meta-RL and Sim2Real.

Vision Sim

๐Ÿ‘๏ธ Vision-Simulation Integration

Bridge perception and control using multi-modal simulation environments and synthetic sensors.

LLM Control

๐Ÿง  LLM-Augmented Decision Systems

Use LLMs for reasoning, planning, and translating natural language into actionable policies.

๐Ÿง  Research Focus Areas


๐Ÿ” Safe & Trustworthy RL

Safe RL

Robust & Stable Learning

Develop agents that ensure system safety, robustness, and interpretability under uncertainty.

Uncertainty Quantification

Uncertainty-Aware Policies

Quantify epistemic and aleatoric uncertainty in high-stakes, partially observable settings.


๐Ÿ”„ Transfer & Meta-Adaptation

Transfer Learning

Domain Adaptation

Enable agents to generalize across grids with different topologies, dynamics, and loads.

Meta RL

Meta-RL for Efficiency

Leverage meta-reasoning to accelerate learning in low-data, high-variance scenarios.


๐Ÿ‘๏ธ Vision-Simulation Integration

Perception Control

Perception-Control Fusion

Use CARLA and AirSim to train end-to-end systems in visual RL tasks with sensors.

Multimodal Fusion

Multi-modal Representations

Combine visual, state, and contextual features for better decision-making.


๐Ÿง  LLM-Augmented Decision Systems

LLM for summarization

LLM-Guided Control

Translate natural language into actionable policies for real-world environments.

Human AI Collaboration

Human-AI Collaboration

Facilitate interactive control loops between humans and agents using LLMs.


๐Ÿง  Research Focus Areas

๐Ÿ” Safe & Trustworthy RL
Safe RL

Robust & Stable Learning

Develop agents that ensure system safety, robustness, and interpretability under uncertainty.

Uncertainty Quantification

Uncertainty-Aware Policies

Quantify epistemic and aleatoric uncertainty in high-stakes, partially observable settings.

๐Ÿ”„ Transfer & Meta-Adaptation
Transfer Learning

Domain Adaptation

Enable agents to generalize across grids with different topologies, dynamics, and loads.

Meta RL

Meta-RL for Efficiency

Leverage meta-reasoning to accelerate learning in low-data, high-variance scenarios.

๐Ÿ‘๏ธ Vision-Simulation Integration
Perception Control

Perception-Control Fusion

Use CARLA and AirSim to train end-to-end systems in visual RL tasks with sensors.

Multimodal Fusion

Multi-modal Representations

Combine visual, state, and contextual features for better decision-making.

๐Ÿง  LLM-Augmented Decision Systems
LLM for summarization

LLM-Guided Control

Translate natural language into actionable policies for real-world environments.

Human AI Collaboration

Human-AI Collaboration

Facilitate interactive control loops between humans and agents using LLMs.

๐Ÿ”ฌ Application Domains

Domain Description
โšก Smart Energy Systems Volt-VAR control, DER coordination, and federated DRL for power grid stability
๐Ÿš˜ Autonomous Systems Safe navigation, adaptive planning, and control in simulation and real-world environments
๐Ÿ›ก Secure AI for Infrastructure Resilience against cyber-attacks and adversarial scenarios in safety-critical systems


๐Ÿ“š Publications

  • ๐Ÿ“ Journal Papers
  • ๐ŸŽค Conference Papers
  1. Kundan Kumar, Gelli Ravikumar
    Physics-based Deep Reinforcement Learning for Grid-Resilient Volt-VAR Control (Under Review)
    IEEE Transactions on Smart Grid, 2025
    Paper Code Poster

  1. Kundan Kumar, Gelli Ravikumar
    Transfer Learning Enhanced Deep Reinforcement Learning for Volt-Var Control in Smart Grids
    IEEE PES Grid Edge Technologies, 2025

  2. Kundan Kumar, Aditya Akilesh Mantha, Gelli Ravikumar
    Bayesian Optimization for DRL in Robust Volt-Var Control
    IEEE PES General Meeting, 2024

  3. Kundan Kumar, Gelli Ravikumar
    Volt-VAR Control and Attack Resiliency using Deep RL
    IEEE ISGT, 2024

  4. JK Francis, C Kumar, J Herrera-Gerena, Kundan Kumar, MJ Darr
    Sensor Data Regression using Deep Learning & Patterns
    IEEE ICMLA, 2022

  5. Kin Gwn Lore, Nicholas Sweet, Kundan Kumar, et al.
    Deep Value of Information Estimators for Human-Machine Collaboration
    ACM/IEEE ICCPS, 2016


๐Ÿ”„ Ongoing Projects

  • ๐Ÿค– Federated DRL for Cyber-Resilient Volt-VAR Optimization
    Decentralized, communication-efficient control using LSTM-enhanced PPO agents across distributed DERs.

  • โšก One-Shot Policy Transfer with Physics Priors
    Train agents on small topologies and adapt to IEEE 123-bus, 8500-node networks in a few iterations.

  • ๐Ÿง  LLM-Guided Autonomous Planning for Smart Buildings
    Convert user prompts to interpretable control policies using LLMs (OpenAI, Claude) in CityLearn environments.

ยฉ 2025 Kundan Kumar โˆ™ Made with Quarto

  • Edit this page
  • Report an issue
  • Contact