Recent highlights

(For a full list see Google Scholar)

A Human-on-the-Loop Optimization Autoformalism Approach for Sustainability

We introduce optimization autoformalism, transforming natural language tasks directly into optimization instances. This allows LLMs to navigate and address intricate planning issues, extending beyond traditional prompt-based methods. Our emphasis lies on unique optimization challenges that require frequent adjustments and cater to individual users, moving beyond generic solutions. Our study spans tasks in the energy sector: from electric vehicle charging, HVAC management, to evaluating the advantages of rooftop solar PVs or heat pumps. This research signifies a pivotal move towards context-driven optimization with LLMs, aiming to make optimization more accessible and democratic.

Ming Jin, Bilgehan Sel, Fnu Hardeep, Wotao Yin


A CMDP-within-Online Framework for Meta-Safe Reinforcement Learning

We propose the first theoretical framework to handle the nonconvexity and stochasticity nature of within-task CMDPs (safe RL) while exploiting inter-task dependency and intra-task geometries for meta-safe RL (Meta-SRL). We obtain task-averaged regret guarantees for the reward maximization and constraint violations using gradient-based meta-learning and show that the task-averaged optimality gap and constraint satisfaction improve with task-similarity. Our meta-algorithm performs inexact online learning on the upper bounds of intra-task optimality gap and constraint violations estimated by off-policy stationary distribution corrections. Furthermore, we enable the learning rates to be adapted for every task and extend our approach to settings with the dynamically changing task environments.

Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, Ming Jin

ICLR (2023) (spotlight presentation) (openreview | pdf)

On Solution Functions of Optimization: Universal Approximation and Covering Number Bounds

This is the first study on the expressibility and learnability of solution functions of convex optimization and their multi-layer architectural extension. Some interesting results include: 1) the class of solution functions of linear programming (LP) and quadratic programming (QP) is a universal approximant, 2) compositionality in the form of deep architecture can achieve a substantial reduction in rate-distortion, and 3) the statistical bounds of empirical covering numbers for LP/QP, as well as a generic optimization problem (possibly nonconvex) can be characterized by tame geometry.

Ming Jin, Vanshaj Khattar, Harshal Kaushik, Bilgehan Sel, Ruoxi Jia

AAAI (2023) (oral presentation) (arXiv | pdf)

Winning the CityLearn Challenge: Adaptive Optimization with Evolutionary Search under Trajectory-based Guidance

The CityLearn Challenge is an international competition for reinforcement learning (RL) solutions to address grand challenges in power and energy systems. In this paper, we present our winning solution using the solution function of optimization as policies to compute the actions for sequential decision-making, while notably adapting the parameters of the optimization model from online observations. Algorithmically, this is achieved by an evolutionary algorithm under a novel trajectory-based guidance scheme. Formally, the global convergence property is established.

Vanshaj Khattar, Ming Jin

AAAI (2023) AI for Social Impact Track (arXiv | pdf)


Publication List