The Hamilton-Jacobi-Bellman Equation for Optimal 
Control in Multi-Agent Systems

Patel Nirmal Rajnikant; Ritu Khanna

doi:https://doi.org/10.14445/22315373/IJMTT-V71I5P102

International Journal of Mathematics Trends and Technology

Research Article | Open Access | Download PDF

Volume 71 | Issue 5 | Year 2025 | Article Id. IJMTT-V71I5P102 | DOI : https://doi.org/10.14445/22315373/IJMTT-V71I5P102

The Hamilton-Jacobi-Bellman Equation for Optimal Control in Multi-Agent Systems

Patel Nirmal Rajnikant, Ritu Khanna

Received	Revised	Accepted	Published
19 Mar 2025	24 Apr 2025	13 May 2025	26 May 2025

Citation :

Patel Nirmal Rajnikant, Ritu Khanna, "The Hamilton-Jacobi-Bellman Equation for Optimal Control in Multi-Agent Systems," International Journal of Mathematics Trends and Technology (IJMTT), vol. 71, no. 5, pp. 9-17, 2025. Crossref, https://doi.org/10.14445/22315373/IJMTT-V71I5P102

Abstract

Hamilton-Jacobi-Bellman (HJB) equation is fundamental to optimal control theory and is required for optimality using dynamic programming rules. We apply the HJB framework to a system with numerous agents who want to maximize their output objective while interacting with other agents in a common environment. In the cooperative and noncooperative cases, we formulate the coupled HJB equations governing the systems. Approximation techniques and a learning based approach to this challenge are presented to address key challenges such as the curse of dimensionality and the desire for decentralized solutions. We also study conditions under which Nash equilibria can be obtained from the HJB framework in differential games. The theoretical findings are validated with simulation results, and they demonstrate the application of the proposed methods in robotic coordination and autonomous vehicle systems.

Keywords

Hamilton-Jacobi-Bellman (HJB) Equation, Optimal Control,Multi-Agent System,Dynamic Programming,Cost Functional.

References

[1] Victor Lesser, Charles L. Ortiz, and Milind Tambe, Distributed Sensor Networks: A Multiagent Perspective, Springer Science & Business Media, 2003.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Samuel Coogan, and Murat Arcak, “Scaling the Size of a Formation Using Relative Position Feedback,” Automatica, vol. 48, no. 10, pp. 2677-2685, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Haokun Hu, Lipo Mo, and Fei Long, “Stochastic Consensus for Heterogeneous Multi-Agent Networks with Constraints and Communication Noises,” International Journal of Control, Automation, and Systems, vol. 22, no. 4, pp. 1150-1162, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[4] R. Olfati-Saber, and R.M. Murray, “Consensus Problems in Networks of Agents with Switching Topology and Time-Delays,” IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1520-1533, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Yongcan Cao et al., “An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination,” IEEE Transactions on Industrial Informatics, vol. 9, no. 1, pp. 427-438, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Zhirong Qiu, Shuai Liu, and Lihua Xie, “Distributed Constrained Optimal Consensus of Multi-Agent Systems,” Automatica, vol. 68, pp. 209-215, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Kristian Hengster Movric, and Frank L. Lewis, “Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies,” IEEE Transactions on Automatic Control, vol. 59, no. 3, pp. 769-774, 2013.
[CrossRef] [Publisher Link]
[8] Dimitri P. Bertsekas, Dynamic Programming and Optimal Control, Athena Scientific, 3rd ed., vol. 2, pp. 1-716, 2000.
[Google Scholar] [Publisher Link]
[9] Kyriakos G. Vamvoudakis, and Frank L. Lewis, “Multi-Player Nonzero-Sum Games: Online Adaptive Learning Solution of Coupled Hamilton-Jacobi Equations,” Automatica, vol. 47, no. 8, pp. 1556-1569, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Richard S. Sutton, and Andrew G. Barto, Reinforcement Learning: An Introduction, MIT press, 2018.
[Google Scholar] [Publisher Link]
[11] Bahare Kiumarsi et al., “Reinforcement Q-Learning for Optimal Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics,” Automatica, vol. 50, no. 4, pp. 1167-1175, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Frank L. Lewis, and Draguna Vrabie, “Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control,” IEEE Circuits and Systems Magazine, vol. 9, no. 3, pp. 32-50, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Bernt Oksendal, Stochastic Differential Equations: An Introduction with Applications, Springer, pp. 1-324, 2013.
[Google Scholar] [Publisher Link]
[14] Quan-Yong Fan, and Guang-Hong Yang, “Adaptive Actor-Critic Design-Based Integral Sliding-Mode Control for Partially Unknown Nonlinear Systems with Input Disturbances,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 1, pp. 165-177, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Zhinan Peng et al., “Input-Output Data-Based Output Antisynchronization Control of Multiagent Systems Using Reinforcement Learning Approach,” IEEE Transactions on Industrial Informatics, vol. 17, no. 11, pp. 7359-7367, 2021.
[CrossRef] [Google Scholar] [Publisher Link]