FUN2MODEL Case Studies

FUN2MODEL Research Themes:

Multi-agent Coordination and Collaboration

Modelling multi-agent systems as concurrent stochastic games. Automated verification and strategy synthesis.

As computing systems increasingly involve concurrently acting autonomous agents, game-theoretic approaches are becoming widespread in computer science as a faithful modelling abstraction. These techniques can be used to reason about the competitive or collaborative behaviour of multiple rational agents or entities with distinct goals or objectives. We have developed a comprehensive set of techniques for verification and strategy synthesis for concurrent stochastic multi-agent games, covering finite and infinite horizon and a large class of probabilistic and reward objectives, as supported by the PRISM-games 3.0 software release.

Nash and correlated equilibria. Equilibria verification and synthesis.

Much of the work concerns zero-sum objectives, where a coalition of agents is aiming to maximise their expected reward, while the other agents aim to minimise this value. However, there are too limiting, as in many cases the agents will often have distinct, but not directly opposing goals, which cannot be modelled in a zero-sum fashion. We thus introduce equilibria, defined by a separate, independent objective for each agent. These are particularly attractive since they ensure stability against deviations by individual agents, improving the overall system outcomes. We consider Nash and correlated equilibria, and develop algorithms for their (approximate/exact) verification and synthesis for concurrent stochastic games. In a Nash equilibrium no agent has an incentive to deviate unilaterally from their strategy. Correlated equilibria, in which agents are able to coordinate through public signals, are easier to compute than Nash and can yield better outcomes.

The image shows possible equilibria for a parking game. We consider a cost measure based on the distance each car needs to travel before parking, and if there is a crash. Although both strategies result in equilibria, given the cars are able to park without crashing, there is an advantage is selecting the one shown on the right, as the sum of distances travelled by all cars is smaller.

Social fairness and social welfare. Two optimality criteria.

To select amongst the multiple equilibria, we consider two main types of optimality criteria, social welfare and social fairness. Social welfare maximises the sum of the agents’ rewards, whereas social cost minimises this value. On the other hand, social fairness, which minimises the differences between the objectives of individual players, is a novel optimality criterion inspired from economics that is distinct from the use of fairness in verification. Both optimality criteria can be employed for Nash and correlated equilibria.

The image shows optimal values for agents trying to send packets using the slotted Aloha protocol. In this setting, if k users try to send a packet in the same time slot, the probability that each of them being successful is q/k, where q is a value between 0 and 1. If unsuccessful, an agent needs to wait a number of slots before resending set according to the Aloha’s exponential backoff scheme. In the figure, SW_i correspond to the optimal values (expected times to send their packets) for agent i for both SWNE (social welfare Nash equilibria) and SWCE (social welfare correlated equilibria) for the cases of two, three and four users. We see that for these types of equilibria the values are different for each user, being lower for the agent that sends first and higher for that who sends last. On the other hand, for SFNE (social fair Nash equilibria) and SFCE (social fair correlated equilibria) the values for all agents coincide while for the latter the overall sum is only than 2% smaller when comparing to the social-welfare optimal variants.

Neuro-symbolic concurrent stochastic games. Agents observe the environment using neural perception mechanisms.

Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We have developed a novel modelling formalism called neuro-symbolic concurrent stochastic games (NS-CSGs), which comprise two probabilistic finite-state agents interacting in a shared continuous-state environment. Each agent observes the environment using a neural perception mechanism, which converts inputs such as images into symbolic percepts, and makes decisions symbolically. We focus on the class of NS-CSGs with Borel state spaces and prove the existence and measurability of the value function for zero-sum discounted cumulative rewards under piecewise-constant restrictions on the components of this class of models. To compute values and synthesise strategies, we developed practical value iteration (VI) and policy iteration (PI) algorithms to solve this new subclass of continuous-state CSGs by relying on a finite decomposition of the environment induced by the neural perception mechanisms, together with a finite representations of value functions and strategies.

The image shows the value function and the computed strategy synthesised in the discounted, infinite-horizon setting for a dynamic car parking example modelled as a fully-observable zero-zum game.

Neuro-symbolic concurrent stochastic games. Equilibria strategies ensure stability.

We considered (fully-observable) neuro-symbolic concurrent stochastic games with neural perception mechanisms, in which agents operate concurrently in a shared environment that they observe through neural networks. We have studied (undiscounted, finite-horizon) equilibria synthesis problem, which we have applied to the VCAS autonomous aircraft controller (also studied in the Robustness Guarantees for Bayesian Neural Networks theme) implemented as a ReLU network.

The figure plots the altitude h for equilibria and zero-sum strategies when maximising h for a given instant k. It can be seen that, with respect to the safety criterion of avoiding a near mid-air collision, equilibria strategies allow the two aircraft to reach a safe configuration within a shorter horizon, which would be missed by a zero-sum analysis.

Partially observable stochastic games with neural perception mechanisms. Strategy synthesis.

Stochastic games are an established model for multi-agent sequential decision making under uncertainty. In practical applications, though, agents often have only partial observability of their environment. Furthermore, agents increasingly perceive their environment using data-driven approaches such as neural networks trained on continuous data. We propose neuro-symbolic partially-observable stochastic games (NS-POSGs), a variant of continuous-space concurrent stochastic games that explicitly incorporates neural perception mechanisms. We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs, which exploits the piecewise constant structure of the model. Using neural network pre-image analysis to construct finite polyhedral representations and particle-based representations for beliefs, we implement our approach and illustrate its practical applicability to the analysis of pedestrian-vehicle and pursuit-evasion scenarios.

The image shows a pedestrian-vehicle example to analyse decision making for an autonomous vehicle using an intention estimation model for a pedestrian at a crossing. The scenario is modelled as a one-sided game, where the first, partially-informed agent represents the vehicle. It observes the environment (comprising the successive pedestrian locations) using a neural network perception mechanism to predict the pedestrian’s intention. The perception function takes two successive (relative) locations of the pedestrian (the top-left coordinates (x1, y1) and (x2, y2) of two fixed size bounding boxes around the pedestrian) and classifies its intention as: unlikely, likely or very likely to cross. We train a feed-forward neural network classifier with ReLU activation functions over the PIE dataset. The second agent, the pedestrian, is fully informed, providing a worst-case analysis of the vehicle decisions, and can decide to cross or return to the roadside. The goal of the vehicle is to minimise the likelihood of a collision with the pedestrian, which is achieved by associating a negative reward with this event.

Left: Positions of two agents. Middle: Sample images from the PIE dataset. Right: Slices of learnt perception function, where (x1,y1), (x2,y2) are two successive (relative) positions of the pedestrian.

To know more about these models and analysis techniques, follow the links below.

Software: PRISM-games

Sort by: date, type, title

19 publications:

2025

[JKP+25] Wojciech Jamroga, Marta Kwiatkowska, Wojciech Penczek, Laure Petrucci, Teofil Sidoruk. Probabilistic Timed ATL. In Proc. at 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025). 2025. [pdf] [bib]

2024

[YSN+24b] Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska. Partially Observable Stochastic Games with Neural Perception Mechanisms. In Proc. 26th International Symposium on Formal Methods (FM'24), Springer. To appear. 2024. [pdf] [bib]
[ABB+24] Roman Andriushchenko, Alexander Bork, Carlos E. Budde, Milan Češka, Kush Grover, Ernst Moritz Hahn, Arnd Hartmanns, Bryant Israelsen, Nils Jansen, Joshua Jeppson, Sebastian Junges, Maximilian A. Köhl, Bettina Könighofer, Jan Křetínský, Tobias Meggendorfer, David Parker, Stefan Pranger, Tim Quatmann, Enno Ruijters, Landon Taylor, Matthias Volk, Maximilian Weininger and Zhen Zhang. Tools at the Frontiers of Quantitative Verification. In Proc. TOOLympics III, volume 14550 of LNCS, pages 90-146, Springer. 2024. [pdf] [bib]
[Kwi24] Marta Kwiatkowska. Strategy Synthesis for Partially Observable Stochastic Games with Neural Perception Mechanisms. In 32nd EACSL Annual Conference on Computer Science Logic (CSL 2024). 2024. [pdf] [bib] https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CSL.2024.5
[YSN+24] Rui Yan, Gabriel Santos, Gethin Norman, David Parker and Marta Kwiatkowska. HSVI-based Online Minimax Strategies for Partially Observable Stochastic Games with Neural Perception Mechanisms. In Proc. Learning for Dynamics and Control Conference (L4DC'24), volume 242 of Proceedings of Machine Learning Research, pages 80-91. 2024. [pdf] [bib]
[YSN+24c] Rui Yan, Gabriel Santos, Gethin Norman, David Parker and Marta Kwiatkowska. Strategy Synthesis for Zero-sum Neuro-symbolic Concurrent Stochastic Games. Information and Computation. To appear. 2024. [pdf] [bib]
[FLHP24] Fatma Faruq, Bruno Lacerda, Nick Hawes and David Parker. A Framework for Simultaneous Task Allocation and Planning under Uncertainty. ACM Transactions on Autonomous and Adaptive Systems. 2024. [pdf] [bib]
[YDZ+24] Rui Yan, Xiaoming Duan, Rui Zou, Xin He, Zongying Shi, Francesco Bullo. Multiplayer Homicidal Chauffeur Reach-Avoid Games: A Pursuit Enclosure Function Approach. Automatica. Paper accepted April 2024. 2024. [pdf] [bib] https://arxiv.org/abs/2311.02389

2023

[YSN+23] Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska. Point-based Value Iteration for Neuro-Symbolic POMDPs. Technical report arXiv:2306.17639, arXiv. Paper under submission. 2023. [pdf] [bib] https://arxiv.org/abs/2306.17639
[Par23] David Parker. Multi-Agent Verification and Control with Probabilistic Model Checking. In Proc. 20th International Conference on Quantitative Evaluation of SysTems (QEST'23), Springer. 2023. [pdf] [bib]
[YGJ+23] Pian Yu, Yulong Gao, Frank J. Jiang, Karl H. Johansson and Dimos V. Dimarogonas. Online control synthesis for uncertain systems under signal temporal logic specifications. International Journal of Robotics Research, SAGE Publications. 2023. [pdf] [bib]
[YZD+23] Rui Yan, Weixian Zhang, Ruiliang Deng, Xiaoming Duan, Zongying Shi, Yisheng Zhong. Evaluation and learning in two-player symmetric games via best and better response. Information Sciences, 647, pages 119459, Elsevier. 2023. [pdf] [bib] https://doi.org/10.1016/j.ins.2023.119459

2022

[KNPS22b] Marta Kwiatkowska, Gethin Norman, David Parker and Gabriel Santos. Symbolic Verification and Strategy Synthesis for Turn-based Stochastic Games. In Principles of Systems Design: Essays Dedicated to Thomas A. Henzinger on the Occasion of His 60th Birthday, volume 13660 of LNCS, Springer. 2022. [pdf] [bib]
[KNP+22] Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos and Rui Yan. Probabilistic Model Checking for Strategic Equilibria-based Decision Making: Advances and Challenges. In Proc. 47th International Symposium on Mathematical Foundations of Computer Science (MFCS'22). August 2022. [pdf] [bib]
[YSD+22] Rui Yan, Gabriel Santos, Xiaoming Duan, David Parker and Marta Kwiatkowska. Finite-horizon Equilibria for Neuro-symbolic Concurrent Stochastic Games. In Proc. 38th Conference on Uncertainty in Artificial Intelligence (UAI'22), AUAI Press. August 2022. [pdf] [bib]
[KNPS22] Marta Kwiatkowska, Gethin Norman, David Parker and Gabriel Santos. Correlated Equilibria and Fairness in Concurrent Stochastic Games. In Proc. 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'22), volume 13244 of LNCS, pages 60–78, Springer. April 2022. [pdf] [bib]

2021

[San21] Gabriel H.R. Santos. Automatic Verification and Strategy Synthesis for Zero-sum and Equilibria Properties of Concurrent Stochastic Games. Ph.D. thesis, Department of Computer Science, University of Oxford. March 2021. [pdf] [bib]
[KNPS21] Marta Kwiatkowska, Gethin Norman, David Parker and Gabriel Santos. Automated Verification of Concurrent Stochastic Systems. Formal Methods in System Design, Springer. January 2021. [pdf] [bib]

2020

[KNPS20] Marta Kwiatkowska, Gethin Norman, David Parker and Gabriel Santos. PRISM-games 3.0: Stochastic Game Verification with Concurrency, Equilibria and Time. In 32nd International Conference on Computer Aided Verification (CAV'20), Springer. July 2020. [pdf] [bib]

Sort by: date, type, title

« Overview