Multi-agentic foundation models are important for #robotics and #automation in negotiated and adversarial places such as #traffic and #warfare.
But how to implement them? I have previously drafted a data-centric architecture for decomposing agentic representations for #UniversalEmbodiment in a GitHub repository.
But LLMs already have internalized multi-agentic representations, why can't we utilize them directly? For example, in text you can easily ask an LLM to describe all the persons or agents present in the scene and their intents.
We can and we must certainly utilize these! But these representations aren't grounded.
What we need to do is to craft robotic foundation model training data to involve scenarios where there are multiple agents present.
First start acausally from what ultimately happened — how was the scenario negotiated between multiple participants, who drove first, what attack and evasive patterns were used?
As we then know what happened, we can go back in time and ask the foundation model to identify all the participants in the feed, and complete their intentions with the information from the ultimate outcome.
The foundation model can then utilize all the language space knowledge it has about multi-agent environments, but also anchor this to visual and control signals present in the training data.
This allows the model to not only answer questions of what each participant intents to do, but also anchor this to multi-modal sensory information, and also project embodiment related control intents to all the participants in the scenario, not only ego.
Ego becomes just a special case in robotic control, the model should learn to generalize to project control intents to all agents present in the data.
Ultimately this allows the foundation model to learn from perceived and projected experiences of others, to learn to imitate or not imitate what it has seen other agents do.
It's all about crafting data, not really about sophisticated model architectures.
#RoboticFoundationModels #FoundationModels #PhysicalAI #AI #AGI