Sequence models such as LLMs are powerful because they exhibit in-context learning and other in-context cognitive capabilities.
This was never intended or engineered in; it was pretty much an accidental result.
What we see in these models is that they can learn in a generalizable way pretty much optimally from a single example in-context. This is way better than any of our classically engineered learning algorithms can do.
In addition to this, they also have world models baked into the causal context processing. They can describe the world state after a sequence of events. More than that actually, they have agentic world models where they can describe what each agent featured in the context intends to do next.
These sequence models are also by accident excellent integration components. The context can be written by other entities as well, not only the model itself generatively. The context can come partly from a user or multiple users, tools such as web searches or Python interpreters, other agents, perception, ...
All in all, LLMs are not just singular atomic entities but they are very powerful building blocks of scalable cognitive architectures. And that is what agentic systems in principle are, LLMs integrated togetger with a wide range of other system, using the context as the interface.

