In a world increasingly shaped by artificial intelligence, a new frontier is emerging – one that promises to propel AI from the digital to the physical world. Known as physical AI or embodied AI, this convergence of robotics and artificial intelligence is poised to redefine how machines interact with humans and their environments.
“Physical AI is another way to talk about the intersection between AI on one side and robotics on the other,” Pascal Brier says. “To some extent, this is the next big thing after spending a year on LLMs, another year on agentic. Now everybody’s talking about physical AI.”
Investing in AI and robotics
He revealed that Capgemini has been investing in both AI and robotics for a considerable time. “Three years ago, we made a public commitment to spend €2 billion in AI, including this intersection between AI and robotics. We’ve currently delivered more than 1,200 projects on AI, including generative AI.”
Capgemini has more than 30,000 people fully dedicated to AI, and over 20,000 experts in engineering, industrialisation and robotics. A network of dedicated labs, including the AI Futures Lab, explores the evolution of AI beyond current paradigms.
“We’ve invested into a new technology called Liquid Networks, which might be an evolution of the transformer model from Google one day,” he explains. “We’ve been investing in a spin-off from MIT called Liquid AI, which is trying to develop that technology, offering the same capabilities as gen AI but with only a fraction of the energy required to run models.”
Pascal describes physical AI as “the intersection of systems that think and machines that act”. The goal is to make systems operative and doing things in the physical world, not only in the digital world.
From machine learning to ChatGPT
While AI and robotics are not new, the difference now is the maturity of both technologies. “AI has gone through several revolutions – from the early 1950s to the 80s with machine learning, the 90s with neural networks, the 2000s with deep learning, and then the transformer model in 2017,” he explains. “The most important date being November 30, 2022, when ChatGPT came out.”
Robotics, too, has evolved: “From the early robots in the 60s to programmable robots in the 80s, mobile and sensor technologies in the 2000s, cobots in the 2010s, and today, physical AI. Now we think those two technologies are mature enough to be combined into something that makes sense and creates sensitive machines that act in the real world.”
Tim Ensor echoes his colleague’s sentiment. “2025 feels like the same kind of moment we had when we were suddenly blown away with what AI could do with creating images, and then with creating text. Now we’re seeing the same in physical robots.”
He cites several examples from this year alone: “DeepMind’s two robot arms folding an origami fox – an incredibly challenging fine manipulation task. In Beijing, 21 humanoids participated in a half marathon, and six of them finished. Unitree, a Chinese robot manufacturer, staged a boxing match between two robots that could react to each other’s movements and get back up after being knocked down. And Boston Dynamics’ latest robot can do a cartwheel and breakdance.”
Defining physical AI or embodied AI
Expanding on the definition of physical AI, or embodied AI as some call it, he describes it as enabling AI to understand and interact with the physical world. “We’ve moved from perception AI, where cameras and machine vision can classify what they see, to generative AI, where models can co-create with us. But generative AI is still limited to the digital domain – words documents, software, data.
“What becomes exciting with physical AI is when we start putting that capability into the real world. We need machines to understand that when you drop an object, it falls to the ground – not just to the bottom of a video frame. That an object continues to exist when it’s behind another object. We need AI that starts to understand these concepts.”
This, he argues, is how AI begins to develop a form of common sense – something it has long lacked. This important step will allow AI to interact on our terms, in our world. The convergence of factors driving this shift include compute power, data availability and algorithmic advances.
“We continue to see exponential increases in compute power. Next year’s NVIDIA Rubin GPU will be 900 times more powerful than the Hopper generation from just a couple of years ago – and at only 3% of the total cost of ownership.”
Rise of edge computing
He also notes the rise of edge computing. “We’re getting to the point where we can deliver two petaflops of compute on GPU platforms using only 100 watts. That’s enough to run high-capacity AI on battery-powered machines in real time.”
On the algorithmic front, Tim highlights the importance of “world models” that allow AI to reason about physical space: “Wayve, the London-based autonomous driving company, is doing some of the most advanced work in this area. They’ve invested heavily in enabling their vehicles to operate effectively in complex environments.
“In visual reasoning and reinforcement learning, we’re seeing large language models trained in multiple modalities – language, image, video – performing well on tasks like question answering and reasoning. And reinforcement learning, where agents learn by interacting with environments, is starting to deliver real benefits.”
Simulators, he adds, are becoming essential. “It’s almost impossible to collect the amount of physical data needed to train these systems. So high-fidelity simulators are critical.”
Omniscia surgical imaging platform
Tim cites some of CC’s work in this area, including Omniscia, a surgical imaging platform that delivers four-times super-resolution and real-time distortion correction in just 12 milliseconds. “That’s faster than human perception,” he says. “It means a robot can respond to physical events – like not falling over – in real time.”
He also refers to our work in autonomous collaborative drone fleets, which applied hierarchical reinforcement learning. “We trained a multi-agent system to allow drones to collaborate and avoid collisions. But when we moved from simulation to the real world, we had to account for things like wind disturbance. It taught us a huge amount.”
The physical AI revolution is also being enabled by advancements in physical technologies: “The electric vehicle industry has driven down battery costs by 80% in the last decade. We’ve seen major advances in electric motors and sensors, like solid-state LiDARs that are compact and efficient.”
Tim Ensor’s most passionate argument for why we will see humanoid robots as an essential part of the physical AI revolution is simply that the built environment is made for humans. So if we want robots to operate in our world, they need to look and move like us. Although robots have long been used for dull, dirty, and dangerous jobs, we’ve only scratched the surface: “To go further, we need robots that are flexible – able to do multiple tasks and adapt to different environments.
“Take warehouse logistics. One of the last jobs in many warehouses is taking packages off the conveyor and loading them into lorries. It’s freezing in winter, boiling in summer, and back-breaking work. It’s hard to retain staff. While there are robot start-ups trying to solve this problem it seems perfectly suited to humanoids.”
Applying humanoid robotics
Looking ahead, Tim points to broader applications. “There are shortages of pharmacists, food manufacturing workers – many roles where humanoid robots could help. The workplace is built for humans, and these technologies can support them.”
Experts predict a global population of 13 million humanoids within a decade – more than most European countries. The market could be worth £5 trillion, exceeding the combined sales of the world’s top 20 vehicle manufacturers. Yet Europe is lagging. “Of the leading companies in humanoid robotics, only one or two are European. The US and China are dominating. If Europe doesn’t act, it will be left behind,” warns Tim.
“I think there’s a need for national strategies to include robotics. The UK’s AI strategy is excellent, but it barely mentions robots. We need to evolve our thinking to include the impact of robotics on society.
2025 is a turning point. What we do now will shape the future of robotics – and society – for decades to come.”