February 10, 2026 · News

Head-to-head: Manifold vs Physical Intelligence on identical hardware

Niva ran a direct comparison between Manifold and Physical Intelligence's π0.5, the leading vision-language-action model for robotics, backed by more than a billion dollars in funding. The task: contact-rich manipulation of a fragile object, an egg pick-and-place, on identical UR5e hardware with the same sensors. π0.5 was trained on roughly 10,000 hours of video data across eight robot platforms. Manifold had no training, no calibration, and no prior exposure to the object or task.

The results

Manifold: 98% success rate, zero-shot.
Physical Intelligence π0.5: 14% success rate, with training.
Both ran on consumer-grade hardware, same robot, same sensors, same task.

Why the refresh rate matters

Manifold ran at 60 Hz. Physical Intelligence’s π0.5 ran at 4.4 Hz. In robotics control, 20 Hz is the minimum refresh rate considered viable for real-time operation. Manifold sits comfortably above it; π0.5 sits well below.

The number measures more than frequency. Every 17 milliseconds, Manifold recomputes its complete world state from available sensor data: forces, positions, material conditions, and their interactions. Each cycle is a deterministic physics computation, not a frame of video or a statistical inference. That is what lets the system react to contact events as they happen, the moment a finger meets the shell, before the shell breaks.

Why physics-native beats pattern-matching on contact-rich tasks

VLA models learn from video. They are pattern matchers over recorded behavior. They cannot reason about contact mechanics (how forces transfer when objects touch), fracture physics (when and how a material yields), or the consequences of an action that has not yet appeared in their training distribution. An egg is a worst-case object for that approach: low force tolerance, brittle failure, no margin for over-grip.

Manifold reasons about those quantities directly. Contact, friction, and material yield are physics, not a learned correlation. The platform handles the egg the same way it handles any object whose behavior is governed by physical law: by running constitutive physics continuously, with deterministic solvers producing every state transition, and refining its parameters from live sensor input.

One platform, generalized

Manifold wasn't designed for egg pick-and-place. It was built around constitutive physics that runs continuously and a deterministic world model that is indifferent to the specific object, task, or robot. Sensor data refines the world model; the underlying physics doesn't change. The result is attribution: predictions trace to physical causes, not to learned correlations. The Physical Intelligence comparison is one instance, but the implications are significant. Generalized, runtime, physics-native world models are capable and extendable in ways that transformer-based VLA can never be - no matter that training data volumes.