The schools of vehicular autonomy

There is a fundamental debate in the world of autonomous driving that defines how each company builds its vehicles. The question is simple: how should a car see the world?

The answer to that question divides the industry into three distinct schools. Each one carries different assumptions about cost, scalability, and the role of artificial intelligence. Understanding these schools is understanding the future of mobility — and understanding why Bex made the choice it did.

The sensor-heavy school

Waymo. Cruise. Aurora. Motional.

The first generation of autonomous vehicles was born by stacking sensors. A Waymo car carries cameras, radars, LiDARs, and ultrasonic sensors — each covering a band of the spectrum, each generating a different layer of data about the environment.

The logic is redundancy: if the camera fails, the LiDAR compensates. If the LiDAR cannot see a transparent object, the radar detects it. The fusion of multiple sensors creates a rich and reliable representation of the surrounding world.

The problem is cost. A single high-resolution LiDAR sensor can cost more than a popular Brazilian car. Multiplied by four or five units per vehicle, plus the radars, plus the computation required to fuse everything in real time — the cost per vehicle becomes prohibitive for operation at scale.

Waymo operates limited fleets in cities mapped centimeter by centimeter. It works extraordinarily well within those bounds. But scaling to millions of vehicles, in thousands of cities, with precarious road infrastructure? The sensor-heavy model has a structural difficulty with that.

The vision-only school

Tesla FSD. Comma.ai OpenPilot.

The second school makes a radical bet: cameras are enough. If a human drives using only two eyes, a computer with eight cameras and sufficient artificial intelligence should be able to do the same.

Tesla is the most well-known case. In 2022, it removed radar and ultrasonic sensors from its vehicles, betting exclusively on cameras and neural networks. Elon Musk’s argument: LiDAR is a crutch. The real solution is teaching the software to see like a human — and that requires data, not expensive sensors.

Comma.ai, with OpenPilot, follows the same philosophy with an open-source approach. A single camera and a simple device installed in the vehicle, running models that learn from millions of kilometers driven by real drivers.

The advantage is economic and scalable. Cameras cost pennies compared to LiDAR. Any vehicle with cameras can, in theory, run the software. And with every kilometer driven, the model improves — creating a virtuous cycle of data and intelligence.

The disadvantage is that pure vision depends entirely on the quality of the model. In adverse conditions — heavy rain, fog, direct sunlight on the lens — the camera struggles. And without sensor redundancy, the system needs to be extraordinarily reliable in processing what the camera captures.

The hybrid school

Mobileye. Huawei ADS. Xpeng XNGP. BYD.

The third school seeks a pragmatic middle ground. Cameras as the primary sensor — because they are cheap and scalable — with radar as backup for critical situations. No LiDAR in the production vehicle, but an architecture that does not depend exclusively on vision.

Mobileye, an Intel subsidiary, is the reference for this approach. Its computer vision chips already equip more than 150 million vehicles worldwide. The EyeQ system processes video in real time with minimal energy consumption, and radar complements detection at long distances and in low-visibility conditions.

Huawei ADS and Xpeng XNGP follow similar paths in China: computer vision as the backbone, radar as the safety net. They can operate in cities without prior mapping — something the sensor-heavy school still does not do well.

The hybrid model is the fastest-growing globally, because it balances cost, safety, and scalability. It is not the most elegant solution, but it is the most pragmatic one for bringing autonomy to millions of vehicles in the coming years.

The trend is clear

The world is converging toward vision. Even companies born in the sensor-heavy school are reducing their dependence on LiDAR. Aurora, which develops autonomous trucks, is already working with leaner sensor configurations. Mobileye plans to remove LiDAR from its next-generation systems.

The reason is mathematical: cameras get better and cheaper every year. AI models get more capable with each generation. LiDAR gets cheaper too, but it will never cost pennies like a camera. And the volume of data that cameras generate — combined with the advancement of computer vision models — is closing the performance gap.

Where Bex stands

Bex is vision-first by conviction and by necessity.

By conviction, because we believe computer vision is the scalable path to autonomy. If the goal is to put intelligence in millions of Brazilian vehicles, the sensor needs to be accessible. Cameras are.

By necessity, because the Brazilian reality demands scale before sophistication. Brazil operates at L0 — with zero autonomy infrastructure. We do not have the luxury of starting with limited fleets of expensive vehicles in mapped cities. We need a broad data collection network, today, with hardware that any driver can install.

That is exactly what Bex Cam does. A camera — on a phone or on a dedicated device — collecting video and telemetry from Brazilian traffic. Every kilometer driven feeds the dataset that will train Bex Pilot, the Bex autonomous driving stack.

The school we chose is not the most conservative. But it is the only one that scales for a continental country with 50 million vehicles and zero autonomy infrastructure.

We start with the camera. The rest comes after.