Harish Venkataraman on why sensing is becoming the first layer of AI
At Image Sensors Europe 2026, much of the conversation centred on integration, system design, and the limits of pixel scaling. But Harish Venkataraman, Director of Camera and Sensing Architecture and Systems at Meta, approached that shift from a different angle. Not by asking how sensors improve, but by asking what role they play in a world increasingly defined by AI. Because if AI is the defining technology of this decade, then sensing is where it begins.

One of the simplest, and most important, points in Venkataraman’s talk is that most AI systems still begin with visual data: images, video, depth. That hasn’t changed. What has changed is what happens next. Sensors are no longer capturing images for a human viewer; they are feeding systems that need to interpret, react, and act, often in real time and under strict constraints. That shifts the definition of performance. It is no longer just about clarity or resolution, but about relevance, timing, and efficiency: what information is captured, how quickly it can be processed, and whether it can support decision-making in the moment.
“It’s not a question anymore of if… it’s about how these sensors become more powerful.”
“More powerful” here doesn’t simply mean more pixels. It means becoming a more effective input to AI.
A system shaped by constraints
That shift introduces a tension that runs through much of the current innovation in imaging. On one side, traditional expectations remain. High resolution, high dynamic range, low noise. These are still essential, particularly for photography and video. On the other, AI-driven systems introduce a different set of pressures. They need to be always on, respond instantly, and operate within tight power and form factor limits.
“There are two competing requirements… high-quality imaging that never goes away, and low-power sensing that enables this future world.”
This tension becomes especially visible in devices like smart glasses and AI wearables, where sensing is continuous rather than occasional. The camera is no longer something that is activated; it is persistent. And that persistence makes power one of the defining constraints. You can’t simply increase performance without consequence. Instead, the system has to adapt.
That is why the architecture itself is starting to change. Rather than relying on a fixed pipeline, sensing and processing are becoming more distributed. Some decisions happen close to the sensor, where latency matters. Others are deferred to edge compute or the cloud, depending on what the application can tolerate. In that sense, the image sensor is no longer the start of a linear process. It is part of a dynamic system.
From camera to perception layer
What emerges from this is a broader redefinition of the camera’s role. It is no longer just a device for capturing images, but part of a perception layer that combines sensing, compute, and models to interpret the world.
“The hardware does just enough, and works paired with machine learning to get the whole picture.”
This is also where depth sensing becomes more central. For many tasks, 2D imaging is sufficient, but as soon as systems need to understand space – to align digital content, track movement, or interact with objects – depth becomes difficult to avoid. It adds a dimension of understanding that cannot be reliably reconstructed from RGB alone. Increasingly, it is not used in isolation. Systems are combining RGB, depth, and other inputs such as IMUs to build a more complete and stable model of the environment.
This is particularly relevant in robotics, where sensing directly drives action. Vision is not just used for recognition, but for navigation, manipulation, and interaction with the physical world. In that context, the sensor is not just feeding AI: it is shaping what AI is capable of doing.
The shift behind the shift
Taken together, these changes point to something deeper than a new set of specifications. The image sensor is no longer a standalone component, optimised in isolation. Its value is increasingly defined by how well it integrates into a larger system: one that includes compute, software, power constraints, and real-world use cases. And that system is being built for AI.
The question is no longer how good an image looks. It is how useful that image is. Not how many pixels are captured, but what those pixels enable.
















