Computer Vision: Making the Metaverse a Reality

Appu Shaji, CEO of Mobius Labs, looks at the forces shaping the metaverse and the role of computer vision in the future of these immersive platforms.

Although recent announcements make it sound as if the metaverse is something new, it has existed in bits and pieces for a number of years. Virtual reality goggles have been with us since at least the early 1990s. A decade later, Second Life enabled players to participate as avatars in what was then a cutting-edge virtual world. More recently, we’ve seen the hype around web 3.0 and decentralised finance as a replacement for traditional banking and business relationships.

None of these things by themselves are enough to make the metaverse a reality. And not all of them will come true. But when you combine different elements and throw AI into the mix, you have the essential ingredients for platforms that are truly immersive, capable of massive scale and commercially viable.

There is another good reason why everyone from big tech to startups have been stampeding to announce a metaverse strategy. We are seeing the beginnings of social media and subscriber fatigue affecting Wall Street favourites such as Netflix and Peloton. Facebook recently announced its first-ever quarterly fall in daily users. No wonder everyone is looking for the next big thing to keep investors and subscriptions on an upward trend.

Another useful comparison is with the smartphone revolution. Although the launch of the iPhone in 2007 marked a critical moment in the advance of mobile technology, it took another four or five years for networking technology (4G), developers (the App store) and mass consumer adoption to catch up. I think that we are in a similar position today: roughly five to ten years away from something genuinely transformative and based on the convergence of technologies from an array of innovators.

Place your meta-bets

At this stage, we don’t know who the Instagrams, Airbnbs or Ubers of this emerging marketplace are. But we can predict some of the qualities that will divide winners and users.

While some big tech names have posted disappointing results in the latest quarter, media-driven channels such as TikTok and YouTube, which cater largely to a younger demographic, are killing it. They’ve found new ways to grow and engage with their audiences, rather like Facebook succeeded in 2007. The metaverse, with its promise of 3D levels of immersion and engagement, feels like a logical step forward, although progress will be extremely volatile.

It is also safe to predict that brands will spend vast sums of money to promote themselves in the metaverse. Nike has set up Nikeland[1] and Adidas has bought virtual real estate in Sandbox. These are risky bets, but contemporary consumer brands seeking first-mover advantage can afford to fail fast in return for a commercial head start.

What about the technologies that will drive the metaverse? Headsets have been with us for several years, at least in the gaming sector. Augmented Reality suffered from a false start with Google Glass. Since then, companies such as Snap[2] have launched a series of devices, but none of them has set the world on fire. We are still waiting for the iPhone of smart glasses, although I have no doubt that this will emerge in the next five years.

AI, especially computer vision, will also play a central role in the commercialisation of the metaverse in the coming years. The creation of virtual and augmented worlds is driving the creation of huge volumes of images, graphics, animations and more. Computer vision makes this visual universe discoverable and commercially viable.

The very first search engines relied on teams of employees to index websites. But it took Google to realise the potential of automated indexing fuelling the exponential rise of the e-commerce economy. Similarly, the future of the metaverse depends on being able to automate the creation and indexing of visual content on a massive scale. Computer vision has already shown that it can do just this in existing media organisations.

There’s more to this than pure object recognition. The latest computer vision tools can recognise expressions, emotions, moods, and can apply these insights not just to static images, but to video. This enables media organisations to index content according to a plethora of search terms and makes it more discoverable to customers. The same will be true of the metaverse.

The (visual) search for opportunities

Other early opportunities for computer vision include advertising. Right now, it’s easy to imagine walking down the virtual street in the metaverse like you would Piccadilly Circus in London or Mitte in Berlin with eye-catching advertising projected into your field of view. But this is too simplistic and based on old fashioned models. Sure, there’ll be virtual billboards, but somebody will find a way of connecting advertisers with customers in ways that we haven’t yet discovered. The only thing for sure is that computer vision will be at the heart of it.

Beyond advertising, the commercial opportunities are boundless. Imagine you’re a world-famous gallery trying to attract visitors to a virtual version of your museum. Suddenly, you’re not only competing against every other gallery in your city, but you’re also up against every gallery on the planet. What if you could promote your collections to audiences who are not only connoisseurs but whose taste matches the artists and artworks on display in your virtual gallery? Visual search, powered by computer vision, makes this possible.

Real estate is another huge opportunity for growth. Right now, we’re on the cusp of making virtual property tours a reality. But in the future, you will also be able to visit the neighbourhood, check out the nearest shops, cafes, and gyms. We are also starting to explore the potential for workplace training. Massively expensive simulators used to be the preserve of airlines and trainee pilots. Smart glasses will enable organisations to train and onboard employees from factory tool operators to data scientists at a fraction of the cost.

Of course, all these predictions come with caveats. We are still coming to terms with the ways that tech companies track end-users and harvest their data. The metaverse will be subject to similar, if not greater debate. The role of eye-tracking and facial recognition are just two activities that must be carefully considered in order to protect the integrity of the digital universe.

But I am optimistic that we will learn the lessons of web 2.0 and that over the coming decade the metaverse will come into being. AI and computer vision will be the enabling technologies, along with a new generation of smart glasses and elements of decentralisation. Rather like a virtual reality roller coaster ride, the next few years will have many ups and downs. But one thing is for sure – it will be daunting and thrilling in equal measure.

Appu Shaji is the CEO of Mobius Labs, a Berlin-based startup giving machines a human-like power of perception. You can find more information about the company on its website.

Stay up to date with the most recent automation, computer vision, machine vision and robotics news on Automate Pro Europe, CVPro, MVPro and RBPro.

[1]

[2]