Social VR for every web browser

Here are Michela's most exciting take-aways from her time in Hubs by Mozilla at IEEEVR 2020.

IEEE VR 2020

Every day last week I woke up extra early, took my coffee into my Sydney home studio and spent a few hours, working in VR, at a conference based in Atlanta.

The Institute of Electrical and Electronics Engineers (IEEE), pronounced I-triple-E, presents conferences on Virtual Reality that are billed as “the premier international venue for presentation of VR research”. Based on last week's experience, it's not hard to see why. Due to Covid19 they moved their 2020 conference entirely into VR - at two weeks notice. This in itself was an achievement. 

It was all thanks to Hubs by Mozilla*- an experimental social platform that supports most, if not all, VR headsets - but a headset is not actually required. You can use Hubs from most web browsers. Hubs has been around for a few years but you could say it has now come of age.

The announcement that the IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR) would be held solely in VR made me sit up and realise it was high time I revisited WebVR. This would be an historic event - not the first conference ever in VR - but certainly the largest. I registered as a volunteer to help monitor Hubs. It was a great experience and an opportunity to see world-class innovators at play. I sincerely hope real-life conferences can resume before too long but when needs must... the organisers' switch to VR was done extremely well. 

After a quick trip to Rhiannan Berry’s Avatar Customizer I was good to go. I opted for a more cartoony avatar but volunteers also had a “changing room” with a large number of variations of pre-rigged avatars to inhabit. As an open source platform, 3D artists can create their own avatars using tools like Blender. Hubs avatars can animate in response to their inhabitant's voice. My (avatar’s) head simply and cheerily varied in size as I spoke but the models available in the avatar gallery provided simple mouth movements.

Hubs at IEEEVR

It didn’t take me long to realise that Hubs by Mozilla is simply amazing. In another post I will share some tips and tricks for newcomers but the top line is simple - it works pretty much anywhere. I tested Hubs on multiple web browsers, on both PC and Mac, on iPhones, iPads, on android devices including an old Samsung G7 phone with the original GearVR, and of course on several VR headsets. There are a few caveats, which I’ll get to later in this article.

After checking in for my first shift, via an Oculus Quest, I scanned the list of VR rooms and checked out all three parallel streams of the conference. The first hub I landed in had about 15 avatars positioned in front of a giant video wall displaying the presentation (a Twitch stream of a Zoom meeting showing slides and webcam). The majority of the avatars did not have hands, which indicated that they were connected from a desktop browser and not using VR equipment. The desktop view of Hubs can be a very practical choice because it allows people to connect from much older devices and to use text chat. You cannot type in Hubs using VR controllers.

VR controllers do however allow an extra level of body language. Presenters using VR controllers were able to point at their displays and use other hand gestures in a naturalistic way. I waved at my fellow volunteers in the room (using handheld Oculus Touch Controllers) before switching to another room. Another group of avatars faced another Twitch video stream showing a panel discussion taking place in VR. The stream showed a different Hubs room with the panellist avatars. Avatars watching a video stream of other avatars. Things can get very meta here! 

Why VR?

You may be wondering what is the point of going into VR if all you are going to do is look at video. Just like at a real world conference or cinema complex, where multiple rooms can show the same content, conference goers filled up different rooms to watch the same proceedings. Unlike the real-world however, there was technically no limit on the number of virtual rooms that could be created to meet demand. The only constraint was the conference budget paying for Amazon cloud resources.

All conference proceedings were recorded in Zoom and re-streamed via Twitch. You were never forced to go into VR to access any content. Why did people bother? For the social experience of participating at a conference in realtime. People moved through VR rooms for the immersive interactive experience and for the random encounters that you might experience while navigating the halls of a conference venue in real life (IRL). I spent a week moving between over 100 multi-user VR rooms, chatting with people I knew IRL as well as making new connections. 

For some people who haven’t experienced real VR, there is understandable skepticism. It sounds like an abstract novelty. Which it can be. But the value of immersion and bringing body language into digital online communications might be easier to appreciate under pandemic lockdown conditions. I think of VR as an optional presentation layer for those privileged enough to have it. This week was a great illustration of why there are many times when web video is not good enough.

We have, above, two images representing VR in 2020. A marketing image on the left, and on the right, how VR actually looked at a major academic conference. While there is nothing behind the glossy image, the image on the right is a screenshot I took while watching a team, in realtime, give their award-winning demo. Since all conference content was readily available for free download, the VR spaces were more about discovery and connections. 

Mozilla has built an open VR chat platform, focused on privacy and security, that can run on multiple devices. What do I mean by “run”? Well for starters, if you don’t have a VR headset then you’re not having an immersive experience but that’s not the point with Hubs. You are still in the conversation, hearing audio and reading text. You join “rooms” created by instancing a “scene” and these rooms are designed to be accessible from a desktop web browser. Which means that you can use a mouse and keyboard or mobile device to navigate as well as going immersive. 

The way Hubs is designed, as a ‘cheap and cheerful’ VR chat platform, there is a limit of around 25 people per room. This isn’t a hard limit but a recommendation based on years of testing. For powerful desktop machines, you can apparently get away with around 40 connections but once a number of less powerful devices are connected (e.g. Oculus Quest) then these quickly become a bottleneck. Everyone in the room can hear everyone else’s audio and see their avatar movements. There is a limit to how much communications traffic can be handled via WebRTC. This is where streaming video comes in.

IEEEVR 2020 hosted over 1,960 registered delegates, of which 1,926 were active participants. There were presenters from 15 countries and observers from many more. All without travel expenses, large carbon footprints or exposure to the Covid19 virus. The humble web page has moved on since I got started on the web in 1993.

For each VR demo accepted into the conference, the project team hosted a room and gave presentations during designated time slots. The one pictured above demonstrated an AR interface for law enforcement traffic stops. Outside of presentation times, authors and presenters hung out next to their displayed work, answering questions and giving impromptu talks to interested attendees. 

The above VR film presentation was in a space that was part production gallery and part production still. The orange-patterned sphere in the background was an actual 3D still, which you could enter, of a 2D animated VR film called Swing.

Without taking away from the achievements on display at IEEEVR 2020, to be fair, the experience at time of writing is still a bit clunky. There was no way to get a heads-up display of web information inside a Hubs room so the only way to navigate between rooms was to keep returning to the conference home page to select a new room from a list. Rooms could contain links to other rooms, like portals but this could only ever point to another specific room and if that room was full you couldn’t get in. As a result, what felt like the obvious user experience (flitting between VR rooms like surfing the web), was not as seamless as it will one day be. It is a known architectural issue, that Mozilla is certainly aware of and working on but despite this hassle it is certainly still possible to navigate. There are numerous VR chat room systems that do provide this capability but again, these rooms are not just web pages. With Hubs, there’s no special software to download and it works cross-platform.

Virtual Production

By the last day of the conference I was keen to host my own event. I registered a community of practice “Birds of a Feather (BOF)” session on the topic of Virtual Production and advertised it on the conference Slack. BOFs are a tradition at tech conferences for informal meetups and exchange on specific topics. We had around 15 active participants and several lurkers who connected with me afterwards. If more than 25 people had shown up we had parallel rooms and a video stream on stand-by.

Using the ability to drag and drop PDFs into the room, I was able to give a speed talk with slides, invite another team to present and then moderate a Q&A. One of the cool hidden features of Hubs - the magnifying glass icon on objects - is being able to remote control slides from anywhere in the room. I discovered this feature while chatting with folks in VR the day before. 

I had the equivalent of a hallway conversation, which led to a group of avatars wandering through various rooms together and eventually saw me pulling out some of my work to show as a PDF. After lugging heavy laptops and tablets around the world for decades, it is hard to over-emphasise how simple by comparison it was to put together a gathering spanning New York, Johannesburg and Sydney in a VR chat room.

95/ What's great about @MozillaHubs is that you can paste in a PDF & share it to the room. @michela gave an impromptu presentation of a node-based live editor. Vimeo links didn't resolve, but this real-time live presentation shows the 3DUI innovations

— Kent Bye VoicesOfVR (@kentbye) March 28, 2020

After my "hallway demo" I was able to refine the presentation the next day (switching from Vimeo to YouTube as Vimeo links don't currently work in Hubs), testing playback in my own purpose-built theatre and hosting the BOF.

Inspired by all this, last week we opened Mod’s first Hubs room - accessible via our Discord server - and we will be hosting 11am Sydney time drop-ins for anyone who wants to come by, chat and explore.

State of the Web

I began my career building the open Web and while I’ll always be a “webdev” of sorts, a good chunk of my time these days revolves around experiences created with proprietary game engines. IEEE 2020 was a timely reminder of how and why open source exists and why, at a time when the world is changing fast, accessibility sometimes matters more than aesthetics.

In the last few years I’ve been focused on real-time virtual production, popularised by big budget shows like The Lion King and The Mandalorian and I’d lost touch with recent progress in the world of WebAR and WebVR. When the first mainstream WebXR software stack first appeared in 2016 (three.js + aframe) I took the time to do several Hello, World examples but, to be honest, I lost interest.

As a technologist I’m a huge fan of open source technology and culture but as a director it’s hard not to be drawn to platforms offering the most cinematic visuals. A backwards-compatible web browser can’t compete with the features offered by the latest commercial game engines but what the web offers is an open communications platform. That’s what makes Hubs, built in aframe, so interesting. It’s taken a pandemic to remind me why open source media platforms deserve their place in the spotlight.

The web technology world has moved on since I last looked at “WebVR”, a term no longer in favor. In Oct 2019 the working draft of the Web XR Device API was released. It makes a lot of sense to change the name to XR. The lines between AR, VR and MR are fluid at the concept stage. Choice of medium often has much as anything to do with available resources. As scope changes, often the medium does, too. Supporting an “immersive XR device” is now the focus. Expect to see more and more “web layers” appearing for other apps, including in VR. As developers and manufacturers support the underlying OpenXR standard, which handles communication between web browsers and hardware, the power and potential of WebXR will continue to grow.

What is perhaps one of the most powerful features of Hubs is that rooms can be created in another part of the system, a web-based 3D editor called Spoke. Spoke will look familiar to anyone who has ever done any 3D modelling but it is more of a layout tool. Using a large library of stock objects, as well as any GLB files, images and links you provide, it is pretty straightforward to create your own custom 3D environment. The key to this is the ability to remix (i.e. duplicate and tweak) an existing public scene. For my first effort at creating my own room I remixed (modded) a scene created by Mozilla, who have made a large library of Hubs stock objects available on Sketchfab under a Creative Commons Attribution License

Backwards compatibility is key

When you run a studio business like Mod, the ready availability of advanced technology means that you effectively operate in a bubble of privilege. It is too easy to make the dangerous assumption that the resources we have, and the digital literacy required to use it, is more generally available than it is. Which is why, when evaluating remote collaboration platforms for any event, let alone an international conference, there are some fundamental blockers. How high should the bar be for entry? How expensive a device do you need? What level of support is feasible? What is it all going to cost the participant?

With all these considerations, my golden rule has been to try and support as broad a range of platforms as possible. There are always compromises and unfortunately in the competitive world of media and entertainment production, the flashiest graphics often attract.

The visual design possible in a web browser is relatively modest compared to what a game engine offers, but the web browser experience is far more accessible, due to its conformance to open standards. In the short term, projects like our A Clever Label documentary, that relies on the grunt of the UE4 game engine, are not viable to port entirely to WebXR but we can certainly provide something for the web-only audience. 

Data portability is key

Efforts in building virtual spaces, objects and avatars could do worse than follow the Mozilla model. The platform itself is open source, the content is created using open source formats (predominantly GLTF).

In evaluating Hubs I exported content from one of my UE4 projects as FBX, converted the FBX to GLTF in Blender, packaged the GLTF as a GLB binary and then simply dragged and dropped into my browser. A microphone and stand prop, to facilitate Q&As, popped into the world within a few seconds as an object that can be immediately picked up and manipulated by anyone in the room, with or without virtual hands.

Choose your hubs carefully

What I was reminded of by last week’s experience was that when choosing an online platform, especially under crisis conditions, there are so many considerations to make. A key one is thinking how any experience is going to be practically maintained. Web apps like Hubs may not have star appeal but they are often the only option available, supporting an incredibly broad eco-system of old devices - not just the latest cool product.

Here at Mod we have built our own web framework, Rack&Pin that allows us, at a pinch, to operate and maintain our experiences independent of any 3rd party cloud services. For us to integrate something like Hubs is a no-brainer. There is a free cloud option but you can always download the open source code and tinker with it. The organisation behind it, Mozilla, has built its reputation as a trusted developer since the early 2000s. We will be monitoring the takeup of Hubs with great interest and exploring how our game engine titles can integrate with it. There are no perfect choices when selecting your communications hubs but it doesn’t hurt to have a reputable vendor like Mozilla building the backend.

It is early days but forward thinking engineering and design patterns enabled this milestone. We will always have the likes of Facebook, which are fish traps designed to herd us into proprietary spaces in order to leverage our data without our fully informed consent or, for many users worldwide, even awareness. Due to Covid19 the world has changed this year and it is a good time to think about how we might want to live in it going forward. It's nice to think of a world where I'll be able to grab a mic and sing with friends online without having to hand over my personal data and be subjected to constant advertising.

For a guided tour or just to chat, drop into our Open House, Monday - Friday, 11-11.30am Sydney time. Hope to see you there.

*Disclaimer: Neither Mod nor Michela have any commercial interest in Mozilla or its products.


More about Mod