Recently, Microsoft guru Alex St. John posted an acute rant on video conferencing. He asks how we live in a world where we drive Teslas and VR is more than a possibility, yet video conferencing is stuck in the dark ages. What gives?
As the founder of a video conferencing startup, I wholeheartedly agree with St. John. Some of the brightest minds are creating crazy technology around us – plant-based meat, cancer detecting therapies, reusable rockets. Meanwhile, the state of video conferencing is abysmal.
Four and a half years ago, my co-founder and I decided to enter the world of video conferencing anyway.
Here are four lessons I’ve learned along the way on why video conferencing sucks and what we can do to make it the new tech hero.
The tyranny (and necessity) of the 10 percent
Prevailing wisdom in developing a new product is that a company should focus on making 90 percent of use cases awesome and ignore the 10 percent that doesn’t add to the value proposition.
What does that mean? Say, you want to build a new car. You’ll have to include features from existing cars even if they are only used by 10 percent of people—like cruise control. That means predecessor technologies create a very high floor for “minimum features required.” You’ll include the cruise control in your new car because everyone else does. Even if your team of designers and engineers show you data around why only a tiny part of the population uses cruises control.
Same goes for video conferencing. For example, 90 percent of calls could be between groups of five people or less, so you may only feel they need to support something like 8-way calls only. Seems like a fair bar, but there is still that pesky 10 percent that needs more. So rolling 25- or 50-way caller support may be necessary despite the fact that few calls of that size take place.
Take responsibility for issues that aren’t necessarily your own
Video conferencing sits on top of a deep technology stack to deliver its functionality. And it’s among the most demanding of applications at the top of that stack, in terms of its bandwidth, latency and jitter requirements.
A simplified summary of this technology stack looks a little like this: applications, real-time audio/video codecs, call state and signaling, operating systems, hardware endpoints, cameras, microphones, Bluetooth headsets, headphones, wifi, routers, firewalls, proxies, hops, internet service providers, data centers, cloud service layers, the bridging between legacy phone networks and an IP network that wasn’t designed for real-time applications in the first place.
You get the picture—it’s complex.
To build a successful, scalable service, we have to invest in infrastructure to address and solve the plethora of problems customers run into, even if those problems are in parts of the stack we don’t control. Why? Because all the user really knows is that her video conferencing call just failed. We have never received a support request where an end user tells us their QoS policy on their firewall is degrading their video conferencing experience.
Lack of useful open-source lego blocks
Open source projects have dramatically accelerated the velocity and quality of software and benefits many types of applications.
Until recently, there were few open source projects that could be used to build a video conferencing system. So each company had to build and manage its own proprietary technology for each layer of the stack. Thankfully, six years ago WebRTC, a project spearheaded by Google, emerged as an incredibly powerful standard. WebRTC is now fully endorsed and adopted by Microsoft, Apple, Cisco and others across the industry. This is huge.
This standard, for the first time, offers a fully open-sourced, community-driven collection of protocols and APIs that anyone can use to build a communications application. Further, these protocols and APIs will be built into nearly every one of the billions of browser instances around the world. One side effect here is that real-time communications applications will no longer require the download of a clunky app to work.
WebRTC will simplify the video conferencing experience and drive greater robustness and reliability by commoditizing many parts of the system for those who adopt it.
And then there’s adoption
When the technology sucks, it’s easy to explain why people don’t use video conferencing. However, suppose you were to get the product and technology 100 percent right. Would everyone start “magically” using it? Sometimes yes, but often times no.
Video requires people to overcome personal anxieties about being on camera. As a result, the ability for a video tool to spread is reduced. Think about your friends who put tape over their camera or people whose default entry into a video conferencing tool is to mute their camera. This is, even today, surprisingly normal. Adopting video requires people to take some naturally unnatural steps compared to what they are used to doing. It may be better, but it’s different.
There is good news though. Many in the industry are working hard to make sure we live in a world where video is reliable and ubiquitous, where you can feel comfortable on camera because the screen won’t freeze on an unflattering face or where hardware isn’t a problem because it’s simple to setup and use.
It won’t be long before we’re no longer hunching over laptops, huddling around phones or taking conference calls with audio so bad you just grit your teeth, nod your head, and agree, never having heard what someone said in the first place. We’ll look back and laugh at how miserable video conferencing was five years ago.
With venture capital entering the space, we’re seeing some of these challenges falling away. The conditions for change in this space have finally emerged. This time it really does feel different.