Even if you think you don’t know what WebRTC is, chances are you are pretty well-acquainted with it. Why? Because everyday web operations rely on it. This article describes a common challenge developers encounter when employing WebRTC and how to solve it, with links to further information.
WebRTC is a realtime communication standard baked right into the web browser. It enables developers to build applications that encompass things like voice or video calling, as well as sending arbitrary data. Google Stadia uses WebRTC to control cloud games, for example. If you've ever had a voice or video call using Facebook Messenger or Google Duo/Meet/Hangouts then you’ve experienced WebRTC.
I'm not here to discuss what WebRTC is - you can find more information about that on my blog, BlogGeek.me. Rather, I want to talk about what it lacks and how to solve it: signaling. By signaling I mean the ability to find the person you want to communicate with and negotiate the communication terms (is this a video session? Voice only? What codecs will be used? etc.). WebRTC does a good job connecting a session and making sure audio and video are crisp to the level available by your network. But for that to happen your application first needs to have a signaling channel and protocol in use.
So if WebRTC lacks signaling, this is a part developers need to figure out on their own. The messages that WebRTC wants you to send on its behalf are a set of SDP blobs. WebRTC handles the creation and parsing of these SDPs but not the sending and receiving part.
You as a developer need to decide how to send them. Some use XMPP as their protocol of choice for such messages. Others resort to MQTT. Others still use SIP (which is quite common in VoIP). For the most part, though, I’d say that developers tend to invent and use their own proprietary protocol here and just use WebSocket or Comet type solution like XHR.
There are some developers who end up opting to not run their own signaling service, but rather “rent” one - from services like Ably Realtime.
Why would someone prefer using a third-party managed service for WebRTC signaling and not take the route of self development? For the same reasons you host machines on AWS instead of opening your own datacenter, because the vendor takes care of:
- Uptime, monitoring, security, updating and dealing with the nuances of supporting multiple browsers, operating systems, and SDKs.
- Scaling the service to meet your growing demands. This is doubly present with WebRTC where all of these messages are “stateful” - something that makes scaling even harder.
- You get to focus on what’s important to you: the messages and state machines that drive your application, not infrastructure and operations.
Ably has put together a series of tutorials on how to create WebRTC apps using Ably as the underlying signaling service. You can also try the latest one, showing how to implement a WebRTC signaling mechanism with FSharp, Fable, and Ably.
The way I see it, there are three main ways to develop a communication with WebRTC these days:
- DIY - by using GitHub, open source, and self-managing your servers
- Semi-managed - using a vendor to manage your signaling and another vendor to manage your NAT traversal
- Fully managed - going and using a vendor that has it all
Why the middle ground of semi-managed? Because it has less vendor lock-in characteristics to it and gives better flexibility in mixing and matching components that you may need. I’d especially suggest it for those who are considering the DIY route - because it will make their lives easier by reducing the non-functional aspects of development needed, while still letting them maintain a large bulk of their IP.
What’s your preferred signaling method for WebRTC? More information about WebRTC signaling servers is available on BlogGeek.me. Jump in to Ably WebRTC signaling solutions by browsing Ably docs or experimenting with a free account. If you have any particularly good solutions to this issue that you feel would benefit this article, get in touch with Ably blog editors.
Guest Blog by Tsahi Levent-Levi, Author of BlogGeek.me as well as CEO & Co-founder at testRTC. He also has online courses (free and paid) at webrtccourse.com
Tsahi has been working in the software communications space as an engineer, manager, marketer and CTO for the last two decades. In his various roles he meets and helps vendors with their communication projects, especially when these relate to WebRTC, CPaaS and AI.