Multi-party calling architectures are a common topic here at webrtcHacks, largely because group calling is widely needed but difficult to implement and understand. Most would agree Scalable Video Coding (SVC) is the most advanced, but the most complex multi-party calling architecture.

To help explain how it works we have brought in not one, but two WebRTC video architecture experts. Sergio Garcia Murillo is a long time media server developer and founder of Medooze. Most recently, and most relevant for this post, he has been working on an open source SFU that leverages VP9 and SVC (the first open source project to do this that I am aware of). In addition, frequent webrtcHacks guest author and renown video expert Gustavo Garcia Bernando joins him.

Slack is an über popular and fast growing communications tool that has a ton of integrations with various WebRTC services. Slack acquired a WebRTC company a year ago and launched its own audio conferencing service earlier this year which we analyzed here and here. Earlier this week they launched video. Does this work the same? Are there any tricks we can learn from their implementation? Long time WebRTC expert and webrtcHacks guest author Gustavo Garica takes a deeper dive into Slack's new video conferencing feature below to see what's going on under the hood.

WebRTC and its peer-to-peer capabilities are great for one-to-one communications. However, when I discuss with customers use cases and services that go beyond one-to-one, namely one-to-many or many-to-many, the question arises: “OK, but what architecture shall I use for this?”. Some service providers want to reuse the multicast support they have in their networks (we are having fun doing some experiments with this), some are exploring simulcast-based solutions, others are considering centralised solutions like MCUs/mixers, and a bunch of them are simply willing to place the burden on the endpoint by using some variation of a mesh-based topology.   The folks at TokBox (a Telefónica Digital company) have great experience with multiparty conferencing solutions.  I thought it would be great to have my friend Gustavo Garcia Bernardo (Cloud Architect at TokBox) to share here his take on the topic.

At TokBox, Gustavo is responsible for architecture, design, and development of cloud components. This includes Mantis, the cloud-scaling infrastructure for the OpenTok, which uses the WebRTC platform. Before joining TokBox, Gustavo spent more than 10 years building VoIP products at Telefónica and driving early adoption of WebRTC in telco products. In fact, I've known Gustavo for 8 years now and the first time I met him it was preparing a proposal for a European Commission-funded research project on P2PSIP. Since then we've been collaborating in the IETF doing some work in the context of P2PSIP, ALTO and SIP related activities. A couple of years ago, while I was working with Acme Packet (now Oracle), we worked together designing and launching Telefonica's Digital TuMe and TuGo.  Lately we have both shifted our focus towards WebRTC.

ML Kit smile detection in a WebRTC app

Now that it is getting relatively easy to setup video calls (most of the time), we can move on to doing fun things with the video stream. With new advancements in Machine Learning (ML) and a growing number of API’s and libraries out there, computer vision is also getting  easier to do. Google’s ML Kit is a recent example of a new machine learning based library that makes gives quick access to computer vision outputs.

To show how to use Google's new ML Kit to detect user smiles on a live WebRTC stream, I would like to welcome back past webrtcHacks author and WebRTC video master  Gustavo Garcia Bernardo of Houseparty. Joining him I would like to also welcome mobile WebRTC expert, Roberto Perez of TokBox.  They give some background on doing facial detection, show some code samples, but more importantly share their learnings for optimum configuration of smile detection inside a Real Time Communications (RTC) app.

I am a big fan of Chrome’s webrtc-internals tool. It is one of the most useful debugging tools for WebRTC and when it was added to Chrome back in 2012 it made my life a lot easier. I even wrote a lengthy series of blog post together with Tsahi Levent-Levi describing how to use it to debug issues recently.

Firefox has a similar about:webrtc page which shows the local and remote SDP for each page as well as a very useful grid of ICE candidates. But unlike Chrome it does not show the exact order of API calls or nice graphs obtained from the getStats API. I miss both features dearly. Edge and Safari don't support similar debugging helpers currently either.

Dealing with multi-party video infrastructure can be pretty daunting. The good news is the technology, products, and standards to enable economical multiparty video in WebRTC has matured quite a bit in the past few years. One of the key underlying technologies enabling some of this change is called simulcast. Simulcast has been an occasional sub-topic here at webrtcHacks in the past and it is time we gave it more dedicated attention.

To do that we asked Oscar Divorra Escoda, Tokbox's Senior Media Scientist and Media Cloud Engineering Lead to walk us through it. Tokbox was one of the first to market with a SFU and Oscar shares some of his learnings below.