Chrome recently added the option of adding redundancy to audio streams using the RED format as defined in RFC 2198, and Fippo wrote about the process and implementation in a previous article. You should catch-up on that post, but to summarize quickly RED works by adding redundant payloads with different timestamps in the same packet. If you lose a packet in a lossy network then chances are another successfully received packet will have the missing data resulting in better audio quality.

That was in a simplified one-to-one scenario, but audio quality issues often have the most impact on larger multi-party calls. As a follow-up to Fippo’s post, Jitsi Architect and Improving Scale and Media Quality with Cascading SFUs author Boris Grozev walks us through his design and tests for adding audio redundancy to a more complex environment with many peers routing media through a Selective Forwarding Unit (SFU). ...  Continue reading

Back in April 2020 a Citizenlab reported on Zoom’s rather weak encryption and stated that Zoom uses the SILK codec for audio. Sadly, the article did not contain the raw data to validate that and let me look at it further. Thankfully Natalie Silvanovich from Googles Project Zero helped me out using the Frida tracing tool and provided a short dump of some raw SILK frames. Analysis of this inspired me to take a look at how WebRTC handles audio. In terms of perception, audio quality is much more critical for the perceived quality of a call as we tend to notice even small glitches. Mere ten seconds of this audio analysis were enough to set me off on quite an adventure investigating possible improvements to the audio quality provided by WebRTC. ...  Continue reading

Software as a Service, Infrastructure as a Service, Platform as a Service, Communications Platform as a Service, Video Conferencing as a Service, but what about Gaming as a Service? There have been a few attempts at Cloud Gaming, most notably Google’s recently launched Stadia. Stadia is no stranger to WebRTC, but can others leverage WebRTC in the same way?

Thanh Nguyen set out to see if this was possible with his open source project, CloudRetro. CloudRetro is based on the popular go-based WebRTC library, pion (thanks to Sean of Pion for helping review here). In this post, Thanh gives an architectural review of how he build the project along with some of the benefits and challanges he ran into along the way. ...  Continue reading

As you may have heard, Whatsapp discovered a security issue in their client which was actively exploited in the wild. The exploit did not require the target to pick up the call which is really scary.
Since there are not many facts to go on, lets do some tea reading…

The security advisory issued by Facebook says

A buffer overflow vulnerability in WhatsApp VOIP stack allowed remote code execution via specially crafted series of SRTCP packets sent to a target phone number.

This is not much detail, investigations are probably still ongoing. I would very much like to hear a post-mortem how WhatsApp detected the abuse. ...  Continue reading

A while ago we looked at how Zoom was avoiding WebRTC by using WebAssembly to ship their own audio and video codecs instead of using the ones built into the browser’s WebRTC.  I found an interesting branch in Google’s main (and sadly mostly abandoned) WebRTC sample application apprtc this past January. The branch is named wartc… a name which is going to stick as warts!

The repo contains a number of experiments related to compiling the library as WebAssembly and evaluating the performance. From the rapid timeline, this looks to have been a hackathon project. ...  Continue reading

QUIC-based DataChannels are being considered as an alternative to the current SCTP-based transport. The WebRTC folks at Google are experimenting  with it:

Let’s test this out. We’ll do a simple single-page example similar to the WebRTC datachannel sample that transfers text. It offers a complete working example without involving signaling servers and also allows comparing the approach to WebRTC DataChannels more easily. ...  Continue reading

It has been a few years since the WebRTC codec wars ended in a detente. H.264 has been around for more than 15 years so it is easy to gloss over the the many intricacies that make it work.

Reknown hackathon star, live-coder, and |pipe| CTO Tim Panton was working on a drone project where he needed a light-weight H.264 stack for WebRTC, so he decided to build one. This is certainly not an exercise I would recommend for most, but Tim shows it can be an enlightening experience if not an easy one. In this post, Tim walks us through his step-by-step discovery as he just tries to get video to work. Check it out for an enjoyable alternative to reading through RFCs specs for an intro on H.264! ...  Continue reading

Deploying media servers for WebRTC has two major challenges, scaling beyond a single server as well as optimizing the media latency for all users in the conference. While simple sharding approaches like “send all users in conference X to server Y” are easy to scale horizontally, they are far from optimal in terms of the media latency which is a key factor in the user experience. Distributing a conference to a network of servers located close to the users and interconnected with each other on a reliable backbone promises a solution to both problems at the same time. Boris Grozev from the Jitsi team describes the cascading SFU problem in-depth and shows their approach as well as some of the challenges they ran into. ...  Continue reading

If you plan to have multiple participants in your WebRTC calls then you will probably end up using a Selective Forwarding Unit (SFU).  Capacity planning for SFU’s can be difficult – there are estimates to be made for where they should be placed, how much bandwidth they will consume, and what kind of servers you need.

To help network architects and WebRTC engineers make some of these decisions, webrtcHacks contributor Dr. Alex Gouaillard and his team at CoSMo Software put together a load test suite to measure load vs. video quality. They published their results for all of the major open source WebRTC SFU’s. This suite based is the Karoshi Interoperability Testing Engine (KITE) Google funded and uses on to show interoperability status. The CoSMo team also developed a machine learning based video quality assessment framework optimized for real time communications scenarios. ...  Continue reading

If you’re new to WebRTC, Jitsi was the first open source Selective Forwarding Unit (SFU) and continues to be one of the most popular WebRTC platforms. They were in the news last week because their parent group inside Atlassian was sold off to Slack but the team clarified this does not have any impact on the Jitsi team. Helping to show they are still chugging along, they released a new feature they wanted to talk about – off-stage layer suspension. This is a technique for minimizing bandwidth and CPU consumption when using simulcast. Simulcast is a common technique used in multi-party video scenarios. See Oscar Divorra’s post on this topic and that Fippo post just last week for more on that. Even if you are not implementing a  simulcast, this is a good post for understanding how to control bandwidth and to see some follow-along reverse-engineering on how Google does things in its Hangouts upgrade called Meet. ...  Continue reading