Echo cancellation is a cornerstone of the audio experience in WebRTC. Google has invested quite a bit in this area, first with the delay-agnostic echo cancellation in 2015 and now with a new echo cancellation system called AEC3. Debugging issues related to AEC3 is one of the hardest areas. Al Brooks from NewVoiceMedia ran into a case of seriously degraded audio reported from his customers’ contact center agents. After a lengthy investigation it turned out to be caused by a Chrome experiment that enabled the new AEC3 for a percentage of users in Chrome stable.
Al takes us through a recap of how he analyzed the problem and narrowed it down enough to file a bug with the WebRTC team at Google. ...  Continue reading

It has been more than a year since Apple first added WebRTC support to Safari. My original post reviewing the implementation continues to be popular here, but it does not reflect some of the updates since the first limited release. More importantly, given its differences and limitations, many questions still remained on how to best develop WebRTC applications for Safari.

I ran into Chad Phillips at Cluecon (again) this year and we ended up talking about his arduous experience making WebRTC work on Safari. He had a great, recent list of tips and tricks so I asked him to share it here. ...  Continue reading

WebRTC isn’t the only cool media API on the Web Platform. The Web Virtual Reality (WebVR) spec was introduced a few years ago to bring support for virtual reality devices in a web browser. It has since been migrated to the newer WebXR Device API Specification.

I was at ClueCon earlier this summer where Dan Jenkins gave a talk showing that it is relatively easy to add a WebRTC video conference streams into a virtual reality environment using WebVR using FreeSWITCH. FreeSWITCH is one of the more popular open source telephony platforms and has had WebRTC for a few years. WebRTC; WebVR; Open Source – obviously this was good webrtcHacks material. ...  Continue reading

Simulcast is one of the more interesting aspects of WebRTC for multiparty conferencing. In a nutshell, it means sending three different resolution (spatial scalability) and different frame rates (temporal scalability) at the same time. Oscar Divorra’s post contains the full details.

Usually, one needs a SFU to take advantage of simulcast. But there is a hack to make the effect visible between two browsers — or inside a single page. This is very helpful for single-page tests or fiddling with simulcast features, particular the ability to enable only certain spatial layers or to control the target bitrate of a particular stream. ...  Continue reading

ML Kit smile detection in a WebRTC app

Now that it is getting relatively easy to setup video calls (most of the time), we can move on to doing fun things with the video stream. With new advancements in Machine Learning (ML) and a growing number of API’s and libraries out there, computer vision is also getting  easier to do. Google’s ML Kit is a recent example of a new machine learning based library that makes gives quick access to computer vision outputs.

To show how to use Google’s new ML Kit to detect user smiles on a live WebRTC stream, I would like to welcome back past webrtcHacks author and WebRTC video master  Gustavo Garcia Bernardo of Houseparty. Joining him I would like to also welcome mobile WebRTC expert, Roberto Perez of TokBox.  They give some background on doing facial detection, show some code samples, but more importantly share their learnings for optimum configuration of smile detection inside a Real Time Communications (RTC) app. ...  Continue reading

In part 1 of this set, I showed how one can use UV4L with the AIY Vision Kit send the camera stream and any of the default annotations to any point on the Web with WebRTC. In this post I will build on this by showing how to send image inference data over a WebRTC dataChannel and render annotations in the browser. To do this we will use a basic Python server,  tweak some of the Vision Kit samples, and leverage the dataChannel features of UV4L.

To fully follow along you will need to have a Vision Kit and should have completed all the instructions in part 1. If you don’t have a Vision Kit, you still may get some value out of seeing how UV4L’s dataChannels can be used for easily sending data from a Raspberry Pi to your browser application. ...  Continue reading

A couple years ago I did a TADHack  where I envisioned a cheap, low-powered camera that could run complex computer vision and stream remotely when needed. After considering what it would take to build something like this myself, I waited patiently for this tech to come. Today with Google’s new AIY Vision kit, we are pretty much there.

The AIY Vision Kit is a $45 add-on board that attaches to a Raspberry Pi Zero with a Pi 2 camera. The board includes a Vision Processing Unit (VPU) chip that runs Tensor Flow image processing graphs super efficiently. The kit comes with a bunch of examples out of the box, but to actually see what the camera see’s you need to plug the HDMI into a monitor. That’s not very useful when you want to put your battery powered kit in a remote location. And while it is nice that the rig does not require any Internet connectivity, that misses out on a lot of the fun applications. So, let’s add some WebRTC to the AIY Vision Kit to let it stream over the web. ...  Continue reading

TensorFlow is one of the most popular Machine Learning frameworks out there – probably THE most popular one. One of the great things about TensorFlow is that many libraries are actively maintained and updated. One of my favorites is the TensorFlow Object Detection API.   The Tensorflow Object Detection API classifies and provides the location of multiple objects in an image. It comes pre-trained on nearly 1000 object classes with a wide variety of pre-trained models that let you trade off speed vs. accuracy. ...  Continue reading

Decoding video when there is packet loss is not an easy task.  Recent Chrome versions have been plagued by video corruption issues related to a new video jitter buffer introduced in Chrome 58. These issues are hard to debug since they occur only when certain packets are lost. To combat these issues, has a pretty powerful tool to reproduce and analyze them called video_replay. When I saw another video corruption issue filed by Stian Selnes I told him about that tool. With an easy reproduction of the stream, the WebRTC video team at Google made short work of the bug. Unfortunately this process is not too well documented, so we asked Stian to walk us through the process of capturing the necessary data and using the video_replay tool. Stian, who works at Pexip, has been dealing with real-time communication for more than 10 years. He has  experience in large parts of the media stack with a special interest in video codecs and other types of signal processing, network protocols and error resilience. ...  Continue reading

I am a big fan of Chrome’s webrtc-internals tool. It is one of the most useful debugging tools for WebRTC and when it was added to Chrome back in 2012 it made my life a lot easier. I even wrote a lengthy series of blog post together with Tsahi Levent-Levi describing how to use it to debug issues recently.

Firefox has a similar about:webrtc page which shows the local and remote SDP for each page as well as a very useful grid of ICE candidates. But unlike Chrome it does not show the exact order of API calls or nice graphs obtained from the getStats API. I miss both features dearly. Edge and Safari don’t support similar debugging helpers currently either.  ...  Continue reading