Thanks to work initiated by Google Project Zero, fuzzing has become a popular topic within WebRTC since late last year. It was clear WebRTC was lacking in this area. However, the community has shown its strength by giving this topic an immense amount of focus and resolving many issues. In a previous post, we showed how to break the Janus Server RTCP parser. The Meetecho team behind Janus did not take that lightly. They got to the bottom of what turned out to be quite a big project. In this post Alessandro Toppi of Meetecho will walk us through how they fixed this problem and built an automated process to help make sure it doesn’t happen again.
webrtcH4cKS: ~ First steps with QUIC DataChannels
QUIC-based DataChannels are being considered as an alternative to the current SCTP-based transport. The WebRTC folks at Google are experimenting with it:
Looking for feedback: QUIC based RTCQuicTransport and RTCIceTransport API's are available as origin trial in Chrome 73 for experimentation.https://t.co/KVVEVmggms
— WebRTC project (@webrtc) February 1, 2019
Let’s test this out. We’ll do a simple single-page example similar to the WebRTC datachannel sample that transfers text. It offers a complete working example without involving signaling servers and also allows comparing the approach to WebRTC DataChannels more easily.
webrtcH4cKS: ~ Let’s get better at fuzzing in 2019 – here’s how
Fuzzing is a Quality Assurance and security testing technique that provides unexpected, often random data to a program input to try to break it. Natalie Silvanovich from Google’s Project Zero team has had quite some fun fuzzing various different RTP implementations recently.
She found vulnerabilities in:
- WebRTC — mostly issues in the RTP payload
- Facetime – a few out-of-bounds, stack corruption, and heap corruption issues
- Whatsapp and what didn’t work
In a nutshell, she found a bunch of vulnerabilities just by throwing unexpected input at parsers. The range of applications which were vulnerable to this shows that the WebRTC/VoIP community does not yet have a process for doing this work ourselves. Meanwhile, the WebRTC folks at Google will have to improve their processes as well.
webrtcH4cKS: ~ Troubleshooting Unwitting Browser Experiments (Al Brooks)
Echo cancellation is a cornerstone of the audio experience in WebRTC. Google has invested quite a bit in this area, first with the delay-agnostic echo cancellation in 2015 and now with a new echo cancellation system called AEC3. Debugging issues related to AEC3 is one of the hardest areas. Al Brooks from NewVoiceMedia ran into a case of seriously degraded audio reported from his customers’ contact center agents. After a lengthy investigation it turned out to be caused by a Chrome experiment that enabled the new AEC3 for a percentage of users in Chrome stable.
Al takes us through a recap of how he analyzed the problem and narrowed it down enough to file a bug with the WebRTC team at Google.
webrtcH4cKS: ~ Guide to WebRTC with Safari in the Wild (Chad Phillips)
It has been more than a year since Apple first added WebRTC support to Safari. My original post reviewing the implementation continues to be popular here, but it does not reflect some of the updates since the first limited release. More importantly, given its differences and limitations, many questions still remained on how to best develop WebRTC applications for Safari.
I ran into Chad Phillips at Cluecon (again) this year and we ended up talking about his arduous experience making WebRTC work on Safari. He had a great, recent list of tips and tricks so I asked him to share it here.
webrtcH4cKS: ~ VR Video Calling with WebRTC and WebVR (Dan Jenkins)
WebRTC isn’t the only cool media API on the Web Platform. The Web Virtual Reality (WebVR) spec was introduced a few years ago to bring support for virtual reality devices in a web browser. It has since been migrated to the newer WebXR Device API Specification.
I was at ClueCon earlier this summer where Dan Jenkins gave a talk showing that it is relatively easy to add a WebRTC video conference streams into a virtual reality environment using WebVR using FreeSWITCH. FreeSWITCH is one of the more popular open source telephony platforms and has had WebRTC for a few years. WebRTC; WebVR; Open Source – obviously this was good webrtcHacks material.
webrtcH4cKS: ~ A playground for Simulcast without an SFU
Simulcast is one of the more interesting aspects of WebRTC for multiparty conferencing. In a nutshell, it means sending three different resolution (spatial scalability) and different frame rates (temporal scalability) at the same time. Oscar Divorra’s post contains the full details.
Usually, one needs a SFU to take advantage of simulcast. But there is a hack to make the effect visible between two browsers — or inside a single page. This is very helpful for single-page tests or fiddling with simulcast features, particular the ability to enable only certain spatial layers or to control the target bitrate of a particular stream.
webrtcH4cKS: ~ Smile, You’re on WebRTC – Using ML Kit for Smile Detection
Now that it is getting relatively easy to setup video calls (most of the time), we can move on to doing fun things with the video stream. With new advancements in Machine Learning (ML) and a growing number of API’s and libraries out there, computer vision is also getting easier to do. Google’s ML Kit is a recent example of a new machine learning based library that makes gives quick access to computer vision outputs.
To show how to use Google’s new ML Kit to detect user smiles on a live WebRTC stream, I would like to welcome back past webrtcHacks author and WebRTC video master Gustavo Garcia Bernardo of Houseparty. Joining him I would like to also welcome mobile WebRTC expert, Roberto Perez of TokBox. They give some background on doing facial detection, show some code samples, but more importantly share their learnings for optimum configuration of smile detection inside a Real Time Communications (RTC) app.
webrtcH4cKS: ~ Part 2: Building a AIY Vision Kit Web Server with UV4L
In part 1 of this set, I showed how one can use UV4L with the AIY Vision Kit send the camera stream and any of the default annotations to any point on the Web with WebRTC. In this post I will build on this by showing how to send image inference data over a WebRTC dataChannel and render annotations in the browser. To do this we will use a basic Python server, tweak some of the Vision Kit samples, and leverage the dataChannel features of UV4L.
To fully follow along you will need to have a Vision Kit and should have completed all the instructions in part 1. If you don’t have a Vision Kit, you still may get some value out of seeing how UV4L’s dataChannels can be used for easily sending data from a Raspberry Pi to your browser application.
A couple years ago I did a TADHack where I envisioned a cheap, low-powered camera that could run complex computer vision and stream remotely when needed. After considering what it would take to build something like this myself, I waited patiently for this tech to come. Today with Google’s new AIY Vision kit, we are pretty much there.
The AIY Vision Kit is a $45 add-on board that attaches to a Raspberry Pi Zero with a Pi 2 camera. The board includes a Vision Processing Unit (VPU) chip that runs Tensor Flow image processing graphs super efficiently. The kit comes with a bunch of examples out of the box, but to actually see what the camera see’s you need to plug the HDMI into a monitor. That’s not very useful when you want to put your battery powered kit in a remote location. And while it is nice that the rig does not require any Internet connectivity, that misses out on a lot of the fun applications. So, let’s add some WebRTC to the AIY Vision Kit to let it stream over the web.