Archives

All posts by Chad Hart

local Jitsi recording hack with getDisplayMedia audio capture and mediaRecorder

I wanted to add local recording to my own Jitsi Meet instance. The feature wasn’t built in the way I wanted, so I set out on a hack to build something simple. That lead me down the road to  discovering that:

  1. getDisplayMedia for screen capture has many quirks,
  2. mediaRecorder for media recording has some of its own unexpected limitations, and
  3. Adding your own HTML/JavaScript to Jitsi Meet is pretty simple

Read on for plenty of details and some reference code. My result is located in this repo.

The Problem

I built a Jitsi Meet server a few months ago with the intention of updating my Build your own phone company with WebRTC and a weekend post. There are a billion posts/videos on how to set up Jitsi Meet, and I don’t have anything new or interesting to add to the technosphere there. However, one feature I really wanted to implement is recording. I often do demos and recording the session for others and future reference. On the surface, my requirements here are simple – record my audio and the audio and video of the Jitsi Meet session on demand and save the file locally. This sounds like a simple feature to add, but… ...  Continue reading

WebRTC has made getting and sending real time video streams (mostly) easy. The next step is doing something with them, and machine learning lets us have some fun with those streams. Last month I showed how to run Computer Vision (CV) locally in the browser. As I mentioned there, local is nice, but sometimes more performance is needed so you need to run your Machine Learning inference on a remote server. In this post I’ll review how to run OpenCV models server-side with hardware acceleration on Intel chipsets using Intel’s open source Open WebRTC Toolkit (OWT). ...  Continue reading

Don’t touch your face! To prevent the spread of disease, health bodies recommend not touching your face with unwashed hands. This is easier said than done if you are sitting in front of a computer for hours.  I wondered, is this a problem that can be solved with a browser?

We have a number of computer vision + WebRTC experiments here. Experimenting with running computer vision locally in the browser using TensorFlow.js has been on my bucket list and this seemed like a good opportunity. A quick search revealed somebody already thought of this 2 week ago. That site used a model that requires some user training – which is interesting but can make it flaky. It also wasn’t open source for others to expand on, so I did some social distancing via coding isolation over the weekend to see what was possible. ...  Continue reading

webrtcH4cKS: ~ Perfect Negotiation

Series preface: We generally lean toward long posts here at webrtcHacks, but not all interesting topics warrant a lot of new text. Sometimes briefer is better. So to better address the many topics that fit into this category, we are starting a new Minimum Duration series. Here is our first post under this set covering Perfect Negotiation.

What is Perfect Negotiation and why do we need it?

Long ago the WebRTC specification designers settled on leaving the signaling communication mechanism between two WebRTC peers up to the application. This means your code needs to handle passing Session Description Protocol (SDP) back and forth and giving that to the peerConnection API. Today WebRTC implementations also almost universally use Trickle-ICE, a form of Interactive Connectivity Establishment (ICE), which passes potential network paths between those peers asynchronously so a connection can be established as soon as possible. The asynchronous but time sensitive nature of all this means it is possible for glare conditions to occur – situations where both sides are making updates at the same time causing their state machines to get out of sync. Differences in how developers implement their code and browsers variances make this worse. ...  Continue reading

WebRTC has a new browser – kind of. Yesterday Microsoft’s  “new” Edge browser based on Chromium – commonly referred to Edgium – went GA. This certainly will make life easier for WebRTC developers since the previous Edge had many differences from other implementations. The big question is how different is Edgium from Chrome for WebRTC usage?

The short answer is there is no real difference, but you can read below for background details on the tests I ran. If you’re new around WebRTC the rundown may give you some ideas for testing your own product. ...  Continue reading

In part 1 of this set, I showed how one can use UV4L with the AIY Vision Kit send the camera stream and any of the default annotations to any point on the Web with WebRTC. In this post I will build on this by showing how to send image inference data over a WebRTC dataChannel and render annotations in the browser. To do this we will use a basic Python server,  tweak some of the Vision Kit samples, and leverage the dataChannel features of UV4L.

To fully follow along you will need to have a Vision Kit and should have completed all the instructions in part 1. If you don’t have a Vision Kit, you still may get some value out of seeing how UV4L’s dataChannels can be used for easily sending data from a Raspberry Pi to your browser application. ...  Continue reading

A couple years ago I did a TADHack  where I envisioned a cheap, low-powered camera that could run complex computer vision and stream remotely when needed. After considering what it would take to build something like this myself, I waited patiently for this tech to come. Today with Google’s new AIY Vision kit, we are pretty much there.

The AIY Vision Kit is a $45 add-on board that attaches to a Raspberry Pi Zero with a Pi 2 camera. The board includes a Vision Processing Unit (VPU) chip that runs Tensor Flow image processing graphs super efficiently. The kit comes with a bunch of examples out of the box, but to actually see what the camera see’s you need to plug the HDMI into a monitor. That’s not very useful when you want to put your battery powered kit in a remote location. And while it is nice that the rig does not require any Internet connectivity, that misses out on a lot of the fun applications. So, let’s add some WebRTC to the AIY Vision Kit to let it stream over the web. ...  Continue reading

TensorFlow is one of the most popular Machine Learning frameworks out there – probably THE most popular one. One of the great things about TensorFlow is that many libraries are actively maintained and updated. One of my favorites is the TensorFlow Object Detection API.   The Tensorflow Object Detection API classifies and provides the location of multiple objects in an image. It comes pre-trained on nearly 1000 object classes with a wide variety of pre-trained models that let you trade off speed vs. accuracy. ...  Continue reading

Decoding video when there is packet loss is not an easy task.  Recent Chrome versions have been plagued by video corruption issues related to a new video jitter buffer introduced in Chrome 58. These issues are hard to debug since they occur only when certain packets are lost. To combat these issues, webrtc.org has a pretty powerful tool to reproduce and analyze them called video_replay. When I saw another video corruption issue filed by Stian Selnes I told him about that tool. With an easy reproduction of the stream, the WebRTC video team at Google made short work of the bug. Unfortunately this process is not too well documented, so we asked Stian to walk us through the process of capturing the necessary data and using the video_replay tool. Stian, who works at Pexip, has been dealing with real-time communication for more than 10 years. He has  experience in large parts of the media stack with a special interest in video codecs and other types of signal processing, network protocols and error resilience. ...  Continue reading