Technology

One of WebRTC’s great features is its mandated strong encryption.  Encryption mechanisms are built-in, meaning developers don’t (often) need to deal with the details. However, these easy, built-in encryption mechanisms assume you have: 1) media is communicated peer-to-peer and 2) a secure signaling channel setup. Most group-calling services make use of a media server device, like a Selective Forwarding Unit (SFU) that terminate and re-encrypt, preventing the end-to-end encryption (e2ee). As we have covered here before, WebRTC e2ee is still possible with new APIs like Insertable Streams. That addresses the first assumption, but what about the second? How does one set up secure signaling for e2ee? ...  Continue reading

Introduction to capture handle - a new Chrome Origin Trial that lets a WebRTC screen sharing application communicate with the tab it is capturing. Examples use case discussed include detecting self-capture, improving the use of collaboration apps that are screen shared, and optimizing stream parameters of the captured content.

Pion seemingly came out of nowhere to become one of the biggest and most active WebRTC communities. Pion is a Go-based set of WebRTC projects. Golang is an interesting language, but it is not among the most popular programming languages out there, so what is so special about Pion? Why are there so many developers involved in this project? 

To learn more about this project and how it came to be among the most active WebRTC organizations, I interviewed its founder – Sean Dubois. We discuss Sean’s background and how be got started in RTC, so see the interview for his background.  I really wanted to understand why he decided to build a new WebRTC project and why he continues to spend so much of his free time on it. ...  Continue reading

Chrome recently added the option of adding redundancy to audio streams using the RED format as defined in RFC 2198, and Fippo wrote about the process and implementation in a previous article. You should catch-up on that post, but to summarize quickly RED works by adding redundant payloads with different timestamps in the same packet. If you lose a packet in a lossy network then chances are another successfully received packet will have the missing data resulting in better audio quality.

That was in a simplified one-to-one scenario, but audio quality issues often have the most impact on larger multi-party calls. As a follow-up to Fippo’s post, Jitsi Architect and Improving Scale and Media Quality with Cascading SFUs author Boris Grozev walks us through his design and tests for adding audio redundancy to a more complex environment with many peers routing media through a Selective Forwarding Unit (SFU).

{“editor”, “chad hart“}

Fippo covered how to add redundancy packets in standard peer-to-peer calls without any middle boxes like a Selective Forwarding Unit (SFU).  What happens when you stick in a SFU in the middle? There are a couple more things to consider.

  • How do we handle conferences where clients have different RED capabilities? It may be the case that only a subset of the participants in a conference support RED. In fact this will often be the case today since RED is a relatively new addition to WebRTC/Chromium/Chrome.
  • Which streams should have redundancy? Should we add redundancy for all audio streams at the cost of additional overhead, or just the currently active speaker (or 2-3 speakers)?
  • Which legs should have redundancy? In multi-SFU cascading scenarios, do we need to add redundancy for the SFU-SFU streams?
  •  ...  Continue reading

    Back in April 2020 a Citizenlab reported on Zoom’s rather weak encryption and stated that Zoom uses the SILK codec for audio. Sadly, the article did not contain the raw data to validate that and let me look at it further. Thankfully Natalie Silvanovich from Googles Project Zero helped me out using the Frida tracing tool and provided a short dump of some raw SILK frames. Analysis of this inspired me to take a look at how WebRTC handles audio. In terms of perception, audio quality is much more critical for the perceived quality of a call as we tend to notice even small glitches. Mere ten seconds of this audio analysis were enough to set me off on quite an adventure investigating possible improvements to the audio quality provided by WebRTC.

     

    Want to keep up on our latest posts? Please click here to subscribe to our mailing list if you have not already. We only email post updates. You can also follow us on twitter at @webrtcHacks for blog updates.

    Software as a Service, Infrastructure as a Service, Platform as a Service, Communications Platform as a Service, Video Conferencing as a Service, but what about Gaming as a Service? There have been a few attempts at Cloud Gaming, most notably Google’s recently launched Stadia. Stadia is no stranger to WebRTC, but can others leverage WebRTC in the same way?

    Thanh Nguyen set out to see if this was possible with his open source project, CloudRetro. CloudRetro is based on the popular go-based WebRTC library, pion (thanks to Sean of Pion for helping review here). In this post, Thanh gives an architectural review of how he build the project along with some of the benefits and challanges he ran into along the way.

    {“editor”, “chad hart“}

    Introduction

    Last year, Google announced Stadia, and It blew my mind. That idea is so unique and innovative. I kept questioning how it is even possible with the current state of technology. The motivation to demystify its technology spurred me to create an open-source version of Cloud Gaming. The result is fantastic. I would like to share my one year adventure working on this project in the article below.

    TLDR: the short slide version with highlights

    Why Cloud Gaming is the future

    I believe Cloud Gaming will soon become the new generation of not only games but also other fields of computer science. Cloud Gaming is the pinnacle of the client/server model. It maximizes backend control and minimizes frontend work by putting game logic on a remote server and streaming images/audio to the client. The server handles heavy processing so the client is no longer limited by hardware constraints.

    Looking at Google Stadia, it essentially allows you to play AAA games (i.e. high-end blockbuster games) on an interface like YouTube. The same methodology can be applied to other heavy offline applications like Operating System or 2D/3D graphic design, etc… so that we can run them consistently on low spec devices across many platforms.

    The future of this technology: imagine running Microsoft Windows 10 on a Chrome browser?

    Cloud Gaming is technically challenging

    Gaming is one of the rare applications that require continuous fast user reaction. If we click a page that takes a 2-second delay once in a while, it is still acceptable. Live broadcast video streams typically run many seconds behind, but still offer acceptable usability. However, if a game is delayed frequently for 500ms, it becomes unplayable. The target is to achieve extremely low latency to ensure the gap between game input and media is as small as possible. Therefore, the traditional Video streaming approach is not applicable here.

    Cloud Gaming common pattern

    The Open Source CloudRetro Project

    I decided to create a POC of Cloud-Gaming so that I can verify whether it is possible with these tight network restrictions. I picked Golang for my POC because it is the language I am most familiar with and it turned out to work well for many other reasons. Go is simple with fast development speed. Go channels are excellent when dealing with concurrency and stream manipulation.

    The project is CloudRetro.io: Open source Web-based Cloud Gaming Service for Retro game. The goal is to bring the most comfortable gaming experience and introduce network gameplay like online multiplayer to traditional retro games.

    You can reference the entire project repo here: https://github.com/giongto35/cloud-game

    CloudRetro Functionality

    CloudRetro used Retro games to demonstrate the power of Cloud Gaming. It enables many unique gaming experiences.

    • Portable Gaming experience
      • Instant play when the page is opened; no download, no install
      • Running on browser, mobile, so you don’t need any software to launch

      Gaming sessions can be shared across multiple devices and stored on cloud storage for next time
      Game is both streamed and playable and multiple users can join the same game:

      • Crowdplay like TwitchPlayPokemon but more real time and seamless
      • Online multiplayer for offline games without network setting. Samurai Showdown is now playable with 2 players over the network in CloudRetro

      Infrastructure

      Requirement and Tech Stack

      Below is the list of requirements I set before starting the project.

      1. Singleplayer:

      This requirement sounds not relevant and straightforward, but it’s one of my key findings that makes cloud gaming stand away from traditional streaming services. If we focus on singleplayer, we can get rid of a centralized server or CDN because we don’t need to distribute the stream session to massive users. Rather than uploading streams to an ingest server or passing packets to a centralized WebSocket server, the service streams to the user directly over a WebRTC peer connection.

      2. Low Latency media stream

      When I research about Stadia, some articles are mentioning the application of WebRTC. I figured out WebRTC is a remarkable technology and fits this cloud gaming use case nicely. WebRTC is a project that provides web browsers and mobile applications with Real-Time Communication via simple API. It enables peer communication and is optimized for media and has built-in standard codecs like VP8 and H264.

      I prioritized delivering the smoothest experience to users over keeping high-quality graphics. Some loss is acceptable in the algorithm. On Google Stadia, there is an additional step to reduce image size on the server, and image frames are rescaled to higher quality before rendering to peers.

      3. Distributed infrastructure with geographic routing.

      No matter how optimized the compression algorithm and the code is, network is still the crucial factor contributing the most to latency. The architecture needs to have a mechanism to pair the closest server to the user to reduce Round Trip Time (RTT). The architecture contains a single coordinator and multiple streaming servers distributed around the world: US West, US East, Europe, Singapore, China. All streaming servers are fully isolated. The system can adjust its distribution when a server joins or leaves the network. Hence, under high traffic, adding more servers allows horizontal scaling.

      4. Browser Compatible

      Cloud Gaming shines the best when it demands the least from users. This means being able to run on a browser. Browsers help bring the most comfortable gaming experience to users by removing software and hardware installs. Browsers also help provide cross-platform flexibility across mobile and desktop. Fortunately, WebRTC has excellent support across different browsers.

      5. Clear separation of Game interface and service

      I see the cloud gaming service as a platform. One should be able to plug in any to the platform. Currently, I integrated LibRetro with the Cloud Gaming service because LibRetro offers a beautiful gaming emulator interface for retro games like SNES, GBA, PS.

      6. Room based mechanism for multiplayer, crowd play and deep-link to the game

      CloudRetro enables many novel gameplays like CrowdPlay and Online MultiPlayer for retro games. If multiple users open the same deep-link on different machines, they will see the same running game as a video stream and even be able to join the game as any player.

      Moreover, Game states are stored on cloud storage. This lets users continue their game at any time on any different device.

      7. Horizontal scaling

      As every SAAS nowadays, it must be designed to be horizontally scalable. The coordinator-worker design enables adding more workers to serve more traffic.

      8. Cloud Agnostic

      CloudRetro infrastructure is hosted on various cloud providers (Digital Ocean, Alibaba, custom provider) to target different regions. I dockerize the infrastructure and configure network settings through bash script to avoid dependency on any one cloud provider. Combining this with WebRTC’s NAT traversal, we can gain the flexibility to deploy CloudRetro on any cloud platform and even on any user’s machines.

      Architectural design

      Worker: (or streaming server as referred above) spawns games, runs encoding pipeline, and streams the encoded media to users. Worker instances are distributed around the world, and each worker can handle multiple user sessions concurrently.

      Coordinator: in charge of pairing the new user with the most appropriate worker for streaming. The coordinator interacts with workers over WebSocket.

      Game state storage: central remote storage for all game states. This storage enables some essential functionalities such as remote saving/loading.

      User flow

      When a new user opens CloudRetro at steps 1 and 2 shown in the image below, the coordinator is requested for the frontend page along with the list of available workers. After that at step 3, the client calculates latencies to all candidates using an HTTP ping request. This list of latencies is later sent back to the coordinator so that it can determine the most suitable worker to serve the user. At step 4 below, the game is spawned. A WebRTC stream connection is established between the user and the designated worker.

      Inside the worker

      Inside a worker, game and streaming pipelines are kept isolated and exchange information via an interface. Currently, this communication is done by in-memory transmission over Golang Channels in the same process. Further segregation is the next goal –  i.e., running the game independently in a different process.

      The main pieces are:

      • WebRTC: Client-facing component where user input comes in and the encoded media from the server goes out.
      • Game Emulator: The game component. Thanks to Libretro library, the system is capable of running the game inside the same process and internally hooking media and input flow. In-game frames are captured and sent to the encoder.
      • Image/Audio Encoder: The encoding pipeline, where it accepts media frames, encodes in the background, and outputs encoded images/audio.

      Implementation

      CloudRetro relies on WebRTC as the backbone, so before going into details about my implementation in Golang, the first section is dedicated to introducing WebRTC technology. It is an awesome technology that greatly helps me achieve sub-second latency streaming.

      WebRTC

      WebRTC is designed to enable high-quality peer-to-peer connections on native mobile and browsers with simple APIs.

      NAT Traversal

      WebRTC is renowned for its NAT Traversal functionality. WebRTC is designed for peer-to-peer communication.  It aims to find the most suitable direct route avoiding NAT gateways and firewalls for peer communication via a process named ICE. As part of this process, the WebRTC APIs find your public IP Address using STUN servers and fallback to a relay server (TURN) when direct communication cannot be established.

      However, CloudRetro doesn’t fully utilize this capability. Its peer connections are not between users and users but between users and cloud servers. The server side of the model has fewer restrictions on direct communication than a typical user device. This allows doingthings like pre-opening inbound ports or using public IP addresses directly as the server is not behind NAT.

      I previously had ambitions of developing the project to become a game distribution platform for Cloud Gaming. The idea was to let game creators contribute games and streaming resources. Users would be paired with game creators’ providers directly. In this decentralized manner, CloudRetro is just a medium to connect third-party streaming resources with users, so it is more scalable when the burden of hosting does not rely on CloudRetro anymore. WebRTC NAT Traversal will play an important role when it eases the peer connection initialization on third-party streaming resources, making it effortless for the creator to join the network.

      Video Compression

      Video compression is an indispensable part of the pipeline that greatly contributes to a smooth streaming experience. Even though It is not compulsory to fully know all of VP8/H264’s video coding details, understanding its concepts helps to demystify streaming speed parameters, debug unexpected behavior, and tune the latency.

      Video Compression for a streaming service is challenging because the algorithm needs to ensure the total encoding time + network transmission + decoding time is as small as possible. In addition, the encoding process needs to be in serial order and continuous. Some traditional encoding trade-offs are not applicable – like trading long encoding time for smaller file size and decoding time or compressing without order.

      The idea of video compression is to omit non-essential bits of information while keeping an understandable level of fidelity for users. In addition to encoding individual static image frames, the algorithm made an inference for the current frame from previous and future frames, so only the difference is sent. As you see in the the Pacman example below, only the differential dots are transferred.

      Audio Compression

      Similarly, the audio compression algorithm omits data that cannot be perceived by humans. The audio codec with the best performance currently is Opus. Opus is designed to transmit audio wave over an ordered datagram protocol such as RTP (Real-time Transport Protocol). It produces lower latency than (mp3, aac) with higher quality. The delay is usually around 5~66.5 ms

      Pion, WebRTC in Golang

      Pion is an open-source project that brings WebRTC to Golang. Rather than simply wrapping the native C++ WebRTC libraries, Pion is a native Golang implementation for better performance, better Golang integration, and version control on constitutive WebRTC protocols.

      The library also provides sub-second latency streaming with many great built-ins. It has its own implementation of STUN, DTLS, SCTP, etc… and some experiments of QUIC and WebAssembly. This open-source library itself is really a good source of learning with a great document, network protocol implementations and cool examples.

      The Pion community, led by a very passionate creator, is lively and has many quality discussions about WebRTC. If you are interested in this technology, please join http://pion.ly/slack – you will learn many new things.

      Write CloudRetro in Golang

      Go Channel In Action

      Thanks to Go’s beautiful channel design, event streaming and concurrency problems are greatly simplified. As in the diagram, there are multiple components running parallely in different GoRoutines. Each component manages its own state and communicates over channels. Golang’s select statement enforces that one atomic event is processed each game tick. This means locking is unnecessary with this design. For example, when a user saves, a completed snapshot of the game state is required. This state needs to remain uninterrupted by running input until the save is complete. During each game tick, the backend can only process either save operation or input operation, so it is concurrently safe.

      Fan-in / Fan-out

      This Golang pattern perfectly matches my use-case for CrowdPlay and Multiple Player. Following this pattern, all user inputs in the same room are fanned-in to a central input channel.  Game media is then fanned-out to all users in the same room. Hence, we achieve game state sharing between multiple gaming sessions from different users.

      Synchronization between different sessions

      Downsides of Golang

      Golang isn’t perfect. Channel is slow. Compared to lock, Go channel is just a simpler way to handle concurrency and streaming events, but channel does not give the best performance. There is a complex locking logic under a channel. Hence, I made some adjustments in the implementation by reapplying locks and atomic value in replacement of channels to optimize the performance.

      In addition, the Golang garbage collector is uncontrollable, so sometimes there are some suspicious long pauses. This greatly hurts the realtime-ness of this streaming application.

      CGO

      The project uses some existing Golang open-source VP8/H264 Library for media compression and Libretro for Game emulators. All of these libraries are just wrappers of C library in Go by using CGO. There are some drawbacks that you can refer to this blog post by Dave. The issues I’m facing are:

      • Crash in CGO is not caught, even with Golang Recovery
      • Unable to define performance bottleneck when we cannot detect granular issues under CGO

      Conclusion ...  Continue reading

    As you may have heard, Whatsapp discovered a security issue in their client which was actively exploited in the wild. The exploit did not require the target to pick up the call which is really scary.
    Since there are not many facts to go on, lets do some tea reading…

    The security advisory issued by Facebook says

    A buffer overflow vulnerability in WhatsApp VOIP stack allowed remote code execution via specially crafted series of SRTCP packets sent to a target phone number.

    This is not much detail, investigations are probably still ongoing. I would very much like to hear a post-mortem how WhatsApp detected the abuse.

    We know there is an issue with SRTCP, the control protocol used with media transmission. This can mean two things:

    1. there is an issue with how RTCP packets are decrypted, i.e. at the SRTCP layer
    2. there is an issue in the RTCP parser

    SRTCP is quite straightforward so a bug in the RTCP parser is more likely. As I said last year, I was surprised Natalie Silvanovichs fuzzer (named “Fred” because why not?) did not find any issues in the webrtc.org RTCP parser.

    We actually have a bit of hard facts provided by the binary diff Checkpoint Research wherein they analyzed how the patched version is different.

    They found two interesting things:

    • there is an additional size check in the RTCP module, ensuring less than 1480 bytes
    • RTCP is processed before the call is answered

    Lets talk about RTCP

    RTCP, the Realtime Control Protocol, is a rather complicated protocol described in RFC 3550. It provides feedback about how the RTP media stream is doing such as packet loss. A UDP datagram can multiplex multiple individual RTCP packets into what is called a compound packet. When a RTCP compound packet is encrypted using SRTCP, all of the packets are encrypted together with a single authentication tag that is usually 12 bytes long.
    To make demuxing compound packets possible, each individual RTCP packet specifies its length in a 16 bit field. For example a sender report packet starts like this:

    The length field is defined as

    length: 16 bits
    The length of this RTCP packet in 32-bit words minus one,
    including the header and any padding. (The offset of one makes
    zero a valid length and avoids a possible infinite loop in
    scanning a compound RTCP packet, while counting 32-bit words
    avoids a validity check for a multiple of 4.)

    which is… rather complicated. It particular this definition means that the RTCP parser MUST validate the length field against the length of the datagram and the remaining bytes in the packet. Some RTCP packets even have additional internal length fields.

    For the first packet in a compound packet length validation is usually done by the library implementing SRTCP like libSRTP. Mind you that WhatsApp probably uses PJSIP and PJMEDIA, or at least they did in back in 2015 when I took a look.

    The length check for the second packet needs to be done by the RTCP library. I would not be surprised if this is where things went south. Been there, done that. And it remains a bit unclear whether the length field is validated against the remaining bytes. 1480 seems like a very odd number to check for though. At first I thought this made sense since it was 1492 minus the 12 bytes for the SRTCP tag but the maximum payload size of UDP turned out to be 1472 bytes, not 1492. So now I end up being confused again…

    Don’t process data from strangers

    There is another issue here. As the New York Times article said it looks like the victims received calls they never answered.

    Checkpoint’s analysis ...  Continue reading

    A while ago we looked at how Zoom was avoiding WebRTC by using WebAssembly to ship their own audio and video codecs instead of using the ones built into the browser’s WebRTC.  I found an interesting branch in Google’s main (and sadly mostly abandoned) WebRTC sample application apprtc this past January. The branch is named wartc… a name which is going to stick as warts!

    The repo contains a number of experiments related to compiling the webrtc.org library as WebAssembly and evaluating the performance. From the rapid timeline, this looks to have been a hackathon project.

    Project Architecture

    The project includes a few pieces:

    • encoding individual images using libwebp as a baseline
    • using libvpx for encoding video streams
    • using WebRTC for audio and video

    Transferring the resulting data is done using the WebRTC DataChannel. The high-level architecture is shown below:

    I decided to apply the Web Assembly (aka wasm) techniques to webrtc sample pages since apprtc is a bit cumbersome to set up. The sample pages are easier framework to work with and don’t need a signaling server. Actually you do not need any server of your own since you can simply run them from github pages.

    You can find the source code here.

    Basic techniques

    Before we get to WebAssembly, first we need to walk through the common techniques to capture and render RTC audio and video. We will use WebAudio to grab and play raw audio samples and the HTML5 canvas element to grab frames from a video and render images.

    Let’s look at each.

    WebAudio

    Grab audio from a stream using WebAudio

    To grab audio from a MediaStream one can use a WebAudio MediaStreamSource and a ScriptProcessor node. After connecting these, call the ScriptProcessorNode’s onaudioprocess with an object containing the raw samples to be sent over the DataChannel.

    Render audio using WebAudio

    Rendering audio is a bit more complicated. The solution that seems to have come up during the hackathon is quite a good hack. It creates an AudioContext with a square input, connects it to a ScriptProcessorNode and pulls data from a buffer (which is fed by the data channel) at a high frequency. Then the ScriptProcessorNode is connected the AudioContext’s destination node which will play out things without needing an element, similar to what we have seen Zoom do.

    Try these here. Make sure to open about:webrtc or chrome://webrtc-internals to see the WebRTC connection in action.

    Canvas

    Grab images from a canvas element

    Grabbing image data from a canvas is quite simple. After creating a canvas element with the desired width and height, we draw the video element to the canvas context and then call getImageData  to get the data which we can then process further.

    Draw images to a canvas element

    Drawing to a canvas is equally simple by creating a frame of the right size and then setting the frames data to the data we want to draw. When this is done at a high enough frequency the result looks quite smooth. To optimize the process, this could be coordinated with the rendering using the requestAnimationFrame method.

    Try these here

    Encoding images using libwebp

    Encoding images using libwebp is a baseline experiment. Each frame is encoded as an individual image, with no reference to other frames like in a video codec. If this example would not deliver acceptable visual quality, it would not make sense to expand the experiment to more advanced stage.
    The code is a very simple extension of the basic code that grabs and renders frame. The only difference is a synchronous call to

    The DataChannel limits the transmitted object size to  65kb. Anything larger needs to be fragmented, which means more code. In our sample we use a 320×240 resolution. At this low resolution, we come in below 65kb and do not need to fragment and reassemble the data.

    We show a side-by-side example of this and WebRTC here. The visual quality is comparable but the webp stream seems to be slightly delayed. This is probably only visible in a side-by-side comparison.

    In terms of bitrate this can easily go up to 1.5mbps — for a 320×[email protected] stream (and using a decent CPU. WebRTC clearly delivers the same quality at a much lower bitrate. Also you will notice, the page gets throttled when in the background and setInterval  is no longer executed at a high frequency.

    Encoding a video stream using libvpx

    The WebP example encodes a series of individual images. But we are really encoding a series of images that are following each other in time and therefore repeat a lot of content that is easily compressed. Using libvpx, an actual video encoding/decoding library that contains the VP8 and VP9 codecs has some benefits as we will see.

    Once again, the basic techniques for grabbing and rendering the frames remains the same. The encoder and decoder are run in a WebWorker this time which means we need to use postMessage  to send and receive frames.
    Sending is done with

    and the encoder will send a message containing the encoded frame:

    Our data is bigger than 65kbps this time so we need to fragment it before sending it over the DataChannel:

    and reassemble on the other side.

    Looking at this in comparison with WebRTC we get pretty close already, it is hard to spot a visual difference. The bitrate is much lower as well, only using about 500kbps compared to the 1.5mbps of the webp sample.

    Note that if you want to tinker with libvpx is it probably better to use Surma’s webm-wasm which also gives you the source code used to built the WebAssembly module.

    Easy simulation of packet loss

    There is an interesting experiment one can do here: drop packets.

    The easiest way to do so is to introduce a some code that stops  vpxenc._onmessage from sending some data:

    The decoder will still attempt to decode the stream but there are quite severe artifacts until the next keyframe arrives. WebRTC (which is based on RTP) has built-in mechanism to recover from packet loss by either requesting the sender to resend a lost packet (using a RTCP NACK) or send a new keyframe (using RTCP PLI). This can be added to the protocol that runs on top of the DataChannel of course or the WebAssembly could simply emit and process RTP and RTCP packets like Zoom does.

    Compiling WebRTC to WebAssembly

    Find the sample here.

    Last but not least the hackathon team managed to compile webrtc.org as WebAssembly. Which is no small feat. Note that synchronization between audio and video is a very hard problem that we have ignored so far. WebRTC does this for us magically.

    We are focusing on audio this time as this was the only thing which we got to work. The structure is a bit different from the previous examples – this time we are using a Transport object that is defined in WebAssembly

    The sendPacket function is called with each RTP packet which is already guaranteed to be smaller than 65k which means we do not need to split it up ourselves.

    This turned out to not work very well – audio was severely distorted. One of the problems is confusion between the platform/WebAudio sampling rate of 44.1khz and the Opus sampling rate rate of 48khz. This is fairly easy to fix though by replacing a couple of references to 480 by 441.

    In addition, the encoder seems to lock up at times, for reasons which are not really possible to debug without access to the source code used to build this. Simply recording the packets and playing them out in the decoder was working better. Clearly this needs a bit more work.

    Summary

    The Hackathon results are very interesting. They show what is possible today, even without WebRTC NV’s lower-level API’s and give us a much better idea at what problems still need to be solved there. In particular the current ways of accessing raw data and feeding it back into the engine show above are cumbersome and inefficient at present. It would of course be great if the Google folks would be a bit more open about these results which are quite interesting (nudge 🙂 ) but… the repository is public at least.

    Zooms Web client still achieves a better result…

    {“author”: “

    Philipp Hancke ...  Continue reading

    =&0=&

    QUIC-based DataChannels are being considered as an alternative to the current SCTP-based transport. The WebRTC folks at Google are experimenting  with it:

    Let’s test this out. We’ll do a simple single-page example similar to the

    WebRTC datachannel sample that transfers text ...  Continue reading