Editor note: see the updated version of this post here.
As described in previous posts, WebRTC does not specify a particular signalling model other than the generic need to exchange Session Description Protocol (SDP) media descriptions in the offer/answer fashion.
During the last few months, my friend Antón Román (CTO of Quobis) and I spent a lot of time with our team figuring out how to manipulate and adapt the SDP’s generated by web browsers to make them compatible with the different server/gateway technologies we’re working with.
As WebRTC makes use of new mechanisms but also existing ones that have seen few deployment in real networks to date. SDP’s generated by Web Browsers are more complex and contain a number of new attributes that are unfamiliar in SIP or IMS networks. In the following post, Antón analyses the anatomy of a WebRTC SDP, giving a detailed description of what all those lines do.
Anatomy of a WebRTC SDP (by Antón Román)
If you are reading this blog you likely know that Session Description Protocol (SDP) plays a central role in the setup of WebRTC sessions. SDP is the protocol used to exchange media information between SIP endpoints, and it has also been chosen by IETF and W3C to exchange media information in WebRTC. A WebRTC peer uses SDP to inform the other end about which transport protocols, ports, codecs and other parameters to use in a media session.
SDP use in WebRTC has been criticized by many since the beginning. These critiques have a good rationale behind and I strongly recommend the Iñaki Baz Interview about Object RTC (ORTC), the proposed alternative to SDP for WebRTC, to learn more about these arguments. SDP may not be the more flexible or scalable way to negotiate WebRTC sessions, but nonetheless it has been adopted by the current standard (namely WebRTC 1.0) and all the existing implementations are based on it. Therefore we will not deal with the SDP debate (and what could happen in WebRTC 2.0) in this post but will instead focus on reviewing the information that is included in the SDP at the time of this writing and how this information is used in the WebRTC multimedia session creation.
SDP is defined in RFC4566 but this specification only defines the main headers and how the offer/answer models works. There are many RFCs that cover how to add various different media capabilities to SDP. This draft reviews all the RFCs involved in the SDPs for WebRTC – I strongly encourage you to reading it if you need a deeper understanding about any of the SDP elements.
Let’s take a look at a real SDP message that was created just prior to being sent to the other peer by Chromium version 32.0.1700.19 to start a video/audio session. Click on the SDP image below to go to our interactive SDP anatomy page for a line-by-line description:
I would like to end this post with a brief comment about SDP incompatibilities among Chrome versions. Finding that your WebRTC app is not working with the last version of Chrome is not an uncommon issue. New features are added, the implementation is improved and bugs are fixed and these changes may have an impact in the SDP. SDP problems are expected to become less common as the implementation gets more mature and homogenised among browser vendors but it is something that must be taken into consideration for the time being.
Aswath Rao says
Do you know whether the current implementations support detecting peer reflexive addresses and using them as ICE candidates? Thanks.
Anton Roman says
honestly I haven’t found prflx candidates in the tests I’ve done so far. Both Chrome and Firefox have code to deal with Peer Reflexive candidates so I understand they are using them in their ICE implementations. In google-ice it seems that prflx are not supported (at least that’s said in a comment in Chromium code). Maybe someone can give us more info about this.
Just for the information of other readers, a peer reflexive candidate is discovered when the reply to a STUN connectivity check (a Binding Request) sent to a Server Reflexive candidate of the other peer has a MAPPED-ADDRESS (IP and/or port) different from any existing local candidate.
Aswath Rao says
Let me elaborate why I asked that question and get your feedback. I am trying to streamline ICE procedure by taking advantage of trickle ICE.
The app has a media relay server (not TURN per se, but a “twice NAT”) and it inserts a port at this RS in both the O/A messages which do not contain any ICE candidates, including local addresses. Then the two end-points will do connectivity check on them, which will succeed and the media will flow. In that process, they will discover the other’s “server reflexive” addresses as peer reflexive addresses. If the both addresses are the same, then they will exchange their local addresses as well. Now they will do full connectivity test and if one of them succeed, then media will flow over this new connection, freeing up the relay server.
The advantages are:
1. Media can start to flow immediately, but the expensive relay server is freed up as soon as it can be.
2. No time lost on STUN query
3. Local address is not revealed if it is not of use and eliminating a potential threat vector.
So, what do you think?
I created a SDP parser so that I can manipulate it into the browser: https://github.com/beradrian/sdpparser.