17 comments on “Signalling Options for WebRTC Applications

  1. Nice article guys. It brought to my mind this Rosenberg’s sentence : “the need for having inter-provider standards is gone”

    In my opinion the main decision to made is if you should use a standard protocol (SIP or XMPP) or create your own ad-hoc protocol. If you have a running infrastructure based on SIP or XMPP probably the best choice is to continue using them also from browsers to reduce the complexity and maintenance cost of protocol translators, but if you are creating a new service/infrastructure it could be a good choice to create your own ad-hoc simple JSON based protocol for your specific use case.

  2. Congratulations for the article.

    About Trikle-ICE for SIP there are, at least, two efforts in the IETF:


    Once this subject is solved, SIP will be more suitable for WebRTC but still not the best possible protocol at all (nor it is the most optimal or reliable RTC protocol over UDP or TCP). It is 100% feasible for a developer to design a custom and minimal signaling protocol, perfectly optimized for a specific website and service. That has never been the goal of SIP.

    • Thing is that adding Trickle ICE to SIP is quite tricky. One reason is that, unless you can make the (very strong) assumption that the other endpoint supports it, you can do trickle only on the receiving side. Another reason is that it basically mandates either PRACK, or some retransmission-based hack. Not going to happen in the next couple of years.

      • Most of the scenarios where I’m seeing SIP over Websockets being used are basically WebRTC-to-SIP ones. In that case, the interworking function is usually anchoring media (note standard ICE, DTLS-SRTP, etc. are not supported by most of the SIP equipment deployed out there) and in the same way would be terminating ICE Trickle. I don’t see this as a blocking issue.

  3. Nice discussion Enrico, signalling is usually an underestimated portion of WebRTC. After having tested Quobis’ WebRTC Client, both in a pure web fashion but also with several gateway vendors interfacing towards “legacy” networks, under different network scenarios and conditions, I’d like to share the following considerations — focusing mainly on the WebRTC to SIP interworking scenario:

    – Using one protocol or another really depends on the scenario you are dealing with. One needs to consider whether interworking towards existing networks/domains is required, what are the protocols being used there, etc. Even, in some integrations we had to use different protocols for different services, namely one signalling protocol for audio/video sessions, a second for IM/Presence and a third for some private service. Because of all this, we decided to implement an abstraction layer in our client architecture so we could support multiple signalling protocols (popular choices seem to be SIPoWS, different flavors of JSONoWS and REST APIs) and add new ones without requiring and application re-design. This way we decouple the client and WebRTC core from the signalling libraries and hence reduce the cost of integrations in different networks and with different gateway vendors.

    – Using SIPoWS when one wants to interconnect towards “legacy” SIP networks can make things easier. In this case, following a standards-based approach, the signalling gateway simply needs to perform transport layer interworking. Beyond that, at the application layer, we’ve seen customers willing to include specific SIP headers in the client itself and have them to traverse the gateway function transparently. This is really easy using SIPoWS — we’ve been involved in a couple of cases where JSON was used and for each new tiny header/parameter/value we wanted to include, the gateway vendor had to provide a new workspace image hence impacting the trial progress (they mentioned in the future that could be doable via config though).

    – Some operators see “non-standard signalling” as a potential lock-in with their gateway vendor. Yes, there are standard APIs but those are rarely implemented. If in the future an operator decides to switch vendors or simply add a new one (e.g. for a new service), using standard signalling really makes things easier

    – Yes, Websocket is a “new” technology but it’s rapidly gaining popularity and implementation support. From what we have experienced in the field, and having tested in both enterprise and carrier networks, usually there’re no issues traversing proxies or other HTTP entities using WSS. In any case, customers do prefer to use encrypted signalling in most of these scenarios. When it comes to timeouts/disconnections, yes, we experienced some of these in the past but was easily fixed via configuration. In any case, we expect websocket related issues to disappear at the same pace implementations get some maturity — note weboscket isn’t used only for webrtc but also for many other HTTP services, and in fact will play an important role in HTTP-based communications

    – When it comes to Trickle ICE and SIP, yes, it’s far from trivial but in a controlled scenario (like the interworking one) is something perfectly doable.

    Any feedback is welcome 🙂

    • Sure when one can assume that the endpoint is always talking to a media anchoring interworking function, one can also make quite a lot of assumptions on the protocol and protocol extensions supported. And I agree in that case there is actually a need for a standardised interface, to indeed avoid vendor lock-in. Whether such interface is better achieved at the protocol level, or at the JavaScript level (e.g. AT&T’s orca.js), that’s going to be a matter of discussion for a while I believe.

    • >>Because of all this, we decided to implement an abstraction layer in our client architecture so we could support multiple signalling protocols (popular choices seem to be SIPoWS, different flavors of JSONoWS and REST APIs) and add new ones without requiring and application re-design<<

      My team has done exactly the same and abstracted signalling with a modular "drop in signalling stack of your choice" approach.

      I think this will become a common model for a while.

    • The overhead of mapping Trickle ICE to SIP is quite large, as each request requires a response ( as opposed to some WebSocket based proprietary option which could send unidirectional messages ). Microsoft tried to add ‘BENOTIFY’ in the past, but that never got very far as far as I know

      This brings interesting questions on the scalability of WebRTC applications. It can be made to work, but does it work for millions of clients connected to the cloud? For example, Chrome by default enables TCP keep-alive on the WebSocket connection, sending a 60-byte packet every 45 seconds. For a handful of clients that is not an issue, but in a scenario with 1 BHCA this could represent 50% of the total signalling traffic

  4. Nice article, Enrico. It does a good job of summarizing why signaling in the WebRTC realm is still heavy lifting and is likely to evolve. I think your concluding statement said it well:”… At the end of the day, most of the signalling and gateway servers will come with client-side toolkits that will mask the underlying protocols…”. The attached presentation provided helpful background. As for UUI in SIP, we will hopefully be done soon. 🙂

  5. Excellent and detailed article – this “WebRTC signaling” question is clearly a hot topic! My view is that the essential value of the “no signalling defined” position is that it opens up all kinds of innovation for alternative connectivity approaches that may or may not bear much resemblance to heavier classical or SIP-style signaling. While developers can choose to use SIP or XMPP, they can also choose not to. I was struck by PubNub‘s demo at WebRTC Expo Atlanta which showed how a global real-time publish/subscribe network, that may be in use for all sorts of other application purposes, could easily absorb the needs of WebRTC signaling. You might see PubNub as “reliable global WebSockets on steroids”. PubNub also reports on their VoIP customer RebTel (with 13 million users) essentially replacing their internal use of SIP with PubNub publish/subscribe. So alternative approaches are not necessarily “heavy lifting” (James), plus, btw, things like UUI information transfer are trivial in many of these alternatives :).

    I am struck by the fact that Twitter, Facebook, Instagram, LinkedIn, Pinterest, Chatter, Jive, Box and many others are all doing global “signaling” of some sort at some level of “real time” and yet their internal global connectivity architectures look nothing like SIP. This is partly because they also have all sorts of other streaming, searching, liking, following and other semantics that traditional signaling has never dealt with, and run at a scale that would completely boggle an average enterprise SIP framework (and I am sure no Web developer wants to start casually adding IMS :).

    For reference, other recent WebRTC signaling discussions include Tsahi Levent-Levi’s blog and at my article at WebRTC World.

  6. Thanks for such a good article and discussion.

    Using SIPoWS to access a Telco Gateway can raise some security concerns as such Gateways could be rather transparent (even good B2BUA can be quite transparent); the Telco network behind the Gateway is often a very sensitive asset and the JavaScipt running in the Browser offers less guarantees about its innocuity than previous SIP endpoints.

    For Web apps that do not need to access legacy networks, PubNub and the likes do indeed seem to offer far more agility than plain SIP. By the way, a benchmark of such “Full Web” frameworks would be a great topic for a future article!

    Another powerful aspect of WebRTC signalling that strikes my mind is the possibility to split Identity Provider stuff from signalling plane stuff. One could wonder how much of the WebRTC Identity Provider is going to be implemented by Browser makers and Players with relevant Identities.

  7. Thanks for sharing John — It’s great to see not only proprietary implementations for RFC7118 but also opensource code including sipml5, QoffeeSIP, sip-js, JsSIP and yours. In fact, I believe your implementation is a fork from JsSIP, right? Thanks again!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.