This is the next decode and analysis in Philipp Hancke’s Blackbox Exploration series conducted by &yet in collaboration with Google. Please see our previous posts covering WhatsApp and Facebook Messenger for more details on these services and this series. {“editor”: “chad hart“}
FaceTime is Apple’s answer to video chat, coming preinstalled on all modern iPhones and iPads. It allows audio and video calls over WiFi and, since 2011, 3G too. Since Apple does not talk much about WebRTC (or anything else), maybe we can find out if they are using WebRTC behind the scenes?
As part of the series of deconstructions, the full analysis (another sixteen pages) is available for download here, including the Wireshark dumps.
If you prefer watching videos, check out the recording of this talk I did at Twilio’s Signal conference where I touch on this analysis and the others in this series.
In a nutshell, FaceTime
- is quite impressive in terms of quality,
- requires an open port (16402) in your firewall as documented here,
- supports iOS and MacOS devices only,
- supports simultaneous ring on multiple devices,
- is separate from the messaging application, unlike WhatsApp and Facebook Messenger,
- announces itself by sending metrics over an unencrypted HTTP connection (Dear Apple, have you heard about pervasive monitoring?)
- presumably still uses SDES (no signs of DTLS handshakes, but I have not seen a=crypto lines in the SDP either).
Since privacy is important, it is sad to see a complete lack of encryption in the HTTP metrics call like this one:
Details
FaceTime has been analyzed earlier- first when it was introduced back in 2010 and more recently in 2013. While the general architecture is still the same, FaceTime has evolved over the years like adding new codecs like H.265 when calling over cellular data.
What else has changed? And how much of the changes can we observe? Is there anything those changes tell us about potential compatibility with WebRTC?
Still using SDES
It is sad that Apple continuing to use SDES almost two years after the IETF at it’s Berlin meeting where it was decided that WebRTC MUST NOT Support SDES. The consensus on this topic during the meeting was unanimous. For more background information, see either Victor’s article on why SDES should not be used or dive into Eric Rescorla’s presentation from that meeting comparing the security properties of both systems.
NAT traversal
Like WebRTC, FaceTime is using the ICE protocol to work around NATs and provide a seamless user experience. However, Apple is still asking users to open a certain number of ports to make things works. Yes, in 2015.
Their interpretation of ICE is slightly different from the standard. In a way similar to WhatsApp, it has a strong preference for using a TURN servers to provide a faster call setup. Most likely, SDES is used for encryption.
Video
For video, both the H.264 and the H.265 codecs are supported, but only H.264 was observed when making a call on a WiFi. The reason for that is probably that, while saving bandwidth, H.265 is more computationally expensive. One of the nice features is that the optimal image size to display on the remote device is negotiated by both clients.
Audio
For audio, the AAC-ELD codec from Fraunhofer is used as outlined on the Fraunhofer website.
In nonscientific testing, the codec did show behaviour of playing out static noise during wifi periods of packet loss between two updated iPhone 6 devices.
Signaling
The signaling is pretty interesting, using XMPP to establish a peer-to-peer connection and then using SIP to negotiate the video call over that peer-to-peer connection (without encrypting the SIP negotiation).
This is a rather complicated and awkward construct that I have seen in the past when people tried to avoid making changes to their existing SIP stack. Does that mean Apple will take a long time to make the library used by FaceTime generally usable for the variety of use cases arising in the context of WebRTC? That is hard to predict, but this seems overly complex.
Quality of Experience
FaceTime offers an impressive quality and user experience. Hardware and software are perfectly attuned to achieve this. As well as the networking stack as you can see in the full story.
{“author”: “Philipp Hancke“}
Chad Hart says
Privacy was a big topic at Apple’s WWDC. It will be interesting to see if some of the privacy gaps Fippo identified here get addressed as part of this initiative.
Henri Machalani says
Here’s a relevant analysis of the Facetime protocol I did a while back for those interested in packet level details: https://drive.google.com/file/d/0B94-8EUppQnYMDk0OTU2OGMtNWRiYy00NjllLWE1YTAtOGViZDcwNzAzYjA4/edit?lipi=urn:li:page:d_flagship3_profile_view_base;l1cn4Ix3TD6bVHAH2HKRkg%3D%3D
Chad Hart says
Good stuff – thanks for sharing!