If you are going to run WebRTC in a general production environment where you want everyone to be able to connect, you are going to need a TURN server. Every WebRTC cloud service has one built-in, but if you are building your own WebRTC infrastructure you’ll need to either use one of the few CPaaS TURN services or run your own. While there are other TURN server projects, coturn has been around the longest and is the default TURN server option for WebRTC beyond.
We last covered coturn in a Q&A with its original founder back in 2014. Earlier this year it looked like the project may have stalled with many wondering “is this project dead”, but in the last few months it has made a dramatic recovery. That recovery is in large part to Gustavo Garcia and Pavel Punsky who volunteered to coordinate the project which helped to rally the community. Gustavo is a long-time WebRTC contributor and regular guest at webrtcHacks. You might have even caught him live during our first quiz/break at the Kranky Geek WebRTC Show in November. Pavel has been working in communications for more than a decade and is currently Director of Engineering at Epic Games.
In this interview, we discuss:
- Some Background on coturn
- “Is the project dead?” The community responds
- Recent updates
- Coturn’s roadmap
- Scaling coturn
- Changing STUN and TURN standards
- Coturn still needs the community’s help
{“editor”, “chad hart“}
Background
If you are not familiar, coturn is the most popular STUN and TURN server used in WebRTC. If you are new to WebRTC you may ask what is a STUN and TURN server? There are a lot of resources out there on that, but in short, these are mechanisms that help with the fundamental IP networking WebRTC because of firewalls and NAT devices.
Coturn’s predecessor project was rfc5766-turn-server by Oleg Moskalenko (mom040267) more than 10 years ago (our interview on that). That project was split with a newer version released as coturn (another interview on that). We discovered coturn was being used by AWS in one of the first big WebRTC deployments. Today, with 1800+ stars on GitHub, it is the default turn server source used by almost everyone. I asked Gustavo and Pavel if they could share any other insights on usage.
webrtcHacks: Who are some of the most significant coturn users?
Pavel: I’m not familiar with a lot of users personally but it seems that the vast majority of “legacy” and existing VoIP/WebRTC systems use coturn. Non-legacy (some more modern applications like Cloudflare TURN and Subspace (RIP) TURN) I’m not sure about that – most probably something else. I can share that Houseparty used coturn and Epic Games today uses coturn for multiple applications.
Gustavo: The big players don’t use coturn (Microsoft/Google) but most of the rest do. I would say that the most significant coturn users are Vonage and Twilio and because of that all the apps powered by those platforms. In terms of specific end-user apps companies like Hopin or Epic Games also use coturn for their services as Pavel said.
It seems Gustavo and Pavel aren’t tracking usage, or even looking at it, that is something I have been doing. Using the methodology described in my Post-Peak WebRTC Developer Trends: An Open Source Analysis, I looked at how many repos in my dataset other than the core github.com/coturn/coturn made mention of “coturn”. There was a huge pandemic peak – like all WebRTC repos – but you can also see a clear and steadily climbing trend:
My dataset has 835 repos that mention “coturn” since 2019. mediasoup had 592 as a reference. coturn is more frequently used as a container: Docker Hub shows 3.2M pull as of today.
“Is the project dead?” The community responds
GitHub user korayvt asked a good question at the end of May last year – “is this project dead?” Oleg had long since moved on from the project. Mészáros Mihály (misi) helped to maintain the project with many updates through the Autumn of 2021. But, as happens with nearly any project – particularly volunteer ones – he also moved on and there was very little activity in the project for some time.
This triggered some concern. How could a project that is used by so many not have anyone to maintain it?
It was after this that Gustavo and Pavel stepped in.
webrtcHacks: Why did you get involved with the project?
Pavel: while running a big coturn deployment at Epic Games I was facing a few issues over time: one was monitoring ability and another one was stability. So the Prometheus interface was contributed at some point (at which I jumped immediately) but it also created instability (memory leak) that caused me to dive into code for the first time. This was a trigger for me to start looking into coturn more closely. Not to mention that Gustavo volunteered to be a maintainer – that was a sign to me to join as well.
Gustavo: For me it was this ticket: https://github.com/coturn/coturn/issues/915. I thought it was not acceptable to have a server used by all of us and not have some basic maintenance.
Pavel: Same for me. I do not remember how I got to that ticket (either I found it myself or Philipp pointed it out) but I thought the same – we all use this server and I should do something. So I saw an opportunity to help even though at that point my knowledge of coturn was very minimal.
webrtcHacks: Are you the new project “leads” like Oleg and Mihály were before?
Pavel: This is my first big involvement with an open-source project in this capacity so it is hard to say. I do think there should be someone (a single person or a group of people) that maintains code at a minimal capacity at any time. Making sure PRs are reviewed, questions answered, and versions updated and published – even if not doing everything by yourself. More like an administrative job. Being able to actually write code and contribute is even better. So far the way we work with Gustavo is that we outlined a high-level plan for a year or so and try to follow it.
Gustavo: yes, unfortunately, Oleg and Misi don’t have the capacity to contribute anymore and we are taking those responsibilities.
One difference is that the project is more stable now so while Oleg and Misi were adding many features we are more in maintenance mode for now so the role is not exactly the same.
webrtcHacks: coturn has a history of having an owner. Do you think projects like this need a formal “owner”?
Pavel: That depends on how we define owner. It definitely needs a group of people that will keep it up-to-date and relevant. Fix CVEs, update libraries, publish versions to new OS versions, facilitate answers to questions – basically support the community. I have seen a spike in interactions the last few months since Gustavo and myself started responding to issues (would be nice to get some graphs here to confirm if that is correct) – this is a positive feedback flywheel.
Gustavo: 💯.
Ok, so it seems Gustavo and Pavel are the new owners. I was curious if this is a lot of effort. Lower effort is obviously more sustainable.
webrtcHacks: How much effort are you putting into coturn?
Gustavo: In terms of effort I don’t think it is much for now. In my case, it is a couple of hours every week to check if there are new issues and review PRs.
For me the important thing at this point is to make sure we are responsive to issues and contributions and not that much about adding features ourselves.
Pavel has been doing some more work recently and probably is spending more time
Pavel: I’m spending 1-2 hours a week responding to issues. I had a few weeks when I spent 3-4 hours a week writing some code. So I would say for coturn – even 1-2 hours a week make a huge difference. That is a difference between a project that is seen as “dead” vs being at a “steady state”.
Gustavo: There is another person – Kai Ren (tyranon) – that is also one of the core contributors and responsible for all the Docker infrastructure and releases of coturn.
tyranon has been active with Docker-related updates since the first part of 2021. Unfortunately, I was not able to get a hold of Kai for more insights, but his work can be seen on GitHub’s releases as well as your favorite container registry: Docker Hub, GitHub Container Registry, and Quay.io.
In addition to their own contributions, Gustavo and Pavel seem to have rallied the community. I first showed that coturn was driving a recent trend of “turn server” keyword usage in my last WebRTC Open Source Analysis. I refreshed this for all of 2022 as shown below:
The core coturn repo has had 1769 users with some activity since 2019. 67% of those users appeared since 1-May, 2022, demonstrating the rally effect.
webrtcHacks: the project has had a lot of activity beyond just your own commits since you got involved -. What’s behind all that activity?
Gustavo: I think a big part of it is that issues and PRs had been piling up for more than a year and our main job has been unblocking them. So the spike in activity is just the effect of suddenly finishing/moving all the pending questions and work of the previous 18 months.
Pavel: This is a positive feedback flywheel – maintainers put in some work, people respond and appreciate, provide feedback and ask questions. This is very helpful for the community and maintainers – the mutual feedback. Me personally, I have an interest in having coturn to be super stable and efficient – I maintain a big global deployment of coturn servers. I bet that’s not only me… But one thing when questions or PRs get unanswered for months, and another thing that they are responded in days – that gives a boost in confidence to coturn users (and also maintainers – I really enjoy when people get their problems solved)
Recent Activity
webrtcHacks: So what have you been working on? Are you working against a plan?
Pavel: The first step was to try and catch up with the current state of things. [The plan for] year 2022 was defined as “Housekeeping and promotion” – I think we did OK with housekeeping, less so with promotion… We got down from ~350 open issues to ~200, all old PRs have been reviewed and merged, security issues addressed, etc – so we got rid of a huge backlog that was dragging the project down.
Promotion kind of worked on its own I think – also thanks to Tsahi and others (Chad) who mentioned the project in their blog posts or webinars.
My personal vision – within the Housekeeping part – is to bring coturn up to speed with modern methodologies/approaches first. Multiple contributors stepped in and contributed to that effort even though it was not openly communicated. I suppose many other developers feel the same because they struggle to integrate coturn into their modern development process.
Lately, we had updates to CMake build system, support for Windows has been added, OpenSSL support has been revamped, basic protocol fuzzing added, etc.
Gustavo: So during 2022 I think we have mostly worked on these topics:
- Address open issues and pending contributions queued for the last months/years
- Agree on some coding style, reformat the code, and clean up
- Define a PR template for contributions and include more CI tests for different OS versions.
- Define new versioning and release process and generate a new version 4.6.0 after almost 2y.
- Improve Prometheus support (bug fixing and new metrics).
- Improve windows support (still WiP)
- Address security issues
The list of changes in the 4.6.0 release is shown below for reference.
coturn 4.6.0 release (regressions & typo fixes removed) | |
---|---|
* fix small issues reported by cppcheck | * add new prom allocations metric |
* fix long log line printing | * don’t link in libintl |
* Print turnserver version witd –version | * fix access to freed memory |
* do not write outside of a buffer in admin interface | * configurable prom username labels |
* fix uclient certificate loading bug | * configurable prometdeus listener port |
* fix duplicate TCP flag in run_tests.sh script | * fix build mariadb connector |
* fix turn session leak | * fix sqlite3_shutdown and sqlite3_config race |
* Document dependency of new-log-timestamp-format on new-log-timestamp | * prom server better |
* Enable compilation of coturn on Solaris 11.4 | * Define OPENSSL_VERSION_1_1_1 on systems where it doesn’t (yet) exist |
* First step to re-enable compilation witd OpenSSL 1.0.x | * Add hash algoritdm for hmackey value to redis userdb schema docs – replace keep-address-family witd allocation-default-address-family (keep-address-family deprecated and will be removed!!) |
* Fix cmake build on macOS | * Restore no_stdout_log behavior |
* Disable SSL renegotiation | * Support older mysql client version in configure |
* Fix user quota release #786 | * Add to support cmake |
* add more info to redis allocation status | * Packaging scripts can miss out on tdese errors (exit code) |
* update turnserver.conf comment | * Readme.turnserver: how to run server as a daemon |
* fix performance regression | * SSL reload has hidden bugs which cause crashes – Try to mitigate STUN amplification attatck |
* add syslog facility config | * Add new option –no-rfc5780 to force disable RFC8750 |
* add support for dual-stack prom listener | * Add new option --no-stun-backward-compatibility Disable handling old STUN Binding requests and disable MAPPED-ADDRESS attribute in binding response (use only tde XOR-MAPPED-ADDRESS) |
* fix build witd libressl 3.4.0+ | * Add new option --response-origin-only-witd-rfc5780 : Add RESPONSE_ORIGIN attribute only if rfc5780 is enabled |
* add ci tests workflow | * Don’t send SOFTWARE attribute if --no-software-attribute set on (BREAKING CHANGE) |
* show error on invalid config |
This included several security-related fixes.
webrtcHacks: what were some of the security issues you addressed?
Pavel:
- Multiple memory issues (use after free, memory overwrite)
- Multiple memory leaks (allows for DoS)
- Crashes that can be easily triggered externally (resulting in DoS)
- STUN buffer that may include data from other allocations
coturn has had some Critical Vulnerabilities and Exploits (CVE) as shown in its labels, including some specific to the coturn project. Its extensive usage by many services makes it more of a target than most WebRTC projects.
webrtcHacks: Do you do anything different for critical security and maintenance issues?
Pavel: There is no precisely defined and spelled out policy. The last issue that was raised (that had a smell of security and widespread impact) was discovered and immediately addressed by a contributor. We released a new version ASAP but did not notify anywhere. We should have done that – this is something we should work on (establishing communication practices) in 2023.
That issue was a memory corruption-related issue this past December.
Coturn roadmap
webrtcHacks: speaking of 2023, do you have a roadmap for the project? What are you working on and what do you plan to work on?
Pavel: The promotion and communication part is something we need to work on more. It is on our roadmap. Lately, we released 4.6.1 to address that security/stability issue [mentioned earlier]. The whole outward communication part needs more attention – and we plan to do that once we are done with more pressing topics.
webrtcHacks: do you have a formal roadmap?
Pavel: We maintain a document with a high-level list of topics that need to be addressed. And also some priority within that list. But those are not features, more like an organizational plan. Feature-wise there are quite a few incoming requests in form of issues where coturn users ask to implement certain features.
webrtcHacks: is that document something you share publicly?
Gustavo: It is a private doc that we never shared but it is a list of high-level topics.
For 2022 we wanted to focus on housekeeping and communication.
For 2023 we want to focus on documentation, best practices, and simplification.
webrtcHacks: What are some of the common user requests?
Pavel: Here are some of the top questions I see on coturn support in no specific order:
- I do not see Prometheus metrics
- Can I run coturn behind a load balancer?
- I have one turn server and it is overloaded – how to scale?
- I cannot connect, help!
- Lots of questions that can be traced back to a lack of understanding of how TURN integrates into WebRTC system in terms of security
Gustavo: I would say that most of the comments/issues are more about usage and deployment than feature requests and that’s why I think it is important to focus in 2023 on documentation and best practices. People still struggle to figure out the best way to scale TURN servers or implement authentication.
Scaling coturn
webrtcHacks: On the topic of scale, how do coturn users handle scalability across servers for capacity/redundancy and regionally for lower latency? Is there a standard approach to that?
Gustavo: There is some documentation around that here, but that doc hasn’t been updated in a while. That’s one of the things we would like to spend time on in 2023 – improving the documentation and providing best practices, especially around three topics:
- deployment,
- authentication, and
- scaling.
That is where we see more questions and confusion coming from people in the issue tracker:
Regarding scaling across servers the most common approach is to use DNS with some latency-based rules to direct the users to the closest region and round-robin inside each of those regions to distribute the traffic across multiple instances. That approach should be good for most of the users and would be the general recommendation, but there are other solutions based on load balancers or anycast that make sense for some specific scenarios.
Pavel: To provide more context to what Gustavo said – the difficulty with TURN is that it works best as a global system. This means multiple deployments in multiple locations with some global allocation/balancing mechanism – this is not possible to achieve today with a single command line. Some people ask for a helm chart like “other apps” – which is a valid request but not possible to accomplish because it requires some global setup that is outside of the control of coturn (like DNS global latency-based setup that is different for each provider).
There is no single recipe – and that is why this topic is not covered well.
STUN and TURN standards have changed
webrtcHacks: RFC5766 – the original IETF TURN specification – was obsoleted by RFC8656 – what were the changes? Are those items coturn needs to address?
Pavel: TURN protocol has been evolving over time since its inception – IPv6, TLS, DTLS, to name a few, were not in the original specification. Multiple security concerns, configurations, and best practices have evolved. This resulted in additional RFCs that need to be considered on top of the original RFC5766 which makes it hard. I see RFC8656 as a modern snapshot of what TURN is in 2020 – how the TURN RFC would have looked if it was written in 2020 (yeah, it sounds dumb). And it is exactly that – you can look at RFC8656 and ignore previous ones – this is the RFC you should be concerned about. To my knowledge, there is nothing that contradicts or breaks compatibility with older RFCs. But I’m not 100% sure about that.
Gustavo: Note: Not only TURN, but STUN [RFC3489] was also updated by RFC8489.
The changes are not big and it was mostly an update to address the needs of WebRTC and the deployment experiences. The WG created to make this update was called “TURN Revised and Modernized (TRAM)“.
It includes things like security upgrades, new attributes, DTLS over UDP, and support for dual allocation. Most (but not all) of them are supported by coturn even if the README doesn’t explicitly state that RFC8656 is implemented. Maybe we should try to review that.
Fippo pointed out from the IETF’s TRAM page:
TRAM is no longer making forward progress on the remaining charter item. Further discussion of existing TRAM RFCs and any other work in this area will occur in TSVWG.
The Transport Area Working Group (tsvwg) looks to be mostly working on SCTP, RSVP, and DiffServe which are protocols often related to real-time transmission.
[“typical deployment” TURN server diagram from RFC8489]
Coturn still needs the community’s help
coturn now has some project management. Since the project has always had a lot of contributors, I was curious how they planned to govern the project moving forward. There is always a trade-off between adding enough structure to make it easy for a regular group of contributors who know the procedures, but not so much that it acts as a barrier to would-be contributors.
webrtcHacks: So how do you two operate the project now? How is it coturn governed?
Gustavo: At a project/roadmap level Pavel and I talk about the status of the project and where it could make sense to put more effort and then we try to spend the time we can in those directions.
Also, we have defined a new versioning and release process that should be more clear and provide consistent releases here: https://groups.google.com/g/turn-server-project-rfc5766-turn-server/c/VdHx2VjR0vE
In terms of code and contributions when in doubt about some specific contribution or change we can discuss it offline or publicly in the GitHub issues tracker.
webrtcHacks: Is there anything specific you would like the community to help with on the project in 2023?
Gustavo: I don’t have anything in mind in terms of features apart from finishing the work that is already being done to improve Windows support. Maybe Pavel has other things in mind.
Where it would be great to have more help is supporting other people with their issues and also with testing providing data from their deployments or extending our existing simple testing infrastructure.
Pavel: I hope to get some real-world deployment information contributions – in terms of documentation, Helm charts, blog posts, etc. A lot of questions in the form of “how do I do X” can be answered that way. We (maintainers) are not always on the edge of what people do. An example for that could be “how do I set up coturn behind a load balancer in Kubernetes?”
webrtcHacks: Do you have any contribution guidelines? Where should someone that wants to contribute start?
Pavel: Not yet. We should definitely add guidelines for contributors.
webrtcHacks: Until you have that, should a would-be contributor just submit a PR? Contact you?
Pavel: Yes, submitting a PR works – we are actively monitoring new PRs and trying to be very responsive. We had multiple first-time contributors over the last half a year.
Gustavo got back to me a few days later with an update on providing contributions:
Gustavo: We added a
contributing.md
after getting this question from you 😀 https://github.com/coturn/coturn/blob/master/CONTRIBUTING.md
That was fast! You can ask questions on the coturn user group here and file issues and contribute code in the GitHub repo: https://github.com/coturn
{
“Q&A”:{
“interviewer”: “chad hart“,
“interviewee”: [”Gustavo Garcia”, “Pavel Punsky“ ]
}
}
Leave a Reply