Roc 0.1 released: real-time streaming over the network

What is Roc?

I’m happy to announce the first release of Roc Toolkit, version 0.1.0!

Roc provides a C API for real-time audio transport. The user just writes a stream to the one end and reads it from another end. Roc, in turn, performs encoding and decoding, and, more importantly, deals with the problems related to the real-time delivery: counters network jitter and packet reordering, maintains constant latency, restores lost packets, compensates the clocks difference.

Roc also provides command-line tools that can be used to read an audio stream from a file or audio device, transmit it over the network, and write to a file or audio device on another computer. These tools can be used with bare ALSA, PulseAudio, CoreAudio, and some other audio backends.

Finally, Roc provides proof-of-concept PulseAudio modules allowing PulseAudio to improve the network service quality by using Roc as a transport. There is a thread in the PulseAudio mailing list discussing the submission of these modules to PulseAudio upstream.

Why Roc?

Real-time streaming requires to address at least these problems:

  • maintaining constant latency;
  • coping with network jitter and packet reordering;
  • coping with packet losses if the network is unreliable;
  • converting between the sender and receiver clock domains.

If you’re using TCP-based protocols, e.g. HTTP, the latency suffers. If you’re using UDP-based protocols without loss recovery, e.g. bare RTP, the service quality suffers if the network is unreliable. And in both cases, you still need to convert between the sender and receiver clock domains by yourself.

Roc addresses all these problems by using a jitter buffer to counter network jitter and packet reordering, Forward Erasure Correction codes to recover lost packets, and a frequency estimation algorithm with a resampler to compensate the clocks difference between the sender and receiver.

Roc does all these transparently to the user and provides a simple high-level stream-oriented API on top. Using Roc is both easier than using HTTP or bare RTP and provides better latency and service quality on unreliable networks.

Internally Roc uses RTP and FECFRAME, an RTP extension for Forward Erasure Correction. These protocols are open standards, so Roc can interoperate with other software implementing them.

Roc is conceived to support multiple audio encodings, FEC schemes, and network protocols and let the user select what is most appropriate for the specific use-case. The 0.1 version implements two FEC schemes (Reed-Solomon and LDPC-Staircase) and a single audio encoding (PCM 16-bit stereo), however, we will add more in upcoming releases.

Learn more

My previous article explains in more detail how does Roc address the problems listed above, and how is it compared with other software, notably built-in PulseAudio transports.

The “overview” page in the documentation provides further details on the real-time streaming challenges and summarizes the project scope, goals, and possible usage scenarios.

If you want to learn more about the implementation, see the “internals” section in the documentation.

Features and plans

The 0.1 release implements basic real-time transport features:

  • streaming CD-quality audio using RTP (PCM 16-bit stereo);
  • maintaining pre-configured target latency;
  • restoring lost packets using FECFRAME with Reed-Solomon and LDPC-Staircase FEC schemes;
  • converting between the sender and receiver clock domains using resampler;
  • converting between the network and input/output sample rates;
  • configurable resampler profiles for different CPU and quality requirements;
  • mixing simultaneous streams from multiple senders on the receiver;
  • binding receiver to multiple ports with different protocols;
  • interleaving packets to increase the chances of successful restoring;
  • detecting and restarting broken streams.

See the “features” page in the documentation for the full list.

There are a lot of plans for further improvements. They can be divided into several groups:

  • Transport. Support more encodings and FEC codes, improve the service quality, improve the latency. Dynamically adapt to the network conditions. Support encryption and multicast.
  • Control protocols. Implement session negotiation, service discovery, and remote control. Provide high-level protocol-independent API for those features.
  • Tools and integrations. Finish PulseAudio modules and try to submit them to upstream, implement more tools, implement bindings for various programming languages.
  • Portability. Ports for non-Linux *nix operating systems, Android, and Windows.
  • Research. Learn to measure the full network latency, test Roc on different network types and conditions, determine the minimum possible latency that we can handle on different channels.

See our roadmap and project board for further details.

The short-term plans for 0.2 and 0.3 are to implement session negotiation and service discovery and to add support for more audio encodings (more sample rates, more channel sets, lossy compression using Opus).

I’ll post updates to my twitter and to our mailing list.

How to use

Consult the following documentation sections:

  • Building — building and installing;
  • Running — instructions for command-line tools and PulseAudio modules;
  • Manuals — reference for command-line tools;
  • API — reference and examples for the C library.

Quickstart:

If you want to run Roc on a single-board computer, see also our “tested boards” and “cross-compiling” pages.

For example, you can run a receiver (roc-recv tool or module-roc-sink-input PulseAudio module) on a Raspberry Pi with speakers and a sender (roc-send tool or module-roc-sink PulseAudio module) on a desktop or a laptop. You can run multiple senders connected to different receivers and you can connect multiple senders to a single receiver. Command-line tools and PulseAudio modules are interoperable.

Currently, Roc supports GNU/Linux (command-line tools and PulseAudio modules) and macOS (command-line tools). See details in the “supported platforms” page.

How to help

The project is still at an early stage of development.

If you would like to help, you can:

  • Help with testing. Build and run Roc command-line tools or PulseAudio modules and submit bug reports or suggest improvements.
  • Help with research. Discover how does Roc perform on your network and share the results. What latency does it handle well in your case, how much packets are lost and recovered, and so on.
  • Help with porting. We’d be happy to meet developers who can help with maintaining ports for non-Linux *nix systems, macOS, Android, Windows, probably something else.
  • Contribute patches. Roc has quite a long and, I think, a pretty interesting list of tasks in the roadmap. Also, Roc tries to meet high coding standards: code review, continuous integration, tests, documentation.

The project internals are documented here and here. The developer’s information can be found here.

You can reach us via our mailing list or GitHub. See details in README.

About authors

Three ex-colleagues, currently:

About name

Roc is a giant chthonic bird. It can carry a lot, even an elephant.

About license

Roc is published under MPL-2.0, except PulseAudio modules, which are LGPL-2.1.

If Roc is built with OpenFEC support enabled, it must be distributed under a lincense compatible with CeCILL, a GPL-like and GPL-compatible license used for OpenFEC.