Audio stream via RTP
Overview
Audio is encoded using Opus, encrypted using AES CBC 128 bit and sent via RTP.
Every 4 RTP packets with audio content will be followed by 2 extra RTP packets with FEC parity information.
PING
The port exchanged during the RTSP phase it’s not where the actual audio stream will take place! |
Moonlight will send a PING
package to the exchanged port, this will tell us which random port the connecting client was able to open in order to communicate with Wolf.
The client port from where Wolf receives the PING
will also be the one where the actual RTP stream will be sent to.
This is necessary because usually clients will be behind NAT.
RTP packets
The first 12 bytes of each packet are defined as follows:
FEC packets will also add the followings 12 bytes after the RTP header and before the FEC actual information:
Opus encoder
All parameters for properly encoding using Opus are exchanged via the RTSP phase.
In order to debug what’s exchanged it might be useful to checkout the RFC 6716 #section-3.1 which defines the Opus packet formats; from the specs:
A well-formed Opus packet MUST contain at least one byte [R1]. This byte forms a table-of-contents (TOC) header that signals which of the various modes and configurations a given packet uses. It is composed of a configuration number, "config", a stereo flag, "s", and a frame count code, "c"
AES encryption
Encoded audio packets will be AES encrypted before being sent over the wire, key and IV are exchanged via HTTPS when calling the launch
endpoint.
These are the same key and IV used to also encrypt the Control stream.
Forward Error Correction (FEC)
Uses Reed Solomon to encode the payload so that it can be checked on the receiving end for transmission errors (and possibly fix them).
This operation will create extra parity blocks that will be sent in separate RTP packets.