Sniffing & Spoofingrtpaudioinjectionspoofingnetworkvoip

rtpinsertsound

rtpinsertsound inserts audio into a specified RTP stream by spoofing packets. It supports mixing WAV files or tcpdump captures into live audio streams.

Description

rtpinsertsound is a tool designed to insert audio into a targeted live RTP audio stream. Created in August-September 2006, it was initially tested on Linux Red Hat Fedora Core 4 but is expected to work across various Linux distributions. The tool allows users to inject custom audio, such as a stapler sound, into ongoing RTP streams for testing or demonstration purposes.

Use cases include network audio manipulation, VoIP stream testing, and RTP protocol analysis. It requires a mono 8000 Hz WAV file or a tcpdump file with G.711 u-law RTP/UDP/IP/ETHERNET packets. Users must specify the audio file path and can target specific interfaces, IPs, and ports.

The tool handles multiple network interfaces carefully, as Linux routing may direct spoofed packets differently, potentially causing loops if packets return via the specified interface.

How It Works

rtpinsertsound captures legitimate RTP packets from the target stream using libpcap with an 'ip' filter. It spoofs new packets by incrementing sequence numbers, adjusting timestamps based on payload length multiplied by spoof factor, and incrementing IP ID. Jitter factor controls transmission timing relative to the next legitimate packet, delaying output to mimic stream timing (e.g., G.711 at 20ms intervals). Audio from WAV (PCM mono 8/16-bit 8000Hz) or tcpdump (G.711 u-law RTP) is mixed into the stream, output via raw sockets on the specified interface.

Installation

bash
sudo apt install rtpinsertsound

Flags

-asource RTP IPv4 addr
-Asource RTP port
-bdestination RTP IPv4 addr
-Bdestination RTP port
-fspoof factor - amount by which to: a) increment the RTP hdr sequence number, b) multiply the RTP payload length and add to RTP hdr timestamp, c) increment the IP hdr ID number [ range: +/- 1000, default: 2 ]
-iinterface (e.g. eth0)
-jjitter factor - determines transmission timing of spoofed packet relative to next legitimate packet [ range: 0 - 80, default: 80 ]
-pseconds to pause between setup and injection
-vverbose output mode
-hhelp - print this usage

Examples

Insert stapler.wav audio file through the network with verbose output, targeting interface eth0
rtpinsertsound /usr/share/rtpinsertsound/stapler.wav -v
Insert WAV audio into RTP stream on specified interface eth0
rtpinsertsound /path/to/audio.wav -i eth0
Use tcpdump file as audio source, specifying source RTP IP and port
rtpinsertsound /path/to/tcpdump.pcap -a 192.168.1.100 -A 5004
Target destination RTP IP and port on interface eth1
rtpinsertsound audio.wav -b 10.0.0.1 -B 5004 -i eth1
Apply spoof factor 5 and jitter factor 50 with verbose output
rtpinsertsound sound.wav -f 5 -j 50 -v
Pause 10 seconds between setup and injection
rtpinsertsound file.wav -p 10
Display help usage information
rtpinsertsound /usr/share/rtpinsertsound/stapler.wav -h
Updated 2026-04-16kali.org ↗