If you’ve ever tried to pull a video feed from a camera, stream a live concert, or just convince a $29 “smart” webcam to behave, you’ve probably discovered the uncomfortable truth of modern media tech:
Video streaming is held together by a family of legacy protocols that barely tolerate each other — yet somehow run the entire planet’s surveillance, livestreaming, baby monitors, and half of YouTube.
And like any dysfunctional family, each protocol has:
- its quirks,
- its emotional baggage from the 90s,
- its own opinion on how video should work,
- and yet… you can’t kick any of them out of the house.
Let’s meet the cast.
RTP: The blue-collar workhorse nobody thanks
RTP is the guy who shows up to work every day at 5:30 a.m., never complains, and carries 90% of the load while everyone else takes credit.
Born in the mid-90s, RTP (Real-time Transport Protocol) was built for a world of dial-up modems and corporate telepresence systems that cost more than a car. And yet — improbably — it still powers:
- IP camera video
- VoIP calls
- WebRTC audio/video
- real-time feeds for everything from drones to traffic cameras
RTP doesn’t negotiate, doesn’t argue, doesn’t even ask what codec you stuffed inside it. It just packs time-stamped chunks of audio/video into UDP and prays they survive the trip.
It’s the original truck driver of the internet. No glamour. Lots of fumes. Absolutely essential.
RTSP: The remote control your DVR from 2003 would be proud of
RTSP (Real-Time Streaming Protocol) is technically a “control protocol,” but spiritually it’s the nerdy AV kid in school who insists on using proper terminology for “play” and “pause.”
Instead of “start,” you send:
- PLAY
- PAUSE
- TEARDOWN (which feels way more dramatic than “stop”)
RTSP doesn’t deliver video — that’s RTP’s job. RTSP just yells instructions.
It’s basically:
“Hey RTP, buddy, could you hand the client video on port 5004? Thanks.”
But here’s the plot twist: RTSP evolved into the lingua franca of surveillance cameras. Want a live feed from any semi-serious IP camera? It’s almost certainly:
rtsp://username:password@camera-ip/Streaming/Channels/101
It’s a miracle of consistency in an industry otherwise convinced that every manufacturer must reinvent everything.
RTMP: The undead Flash protocol that refuses to die
RTMP (Real-Time Messaging Protocol) should be extinct. It was designed for Flash — you know, that browser plugin we ceremonially buried in 2020.
And yet somehow…
RTMP is still the #1 ingest format for:
- OBS
- Streamlabs
- Twitch
- YouTube Live
- Facebook Live
It’s the cockroach of streaming protocols — embarrassingly hard to kill and inexplicably effective.
Is it pretty? No.
Is it modern? No.
Is it insanely reliable for pumping video into a media server? Absolutely.
RTMP is the ’99 Honda Civic of livestreaming: old, ugly, unkillable.
ONVIF: The committee that forces security cameras to behave (mostly)
ONVIF is not actually a streaming protocol — it’s more like a United Nations peacekeeping force for IP cameras.
Before ONVIF, every camera manufacturer had its own API, its own device discovery, its own settings layout, and its own personality disorder.
ONVIF stepped in and said:
“Everyone. Sit. We’re doing SOAP over HTTP now. And you’re going to like it.”
The industry grumbled… and complied.
Thanks to ONVIF, you can:
- auto-discover cameras,
- get their RTSP URLs,
- configure streams,
- control PTZ,
- subscribe to motion alarms,
- authenticate users.
Yes, it uses XML. Yes, it uses SOAP. Yes, it feels like configuring a printer in 2008.
But it works — and in the CCTV world, that’s practically a miracle.
Why the world still needs all four (and then some)
You might wonder, as any sane engineer would:
Why not replace all these dinosaurs with one elegant modern protocol?
Because real-time video is a chaotic hellscape of conflicting priorities:
Low latency
Security operators want <1 second.
Streamers want “fast enough not to spoil the meme.”
CDNs want 10 seconds because caching is cheaper.
Firewalls
They hate UDP.
They hate RTP.
They tolerate HTTP.
They pretend to like WebRTC but only out of politeness.
Scale
Viewing 12 camera feeds? → RTSP.
Streaming to a million phones? → HLS/DASH.
Video calling? → WebRTC.
Moving broadcast feeds between cities? → SRT/RIST.
There is no “one-size-fits-all.”
This is engineering, not wish-fulfillment.
The rest of the streaming circus
While RTP/RTSP/RTMP/ONVIF do their thing, the modern web brought its own stars:
HLS
Created by Apple. Uses HTTP. Works everywhere.
Has enough latency to make you yell “GOAL!” 20 seconds before your neighbor hears it.
MPEG-DASH
The standardized cousin of HLS. Great for massive audiences.
Terrible if you want to see anything in real time.
WebRTC
High-tech wizardry. Ultra-low latency. Encrypted everything.
Designed for video calls, but increasingly used for live surveillance previews.
SRT & RIST
For broadcasters who need to move high-quality, low-jitter video over the internet without praying to the network gods.
MPEG-TS over UDP
Still running half the world’s IPTV networks. If apocalypse comes, MPEG-TS will survive along with cockroaches and RTMP.
The bottom line
Real-time media is chaotic. Messy. Unpredictable. And built on a stack of old and new tools that work only because the industry collectively decided not to break them all at once.
- RTP moves the bits.
- RTSP tells RTP what to do.
- RTMP gets your livestream into the cloud whether you like it or not.
- ONVIF prevents IP cameras from behaving like wild animals.
They may be old. But together, they keep the world’s video feeds running — from your doorbell cam to a stadium livestream watched by 12 million people.
And until someone invents a protocol that teleports photons directly into your eyeballs, this gloriously dysfunctional family isn’t going anywhere.