Live Streaming infra and protocols
March 24, 2025 · 3 min read · Page View:
The process of live streaming from alicloud
If you have any questions, feel free to comment below.
In our daily life, you may always see the live streaming on the internet, such as the live streaming of the video conference, the live streaming of the sports event, the live streaming of the online course, etc. And sometimes you even start your own live streaming, but we all get used to just clicking the button on the screen to start the live streaming, do you know what is happening behind the scene?
Introduction #
Before we start, we need to know some common terms in the live streaming.

- The blue line is the push-pull logic of the live streaming.
- The yellow line is the bussiness logic.
Push streaming (End) #
Push streaming means the process of sending the live streaming data(collect audio and video data) to the streaming media server. In this process, the network demand is high, you should better use the stable network environment to push the streaming data. In this process, the common protocol is RTMP. And the data format is encoded video eg. h264
, h265
, encoded audio eg. aac
. (The decode process is done in the playback end.)
Normally, the CDN server is used to cache the live streaming as well as pull the streaming from the origin server if itself receives the large number of requests.
Origin server #
The origin servers are comprised of the streaming media server and the bussiness server end. The streaming media server is responsible for the live streaming data, and the bussiness server end is responsible for the bussiness logic such as the danmaku, the chat or even the payment of the sales.
Pull streaming (End) #
The pull streaming end is also called as the playback end. The devices like the smart phone, the tablet or the computer can pull the live streaming data from the CDN server. Pull streaming means the process of receiving the live streaming data from the server. The common protocol is RTMP, HLS, HDL(HTTP-FLV).
The common protocols #
RTMP #
Real Time Messaging Protocol, which is a TCP based protocol (by Adobe), so the data will be transferred in the TCP tunnel. The most video format is flv
. The port is 1935
.
It has some advantages:
- RTMP is a TCP based protocol, so it can use the TCP long connection to transfer the data, while the 1935 port may be blocked.
- It will not generate the files in the streaming media server. So it can be lower latency.
HLS #
HTTP Live Streaming, which is a protocol of Apple. It has a great compatibility in every platform. It is used to transfer the video segments ts
. The streaming media server will use m3u8
index file to manage the video segments(ensure the order of the video segments). The media server will cache the video data into a ts file around 10 seconds. Then when the client requests the video data, the media server will send the newest ts
file to the client. It has good mobile compatibility. Besides, the HLS is based on the HTTP protocol, so everytime it requests the video data, it will be a new HTTP request, which will increase the loadency. From above we can analyze the HLS cannot meet the requirement of the low latency. 2
It still has some advantages:
- HLS is based on the HTTP protocol, so it can easily bypass the firewall.
- The player uses the ts file, so the client can change the bit rate easily.
HTTP-FLV / HDL #
HTTP-FLV is a lightweight protocol for live streaming. It is used to transfer the video eg. flv
in http, and it is based on the HTTP protocol with port 80
. The advantage is that the header information is simple still the RTMP has a complex handshake protocol. So it will be little faster than the RTMP.
MPEG-DASH #
Moving Picture Experts Group-Dynamic Adaptive Streaming Over HTTP is the first international streaming media protocol based on HTTP. Its unique value is that it can almost universally play videos, supporting formats including H.264, H.265, VP8/9, and AV1.
Websocket #
Normally, the websocket
protocol is not used in the streaming media, it is often used in the bussiness end for communication and interation. The interation between the host and the audience is jit. So the http cannot takes the role due to the fact that it is stateless and connectionless. Generally, the websocket is used to achieve the goal.
Practical situation #
Normal, the live streaming system will be:
- The host pushes the streaming via RTMP.
- The client will pull the streaming via HTTP-FLV or HLS.
References #
Related readings
If you find this blog useful and want to support my blog, need my skill for something, or have a coffee chat with me, feel free to: