-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support docker and k8s in native #1595
Comments
daemon is a background service that uses two forks to make the parent process become 1, thus differentiating it from background processes. In Docker, Docker implements the daemon, so SRS needs to have the default daemon set to off, which means it will always start in the foreground. This is a change that SRS needs to make to become cloud-native. For more details, refer to #1594.
|
The machines in Docker are all internal network addresses within NAT, no longer having external IP addresses. Therefore, when directing to a source station cluster with flows, there may be an issue of returning internal IP addresses. Please refer to #1501 for more details.
For flow levels exceeding 100k, it is necessary to define them on your own. There are some suggested solutions available, please refer to #1607 (comment) for more information.
|
SRS3 already supports Docker. The official Docker image is based on CentOS7, which is relatively mature and stable in the server field. It provides the image for deployment of SRS3, which can be found at SRS3 Docker. It also provides development images at Development, as well as special images like SRT. For more details, please refer to srs-docker.
|
In cloud services, SLB is usually placed in front of SRS to provide services. SLB is similar to nginx reverse proxy, with high bandwidth and throughput. Multiple SRS extensions can be added behind it, without changing the IP address for external service provision. In k8s, SRS runs in pods, and new pods can be continuously started. SLB is placed in front of the pods to provide external services. SLB generally has health checks, either TCP or HTTP protocols, which require SRS support. Please refer to #1598 for more information.
|
If SRS shares volume with other containers like Nginx, for example, SRS handles HLS while Nginx is responsible for distributing HLS, if the volume is k8s' emptyDir, then the HTTP directory is empty, without crossdomain.xml or index.html, which is not very user-friendly. SRS can write some information by default, but it is turned off by default. In k8s, it can be enabled by referring to #1603.
|
Regarding the storage issues of K8s, it can be divided into two types: configuration and streaming slicing, such as DVR or HLS. K8s configuration can be done using ConfigMap, which can be mounted as a shared volume in the container's file system. This makes SRS appear as a configuration file. For more details, please refer to 1, 2, 3. As for how to notify SRS to reload after updating the configuration file, further investigation is required. In K8s storage, when there is a single origin server, SRS and Nginx can be deployed in the same Pod and share a directory using Volume: emptyDir. SRS writes HLS files to this directory, while Nginx reads the HLS files and distributes them externally. However, in an Origin Cluster, since the origin servers are separate Deployments and Services, they cannot be deployed in the same Pod. In order to share storage across Pods, a cross-Pod shared storage solution such as NAS is needed.
|
In the origin server cluster, each origin server needs to provide services to the Edge, or in other words, the origin server needs to be accessed. Therefore, each origin server needs to have a service address. Please refer to Origin Server Cluster K8s Deployment Method for more details. This requires configuring
|
K8S updates and rollbacks can be achieved through Rolling Update. Version information is recorded each time When performing K8S gray release, the general approach is to create a new version of the image and deploy the new version of the application. The labels for the old and new applications are the same, so the SLB or Service will evenly distribute the traffic between the old and new versions. Then, by scaling down the old version and scaling up the new version, the traffic is gradually increased to the new version. There are several issues here:
The second point above, regarding how to perform a smooth upgrade, research has found that Kubernetes (K8S) has relevant mechanisms:
Wiki please refer to: https://github.com/ossrs/srs/wiki/v4_CN_K8s#srs-cluster-canary-release
|
SRS3 supports Gracefully Quit, which requires sending the SIGQUIT signal to SRS. However, in K8S, after preStop, it also sends SIGTERM to SRS. By default, SIGTERM is a Fast Quit that will exit quickly. Therefore, if SRS receives SIGTERM during Gracefully Quit, it will also exit quickly. Therefore, in the Docker environment, SRS needs to specify SIGTERM through configuration (ignore or consider it as Gracefully QUIT). Since it is only needed during smooth upgrades, it is not required by default. It is more appropriate to specify it through configuration. Reference: #1579 (comment) For example, the configuration is as follows:
So after we delete the Deployment, we will first call preStop to send SIGQUIT to SRS, initiate Gracefully Quit, and then wait for 60s. And at 30 seconds, K8S realizes that preStop has not finished yet, so it sends SIGTERM to SRS, then waits for 2 seconds and sends SIGKILL to force the exit of SRS. The log is as follows:
Wiki reference: https://github.com/ossrs/srs/wiki/v4_CN_K8s#srs-cluster-canary-release
|
When starting to delete a Pod, it will also be removed from the Service at the same time. However, since this process is synchronous, it is possible that the Pod has not been removed from the Service when we receive the SIGQUIT signal. Please refer to Termination of Pods.
Step 5 is being performed simultaneously with Step 3. This means that when we receive SIGQUIT, we cannot immediately stop the listeners. We need to wait for a short period, such as around 2.3 seconds, to ensure that the Service has safely removed this Pod. Then we can stop listening without causing any issues. Add a configuration option for this.
By calculation, the minimum time required for SRS to perform Gracefully Quit is 5.5 seconds:
Wait for a certain period of time and then Gracefully Quit:
If you don't need to wait that long, you can also configure a shorter duration, maybe a few hundred milliseconds should be enough.
Wiki please refer to: https://github.com/ossrs/srs/wiki/v4_CN_K8s#srs-cluster-canary-release
|
The configuration of SRS is stored in ConfigMap. After the configuration is changed, SRS needs to reload and load the configuration, which involves how K8S notifies the relevant SRS. For more details, please refer to #1635.
|
The resources of K8S include CPU and Memory, and the new version also has extended resources. The definition and consumption of resources are the standards for accurately evaluating water levels and scaling up or down. This may require SRS to do more work, please refer to Resource.
|
Docker and K8S need SRS to make some modifications, especially in the Origin and Edge Cluster modes, to provide better support for Docker and K8S. This way, users can quickly build a streaming media source station and distribution cluster using SRS, and it is also convenient for scaling, resizing, monitoring, and operation.
The important direction of SRS4 is to be cloud-native, creating a streaming media cluster on the cloud.
Please refer to the K8S Wiki at: https://github.com/ossrs/srs/wiki/v4_CN_K8s
TRANS_BY_GPT3
The text was updated successfully, but these errors were encountered: