This is a document about the way to build the technical stack of multimedia research/development.
-
Some basic knowledge about computer science and mathematical tools, especially computer network, operating system, convex optimization, machine learning, etc.
-
The video streaming algorithm listed in Repository of papers , such as Buffer-Based, Rate-Based, Pensieve, Comyco and Pitree.
-
Video Coding (H.264, H.265, SVC) and related tools (FFmpeg). Video coding scientists aim to design better compression methods to reduce the video size and maintain the perceived quality, and video transmission scientists aim to design better rate adaptation schemes according to the network condition using some existing coding tools. So, their connection is very close, but their research focuses are a little different.
-
Image processing (super-resolution, denoising and deblurring). Some advanced tools, such as StyleGAN, make video more interesting.
-
Mobile/Edge/Cloud computing. Sometimes it needs to allocate some resources and group mobile users in the video system. Some optimization algorithms, such as bandit, knapsack, greedy method, auction, game theory and DRL, are always used to solve the system problem.
-
Transmission protocols, such as RTMP, RTSP, HLS, DASH.
-
Advanced tools. DASH is often used in on-demand videos in the experiments. webrtc is often used in video telephony. QUIC , VLC player
-
The ability to build an optimization problem of video streaming or video analysis.
-
A vision of the latest researches, such as AI, 6-DOF videos (point cloud), etc.
Additional skills:
- Deep learning (pytorch/tensorflow). Nowadays, many SOTA algorithms, such as DRL, super-resolution and video coding, use deep learning based methods.
- Android development, if you want to develop your mobile video application and system.
- Video quality assessment, which means how to evaluate the video's perceived quality. Basic methods: PSNR, SSIM, VMAF.
- VR focuses on how to render in time. VR developing tool: Unity. AR is related to video analysis. Tools: ARKit and ARCore.
- 360-degree video: viewport prediction, projection, gaze collection, stitching, etc.
- Using C/C++ to develop a video system.
- many more: Nginx, graphics.
It seems it is really hard to conquer the field of multimedia. However, you only need to be expert at one field and learn some basic knowledge in other fields. Multimedia application has penetrated into every aspect of our life, such as video entertainment, video conferencing, and video analysis with some computer vision algorithms. If you are a multimedia expert, you can create many amazing and valuable applications. Last but not least, interest is the most important thing.