Recording

屏幕、摄像头、麦克风和扬声器的录制是一个常见的需求。相较于转码示例，这个示例主要不同在于：

打开并访问输入设备
使用swscale对解码图片进行缩放和格式转换
正确处理编解码帧的时间戳

Windows

# 列出所有设备
ffmpeg -hide_banner -f dshow -list_devices true -i dummy

# 录制摄像头
ffmpeg -f dshow -i video="HD WebCam" -c:v libx264 camera.mp4
# 录制屏幕
ffmpeg -f gdigrab -framerate 25 -offset_x 100 -offset_y 200 -video_size 720x360 -i desktop -c:v libx264 screen.mp4

Linux

# 录制摄像头
ffmpeg -f v4l2 -i /dev/video0 -c:v libx264 camera.mkv
# 录制屏幕
ffmpeg -framerate 25 -video_size 720x360 -f x11grab -i :0.0+100,200 -c:v libx264 screen.mp4

# 录制麦克风
ffmpeg -f pulse -i default -y a.wav
ffmpeg -f alsa -i default -y a.wav

# 录制扬声器
ffmpeg -hide_banner -sources pulse # ffmpeg查看pulse音频源
# Auto-detected sources for pulse:
# * alsa_input.usb-046d_081b_D16189C0-02.mono-fallback [Webcam C310 Mono]
#   alsa_output.pci-0000_00_1f.3.iec958-stereo.monitor [Monitor of Built-in Audio Digital Stereo (IEC958)]
#   alsa_output.pci-0000_01_00.1.hdmi-stereo.monitor [Monitor of TU104 HD Audio Controller Digital Stereo (HDMI)]

# pactl list short sources
# 1	alsa_input.usb-046d_081b_D16189C0-02.mono-fallback	module-alsa-card.c	s16le 1ch 48000Hz	SUSPENDED
# 2	alsa_output.pci-0000_00_1f.3.iec958-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED
# 3	alsa_output.pci-0000_01_00.1.hdmi-stereo.monitor	module-alsa-card.c	s16le 2ch 44100Hz	SUSPENDED

ffmpeg -hide_banner -f pulse -i alsa_output.pci-0000_01_00.1.hdmi-stereo.monitor -y a.wav

视频录制

打开输入设备

相较于打开视频文件，访问其他输入设备需要多一点准备，首先是

avdevice_register_all();

在打开设备的时候，需要指定输入设备的format:

// windows和linux上的输入设备格式不同，在ffmpeg命令中使用-f指定，参看本节开头的ffmpeg命令
avformat_open_input(&decoder_fmt_ctx, input, av_find_input_format(input_format), nullptr)

缩放和`PIX_FMT_`转换

准备上下文，以及为转换后的图片分配空间

    SwsContext * sws_ctx = sws_getContext(
            decoder_ctx->width,decoder_ctx->height,decoder_ctx->pix_fmt,
            encoder_ctx->width,encoder_ctx->height,encoder_ctx->pix_fmt,
            SWS_BICUBIC, nullptr, nullptr, nullptr
    );

    AVFrame * scaled_frame = av_frame_alloc();
    scaled_frame->height = encoder_ctx->height;
    scaled_frame->width = encoder_ctx->width;
    scaled_frame->format = encoder_ctx->pix_fmt;
    av_frame_get_buffer(scaled_frame, 0);

缩放和转换调用：

sws_scale(
    sws_ctx,
    static_cast<const uint8_t *const *>(decoded_frame->data), decoded_frame->linesize,
    0, decoder_ctx->height,
    scaled_frame->data, scaled_frame->linesize);

缩放和格式转换也可以用filter实现，而且filter会自动进行格式协商，见后续filter等示例。

时间设定

因为视频是由不连续的帧组成的，因此需要帧率和其他时间戳来控制实际每一帧的显示时间。

首先说一下AVRational结构体，它在ffmpeg中用来表示有理数，且用的是分数num/dem的形式。

/**
 * Rational number (pair of numerator and denominator).
 */
typedef struct AVRational{
    int num; ///< Numerator
    int den; ///< Denominator
} AVRational;

在ffmpeg中，帧率framerate和时间基数time_base都是用AVRational表示的。

framerate: 帧率，例如24帧表示为AVRational{24, 1}
time_base: 时间戳单位(或者理解为时间片，是对单位s的缩放)，ffmpeg中的时间单位并不是固定的1s/1ms/1us等，而是解码器(AVCodecContext)和每条视频流/音频流(AVStream)可以设定各自的time_base。如果time_base是1ms，则为AVRational{1, 1000}，即1s的1000分之一。
- AVCodecContext.time_base: gives the exact fps. If ticks_per_frame is 2, downsize the time_base with 1/2. For example, if AVCodecContext.time_base (1, 60) and ticks_per_frame is 1, the fps is 60. If ticks_per_frame is 2, fps is 30. 也就是AVCodecContext.time_base和fps是相关的。 fps固定时，AVCodecContex.time_base应该为1/framerate；fps不固定时，那就没有fps这个概念了(或者说可以认为是1/AVCodecContex.time_base)
- AVStream.time_base: The time_base for AVStream is only for time unit in the methods in AVStream, such as getting the time of one frame, or the .start variable. 也就是AVStream.time_base只是一个精确的时间单位就可以了，而且编码时AVStream.time_base在手动设定后，可能会被ffmpeg根据编码格式重新设定。
- 关于为什么要有AVCodecContext.time_base和AVStream.time_base两种: Generaly coder time base is inverse Frame Rate, so we can increment PTS simple by 1 for next frame, but Stream time base can depend on some format/codec specifications. Packets PTS/DTS must be in Stream time-base units before writing so rescaling between coder and stream time bases is required.
pts: presentation timestamp，也就是显示时间，用来指定该帧播放的时间。pts的时间单位就是time_base，也就是从视频开始到这一帧经过了多少个time_base。
- AVPacket.pts 的单位必须是对应流的AVStream.time_base
- AVFrame.pts 的单位则不确定，是解码/编码时输入的 packet 或 frame 对应的time_base
dts: decompression timestamp，即解码时间。由于有些编码格式有预测帧等类型的帧存在，帧的编解码顺序不同，pts >= dts。编码后写入文件时，帧的dts应该为单调递增。
duration: 两帧之间的间隔。
AVFormatContext.r_frame_rate: libavformats猜的framerate

设定`time_base`和`pts`

编码时的time_base需要手动设定：

// 一般来说，转码或者录屏的编码器使用和输入源相同的帧率，或者设定为指定帧率也可以
encoder_ctx->framerate = av_guess_frame_rate(decoder_fmt_ctx, decoder_fmt_ctx->streams[video_stream_idx], nullptr);
// Context的time_base一般设置为帧率的倒数即可，这样后一帧的pts就是当前帧pts+1，这样都是整数。
// 此外设置和输入源解码器context相同的time_base或设置为指定的time_base都行
encoder_ctx->time_base = av_inv_q(encoder_ctx->framerate);
// 视频流的time_base要在调用avformat_write_header()之前设定(或者不设定)，
// 且调用`avformat_write_header()`后，流的time_base会被覆写，因此不一定是这里设定的值
encoder_fmt_ctx->streams[0]->time_base = encoder_ctx->time_base;

视频转码时，解码的帧是有对应的pts等时间戳的，但是录屏/录制摄像头等视频流没有正确的时间戳，我们需要通过系统时钟计算并设定每一帧的时间戳。

纪录开始录制的时间first_pts，然后用av_gettime_relative()获取当前时间，减去first_pts作为帧的pts。ffmpeg内部的时间单位是AVRational{1, 1000000}，需要转换到对应的时间单位上。

int64_t first_pts = AV_NOPTS_VALUE;
first_pts = first_pts == AV_NOPTS_VALUE ? av_gettime_relative() : first_pts;
scaled_frame->pts = av_rescale_q(av_gettime_relative() - first_pts, { 1, AV_TIME_BASE }, encoder_fmt_ctx->streams[0]->time_base);

此外，也可以使用摄像头等输入源的pts，不过摄像头的pts一般不是从0开始的，要减去视频流的起始时间，在读取到packet时：

packet->pts -= decoder_fmt_ctx->streams[video_stream_idx]->start_time;

pts要在编码前设定好，这样编码器可以为生成的packet设定对应的pts和dts。因为有不同类型的帧，所以编码器输出的packet不是按照pts顺序输出，而是按照dts输出的，且dts在写入文件时，必须时单调递增的(这里可以添加一个是否单调递增的检查，因为一般写入前要进行时间单位的转换，如果时间是被截断的，dts可能会重复造成写入失败)。

音频录制

音频基础可以先看一下数字音频基础－从PCM说起类似的博客。音频处理过程中，以下参数较为常用:

sample_rate: 采样率
channels: 通道数
channel_layout: 通道布局
sample_fmt: 采样格式

音频不同于视频，音频是连续的采样(离散但连续的等时间间隔采样)，对时间较为敏感(人对声音敏感)，也就是采样率确定的情况下，将time_base设定为采样率的倒数，那么计算的pts均为连续的整数，且非常好计算。

resampled_frame->pts = first_pts;
first_pts += resampled_frame->nb_samples;

References

[Ffmpeg-devel] Frame rates and time_base
ffmpeg time unit explanation and av_seek_frame method
[Libav-user] Helo in understanding PTS and DTS
数字音频基础－从PCM说起
transcode_aac_8c-example
YUV PixelFormat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Recording

Windows

Linux

视频录制

打开输入设备

缩放和`PIX_FMT_`转换

时间设定

设定`time_base`和`pts`

音频录制

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Recording

Windows

Linux

视频录制

打开输入设备

缩放和PIX_FMT_转换

时间设定

设定time_base和pts

音频录制

References

缩放和`PIX_FMT_`转换

设定`time_base`和`pts`