Skip to content

Latest commit

 

History

History
543 lines (460 loc) · 24.5 KB

FFmpeg source code structure AVPacket, AVPacketSideData, AVBufferRef and AVBuffer.md

File metadata and controls

543 lines (460 loc) · 24.5 KB

AVPacket stores the encoded frame data, which is usually output by demuxer and then transferred to decoder as input, or received from encoder as output and then transferred to muxer.

img

For video, it should usually contain a compressed frame. For audio, it may contain several compressed frames. Encoders allow output of empty packets, no compressed data, and only side data (for example, updating some stream parameters at the end of encoding).

The semantics of data ownership depends on the buf field. If this value is set, the Packet data is dynamically allocated and valid indefinitely until it is used for AV_ Packet_ The call to unref() reduces the reference count to 0.

If the buf field is not set, av_packet_ref() will make a copy instead of increasing the reference count.

side data is always generated by av_malloc(), assigned by av_packet_ref() is copied by av_packet_unref() is released.

sizeof(AVPacket) has been abandoned as a part of public ABI. Once AV_ init_ The Packet() function is removed, and the new Packet can only be created by av_packet_alloc(), new fields may be added to the end of the structure.

Next, we will learn about AVPacket, and then we will lead out AVBufferRef and AVPacketSideData from AVPacket structure. Finally, we will lead out AVBuffer from AVBufferRef and AVPacketSideData from AVPacketSideData enumeration.

1, AVPacket

libavcodec/packet.h

typedef struct AVPacket {
    /**
     * A reference to the reference-counted buffer where the packet data is
     * stored.
     * May be NULL, then the packet data is not reference-counted.
     */
    AVBufferRef *buf;
    /**
     * Presentation timestamp in AVStream->time_base units; the time at which
     * the decompressed packet will be presented to the user.
     * Can be AV_NOPTS_VALUE if it is not stored in the file.
     * pts MUST be larger or equal to dts as presentation cannot happen before
     * decompression, unless one wants to view hex dumps. Some formats misuse
     * the terms dts and pts/cts to mean something different. Such timestamps
     * must be converted to true pts/dts before they are stored in AVPacket.
     */
    int64_t pts;
    /**
     * Decompression timestamp in AVStream->time_base units; the time at which
     * the packet is decompressed.
     * Can be AV_NOPTS_VALUE if it is not stored in the file.
     */
    int64_t dts;
    uint8_t *data;
    int   size;
    int   stream_index;
    /**
     * A combination of AV_PKT_FLAG values
     */
    int   flags;
    /**
     * Additional packet data that can be provided by the container.
     * Packet can contain several types of side information.
     */
    AVPacketSideData *side_data;
    int side_data_elems;

    /**
     * Duration of this packet in AVStream->time_base units, 0 if unknown.
     * Equals next_pts - this_pts in presentation order.
     */
    int64_t duration;

    int64_t pos;                            ///< byte position in stream, -1 if unknown

#if FF_API_CONVERGENCE_DURATION
    /**
     * @deprecated Same as the duration field, but as int64_t. This was required
     * for Matroska subtitles, whose duration values could overflow when the
     * duration field was still an int.
     */
    attribute_deprecated
    int64_t convergence_duration;
#endif
} AVPacket;

Here's what each field means.

field meaning
AVBufferRef * buf The reference to the reference count buffer that stores the packet data.
int64_t pts Using avstream > time_ The time stamp displayed in the base time base is the time when the unpacked packet is presented to the user.
int64_t dts Using avstream > time_ Base is the time stamp of unpacking, and the time when the packet is unpacked.
uint8_t * data The actual data buffer of the packet.
int size The actual data size of the packet.
int stream_index The index of the stream.
int flags AV_ PKT_ A combination of flag values.
AVPacketSideData * side_data Additional data that the container can provide.
int side_data_elems side_ The number of data elements.
int64_t duration The duration of this packet is avstream > time_ Base, or 0 if unknown.
int64_t pos The byte position in the stream, or - 1 if unknown.

Here's AV_ PKT_ The combined values that flag can use.

libavcodec/packet.h

#define AV_PKT_FLAG_KEY 0x0001 / / key frame
#define AV_PKT_FLAG_CORRUPT 0x0002 / / corrupt data
#define AV_PKT_FLAG_DISCARD 0x0004 / / is used to discard packet s that need to remain in a valid decoder state but do not need to be output, and should be discarded after decoding.
#define AV_ PKT_ FLAG_ Trusted 0x0008 / / packet comes from a trusted source.
#define AV_PKT_FLAG_DISPOSABLE 0x0010 / / used to indicate a packet containing a frame that can be discarded by the decoder, that is, a non referenced frame.

2, AVBufferRef

A reference to a data buffer. The size of this structure is not part of the public ABI and is not intended to be allocated directly.

libavutil/buffer.h

typedef struct AVBufferRef {
     AVBuffer *buffer;
 
     /**
      * The data buffer. It is considered writable if and only if
      * this is the only reference to the buffer, in which case
      * av_buffer_is_writable() returns 1.
      */
     uint8_t *data;
     /**
      * Size of data in bytes.
      */
 #if FF_API_BUFFER_SIZE_T
     int      size;
 #else
     size_t   size;
 #endif
} AVBufferRef;
field meaning
AVBuffer *buffer A reference count buffer type. It is opaque, which means to use it by reference (AVBufferRef).
uint8_t *data Data buffer. If and only if this is the only reference to the buffer, it is considered writable, in which case av_buffer_is_writable() returns 1.
size_t / int size The size of data in bytes.

3, AVBuffer

A reference count buffer type. Defined in libavutil / buffer_ In internal. H. It is opaque, which means to use it by reference (AVBufferRef).

libavutil/buffer_internal.h

struct AVBuffer {
    uint8_t *data; /**< data described by this buffer */
    buffer_size_t size; /**< size of data in bytes */

    /**
     *  number of existing AVBufferRef instances referring to this buffer
     */
    atomic_uint refcount;

    /**
     * a callback for freeing the data
     */
    void (*free)(void *opaque, uint8_t *data);

    /**
     * an opaque pointer, to be used by the freeing callback
     */
    void *opaque;

    /**
     * A combination of AV_BUFFER_FLAG_*
     */
    int flags;

    /**
     * A combination of BUFFER_FLAG_*
     */
    int flags_internal;
};
field meaning
uint8_t *data The data described by the buffer.
buffer_size_t size The size of data in bytes.
atomic_uint refcount The number of existing AVBufferRef instances that reference this buffer.
void (*free)(void *opaque, uint8_t *data) Callback used to release data.
void *opaque An opaque pointer used by the release callback function.
int flags AV_BUFFER_FLAG_ *The combination of the two.
int flags_internal BUFFER_FLAG_ *The combination of the two.

AVBuffer is an API for referencing count data buffers.

There are two core objects AVBuffer and AVBufferRef in this API. AVBuffer represents the data buffer itself; it is opaque and cannot be accessed directly by the caller, but only through AVBufferRef. However, the caller may compare two AVBuffer pointers to check whether two different references describe the same data buffer. AVBufferRef represents a single reference to AVBuffer, which can be operated directly by the caller.

There are two functions that can create a new AVBuffer with one reference -- av_buffer_alloc() is used to allocate a new buffer, av_buffer_create() is used to wrap an existing array in AVBuffer. From existing references, you can use av_buffer_ref() creates another reference. Using av_buffer_unref() releases a reference (once all references are released, the data is automatically released).

The Convention between this API and the rest of FFmpeg is that a buffer is considered writable if there is only one reference to it (and it is not marked read-only). AV is provided_ buffer_ is_ Write() function to check if this is true, and av_buffer_make_writable() will automatically create a new writable buffer if necessary.

Of course, nothing prevents the calling code from violating this Convention, but it is only safe if all existing references are under its control.

Reference and dereference buffers are thread safe, so they can be used by multiple threads at the same time without any additional locks.

Two different references to the same buffer can point to different parts of the buffer (for example, their AVBufferRef.data The data will not be equal).

4, AVPacketSideData

Additional Packet data that the container can provide. A Packet can contain several types of side information.

libavcodec/packet.h

typedef struct AVPacketSideData {
    uint8_t *data;
#if FF_API_BUFFER_SIZE_T
    int      size;
#else
    size_t   size;
#endif
    enum AVPacketSideDataType type;
} AVPacketSideData;
field meaning
uint8_t *data Data cache.
int / size_t size The size of the data cache in bytes.
enum AVPacketSideDataType type Packet side data type.

The AVPacketSideDataType enumeration defines various side data types.

libavcodec/packet.h

/**
 * @defgroup lavc_packet AVPacket
 *
 * Types and functions for working with AVPacket.
 * @{
 */
enum AVPacketSideDataType {
    /**
     * An AV_PKT_DATA_PALETTE side data packet contains exactly AVPALETTE_SIZE
     * bytes worth of palette. This side data signals that a new palette is
     * present.
     */
    AV_PKT_DATA_PALETTE,

    /**
     * The AV_PKT_DATA_NEW_EXTRADATA is used to notify the codec or the format
     * that the extradata buffer was changed and the receiving side should
     * act upon it appropriately. The new extradata is embedded in the side
     * data buffer and should be immediately used for processing the current
     * frame or packet.
     */
    AV_PKT_DATA_NEW_EXTRADATA,

    /**
     * An AV_PKT_DATA_PARAM_CHANGE side data packet is laid out as follows:
     * @code
     * u32le param_flags
     * if (param_flags & AV_SIDE_DATA_PARAM_CHANGE_CHANNEL_COUNT)
     *     s32le channel_count
     * if (param_flags & AV_SIDE_DATA_PARAM_CHANGE_CHANNEL_LAYOUT)
     *     u64le channel_layout
     * if (param_flags & AV_SIDE_DATA_PARAM_CHANGE_SAMPLE_RATE)
     *     s32le sample_rate
     * if (param_flags & AV_SIDE_DATA_PARAM_CHANGE_DIMENSIONS)
     *     s32le width
     *     s32le height
     * @endcode
     */
    AV_PKT_DATA_PARAM_CHANGE,

    /**
     * An AV_PKT_DATA_H263_MB_INFO side data packet contains a number of
     * structures with info about macroblocks relevant to splitting the
     * packet into smaller packets on macroblock edges (e.g. as for RFC 2190).
     * That is, it does not necessarily contain info about all macroblocks,
     * as long as the distance between macroblocks in the info is smaller
     * than the target payload size.
     * Each MB info structure is 12 bytes, and is laid out as follows:
     * @code
     * u32le bit offset from the start of the packet
     * u8    current quantizer at the start of the macroblock
     * u8    GOB number
     * u16le macroblock address within the GOB
     * u8    horizontal MV predictor
     * u8    vertical MV predictor
     * u8    horizontal MV predictor for block number 3
     * u8    vertical MV predictor for block number 3
     * @endcode
     */
    AV_PKT_DATA_H263_MB_INFO,

    /**
     * This side data should be associated with an audio stream and contains
     * ReplayGain information in form of the AVReplayGain struct.
     */
    AV_PKT_DATA_REPLAYGAIN,

    /**
     * This side data contains a 3x3 transformation matrix describing an affine
     * transformation that needs to be applied to the decoded video frames for
     * correct presentation.
     *
     * See libavutil/display.h for a detailed description of the data.
     */
    AV_PKT_DATA_DISPLAYMATRIX,

    /**
     * This side data should be associated with a video stream and contains
     * Stereoscopic 3D information in form of the AVStereo3D struct.
     */
    AV_PKT_DATA_STEREO3D,

    /**
     * This side data should be associated with an audio stream and corresponds
     * to enum AVAudioServiceType.
     */
    AV_PKT_DATA_AUDIO_SERVICE_TYPE,

    /**
     * This side data contains quality related information from the encoder.
     * @code
     * u32le quality factor of the compressed frame. Allowed range is between 1 (good) and FF_LAMBDA_MAX (bad).
     * u8    picture type
     * u8    error count
     * u16   reserved
     * u64le[error count] sum of squared differences between encoder in and output
     * @endcode
     */
    AV_PKT_DATA_QUALITY_STATS,

    /**
     * This side data contains an integer value representing the stream index
     * of a "fallback" track.  A fallback track indicates an alternate
     * track to use when the current track can not be decoded for some reason.
     * e.g. no decoder available for codec.
     */
    AV_PKT_DATA_FALLBACK_TRACK,

    /**
     * This side data corresponds to the AVCPBProperties struct.
     */
    AV_PKT_DATA_CPB_PROPERTIES,

    /**
     * Recommmends skipping the specified number of samples
     * @code
     * u32le number of samples to skip from start of this packet
     * u32le number of samples to skip from end of this packet
     * u8    reason for start skip
     * u8    reason for end   skip (0=padding silence, 1=convergence)
     * @endcode
     */
    AV_PKT_DATA_SKIP_SAMPLES,

    /**
     * An AV_PKT_DATA_JP_DUALMONO side data packet indicates that
     * the packet may contain "dual mono" audio specific to Japanese DTV
     * and if it is true, recommends only the selected channel to be used.
     * @code
     * u8    selected channels (0=mail/left, 1=sub/right, 2=both)
     * @endcode
     */
    AV_PKT_DATA_JP_DUALMONO,

    /**
     * A list of zero terminated key/value strings. There is no end marker for
     * the list, so it is required to rely on the side data size to stop.
     */
    AV_PKT_DATA_STRINGS_METADATA,

    /**
     * Subtitle event position
     * @code
     * u32le x1
     * u32le y1
     * u32le x2
     * u32le y2
     * @endcode
     */
    AV_PKT_DATA_SUBTITLE_POSITION,

    /**
     * Data found in BlockAdditional element of matroska container. There is
     * no end marker for the data, so it is required to rely on the side data
     * size to recognize the end. 8 byte id (as found in BlockAddId) followed
     * by data.
     */
    AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL,

    /**
     * The optional first identifier line of a WebVTT cue.
     */
    AV_PKT_DATA_WEBVTT_IDENTIFIER,

    /**
     * The optional settings (rendering instructions) that immediately
     * follow the timestamp specifier of a WebVTT cue.
     */
    AV_PKT_DATA_WEBVTT_SETTINGS,

    /**
     * A list of zero terminated key/value strings. There is no end marker for
     * the list, so it is required to rely on the side data size to stop. This
     * side data includes updated metadata which appeared in the stream.
     */
    AV_PKT_DATA_METADATA_UPDATE,

    /**
     * MPEGTS stream ID as uint8_t, this is required to pass the stream ID
     * information from the demuxer to the corresponding muxer.
     */
    AV_PKT_DATA_MPEGTS_STREAM_ID,

    /**
     * Mastering display metadata (based on SMPTE-2086:2014). This metadata
     * should be associated with a video stream and contains data in the form
     * of the AVMasteringDisplayMetadata struct.
     */
    AV_PKT_DATA_MASTERING_DISPLAY_METADATA,

    /**
     * This side data should be associated with a video stream and corresponds
     * to the AVSphericalMapping structure.
     */
    AV_PKT_DATA_SPHERICAL,

    /**
     * Content light level (based on CTA-861.3). This metadata should be
     * associated with a video stream and contains data in the form of the
     * AVContentLightMetadata struct.
     */
    AV_PKT_DATA_CONTENT_LIGHT_LEVEL,

    /**
     * ATSC A53 Part 4 Closed Captions. This metadata should be associated with
     * a video stream. A53 CC bitstream is stored as uint8_t in AVPacketSideData.data.
     * The number of bytes of CC data is AVPacketSideData.size.
     */
    AV_PKT_DATA_A53_CC,

    /**
     * This side data is encryption initialization data.
     * The format is not part of ABI, use av_encryption_init_info_* methods to
     * access.
     */
    AV_PKT_DATA_ENCRYPTION_INIT_INFO,

    /**
     * This side data contains encryption info for how to decrypt the packet.
     * The format is not part of ABI, use av_encryption_info_* methods to access.
     */
    AV_PKT_DATA_ENCRYPTION_INFO,

    /**
     * Active Format Description data consisting of a single byte as specified
     * in ETSI TS 101 154 using AVActiveFormatDescription enum.
     */
    AV_PKT_DATA_AFD,

    /**
     * Producer Reference Time data corresponding to the AVProducerReferenceTime struct,
     * usually exported by some encoders (on demand through the prft flag set in the
     * AVCodecContext export_side_data field).
     */
    AV_PKT_DATA_PRFT,

    /**
     * ICC profile data consisting of an opaque octet buffer following the
     * format described by ISO 15076-1.
     */
    AV_PKT_DATA_ICC_PROFILE,

    /**
     * DOVI configuration
     * ref:
     * dolby-vision-bitstreams-within-the-iso-base-media-file-format-v2.1.2, section 2.2
     * dolby-vision-bitstreams-in-mpeg-2-transport-stream-multiplex-v1.2, section 3.3
     * Tags are stored in struct AVDOVIDecoderConfigurationRecord.
     */
    AV_PKT_DATA_DOVI_CONF,

    /**
     * Timecode which conforms to SMPTE ST 12-1:2014. The data is an array of 4 uint32_t
     * where the first uint32_t describes how many (1-3) of the other timecodes are used.
     * The timecode format is described in the documentation of av_timecode_get_smpte_from_framenum()
     * function in libavutil/timecode.h.
     */
    AV_PKT_DATA_S12M_TIMECODE,

    /**
     * The number of side data types.
     * This is not part of the public API/ABI in the sense that it may
     * change when new side data types are added.
     * This must stay the last enum value.
     * If its value becomes huge, some code using it
     * needs to be updated as it assumes it to be smaller than other limits.
     */
    AV_PKT_DATA_NB
};
type meaning
AV_PKT_DATA_PALETTE Palette, data size by AVPALETTE_SIZE decision.
AV_PKT_DATA_NEW_EXTRADATA Used to inform the codec or format that the extradata buffer has changed, and the receiver should take appropriate measures to do so. The new extradata is embedded in the side data buffer and should be used immediately to process the current frame or packet.
AV_PKT_DATA_PARAM_CHANGE The layout is affected by the AVSideDataParamChangeFlags type.
AV_PKT_DATA_H263_MB_INFO It contains a lot of structure about macroblock information, which is related to dividing the packet into smaller packets at the edge of macroblock.
AV_PKT_DATA_REPLAYGAIN It is associated with audio stream and contains replay gain information in the form of AVReplayGain structure.
AV_PKT_DATA_DISPLAYMATRIX It contains a 3x3 transformation matrix, which describes an affine transformation, which needs to be applied to the decoded video frame to display correctly.
AV_PKT_DATA_STEREO3D It is associated with video stream and contains stereo 3D information in the form of avstereo 3D structure.
AV_PKT_DATA_AUDIO_SERVICE_TYPE Associated with an audio stream and corresponding to enum type enum AVAudioServiceType.
AV_PKT_DATA_QUALITY_STATS Contains quality related information from the encoder.
AV_PKT_DATA_FALLBACK_TRACK Contains an integer value that represents the stream index of the fallback track.
AV_PKT_DATA_CPB_PROPERTIES It corresponds to AVCPBProperties structure.
AV_PKT_DATA_SKIP_SAMPLES It is recommended to skip the specified number of samples.
AV_PKT_DATA_JP_DUALMONO Indicates that the packet may contain "dual mono" audio specific to Japanese DTV. If it is true, it is recommended to use only the selected channel.
AV_PKT_DATA_STRINGS_METADATA List of string key value pairs.
AV_PKT_DATA_SUBTITLE_POSITION The location of the subtitle event.
AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL The data found in the BlockAdditional element of the matroska container.
AV_PKT_DATA_WEBVTT_IDENTIFIER The optional first identifier line of the WebVTT cue.
AV_PKT_DATA_WEBVTT_SETTINGS Optional setting (rendering description) after the timestamp specifier of WebVTT cue.
AV_PKT_DATA_METADATA_UPDATE List of string key value pairs. Include update metadata that appears in the stream.
AV_PKT_DATA_MPEGTS_STREAM_ID uint8_t type MPEGTS stream ID, which needs to transfer stream ID information from demuxer to corresponding muxer.
AV_PKT_DATA_MASTERING_DISPLAY_METADATA Mastering display metadata (based on SMPTE-2086:2014), which should be associated with video stream and stored in the form of avmasteringdisplay metadata structure.
AV_PKT_DATA_SPHERICAL It is associated with video stream and corresponds to avspherical mapping structure.
AV_PKT_DATA_CONTENT_LIGHT_LEVEL Content light level (based on CTA-861.3). The metadata should be associated with the video stream and stored in the form of AVContentLightMetadata structure.
AV_PKT_DATA_A53_CC ATSC A53 Part 4 Closed Captions.
AV_PKT_DATA_ENCRYPTION_INIT_INFO Encrypt initialization data.
AV_PKT_DATA_ENCRYPTION_INFO Contains encrypted information about how to decrypt a packet.
AV_PKT_DATA_AFD Active Format Description data. Describes the use of AVActiveFormatDescription in ETSI TS 101 154 to enumerate specified data consisting of a single byte.
AV_PKT_DATA_PRFT Producer reference time data corresponds to avproducer reference time structure, which is usually exported by some encoders (by exporting in AVCodecContext)_ side_ The prft tag is set in the data field.
AV_PKT_DATA_ICC_PROFILE ICC profile data consisting of opaque eight byte buffers in the format described in ISO 15076-1.
AV_PKT_DATA_DOVI_CONF DOVI configuration.
AV_PKT_DATA_S12M_TIMECODE Timecode in accordance with SMPTE ST 12-1:2014.
AV_PKT_DATA_NB Number of side data types.

reference material:

  1. https://ffmpeg.org/doxygen/trunk/structAVPacket.html
  2. https://ffmpeg.org/doxygen/trunk/structAVBufferRef.html
  3. https://ffmpeg.org/doxygen/trunk/structAVBuffer.html
  4. https://ffmpeg.org/doxygen/trunk/structAVPacketSideData.html
  5. https://ffmpeg.org/doxygen/trunk/group__lavc__packet.html#ga9a80bfcacc586b483a973272800edb97