FFmpeg 封装格式处理

本文为作者原创，转载请注明出处：https://www.cnblogs.com/leisure_chn/p/10506636.html

FFmpeg 封装格式处理相关内容分为如下几篇文章：
[1]. FFmpeg 封装格式处理 - 简介
[2]. FFmpeg 封装格式处理 - 解复用例程
[3]. FFmpeg 封装格式处理 - 复用例程
[4]. FFmpeg 封装格式处理 - 转封装例程
这几篇文章内容联系紧密，但放在一篇文章里内容太长，遂作拆分。章节号不作调整。基于 FFmpeg 4.1 版本。

1. 概述

1.1 封装格式简介

封装格式 (container format) 可以看作是编码流 (音频流、视频流等) 数据的一层外壳，将编码后的数据存储于此封装格式的文件之内。封装又称容器，容器的称法更为形象，所谓容器，就是存放内容的器具，饮料是内容，那么装饮料的瓶子就是容器。

不同封装格式适用于不同的场合，支持的编码格式不一样，几个常用的封装格式如下：
下表引用自 “视音频编解码技术零基础学习方法”

名称 (文件扩展名)	推出机构	流媒体	支持的视频编码	支持的音频编码	目前使用领域
AVI(.avi)	Microsoft 公司	不支持	几乎所有格式	几乎所有格式	BT 下载影视
Flash Video(.flv)	Adobe 公司	支持	Sorenson/VP6/H.264	MP3/ADPCM/Linear PCM/AAC 等	互联网视频网站
MP4(.mp4)	MPEG 组织	支持	MPEG-2/MPEG-4/H.264/H.263 等	AAC/MPEG-1 Layers I,II,III/AC-3 等	互联网视频网站
MPEGTS(.ts)	MPEG 组织	支持	MPEG-1/MPEG-2/MPEG-4/H.264	MPEG-1 Layers I,II,III/AAC	IPTV，数字电视
Matroska(.mkv)	CoreCodec 公司	支持	几乎所有格式	几乎所有格式	互联网视频网站
Real Video(.rmvb)	Real Networks 公司	支持	RealVideo 8,9,10	AAC/Cook Codec/RealAudio Lossless	BT 下载影视

1.2 FFmpeg 中的封装格式

FFmpeg 关于封装格式的处理涉及打开输入文件、打开输出文件、从输入文件读取编码帧、往输出文件写入编码帧这几个步骤，这些都不涉及编码解码层面。

在 FFmpeg 中，mux 指复用，是 multiplex 的缩写，表示将多路流 (视频、音频、字幕等) 混入一路输出中 (普通文件、流等)。demux 指解复用，是 mux 的反操作，表示从一路输入中分离出多路流(视频、音频、字幕等)。mux 处理的是输入格式，demux 处理的输出格式。输入 / 输出媒体格式涉及文件格式和封装格式两个概念。文件格式由文件扩展名标识，主要起提示作用，通过扩展名提示文件类型(或封装格式) 信息。封装格式则是存储媒体内容的实际容器格式，不同的封装格式对应不同的文件扩展名，很多时候也用文件格式代指封装格式，例如常用 ts 格式 (文件格式) 代指 mpegts 格式(封装格式)。

例如，我们把 test.ts 改名为 test.mkv，mkv 扩展名提示了此文件封装格式为 Matroska，但文件内容并无任何变化，使用 ffprobe 工具仍能正确探测出封装格式为 mpegts。

1.2.1 查看 FFmpeg 支持的封装格式

使用ffmpeg -formats命令可以查看 FFmpeg 支持的封装格式。FFmpeg 支持的封装非常多，下面仅列出最常用的几种：

think@opensuse> ffmpeg -formats
File formats:
 D. = Demuxing supported
 .E = Muxing supported
 --
 DE flv             FLV (Flash Video)
 D  aac             raw ADTS AAC (Advanced Audio Coding)
 DE h264            raw H.264 video
 DE hevc            raw HEVC video
  E mp2             MP2 (MPEG audio layer 2)
 DE mp3             MP3 (MPEG audio layer 3)
  E mpeg2video      raw MPEG-2 video
 DE mpegts          MPEG-TS (MPEG-2 Transport Stream)

1.2.2 h264/aac 裸流封装格式

h264 裸流封装格式和 aac 裸流封装格式在后面的解复用和复用例程中会用到，这里先讨论一下。

h264 本来是编码格式，当作封装格式时表示的是 H.264 裸流格式，所谓裸流就是不含封装信息也流，也就是没穿衣服的流。aac 等封装格式类似。

我们看一下 FFmpeg 工程源码中 h264 编码格式以及 h264 封装格式的定义：
FFmpeg 工程包含 h264 解码器，而不包含 h264 编码器 (一般使用第三方 libx264 编码器用作 h264 编码)，所以只有解码器定义：

AVCodec ff_h264_decoder = {
    .name                  = "h264",
    .long_name             = NULL_IF_CONFIG_SMALL("H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10"),
    .type                  = AVMEDIA_TYPE_VIDEO,
    .id                    = AV_CODEC_ID_H264,
    ......
};

h264 封装格式定义如下：

AVOutputFormat ff_h264_muxer = {
    .name              = "h264",
    .long_name         = NULL_IF_CONFIG_SMALL("raw H.264 video"),
    .extensions        = "h264,264",
    .audio_codec       = AV_CODEC_ID_NONE,
    .video_codec       = AV_CODEC_ID_H264,
    .write_header      = force_one_stream,
    .write_packet      = ff_raw_write_packet,
    .check_bitstream   = h264_check_bitstream,
    .flags             = AVFMT_NOTIMESTAMPS,
};
AVOutputFormat ff_h264_muxer = {
    .name              = "h264",
    .long_name         = NULL_IF_CONFIG_SMALL("raw H.264 video"),
    .extensions        = "h264,264",
    .audio_codec       = AV_CODEC_ID_NONE,
    .video_codec       = AV_CODEC_ID_H264,
    .write_header      = force_one_stream,
    .write_packet      = ff_raw_write_packet,
    .check_bitstream   = h264_check_bitstream,
    .flags             = AVFMT_NOTIMESTAMPS,
};

1.2.3 mpegts 封装格式

再看一下 mpegts 封装格式定义，AVInputFormat 用于定义输入封装格式，AVOutputFormat 用于定义输出封装格式。mpegts 输入封装格式中并未指定文件扩展名，而 mpegts 输出封装格式中则指定了文件扩展名为 "ts,m2t,m2ts,mts"。

AVInputFormat ff_mpegts_demuxer = {
    .name           = "mpegts",
    .long_name      = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
    .priv_data_size = sizeof(MpegTSContext),
    .read_probe     = mpegts_probe,
    .read_header    = mpegts_read_header,
    .read_packet    = mpegts_read_packet,
    .read_close     = mpegts_read_close,
    .read_timestamp = mpegts_get_dts,
    .flags          = AVFMT_SHOW_IDS | AVFMT_TS_DISCONT,
    .priv_class     = &mpegts_class,
};
AVOutputFormat ff_mpegts_muxer = {
    .name           = "mpegts",
    .long_name      = NULL_IF_CONFIG_SMALL("MPEG-TS (MPEG-2 Transport Stream)"),
    .mime_type      = "video/MP2T",
    .extensions     = "ts,m2t,m2ts,mts",
    .priv_data_size = sizeof(MpegTSWrite),
    .audio_codec    = AV_CODEC_ID_MP2,
    .video_codec    = AV_CODEC_ID_MPEG2VIDEO,
    .init           = mpegts_init,
    .write_packet   = mpegts_write_packet,
    .write_trailer  = mpegts_write_end,
    .deinit         = mpegts_deinit,
    .check_bitstream = mpegts_check_bitstream,
    .flags          = AVFMT_ALLOW_FLUSH | AVFMT_VARIABLE_FPS | AVFMT_NODIMENSIONS,
    .priv_class     = &mpegts_muxer_class,
};

1.2.4 文件扩展名与封装格式

在 FFmpeg 命令行中，输入文件扩展名是错的也没有关系，因为 FFmpeg 会读取一小段文件来探测出真正的封装格式；但是如果未显式的指定输出封装格式，就只能通过输出文件扩展名来确定封装格式，就必须确保扩展名是正确的。

做几个实验，来研究一下 FFmpeg 中文件扩展名与封装格式的关系：

测试文件下载 (右键另存为)：tnhaoxc.flv

文件信息如下：

think@opensuse> ffprobe tnhaoxc.flv 
ffprobe version 4.1 Copyright (c) 2007-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
  Metadata:
    encoder         : Lavf58.20.100
  Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
    Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp

实验 1：将 flv 封装格式转换为 mpegts 封装格式
使用转封装指令将 flv 封装格式转换为 mpegts 封装格式，在 SHELL 中依次运行如下两条命令：

ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.ts
ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.m2t

生成 tnhaoxc.ts 和 tnhaoxc.m2t 文件，比较一下两文件有无不同：

diff tnhaoxc.ts tnhaoxc.m2t

命令行无输出，表示两文件内容相同。即两文件仅是扩展名不同，封装格式都是 mpegts，文件内容并无任何不同。

实验 2：为输出文件指定错误的扩展名
指定一个错误的扩展名再试一下 (误把封装格式名称当作文件扩展名)：

ffmpeg -i tnhaoxc.flv -map 0 -c copy tnhaoxc.mpegts

命令行输出如下错误信息：

ffmpeg version 4.1 Copyright (c) 2000-2018 the FFmpeg developers
Input #0, flv, from 'tnhaoxc.flv':
  Metadata:
    encoder         : Lavf58.20.100
  Duration: 00:02:13.68, start: 0.000000, bitrate: 838 kb/s
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 784x480, 25 fps, 25 tbr, 1k tbn, 50 tbc
    Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp
[NULL @ 0x1d62e80] Unable to find a suitable output format for 'tnhaoxc.mpegts'
tnhaoxc.mpegts: Invalid argument

提示无法确定输出格式。FFmpeg 无法根据此扩展名确定输出文件的封装格式。

实验 3：为输出文件指定错误的扩展名但显式指定封装格式
通过-f mpegts选项显式指定封装格式为 mpegts：

ffmpeg -i tnhaoxc.flv -map 0 -c copy -f mpegts tnhaoxc.mpegts

命令执行成功，看一下文件内容是否正确：

diff tnhaoxc.mpegts tnhaoxc.ts

发现 tnhaoxc.mpegts 和 tnhaoxc.ts 文件内容完全一样，虽然 tnhaoxc.mpegts 有错误的文件扩展名，仍然得到了我们期望的封装格式。

不知道什么命令可以查到封装格式对应的扩展名。可以在 FFmpeg 工程源码中搜索封装格式名称，如搜索 “mpegts”，可以看到其扩展名为 “ts,m2t,m2ts,mts”。

2. API 介绍

最主要的 API 有如下几个。FFmpeg 中将编码帧及未编码帧均称作 frame，本文为方便，将编码帧称作 packet，未编码帧称作 frame。

2.1 avformat_open_input()

/**
 * Open an input stream and read the header. The codecs are not opened.
 * The stream must be closed with avformat_close_input().
 *
 * @param ps Pointer to user-supplied AVFormatContext (allocated by avformat_alloc_context).
 *           May be a pointer to NULL, in which case an AVFormatContext is allocated by this
 *           function and written into ps.
 *           Note that a user-supplied AVFormatContext will be freed on failure.
 * @param url URL of the stream to open.
 * @param fmt If non-NULL, this parameter forces a specific input format.
 *            Otherwise the format is autodetected.
 * @param options  A dictionary filled with AVFormatContext and demuxer-private options.
 *                 On return this parameter will be destroyed and replaced with a dict containing
 *                 options that were not found. May be NULL.
 *
 * @return 0 on success, a negative AVERROR on failure.
 *
 * @note If you want to use custom IO, preallocate the format context and set its pb field.
 */
int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);

这个函数会打开输入媒体文件，读取文件头，将文件格式信息存储在第一个参数 AVFormatContext 中。

2.2 avformat_find_stream_info()

/**
 * Read packets of a media file to get stream information. This
 * is useful for file formats with no headers such as MPEG. This
 * function also computes the real framerate in case of MPEG-2 repeat
 * frame mode.
 * The logical file position is not changed by this function;
 * examined packets may be buffered for later processing.
 *
 * @param ic media file handle
 * @param options  If non-NULL, an ic.nb_streams long array of pointers to
 *                 dictionaries, where i-th member contains options for
 *                 codec corresponding to i-th stream.
 *                 On return each dictionary will be filled with options that were not found.
 * @return >=0 if OK, AVERROR_xxx on error
 *
 * @note this function isn't guaranteed to open all the codecs, so
 *       options being non-empty at return is a perfectly normal behavior.
 *
 * @todo Let the user decide somehow what information is needed so that
 *       we do not waste time getting stuff the user does not need.
 */
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);

这个函数会读取一段视频文件数据并尝试解码，将取到的流信息填入 AVFormatContext.streams 中。AVFormatContext.streams 是一个指针数组，数组大小是 AVFormatContext.nb_streams

2.3 av_read_frame()

/**
 * Return the next frame of a stream.
 * This function returns what is stored in the file, and does not validate
 * that what is there are valid frames for the decoder. It will split what is
 * stored in the file into frames and return one for each call. It will not
 * omit invalid data between valid frames so as to give the decoder the maximum
 * information possible for decoding.
 *
 * If pkt->buf is NULL, then the packet is valid until the next
 * av_read_frame() or until avformat_close_input(). Otherwise the packet
 * is valid indefinitely. In both cases the packet must be freed with
 * av_packet_unref when it is no longer needed. For video, the packet contains
 * exactly one frame. For audio, it contains an integer number of frames if each
 * frame has a known fixed size (e.g. PCM or ADPCM data). If the audio frames
 * have a variable size (e.g. MPEG audio), then it contains one frame.
 *
 * pkt->pts, pkt->dts and pkt->duration are always set to correct
 * values in AVStream.time_base units (and guessed if the format cannot
 * provide them). pkt->pts can be AV_NOPTS_VALUE if the video format
 * has B-frames, so it is better to rely on pkt->dts if you do not
 * decompress the payload.
 *
 * @return 0 if OK, < 0 on error or end of file
 */
int av_read_frame(AVFormatContext *s, AVPacket *pkt);

本函数用于解复用过程。

本函数将存储在输入文件中的数据分割为多个 packet，每次调用将得到一个 packet。packet 可能是视频帧、音频帧或其他数据，解码器只会解码视频帧或音频帧，非音视频数据并不会被扔掉、从而能向解码器提供尽可能多的信息。

对于视频来说，一个 packet 只包含一个视频帧；对于音频来说，若是帧长固定的格式则一个 packet 可包含整数个音频帧，若是帧长可变的格式则一个 packet 只包含一个音频帧。

读取到的 packet 每次使用完之后应调用av_packet_unref(AVPacket *pkt)清空 packet。否则会造成内存泄露。

2.4 av_write_frame()

/**
 * Write a packet to an output media file.
 *
 * This function passes the packet directly to the muxer, without any buffering
 * or reordering. The caller is responsible for correctly interleaving the
 * packets if the format requires it. Callers that want libavformat to handle
 * the interleaving should call av_interleaved_write_frame() instead of this
 * function.
 *
 * @param s media file handle
 * @param pkt The packet containing the data to be written. Note that unlike
 *            av_interleaved_write_frame(), this function does not take
 *            ownership of the packet passed to it (though some muxers may make
 *            an internal reference to the input packet).
 *            <br>
 *            This parameter can be NULL (at any time, not just at the end), in
 *            order to immediately flush data buffered within the muxer, for
 *            muxers that buffer up data internally before writing it to the
 *            output.
 *            <br>
 *            Packet's @ref AVPacket.stream_index "stream_index" field must be
 *            set to the index of the corresponding stream in @ref
 *            AVFormatContext.streams "s->streams".
 *            <br>
 *            The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
 *            must be set to correct values in the stream's timebase (unless the
 *            output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
 *            they can be set to AV_NOPTS_VALUE).
 *            The dts for subsequent packets passed to this function must be strictly
 *            increasing when compared in their respective timebases (unless the
 *            output format is flagged with the AVFMT_TS_NONSTRICT, then they
 *            merely have to be nondecreasing).  @ref AVPacket.duration
 *            "duration") should also be set if known.
 * @return < 0 on error, = 0 if OK, 1 if flushed and there is no more data to flush
 *
 * @see av_interleaved_write_frame()
 */
int av_write_frame(AVFormatContext *s, AVPacket *pkt);

本函数用于复用过程，将 packet 写入输出媒体。

packet 交织是指：不同流的 packet 在输出媒体文件中应严格按照 packet 中 dts 递增的顺序交错存放。

本函数直接将 packet 写入复用器 (muxer)，不会缓存或记录任何 packet。本函数不负责不同流的 packet 交织问题。由调用者负责。

如果调用者不愿处理 packet 交织问题，应调用 av_interleaved_write_frame() 替代本函数。

2.5 av_interleaved_write_frame()

/**
 * Write a packet to an output media file ensuring correct interleaving.
 *
 * This function will buffer the packets internally as needed to make sure the
 * packets in the output file are properly interleaved in the order of
 * increasing dts. Callers doing their own interleaving should call
 * av_write_frame() instead of this function.
 *
 * Using this function instead of av_write_frame() can give muxers advance
 * knowledge of future packets, improving e.g. the behaviour of the mp4
 * muxer for VFR content in fragmenting mode.
 *
 * @param s media file handle
 * @param pkt The packet containing the data to be written.
 *            <br>
 *            If the packet is reference-counted, this function will take
 *            ownership of this reference and unreference it later when it sees
 *            fit.
 *            The caller must not access the data through this reference after
 *            this function returns. If the packet is not reference-counted,
 *            libavformat will make a copy.
 *            <br>
 *            This parameter can be NULL (at any time, not just at the end), to
 *            flush the interleaving queues.
 *            <br>
 *            Packet's @ref AVPacket.stream_index "stream_index" field must be
 *            set to the index of the corresponding stream in @ref
 *            AVFormatContext.streams "s->streams".
 *            <br>
 *            The timestamps (@ref AVPacket.pts "pts", @ref AVPacket.dts "dts")
 *            must be set to correct values in the stream's timebase (unless the
 *            output format is flagged with the AVFMT_NOTIMESTAMPS flag, then
 *            they can be set to AV_NOPTS_VALUE).
 *            The dts for subsequent packets in one stream must be strictly
 *            increasing (unless the output format is flagged with the
 *            AVFMT_TS_NONSTRICT, then they merely have to be nondecreasing).
 *            @ref AVPacket.duration "duration") should also be set if known.
 *
 * @return 0 on success, a negative AVERROR on error. Libavformat will always
 *         take care of freeing the packet, even if this function fails.
 *
 * @see av_write_frame(), AVFormatContext.max_interleave_delta
 */
int av_interleaved_write_frame(AVFormatContext *s, AVPacket *pkt);

本函数用于复用过程，将 packet 写入输出媒体。

本函数将按需在内部缓存 packet，从而确保输出媒体中不同流的 packet 能按照 dts 增长的顺序正确交织。

2.6 avio_open()

/**
 * Create and initialize a AVIOContext for accessing the
 * resource indicated by url.
 * @note When the resource indicated by url has been opened in
 * read+write mode, the AVIOContext can be used only for writing.
 *
 * @param s Used to return the pointer to the created AVIOContext.
 * In case of failure the pointed to value is set to NULL.
 * @param url resource to access
 * @param flags flags which control how the resource indicated by url
 * is to be opened
 * @return >= 0 in case of success, a negative value corresponding to an
 * AVERROR code in case of failure
 */
int avio_open(AVIOContext **s, const char *url, int flags);

创建并初始化一个 AVIOContext，用于访问输出媒体文件。

2.7 avformat_write_header()

/**
 * Allocate the stream private data and write the stream header to
 * an output media file.
 *
 * @param s Media file handle, must be allocated with avformat_alloc_context().
 *          Its oformat field must be set to the desired output format;
 *          Its pb field must be set to an already opened AVIOContext.
 * @param options  An AVDictionary filled with AVFormatContext and muxer-private options.
 *                 On return this parameter will be destroyed and replaced with a dict containing
 *                 options that were not found. May be NULL.
 *
 * @return AVSTREAM_INIT_IN_WRITE_HEADER on success if the codec had not already been fully initialized in avformat_init,
 *         AVSTREAM_INIT_IN_INIT_OUTPUT  on success if the codec had already been fully initialized in avformat_init,
 *         negative AVERROR on failure.
 *
 * @see av_opt_find, av_dict_set, avio_open, av_oformat_next, avformat_init_output.
 */
av_warn_unused_result
int avformat_write_header(AVFormatContext *s, AVDictionary **options);

向输出文件写入文件头信息。

2.8 av_write_trailer()

/**
 * Write the stream trailer to an output media file and free the
 * file private data.
 *
 * May only be called after a successful call to avformat_write_header.
 *
 * @param s media file handle
 * @return 0 if OK, AVERROR_xxx on error
 */
int av_write_trailer(AVFormatContext *s);

向输出文件写入文件尾信息。

6. 参考资料

[1] WIKI，Digital_container_format
[2] WIKI，Comparison_of_container_formats
[3] 雷霄骅，使用 FFMPEG 类库分离出多媒体文件中的 H.264 码流，https://blog.csdn.net/leixiaohua1020/article/details/11800877
[4] 雷霄骅，最简单的基于 FFmpeg 的封装格式处理：视音频分离器简化版，https://blog.csdn.net/leixiaohua1020/article/details/39767055

7. 修改记录

2019-03-08 V1.0 解复用例程初稿
2019-03-09 V1.0 拆分笔记
2019-03-10 V1.0 增加复用例程和转封装例程

Hi! Welcome