主播声音拍摄教程（FLVMP4TS合成音实战）

威哥 2023-06-24 05:26:16 821

主播声音拍摄教程（FLVMP4TS合成音实战）（4）通过avio_open，打开对应的输出文件。（3）通过add_stream，添加流，设置编码器参数。通过自定义的函数open_video和open_audio，关联编码器，分配帧buffer，初始化scale，初始化PCM的参数，初始化重采样器。2.FFMPEG合成音视频框架（1）avformat_alloc_output_context2：分配AVFormatContext，并根据filename绑定合适的AVOutputFormat。（2）通过fmt->video_codec = AV_CODEC_ID_H264 和fmt->audio_codec = AV_CODEC_ID_AAC。

1.环境准备

主播声音拍摄教程（FLVMP4TS合成音实战）(1)

音视频数据用代码生成，不是从文件读取，然后再合成为flv。

主播声音拍摄教程（FLVMP4TS合成音实战）(2)

合成的flv。

主播声音拍摄教程（FLVMP4TS合成音实战）(3)

播放如下:

主播声音拍摄教程（FLVMP4TS合成音实战）(4)

2.FFMPEG合成音视频框架

主播声音拍摄教程（FLVMP4TS合成音实战）(5)

（1）avformat_alloc_output_context2：分配AVFormatContext，并根据filename绑定合适的AVOutputFormat。

（2）通过fmt->video_codec = AV_CODEC_ID_H264 和fmt->audio_codec = AV_CODEC_ID_AAC。

（3）通过add_stream，添加流，设置编码器参数。通过自定义的函数open_video和open_audio，关联编码器，分配帧buffer，初始化scale，初始化PCM的参数，初始化重采样器。

（4）通过avio_open，打开对应的输出文件。

（5）avformat_write_header ：写⽂件头

（6）写视频帧和音频帧，涉及到编码音视频，这里面也会涉及到time_base转换。

（7）av_write_frame/av_interleaved_write_frame：写packet

（8）av_write_trailer ：写⽂件尾，如果是实时流，可以不用写文件尾。

看看这些比较重要的函数和结构：

avformat_write_header：

注意：这里会改变音视频各自的time_base，如，开始audio的 AVstream->base_time = 1/44100 video 为AVstream->base_time = 1/25。经过avformat_write_header后，audio和video的time_base就变为1/1000。这种改变，是与封装格式有关系。比如ts和flv又是不一样。

主播声音拍摄教程（FLVMP4TS合成音实战）(6)

主播声音拍摄教程（FLVMP4TS合成音实战）(7)

上面图中的红框部分，就会调用到复用器的write_header，如：

主播声音拍摄教程（FLVMP4TS合成音实战）(8)

avformat_alloc_output_context2：

函数在在libavformat.h⾥⾯的定义。

主播声音拍摄教程（FLVMP4TS合成音实战）(9)

int avformat_alloc_output_context2(AVFormatContext **ctx ff_const59

AVOutputFormat *oformat const char *format_name const char *filename);

ctx:需要创建的context，返回NULL表示失败。

oformat:指定对应的AVOutputFormat，如果不指定，可以通过后⾯format_name、filename两个参数进⾏指定，让ffmpeg⾃⼰推断。

format_name: 指定⾳视频的格式，⽐如“flv”，“mpeg”等，如果设置为NULL，则由filename进⾏指定，让ffmpeg⾃⼰推断。

filename: 指定⾳视频⽂件的路径，如果oformat、format_name为NULL，则ffmpeg内部根据filename后缀名选择合适的复⽤器，⽐如xxx.flv则使⽤flv复⽤器。

int avformat_alloc_output_context2(AVFormatContext **avctx ff_const 59 AVOutputFormat *oformat 2 const char *format const char *f ilename) 3 { 4 AVFormatContext *s = avformat_alloc_context(); 5 int ret = 0; 67 *avctx = NULL; 8 if (!s) 9 goto nomem; 1011 if (!oformat) { // oformat为NULL 12 if (format) { 13 oformat = av_guess_format(format NULL NULL); //根据提供的格式进⾏查找 format 14 if (!oformat) { 15 av_log(s AV_LOG_ERROR "Requested output format '% s' is not a suitable output format\n" format); 16 ret = AVERROR(EINVAL); 17 goto error; 18 } 19 } else { // oformat和format都为NULL 20 oformat = av_guess_format(NULL filename NULL); // 根据⽂件名后缀进⾏查找 21 if (!oformat) { 22 ret = AVERROR(EINVAL); 23 av_log(s AV_LOG_ERROR "Unable to find a suitable o utput format for '%s'\n" 24 filename); 25 goto error; 26 } 27 } 28 } 2930 s->oformat = oformat; 31 if (s->oformat->priv_data_size > 0) { 32 s->priv_data = av_mallocz(s->oformat->priv_data_size); 33 if (!s->priv_data) 7 34 goto nomem; 35 if (s->oformat->priv_class) { 36 *(const AVClass**)s->priv_data= s->oformat->priv_class; 37 av_opt_set_defaults(s->priv_data); 38 } 39 } else 40 s->priv_data = NULL; 4142 if (filename) { 43 #if FF_API_FORMAT_FILENAME 44 FF_DISABLE_DEPRECATION_WARNINGS 45 av_strlcpy(s->filename filename sizeof(s->filename)); 46 FF_ENABLE_DEPRECATION_WARNINGS 47 #endif 48 if (!(s->url = av_strdup(filename))) 49 goto nomem; 5051 } 52 *avctx = s; 53 return 0; 54 nomem: 55 av_log(s AV_LOG_ERROR "Out of memory\n"); 56 ret = AVERROR(ENOMEM); 57 error: 58 avformat_free_context(s); 59 return ret; 60 }

可以看出，⾥⾯最主要的就两个函数，avformat_alloc_context和av_guess_format，⼀个是申请内存分配上下⽂，⼀个是通过后⾯两个参数获取AVOutputFormat。av_guess_format这个函数会通过filename和short_name来和所有的编码器进⾏⽐对，找出最接近的编码器然后返回。

ff_const59 AVOutputFormat *av_guess_format(const char *short_name const char *filename const char *mime_type) { const AVOutputFormat *fmt = NULL; AVOutputFormat *fmt_found = NULL; void *i = 0; int score_max score; /* specific test for image sequences */ #if CONFIG_IMAGE2_MUXER if (!short_name && filename && av_filename_number_test(filename) && ff_guess_image2_codec(filename) != AV_CODEC_ID_NONE) { return av_guess_format("image2" NULL NULL); } #endif /* Find the proper file type. */ score_max = 0; while ((fmt = av_muxer_iterate(&i))) { score = 0; if (fmt->name && short_name && av_match_name(short_name fmt->name)) // fmt->name⽐如ff_flv_muxer的为"flv" // 匹配了name 最⾼规格 score = 100; if (fmt->mime_type && mime_type && !strcmp(fmt->mime_type mime_type)) // ff_flv_muxer的为 "video/x-flv" // 匹配mime_type score = 10; if (filename && fmt->extensions && av_match_ext(filename fmt->extensions)) { //ff_flv_muxe r的为 "flv" // 匹配 score = 5; } if (score > score_max) { // 更新最匹配的分值 score_max = score; fmt_found = (AVOutputFormat*)fmt; } } return fmt_found; }

AVOutputFormat

AVOutpufFormat表示输出⽂件容器格式，AVOutputFormat 结构主要包含的信息有：封装名称描述，编码格式信息(video/audio 默认编码格式，⽀持的编码格式列表)，⼀些对封装的操作函数(write_header write_packet write_tailer等)。

ffmpeg⽀持各种各样的输出⽂件格式，MP4，FLV，3GP等等。⽽ AVOutputFormat 结构体则保存了这些格式的信息和⼀些常规设置。每⼀种封装对应⼀个 AVOutputFormat 结构，ffmpeg将AVOutputFormat 按照链表存储：

主播声音拍摄教程（FLVMP4TS合成音实战）(10)

2.结构体定义

typedef struct AVOutputFormat{ const char *name; /** * Descriptive name for the format meant to be more human-readable * than name. You should use the NULL_IF_CONFIG_SMALL() macro * to define it. */ const char *long_name; const char *mime_type; const char *extensions; /**< comma-separated filename extensions */ /* output support */ enum AVCodecID audio_codec; /**< default audio codec */ enum AVCodecID video_codec; /**< default video codec */ enum AVCodecID subtitle_codec; /**< default subtitle codec */ /** * can use flags: AVFMT_NOFILE AVFMT_NEEDNUMBER * AVFMT_GLOBALHEADER AVFMT_NOTIMESTAMPS AVFMT_VARIABLE_FPS * AVFMT_NODIMENSIONS AVFMT_NOSTREAMS AVFMT_ALLOW_FLUSH * AVFMT_TS_NONSTRICT AVFMT_TS_NEGATIVE */ int flags; /** * List of supported codec_id-codec_tag pairs ordered by "better * choice first". The arrays are all terminated by AV_CODEC_ID_NONE. */ const struct AVCodecTag * const *codec_tag; const AVClass *priv_class; ///< AVClass for the private context /***************************************************************** * No fields below this line are part of the public API. They * may not be used outside of libavformat and can be changed and * removed at will. * New public fields should be added right above. ***************************************************************** */ /** * The ff_const59 define is not part of the public API and will * be removed without further warning. */ #if FF_API_AVIOFORMAT #define ff_const59 #else #define ff_const59 const #endif ff_const59 struct AVOutputFormat *next; /** * size of private data so that it can be allocated in the wrapper */ int priv_data_size; int (*write_header)(struct AVFormatContext *); /** * Write a packet. If AVFMT_ALLOW_FLUSH is set in flags * pkt can be NULL in order to flush data buffered in the muxer. * When flushing return 0 if there still is more data to flush * or 1 if everything was flushed and there is no more buffered * data. */ int (*write_packet)(struct AVFormatContext * AVPacket *pkt); int (*write_trailer)(struct AVFormatContext *); /** * Currently only used to set pixel format if not YUV420P. */ int (*interleave_packet)(struct AVFormatContext * AVPacket *out AVPacket *in int flush); /** * Test if the given codec can be stored in this container. * * @return 1 if the codec is supported 0 if it is not. * A negative number if unknown. * MKTAG('A' 'P' 'I' 'C') if the codec is only supported as AV_DISPOSITION_ATTACHED_PIC */ int (*query_codec)(enum AVCodecID id int std_compliance); void (*get_output_timestamp)(struct AVFormatContext *s int stream int64_t *dts int64_t *wall); /** * Allows sending messages from application to device. */ int (*control_message)(struct AVFormatContext *s int type void *data size_t data_size); /** * Write an uncoded AVFrame. * * See av_write_uncoded_frame() for details. * * The library will free *frame afterwards but the muxer can prevent it * by setting the pointer to NULL. */ int (*write_uncoded_frame)(struct AVFormatContext * int stream_index AVFrame **frame unsigned flags); /** * Returns device list with it properties. * @see avdevice_list_devices() for more details. */ int (*get_device_list)(struct AVFormatContext *s struct AVDeviceInfoList *device_list); /** * Initialize device capabilities submodule. * @see avdevice_capabilities_create() for more details. */ int (*create_device_capabilities)(struct AVFormatContext *s struct AVDeviceCapabilitiesQuery *caps); /** * Free device capabilities submodule. * @see avdevice_capabilities_free() for more details. */ int (*free_device_capabilities)(struct AVFormatContext *s struct AVDeviceCapabilitiesQuery *caps); enum AVCodecID data_codec; /**< default data codec */ /** * Initialize format. May allocate data here and set any AVFormatContext or * AVStream parameters that need to be set before packets are sent. * This method must not write output. * * Return 0 if streams were fully configured 1 if not negative AVERROR on failure * * Any allocations made here must be freed in deinit(). */ int (*init)(struct AVFormatContext *); /** * Deinitialize format. If present this is called whenever the muxer is being * destroyed regardless of whether or not the header has been written. * * If a trailer is being written this is called after write_trailer(). * * This is called if init() fails as well. */ void (*deinit)(struct AVFormatContext *); /** * Set up any necessary bitstream filtering and extract any extra data needed * for the global header. * Return 0 if more packets from this stream must be checked; 1 if not. */ int (*check_bitstream)(struct AVFormatContext * const AVPacket *pkt); }AVOutputFormat

3.常⻅变量及其作⽤

const char *name;// 复⽤器名称

const char *long_name;//格式的描述性名称，易于阅读。

enum AVCodecID audio_codec; //默认的⾳频编解码器

enum AVCodecID video_codec; //默认的视频编解码器

enum AVCodecID subtitle_codec; //默认的字幕编解码器

注意：⼤部分复⽤器都有默认的编码器，所以⼤家如果要调整编码器类型则需要⾃⼰⼿动指定。

⽐如AVOutputFormat的ff_flv_muxer，flv默认就指定了MP3

主播声音拍摄教程（FLVMP4TS合成音实战）(11)

mpegts默认指定了，AV_CODEC_ID_MP2和AV_CODEC_ID_MPEG2VIDEO

主播声音拍摄教程（FLVMP4TS合成音实战）(12)

int (*write_header)(struct AVFormatContext *);//写头

int (*write_packet)(struct AVFormatContext * AVPacket *pkt);//写⼀个数据包。如果在标志中设

置AVFMT_ALLOW_FLUSH，则pkt可以为NULL。

int (*write_trailer)(struct AVFormatContext *);//写尾部

//交叉写包

int (*interleave_packet)(struct AVFormatContext * AVPacket *out AVPacket *in int flush);

int (*control_message)(struct AVFormatContext *s int type void *data size_t data_size);//允许从应⽤程序向设备发送消息。

int (*write_uncoded_frame)(struct AVFormatContext * int stream_index AVFrame **frame unsigned flags);//写⼀个未编码的AVFrame。

//初始化格式。可以在此处分配数据，并设置在发送数据包之前需要设置的任何AVFormatContext或AVStream参数。

int (*init)(struct AVFormatContext *);

void (*deinit)(struct AVFormatContext *);//取消初始化格式。

int (*check_bitstream)(struct AVFormatContext * const AVPacket *pkt);//设置任何必要的⽐特流过滤，并提取全局头部所需的任何额外数据。这个很关键。

avformat_new_stream

在 AVFormatContext 中创建 Stream 通道。

AVStream 即是流通道。例如我们将 H264 和 AAC 码流存储为MP4⽂件的时候，就需要在 MP4⽂件中增加两个流通道，⼀个存储Video：H264，⼀个存储Audio：AAC。（假设H264和AAC只包含单个流通道）。

AVStream *avformat_new_stream(AVFormatContext *s const AVCodec *c);

AVFormatContext ：

unsigned int nb_streams; 记录stream通道数⽬。

AVStream **streams; 存储stream通道。

AVStream ：

int index; 在AVFormatContext 中所处的通道索引

avformat_new_stream之后便在 AVFormatContext ⾥增加了 AVStream 通道（相关的index已经被设置了）。之后，我们就可以⾃⾏设置 AVStream 的⼀些参数信息。例如 : codec_id format bit_rate width height。相当于这些参数就有了“依靠”。

av_interleaved_write_frame

函数原型：int av_interleaved_write_frame(AVFormatContext *s AVPacket *pkt)

功能：将数据包写⼊输出媒体⽂件，并确保正确的交织（保持packet dts的增⻓性）。该函数会在内部根据需要缓存packet，以确保输出⽂件中的packet按dts递增的顺序正确交织。如果⾃⼰进⾏交织则应调⽤av_write_frame()(没有缓存)。如果没有B帧的情况，正确设置pts后，这两个接口表现出来的差不多。

参数：

主播声音拍摄教程（FLVMP4TS合成音实战）(13)

返回值：成功时为0，错误时为负AVERROR。即使此函数调⽤失败，Libavformat仍将始终释放该packet。

av_compare_ts

主播声音拍摄教程（FLVMP4TS合成音实战）(14)

int av_compare_ts(int64_t ts_a AVRational tb_a int64_t ts_b AVRational tb_b);

返回值：

-1 ts_a 在ts_b之前。

1 ts_a 在ts_b之后。

0 ts_a 在ts_b同⼀位置。

MediaInfo分析⽂件写⼊

这⾥只是分析avformat_write_header和av_write_trailer的作⽤。

主播声音拍摄教程（FLVMP4TS合成音实战）(15)

主播声音拍摄教程（FLVMP4TS合成音实战）(16)

主播声音拍摄教程（FLVMP4TS合成音实战）(17)

主播声音拍摄教程（FLVMP4TS合成音实战）(18)

avformat_write_header av_write_trailer。对于FLV⽽⾔没有任何变化。

mp4

主播声音拍摄教程（FLVMP4TS合成音实战）(19)

主播声音拍摄教程（FLVMP4TS合成音实战）(20)

主播声音拍摄教程（FLVMP4TS合成音实战）(21)

主播声音拍摄教程（FLVMP4TS合成音实战）(22)

主播声音拍摄教程（FLVMP4TS合成音实战）(23)

主播声音拍摄教程（FLVMP4TS合成音实战）(24)

时间戳详解

参考原文地址：https://www.cnblogs.com/leisure_chn/p/10584910.html

I帧/P帧/B帧

I帧：I帧(Intra-coded picture 帧内编码帧，常称为关键帧)包含⼀幅完整的图像信息，属于帧内编码图像，不含运动⽮量，在解码时不需要参考其他帧图像。因此在I帧图像处可以切换频道，⽽不会导致图像丢失或⽆法解码。I帧图像⽤于阻⽌误差的累积和扩散。在闭合式GOP中，每个GOP的第⼀个帧⼀定是I帧，且当前GOP的数据不会参考前后GOP的数据。

P帧：P帧(Predictive-coded picture 预测编码图像帧)是帧间编码帧，利⽤之前的I帧或P帧进⾏预测编码。

B帧：B帧(Bi-directionally predicted picture 双向预测编码图像帧)是帧间编码帧，利⽤之前和(或)之后的I帧或P帧进⾏双向预测编码。B帧不可以作为参考帧。B帧具有更⾼的压缩率，但需要更多的缓冲时间以及更⾼的CPU占⽤率，因此B帧适合本地存储以及视频点播，⽽不适⽤对实时性要求较⾼的直播系统。

DTS和PTS

DTS(Decoding Time Stamp 解码时间戳)，表示压缩帧的解码时间。

PTS(Presentation Time Stamp 显示时间戳)，表示将压缩帧解码后得到的原始帧的显示时间。

⾳频中DTS和PTS是相同的。视频中由于B帧需要双向预测，B帧依赖于其前和其后的帧，因此含B帧的视频解码顺序与显示顺序不同，即DTS与PTS不同。当然，不含B帧的视频，其DTS和PTS是相同的。下图以⼀个开放式GOP示意图为例，说明视频流的解码顺序和显示顺序。

主播声音拍摄教程（FLVMP4TS合成音实战）(25)

采集顺序：指图像传感器采集原始信号得到图像帧的顺序。

编码顺序：指编码器编码后图像帧的顺序。存储到磁盘的本地视频⽂件中图像帧的顺序与编码顺序相同。

传输顺序：指编码后的流在⽹络中传输过程中图像帧的顺序。

解码顺序：指解码器解码图像帧的顺序。

显示顺序：序指图像帧在显示器上显示的顺序。

注意：从图中可以看出，采集顺序与显示顺序相同，编码顺序，传输顺序和解码顺序相同。

以图中“B[1]”帧为例进⾏说明，“B[1]”帧解码时需要参考“I[0]”帧和“P[3]”帧，因此“P[3]”帧必须⽐“B[1]”帧先解码。这就导致了解码顺序和显示顺序的不⼀致，后显示的帧需要先解码。

时间基与时间戳

在FFmpeg中，时间基(time_base)是时间戳(timestamp)的单位，时间戳值乘以时间基，可以得到实际的时刻值(以秒等为单位)。例如，如果⼀个视频帧的dts是40，pts是160，其time_base是1/1000秒，那么可以计算出此视频帧的解码时刻是40毫秒(40/1000)，显示时刻是160毫秒(160/1000)。FFmpeg中时间戳(pts/dts)的类型是int64_t类型，把⼀个time_base看作⼀个时钟脉冲，则可把dts/pts看作时钟脉冲的计数。

三种时间基tbr、tbn和tbc

不同的封装格式具有不同的时间基。在FFmpeg处理⾳视频过程中的不同阶段，也会采⽤不同的时间基。

FFmepg中有三种时间基，命令⾏中tbr、tbn和tbc的打印值就是这三种时间基的倒数。如下：

tbn：对应容器中的时间基。值是AVStream.time_base的倒数。

tbc：对应编解码器中的时间基。值是AVCodecContext.time_base的倒数

tbr：从视频流中猜算得到，可能是帧率或场率(帧率的2倍)。

测试⽂件下载(右键另存为)：tnmil3.flv。

使⽤ffprobe探测媒体⽂件格式，如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(26)

关于tbr、tbn和tbc的说明，原⽂如下，来⾃FFmpeg邮件列表：

There are three different time bases for time stamps in FFmpeg. The values printed are actually reciprocals of these i.e. 1/tbr 1/tbn and 1/tbc tbn is the time base in AVStream that has come from the container I think. It is used for all AVStream time stamps。 tbc is the time base in AVCodecContext for the codec used for a particular stream. It is used for all AVCodecContext and related time stamps.

内部时间基AV_TIME_BASE

除以上三种时间基外，FFmpeg还有⼀个内部时间基AV_TIME_BASE(以及分数形式的AV_TIME_BASE_Q)。

主播声音拍摄教程（FLVMP4TS合成音实战）(27)

AV_TIME_BASE及AV_TIME_BASE_Q⽤于FFmpeg内部函数处理，使⽤此时间基计算得到时间值表示的是微秒。

时间值形式转换

av_q2d()将时间从AVRational形式转换为double形式。AVRational是分数类型，double是双精度浮点数类型，转换的结果单位是秒。转换前后的值基于同⼀时间基，仅仅是数值的表现形式不同⽽已。

av_q2d()实现如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(28)

av_q2d()使⽤⽅法如下：

注意：时刻表示的是一个瞬间值，时长表示能持续的时间。

主播声音拍摄教程（FLVMP4TS合成音实战）(29)

时间基转换函数

av_rescale_q()⽤于不同时间基的转换，⽤于将时间值从⼀种时间基转换为另⼀种时间基。

将a数值由 bq时间基转成 cq的时间基，通过返回结果获取以cq时间基表示的新数值。如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(30)

int64_t av_rescale_rnd(int64_t a int64_t b int64_t c enum AVRounding rnd)。这个函数的用法就更复杂，但是一般你可以选择不用。

它的作⽤是计算 "a * b / c" 的值并分五种⽅式来取整。按照如下方式：

// Round toward zero(可以理解为向下取整). 趋近于0， round(2.5) 为 2 ⽽round(-2.5) 为 -2

AV_ROUND_ZERO = 0

// Round away from zero(可以理解为向上取整)。趋远于0 round(3.5)=4 round(-3.5)=-4

AV_ROUND_INF = 1

// Round toward -infinity.向负⽆穷⼤⽅向 [-2.9 -1.2 2.4 5.6 7.0 2.4] -> [-3 -2 2 5 7 2]

AV_ROUND_DOWN = 2

// Round toward infinity. 向正⽆穷⼤⽅向[-2.9 -1.2 2.4 5.6 7.0 2.4] -> [-2 -1 3 6 7 3]

AV_ROUND_UP = 3

// 四舍五⼊⼩于0.5取值趋向0 ⼤于0.5取值趋远于0

AV_ROUND_NEAR_INF = 5

av_packet_rescale_ts()⽤于将AVPacket中各种时间值从⼀种时间基转换为另⼀种时间基。

主播声音拍摄教程（FLVMP4TS合成音实战）(31)

转封装过程中的时间基转换

容器中的时间基(AVStream.time_base，3.2节中的tbn)定义如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(32)

注意：AVStream.time_base是AVPacket中pts和dts的时间单位，输⼊流与输出流中time_base按如下⽅式确定：

对于输⼊流：打开输⼊⽂件后，调⽤avformat_find_stream_info()获取到每个流中的time_base。

对于输出流：打开输出⽂件后，调⽤avformat_write_header()可根据输出⽂件封装格式确定每个流的time_base并写⼊输出⽂件中。

不同封装格式具有不同的时间基，在转封装(将⼀种封装格式转换为另⼀种封装格式)过程中，时间基转换相关代码如下：

一般在以下都是一段固定的代码，在推流前，一般都是这么写的：

av_read_frame(ifmt_ctx &pkt); pkt.pts = av_rescale_q_rnd(pkt.pts in_stream->time_base out_stream- >time_base AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX); pkt.dts = av_rescale_q_rnd(pkt.dts in_stream->time_base out_stream- >time_base AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX); pkt.duration = av_rescale_q(pkt.duration in_stream->time_base out_s tream->time_base);

下⾯的代码具有和上⾯代码相同的效果：

// 从输⼊⽂件中读取packet av_read_frame(ifmt_ctx &pkt); // 将packet中的各时间值从输⼊流封装格式时间基转换到输出流封装格式时间基 av_packet_rescale_ts(&pkt in_stream->time_base out_stream->time_bas e);

这⾥流⾥的时间基 in_stream->time_base 和 out_stream->time_base ，是容器中的时间基，就是上面所说的tbn。

例如，flv封装格式的time_base为{1 1000}，ts封装格式的time_base为{1 90000}。

我们编写程序将flv封装格式转换为ts封装格式，抓取原⽂件(flv)的前四帧显示时间戳：

主播声音拍摄教程（FLVMP4TS合成音实战）(33)

再抓取转换的⽂件(ts)的前四帧显示时间戳：

主播声音拍摄教程（FLVMP4TS合成音实战）(34)

可以发现，对于同⼀个视频帧，它们时间基(tbn)不同因此时间戳(pkt_pts)也不同，但是计算出来的时刻值(pkt_pts_time)是相同的。

比如，第一帧这样计算。

看第⼀帧的时间戳，计算关系：80×{1 1000} == 7200×{1 90000} == 0.080000。

一定要记住一个结论就是，时间戳可以不一样，但是时刻值肯定是一样。

转码过程中的时间基转换

编解码器中的时间基(AVCodecContext.time_base，3.2节中的tbc)定义如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(35)

主播声音拍摄教程（FLVMP4TS合成音实战）(36)

上述注释指出，AVCodecContext.time_base是帧率(视频帧)的倒数，每帧时间戳递增1，那么tbc就等于帧率。编码过程中，应由⽤户设置好此参数。解码过程中，此参数已过时，建议直接使⽤帧率倒数⽤作时间基。

这⾥有⼀个问题：按照此处注释说明，帧率为25的视频流，tbc理应为25，但实际值却为50，不知作何解释？是否tbc已经过时，不具参考意义？

所以这个结论就是：

根据注释中的建议，实际使⽤时，在视频解码过程中，我们不使⽤AVCodecContext.time_base，⽽⽤帧率倒数作时间基，在视频编码过程中，我们将AVCodecContext.time_base设置为帧率的倒数。

视频流

视频按帧播放，所以解码后的原始视频帧时间基为 1/framerate。

视频解码过程中的时间基转换处理，packet的pts到底什么，要看实际的情况，从av_read_frame读取的packet，是以AVSteam->time_base，送给解码器之前没有必要转成AVcodecContext->time_base。

需要注意的是avcodec_receive_frame后以AVSteam->time_base为单位即可。

也就是以下这段代码，没有必要进行时间基的转换。

AVFormatContext *ifmt_ctx; AVStream *in_stream; AVCodecContext *dec_ctx; AVPacket packet; AVFrame *frame; // 从输⼊⽂件中读取编码帧 av_read_frame(ifmt_ctx &packet); // 时间基转换 int raw_video_time_base = av_inv_q(dec_ctx->framerate); av_packet_rescale_ts(packet in_stream->time_base raw_video_time_ba se); // 解码 avcodec_send_packet(dec_ctx packet)； avcodec_receive_frame(dec_ctx frame);

视频编码过程中的时间基转换处理。编码的时候frame如果以AVstream为time_base送编码器，则avcodec_receive_packet读取的时候也是可以转成AVSteam->time_base，也就是自动可以转换。也就是具体情况，具体分析。

AVFormatContext *ofmt_ctx; AVStream *out_stream; AVCodecContext *dec_ctx; AVCodecContext *enc_ctx; AVPacket packet; AVFrame *frame; // 编码 avcodec_send_frame(enc_ctx frame); //这里内部就可以转换AVSteam->time_base avcodec_receive_packet(enc_ctx packet); // 时间基转换 packet.stream_index = out_stream_idx; enc_ctx->time_base = av_inv_q(dec_ctx->framerate); av_packet_rescale_ts(&opacket enc_ctx->time_base out_stream->time_ base); // 将编码帧写⼊输出媒体⽂件 av_interleaved_write_frame(o_fmt_ctx &packet);

⾳频流

对于⾳频流也是类似的，本质来讲就是具体情况具体分析，⽐如ffplay 解码播放时就是AVSteam的time_base为基准的packet。然后出来的frame再⽤AVSteam的time_base对应的将pts转成秒(使用内部时间基转换)。

但是要注意的是ffplay做了⼀个⽐较隐秘的设置：avctx->pkt_timebase = ic->streams[stream_index]->time_base; 即是对应的codeccontext⾃⼰对pkt_timebase设置，和AVStream⼀样的time_base。

⾳频按采样点播放，所以解码后的原始⾳频帧时间基为 1/sample_rate。

⾳频解码过程中的时间基转换处理：

AVFormatContext *ifmt_ctx; AVStream *in_stream; AVCodecContext *dec_ctx; AVPacket packet; AVFrame *frame; // 从输⼊⽂件中读取编码帧 av_read_frame(ifmt_ctx &packet); // 时间基转换 int raw_audio_time_base = av_inv_q(dec_ctx->sample_rate); 10 av_packet_rescale_ts(packet in_stream->time_base raw_audio_time_ba se); 11 // 解码 12 avcodec_send_packet(dec_ctx packet) 13 avcodec_receive_frame(dec_ctx frame);

⾳频编码过程中的时间基转换处理：

AVFormatContext *ofmt_ctx; AVStream *out_stream; AVCodecContext *dec_ctx; AVCodecContext *enc_ctx; AVPacket packet; AVFrame *frame; // 编码 avcodec_send_frame(enc_ctx frame); avcodec_receive_packet(enc_ctx packet); // 时间基转换 packet.stream_index = out_stream_idx; enc_ctx->time_base = av_inv_q(dec_ctx->sample_rate); av_packet_rescale_ts(&opacket enc_ctx->time_base out_stream->time_ base); // 将编码帧写⼊输出媒体⽂件 av_interleaved_write_frame(o_fmt_ctx &packet);

3.分析实战代码

主播声音拍摄教程（FLVMP4TS合成音实战）(37)

封装音视频编码相关的数据结构。包括编码器上下文，每一个stream，音频的采样数量，重采样前后的frame等，详细看如下图：

主播声音拍摄教程（FLVMP4TS合成音实战）(38)

定义AVOutputFormat *fmt，输出文件容器格式封装了复用规则，比如，ff_flv_muxer。

分配AVFormatContext并根据filename绑定合适的AVOutputFormat，如果如果不能根据文件后缀名找到合适的格式，那缺省使用flv格式。并获取一些参数进行填充到输出结构里。如下：

主播声音拍摄教程（FLVMP4TS合成音实战）(39)

使用指定的音视频编码格式增加音频流和视频流，如果不想要音频或视频，那这个fmt->video_codec可以指定为AV_CODEC_ID_NONE。

主播声音拍摄教程（FLVMP4TS合成音实战）(40)

这个add_stream具体是做什么工作呢？

查找编码器，然后新建码流，并绑定到 AVFormatContext。

主播声音拍摄教程（FLVMP4TS合成音实战）(41)

如图，这样好理解点。

主播声音拍摄教程（FLVMP4TS合成音实战）(42)

现在指定编码器上下文的索引，默认索引值为-1，每次调用avformat_new_stream的时候nb_streams 1。但id是从0开始比如第1个流：对应流id = nb_streams(1) -1 = 0，第2个流：对应流id = nb_streams(2) -1 = 1。

紧接着就是创建编码器上下文。

主播声音拍摄教程（FLVMP4TS合成音实战）(43)

初始化音视频编码器的一些参数。

初始化音频参数:

主播声音拍摄教程（FLVMP4TS合成音实战）(44)

初始化视频参数：

主播声音拍摄教程（FLVMP4TS合成音实战）(45)

添加完流后，就要去open_video，看看具体做什么工作？

首先就是关联编码器，分配帧buffer，这里使用的是32字节对齐，如果编码器格式需要的数据不是 AV_PIX_FMT_YUV420P才需要调用图像scale，一般H264是支持，需要转换。还需要把编码器上下文的一些参数拷贝过来。

主播声音拍摄教程（FLVMP4TS合成音实战）(46)

添加完流后，也要去open_audio，看看具体做什么工作？

也要关联编码器，初始化PCM参数。配送给编码器的帧并申请对应的buffer。也需要拷贝参数，如果采样格式不符合要求，也是要创建音频重采样。

注意：这里会设置codec_ctx->time_base，这里就为1/44100。

主播声音拍摄教程（FLVMP4TS合成音实战）(47)

主播声音拍摄教程（FLVMP4TS合成音实战）(48)

做了前面的一些关联，参数设置后。接下来就要打开对应输出文件。

主播声音拍摄教程（FLVMP4TS合成音实战）(49)

再开始就是写入头部，注意这里只是调用了一次。在写入头部是，对应steam的time_base被改写，时间发生变化。

主播声音拍摄教程（FLVMP4TS合成音实战）(50)

重头戏，这里就开始循环写入音频和视频数据了。其中av_compare_ts，就是用来比较音频pts和视频pts大小，做一个同步处理，使得pts都有序，以音频pts为基准。

主播声音拍摄教程（FLVMP4TS合成音实战）(51)

看看write_audio_frame，具体工作干了什么？

获取音频数据，进行格式转换，时间基数转换，添加pts。

主播声音拍摄教程（FLVMP4TS合成音实战）(52)

这里通过av_compare_ts，控制了一个音频编码写入时长。

主播声音拍摄教程（FLVMP4TS合成音实战）(53)

然后再编码音频，再写pkt到文件里。

主播声音拍摄教程（FLVMP4TS合成音实战）(54)

看看write_video_frame，又主要干了什么？

主要就是获取视频数据，编码视频，然后写PKT到文件(同样也只产生5s中的数据)。

获取视频数据，进行像素格式转换。

主播声音拍摄教程（FLVMP4TS合成音实战）(55)

主播声音拍摄教程（FLVMP4TS合成音实战）(56)

在 write_frame这里，会进行一个时间基数再次转换。

pts_before * 1/44100 = pts_after *1/1000。

pts_after = pts_before * 1/44100 * 1000 = -1024 * 1/44100 * 1000 = -23。相当于要存储一个在文件中，合适的时间。也不要认为这里的单位一定是ms。一般我在这里，会根据duration去计算pts。

做了转换后，新的pts，dts，duration，都会做转换。

主播声音拍摄教程（FLVMP4TS合成音实战）(57)

主播声音拍摄教程（FLVMP4TS合成音实战）(58)

再看看video 的pts变化。

主播声音拍摄教程（FLVMP4TS合成音实战）(59)

所以前面的工作都执行完后，需要调用av_write_trailer，MP4和FLV是有很大区别，如果MP4不调用这个函数，最后是无法播放。

以上这篇文章分析到这里，欢迎关注，点赞，转发，收藏。如果需要测试代码，可以私信。

网站首页

返回栏目

主播声音拍摄教程（FLVMP4TS合成音实战）

猜您喜欢：

相关文章