ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）-爱玩科技

威哥 2023-07-30 05:23:24 367

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(4)stream_open。(3)SDL_CreateRender，基于主窗⼝创建renderer，⽤于渲染输出。int main(int argc char **argv) { //…… //3. SDL的初始化 flags = SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER; if (SDL_Init (flags)) { av_log(NULL AV_LOG_FATAL "Could not initialize SDL - %s\n" SDL_GetError()); av_log(NULL AV_LOG_FATAL "(Did you set the DISPLAY variabl e?)\n"); exit(1);

1.视频输出模块

ffplay为了适应不同的平台，选择了SDL（跨平台）作为显示的SDK，以便在windows、linux、macos等不同平台上实现视频画⾯的显示。

视频（图像）输出初始化。

开始分析视频（图像）的显示。因为使⽤了SDL，⽽video的显示也依赖SDL的窗⼝显示系统，所以先从main函数的SDL初始化看起（节选）：

int main(int argc char **argv) { //…… //3. SDL的初始化 flags = SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER; if (SDL_Init (flags)) { av_log(NULL AV_LOG_FATAL "Could not initialize SDL - %s\n" SDL_GetError()); av_log(NULL AV_LOG_FATAL "(Did you set the DISPLAY variabl e?)\n"); exit(1); } //4. 创建窗⼝ window = SDL_CreateWindow(program_name SDL_WINDOWPOS_UNDEFINED SDL_WINDOWPOS_UNDEFINED default_width default_height flags); if (window) { //创建renderer renderer = SDL_CreateRenderer(window -1 SDL_RENDERER_ACCEL ERATED | SDL_RENDERER_PRESENTVSYNC); if (!renderer) { av_log(NULL AV_LOG_WARNING "Failed to initialize a har dware accelerated renderer: %s\n" SDL_GetError()); renderer = SDL_CreateRenderer(window -1 0); if (renderer) { if (!SDL_GetRendererInfo(renderer &renderer_info)) av_log(NULL AV_LOG_VERBOSE "Initialized %s rendere r.\n" renderer_info.name); } } //5. 通过stream_open函数，开启read_thread读取线程 is = stream_open(input_filename file_iformat);//这⾥创建了read_th read if (!is) { av_log(NULL AV_LOG_FATAL "Failed to initialize VideoState! \n"); do_exit(NULL); } //6. 事件响应 event_loop(is); } }

main函数主要步骤如下：

(1)SDL_Init，主要是SDL_INIT_VIDEO的⽀持。

(2)SDL_CreateWindow，创建主窗⼝。

(3)SDL_CreateRender，基于主窗⼝创建renderer，⽤于渲染输出。

(4)stream_open。

(5)event_loop，播放控制事件响应循环，但也负责了video显示输出。

重点分析set_default_window_size的原理，该函数主要获取窗⼝的宽⾼，以及视频渲染的区域：

//7 从待处理流中获取相关参数，设置显示窗⼝的宽度、⾼度及宽⾼⽐ if (st_index[AVMEDIA_TYPE_VIDEO] >= 0) { AVStream *st = ic->streams[st_index[AVMEDIA_TYPE_VIDEO]]; AVCodecParameters *codecpar = st->codecpar; /*根据流和帧宽⾼⽐猜测帧的样本宽⾼⽐。该值只是⼀个参考 */ AVRational sar = av_guess_sample_aspect_ratio(ic st NULL); if (codecpar->width) { // 设置显示窗⼝的⼤⼩和宽⾼⽐ set_default_window_size(codecpar->width codecpar->height s ar); } }

static void set_default_window_size(int width int height AVRationa l sar) { SDL_rect rect; int max_width = screen_width ? screen_width : INT_MAX; // 确定是否指定窗⼝最⼤宽度 int max_height = screen_height ? screen_height : INT_MAX; // 确定是否指定窗⼝最⼤⾼度 if (max_width == INT_MAX && max_height == INT_MAX) max_height = height; // 没有指定最⼤⾼度时则使⽤视频的⾼度 calculate_display_rect(&rect 0 0 max_width max_height width height sar); default_width = rect.w; default_height = rect.h; }

screen_width和screen_height可以在ffplay启动时设置 -x screen_width -y screen_height获取指定的宽⾼，如果没有指定，则max_height = height，即是视频帧的⾼度，宽度是窗口宽度。

初始化窗⼝显示⼤⼩

主要分析calculate_display_rect，根据传⼊的参数（int scr_xleft int scr_ytop int scr_width int scr_height int pic_width int pic_height AVRational pic_sar）获取显示区域的起始坐标和⼤⼩(rect)。

static void calculate_display_rect(SDL_Rect *rect int scr_xleft int scr_ytop int scr_width int scr_height， int pic_width int pic_height AV Rational pic_sar) { AVRational aspect_ratio = pic_sar; // ⽐率 int64_t width height x y; if (av_cmp_q(aspect_ratio av_make_q(0 1)) <= 0) aspect_ratio = av_make_q(1 1);// 如果aspect_ratio是负数或者为 0 设置为1:1 // 转成真正的播放⽐例 aspect_ratio = av_mul_q(aspect_ratio av_make_q(pic_width pic_h eight)); /* XXX: we suppose the screen has a 1.0 pixel ratio */ // 计算显示视频帧区域的宽⾼ // 先以⾼度为基准就是把高度按照原始视频高度来，铺满 height = scr_height; // &~1 取偶数宽度 width = av_rescale(height aspect_ratio.num aspect_ratio.den) & ~1; if (width > scr_width) { // 当以⾼度为基准发现计算出来的需要的窗⼝宽度不⾜时调整为以窗⼝宽度为基准 width = scr_width; height = av_rescale(width aspect_ratio.den aspect_ratio.nu m) & ~1; } // 计算显示视频帧区域的起始坐标（在显示窗⼝内部的区域） x = (scr_width - width) / 2; y = (scr_height - height) / 2; rect->x = scr_xleft x; rect->y = scr_ytop y; rect->w = FFMAX((int)width 1); rect->h = FFMAX((int)height 1); }

typedef struct AVRational{

int num; ///< Numerator 分⼦

int den; ///< Denominator 分⺟

} AVRational;

注意视频显示尺⼨比例的计算计算出真实的比例。

aspect_ratio = av_mul_q(aspect_ratio av_make_q(pic_width pic_height));

视频（图像）输出逻辑。

基本步骤如下：

1main() -- > 2 event_loop --> 3 refresh_loop_wait_event() --> 4 video_refresh() --> 5 video_display() --> 6 video_image_display() --> 7 upload_texture()

event_loop 开始处理SDL事件：

static void event_loop(VideoState *cur_stream) { SDL_Event event; double incr pos frac; for (;;) { double x; refresh_loop_wait_event(cur_stream &event);//video是在这⾥显示的 switch (event.type) { //…… case SDLK_SPACE://按空格键触发暂停/恢复 toggle_pause(cur_stream); break; case SDL_QUIT: case FF_QUIT_EVENT://⾃定义事件，⽤于出错时的主动退出 do_exit(cur_stream); break; } }

event_loop 的主要代码是⼀个主循环，主循环内执⾏：

(1)refresh_loop_wait_event。

(2)处理SDL事件队列中的事件。⽐如按空格键可以触发暂停/恢复，关闭窗⼝可以触发do_exit销毁播放现场。

video的显示主要在 refresh_loop_wait_event ：

static void refresh_loop_wait_event(VideoState *is SDL_Event *event ) { double remaining_time = 0.0; /* 休眠等待，remaining_time的计算在vide o_refresh中 */ /* 调⽤SDL_PeepEvents前先调⽤SDL_PumpEvents，将输⼊设备的事件抽到事件队列中 */ SDL_PumpEvents(); /* * SDL_PeepEvents check是否事件，⽐如⿏标移⼊显示区等 * 从事件队列中拿⼀个事件，放到event中，如果没有事件，则进⼊循环中 * SDL_PeekEvents⽤于读取事件，在调⽤该函数之前，必须调⽤SDL_PumpEvents 搜集键盘等事件 */ while (!SDL_PeepEvents(event 1 SDL_GETEVENT SDL_FIRSTEVENT S DL_LASTEVENT)) { if (!cursor_hidden && av_gettime_relative() - cursor_last_sh own > CURSOR_HIDE_DELAY) { SDL_ShowCursor(0); cursor_hidden = 1; } /* * remaining_time就是⽤来进⾏⾳视频同步的。 * 在video_refresh函数中，根据当前帧显示时刻(display time)和实际时刻 (actual time) * 计算需要sleep的时间，保证帧按时显示 */ if (remaining_time > 0.0) //sleep控制画⾯输出的时机 av_usleep((int64_t)(remaining_time * 1000000.0)); remaining_time = REFRESH_RATE; if (is->show_mode != SHOW_MODE_NONE && // 显示模式不等于SHOW_MO DE_NONE (!is->paused // ⾮暂停状态 || is->force_refresh) // ⾮强制刷新状态 ) { video_refresh(is &remaining_time); } /* 从输⼊设备中搜集事件，推动这些事件进⼊事件队列，更新事件队列的状态， * 不过它还有⼀个作⽤是进⾏视频⼦系统的设备状态更新，如果不调⽤这个函数， * 所显示的视频会在⼤约10秒后丢失⾊彩。没有调⽤SDL_PumpEvents，将不会 * 有任何的输⼊设备事件进⼊队列，这种情况下，SDL就⽆法响应任何的键盘等硬件输⼊。 */ //响应更多事件 SDL_PumpEvents(); } }

响应事件时，画面就有可能要暂停。

SDL_PeepEvents通过参数SDL_GETEVENT⾮阻塞查询队列中是否有事件。如果返回值不为0，表示有事件发⽣（或-1表示发⽣错误），那么函数就会返回，接着让event_loop处理事件；否则，就调⽤video_refresh显示画⾯，并通过输出参数remaining_time获取下⼀轮应当sleep的时间，以保持稳定的画⾯输出。

这⾥还有⼀个判断是否要调⽤video_refresh的前置条件。满⾜以下条件即可显示：

(1)显示模式不为SHOW_MODE_NONE（如果⽂件中只有audio，也会显示其波形或者频谱图等）。

(2)或者，当前没有被暂停。

(3)或者，当前设置了force_refresh，我们分析force_refresh置为1的场景：

a. video_refresh⾥⾯帧该显示，这个是常规情况。

b. SDL_WINDOWEVENT_EXPOSED，窗⼝需要重新绘制。

c.SDL_MOUSEBUTTONDOWN && SDL_BUTTON_LEFT 连续⿏标左键点击2次显示窗⼝间隔⼩

于0.5秒，进⾏全屏或者恢复原始窗⼝播放。

d. SDLK_f，按f键进⾏全屏或者恢复原始窗⼝播放。

video_refresh:视频刷新显示

接下来，分析video显示的关键函数 video_refresh （经简化）：

static void video_refresh(void *opaque double *remaining_time) { VideoState *is = opaque; double time; Frame *sp *sp2; if (is->video_st) { retry: if (frame_queue_nb_remaining(&is->pictq) == 0) {// 帧队列是否为空 // nothing to do no picture to display in the queue // 什么都不做，队列中没有图像可显示 } else { double last_duration duration delay; Frame *vp *lastvp; /* dequeue the picture */ lastvp = frame_queue_peek_last(&is->pictq); //读取上⼀帧 vp = frame_queue_peek(&is->pictq); // 读取待显示帧 if (vp->serial != is->videoq.serial) { // 如果不是最新的播放序列，则将其出队列，以尽快读取最新序列的帧 frame_queue_next(&is->pictq); goto retry; } if (lastvp->serial != vp->serial) // 新的播放序列重置当前时间 is->frame_timer = av_gettime_relative() / 1000000.0; if (is->paused) goto display; /* compute nominal last_duration */ last_duration = vp_duration(is lastvp vp); //计算上⼀帧l astvp应显示的时⻓ delay = compute_target_delay(last_duration is); //计算上⼀帧lastvp还要播放的时间 time= av_gettime_relative()/1000000.0; //相当于做了一个流控 if (time < is->frame_timer delay) { // 还没有到播放时间 *remaining_time = FFMIN(is->frame_timer delay - ti me *remaining_time); goto display; } // ⾄少该到vp播放的时间了 is->frame_timer = delay; //事件差值在某个范围，也认定是有效。 if (delay > 0 && time - is->frame_timer > AV_SYNC_THRESH OLD_MAX) is->frame_timer = time; //实时更新时间 SDL_LockMutex(is->pictq.mutex); if (!isnan(vp->pts)) update_video_pts(is vp->pts vp->pos vp->serial); // 更新video的时钟 SDL_UnlockMutex(is->pictq.mutex); //表示队列至少有2帧 if (frame_queue_nb_remaining(&is->pictq) > 1) { //取出下一帧 Frame *nextvp = frame_queue_peek_next(&is->pictq); //计算与上一帧时长 duration = vp_duration(is vp nextvp); //如果不是video_master，并且可以drop if(!is->step && (framedrop>0 || (framedrop && get_master_sync_ty pe(is) != AV_SYNC_VIDEO_MASTER)) && time > is->frame_timer duration){ is->frame_drops_late ; //尽快拿到下一帧 frame_queue_next(&is->pictq); goto retry; //检测下⼀帧 } } if (is->subtitle_st) { //显示字幕 //…… } frame_queue_next(&is->pictq); is->force_refresh = 1; if (is->step && !is->paused) stream_toggle_pause(is); } display: /* display picture */ if (!display_disable && is->force_refresh && is->show_mode = = SHOW_MODE_VIDEO && is->pictq.rindex_shown) video_display(is); } is->force_refresh = 0; }

video_refresh ⽐较⻓，即使已经经过了简化，去掉了次要分⽀。

函数中涉及到FrameQueue中的3个节点是lastvp vp nextvp，其中：

(1)vp这次将要显示的⽬标帧（待显示帧）。

(2)lastvp是已经显示了的帧（也是当前屏幕上看到的帧）。

(3)nextvp是下⼀次要显示的帧（排在vp后⾯）。

顺序就是nextvp(下下次准备显示)->vp(待显示)->lastvp(已经显示)

取出其前⾯⼀帧与后⾯⼀帧，是为了通过pts准确计算duration(这个值有可能会发生变化)。duration的计算通过函数 vp_duration完成：

// 计算上⼀帧需要持续的duration，这⾥有校正算法 static double vp_duration(VideoState *is Frame *vp Frame *nextvp) { if (vp->serial == nextvp->serial) { // 同⼀播放序列，序列连续的情况下 double duration = nextvp->pts - vp->pts; if (isnan(duration) // duration 数值异常 || duration <= 0 // pts值没有递增时 || duration > is->max_frame_duration // 超过了最⼤帧范围 ) { // 异常情况， // 1. 异常情况⽤⽤之前⼊队列的时候计算的帧间隔(主要根据帧率去计算) return vp->duration; /* 异常时以帧时间为基准(1秒/帧率) */ } else { // 2. 相邻pts' return duration; //使⽤两帧pts差值计算duration，⼀般情况下也是⾛的这个分⽀ } } else { // 不同播放序列序列不连续则返回0 return 0.0; } }

video_refresh 的主要流程如下：

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(1)

先来看下上⾯流程图中的主流程——即中间⼀列框图。从框图进⼀步抽象， video_refresh 的主体流程分为3个步骤：

(1)取出上⼀帧lastvp和待显示的帧vip。

(2)计算上⼀帧lastvp应显示的时⻓，判断是否继续显示上⼀帧。

(3)估算当前帧应显示的时⻓，判断是否要丢帧，已过时就要丢帧。未过时，就要显示。

(4)调⽤video_display进⾏显示。

整体一个渲染逻辑就是，video_display 会调⽤ frame_queue_peek_last 获取上次显示的frame（lastvp），并显示。所以在 video_refresh 中如果流程直接⾛到 video_display 就会显示 lastvp （需要注意的是在此时如果不是触发了force_refresh，则不会去重新取lastvp进⾏重新渲染），如果先调⽤ frame_queue_next再调⽤ video_display ，那么就会显示 vp。

下⾯我们具体分析这3个步骤，并和流程图与代码进⾏对应阅读。

计算上⼀帧应显示的时⻓，判断是否继续显示上⼀帧

⾸先检查pictq是否为空（调⽤ frame_queue_nb_remaining 判断队列中是否有未显示的帧），如果为空，则调⽤ video_display (显示上⼀帧)。

在进⼀步准确计算上⼀帧应显示时间前，需要先判断 frame_queue_peek 获取的 vp 是否是最新序列——即 if (vp->serial != is->videoq.serial) ，如果条件成⽴，说明发⽣过seek等操作，流不连续，应该抛弃lastvp。故调⽤ frame_queue_next 抛弃lastvp后，返回流程开头重试下⼀轮。

接下来可以计算准确的 lastvp 应显示时⻓了。计算应显示时间的代码是：

last_duration = vp_duration(is lastvp vp); delay = compute_target_delay(last_duration is);

直观理解，主要基于 vp_duration 计算两帧pts差，即帧持续时间，即可。但是，如果考虑到同步，⽐如视频同步到⾳频，则还需要考虑当前与主时钟的差距，进⽽决定是重复上⼀帧还是丢帧，还是正常显示下⼀帧（待显示帧vp）。这⾥只需要理解通过以上两步就可以计算出准确的上⼀帧应显示时⻓了。最后，根据上⼀帧应显示时⻓（delay变量），确定是否继续显示上⼀帧：

time= av_gettime_relative()/1000000.0;//获取当前系统时间(单位秒) if (time < is->frame_timer delay) { *remaining_time = FFMIN(is->frame_timer delay - time *remainin g_time); goto display; }

frame_timer 可以理解为帧显示时刻，对于更新前，可以理解为上⼀帧的显示时刻。对于更新后，可以理解为当前帧显示时刻。time < is->frame_timer delay (上一帧显示还没结束)，如果当前系统时刻还未到达上⼀帧的结束时刻，那么还应该继续显示上⼀帧。这里的delay就是如图下的last_duration。

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(2)

估算当前帧应显示的时⻓，判断是否要丢帧

这个步骤执⾏前，还需要⼀点准备⼯作：更新frame_timer和更新vidclk。

is->frame_timer = delay;//更新frame_timer，现在表示vp的显示时刻 if (delay > 0 && time - is->frame_timer > AV_SYNC_THRESHOLD_MAX) is->frame_timer = time;//如果和系统时间差距太⼤，就纠正为系统时间 SDL_LockMutex(is->pictq.mutex); if (!isnan(vp->pts)) update_video_pts(is vp->pts vp->pos vp->serial);//更新vidclk SDL_UnlockMutex(is->pictq.mutex);

接下来就可以判断是否要丢帧了：

if (frame_queue_nb_remaining(&is->pictq) > 1) {//有nextvp才会检测是否该丢帧 Frame *nextvp = frame_queue_peek_next(&is->pictq); duration = vp_duration(is vp nextvp); if(!is->step // ⾮逐帧模式才检测是否需要丢帧 is->step==1 为逐帧播放逐帧播放最好别丢帧。 && (framedrop>0 || // cpu解帧过慢 (framedrop && get_master_sync_type(is) != AV_SYNC_VIDEO_M ASTER)) // ⾮视频同步⽅式 && time > is->frame_timer duration // 确实落后了⼀帧数据 ) { printf("%s(%d) dif:%lfs drop frame\n" __FUNCTION__ __LINE __ (is->frame_timer duration) - time); 11 is->frame_drops_late ; // 统计丢帧情况 // 这⾥实现真正的丢帧 frame_queue_next(&is->pictq); goto retry; } }

丢帧的代码也⽐较简单，只需要 frame_queue_next ，然后retry。丢帧的前提条件是frame_queue_nb_remaining(&is->pictq) > 1，即是要有nextvp，且需要同时满⾜以下条件：

(1)不处于step状态。换⾔之，如果当前是step状态，不会触发丢帧逻辑。（step⽤于pause状态下进⾏seek操作时，于seek操作结束后显示seek后的⼀帧画⾯，⽤于直观体现seek⽣效了）。

(2)启⽤framedrop机制，或⾮AV_SYNC_VIDEO_MASTER（不是以video为同步）。

(3)时间已⼤于frame_timer duration，已经超过该帧该显示的时⻓(超过一定的阈值)。

调⽤video_display进⾏显示

如果既不需要重复上⼀帧，也不需要抛弃当前帧，那么就可以安⼼显示当前帧了。之前有顺带提过video_display 中显示的是 frame_queue_peek_last ，所以需要先调⽤ frame_queue_next ，移动pictq内的指针，将vp变成shown，确保 frame_queue_peek_last 取到的是vp。

static void video_display(VideoState *is) { if (!is->width) video_open(is);//如果窗⼝未显示，则显示窗⼝ SDL_SetRenderDrawColor(renderer 0 0 0 255); SDL_RenderClear(renderer); if (is->audio_st && is->show_mode != SHOW_MODE_VIDEO) video_audio_display(is);//图形化显示仅有⾳轨的⽂件 else if (is->video_st) video_image_display(is);//显示⼀帧视频画⾯ SDL_RenderPresent(renderer); }

如果初步了解解SDL接⼝的使⽤，我们直接看 video_image_display ：

static void video_image_display(VideoState *is) { Frame *vp; Frame *sp = NULL; SDL_Rect rect; vp = frame_queue_peek_last(&is->pictq);//取要显示的视频帧 if (is->subtitle_st) { //字幕显示逻辑 } //将帧宽⾼按照sar最⼤适配到窗⼝ calculate_display_rect(&rect is->xleft is->ytop is->width is ->height vp->width vp->height vp->sar); if (!vp->uploaded) {//如果是重复显示上⼀帧，那么uploaded就是1 if (upload_texture(&is->vid_texture vp->frame &is->img_con vert_ctx) < 0) return; vp->uploaded = 1; // 已经拷⻉过⼀次到vid_texture则不需要再拷⻉ vp->flip_v = vp->frame->linesize[0] < 0; // = 1垂直翻转 = 0正常播放 } SDL_RenderCopyEx(renderer is->vid_texture NULL &rect 0 NULL vp->flip_v ? SDL_FLIP_VERTICAL : 0); if (sp) { //字幕显示逻辑 } }

如果了解了SDL的显示， video_image_display 的逻辑不算复杂，即先 frame_queue_peek_last取要显示帧，然后 upload_texture 更新到SDL_Texture，最后通过 SDL_RenderCopyEx 拷⻉纹理给render显示。

最后，了解下 upload_texture 具体是如何将AVFormat的图像数据传给sdl的纹理：

static int upload_texture(SDL_Texture **tex AVFrame *frame struct SwsContext **img_convert_ctx) { int ret = 0; Uint32 sdl_pix_fmt; SDL_BlendMode sdl_blendmode; // 根据frame中的图像格式(FFmpeg像素格式)，获取对应的SDL像素格式 get_sdl_pix_fmt_and_blendmode(frame->format &sdl_pix_fmt &sdl_ blendmode); // 参数tex实际是&is->vid_texture，此处根据得到的SDL像素格式，为&is->vid_ texture if (realloc_texture(tex sdl_pix_fmt == SDL_PIXELFORMAT_UNKNOWN ? SDL_PIXELFORMAT_ARGB8888 : sdl_pix_fmt frame->width frame->heigh t sdl_blendmode 0) < 0) return -1; switch (sdl_pix_fmt) { /* frame格式是SDL不⽀持的格式，则需要进⾏图像格式转换，转换为⽬标格式AV_ PIX_FMT_BGRA(因为这个转换导致效率可能不如libyuv和shader) ，对应SDL_PIXELFORMAT_BGRA32 */ case SDL_PIXELFORMAT_UNKNOWN: /* This should only happen if we are not using avfilte r... */ *img_convert_ctx = sws_getCachedContext(*img_convert_ctx frame->width frame->height frame->format frame->w idth frame->height AV_PIX_FMT_BGRA sws_flags NULL NULL NULL); (*img_convert_ctx != NULL) { uint8_t *pixels[4]; int pitch[4]; if (!SDL_LockTexture(*tex NULL (void **)pixels pi tch)) { sws_scale(*img_convert_ctx (const uint8_t * con st *)frame->data frame->linesize 0 frame->height pixels pitch); SDL_UnlockTexture(*tex); } } else { av_log(NULL AV_LOG_FATAL "Cannot initialize the co nversion context\n"); ret = -1; } break; // frame格式对应SDL_PIXELFORMAT_IYUV，不⽤进⾏图像格式转换，调⽤SDL _UpdateYUVTexture()更新SDL texture case SDL_PIXELFORMAT_IYUV: if (frame->linesize[0] > 0 && frame->linesize[1] > 0 && frame->linesize[2] > 0) { ret = SDL_UpdateYUVTexture(*tex NULL frame->data[0 ] frame->linesize[0] frame->data[1 ] frame->linesize[1] frame->data[2 ] frame->linesize[2]); } else if (frame->linesize[0] < 0 && frame->linesize[1]< 0 && frame->linesize[2] < 0) { ret = SDL_UpdateYUVTexture(*tex NULL frame->data[0 ] frame->linesize[0] * (frame->height - 1) -fr ame->linesize[0] frame->data[1 ] frame->linesize[1] * (AV_CEIL_RSHIFT(frame->height 1) - 1) -fr ame->linesize[1] frame->data[2 ] frame->linesize[2] * (AV_CEIL_RSHIFT(frame->height 1) - 1) -fr ame->linesize[2]); } else { av_log(NULL AV_LOG_ERROR "Mixed negative and posit ive linesizes are not supported.\n"); return -1; } break; // frame格式对应其他SDL像素格式，不⽤进⾏图像格式转换，调⽤SDL_UpdateT exture()更新SDL texture default: if (frame->linesize[0] < 0) { ret = SDL_UpdateTexture(*tex NULL frame->data[0] frame->linesize[0] * (frame->height - 1) -frame->linesize[0]); } else { ret = SDL_UpdateTexture(*tex NULL frame->data[0] frame->linesize[0]); } break; } return ret; }

frame中的像素格式是FFmpeg中定义的像素格式，FFmpeg中定义的很多像素格式和SDL中定义的很多像素格式其实是同⼀种格式，只名称不同⽽已。根据frame中的像素格式与SDL⽀持的像素格式的匹配情况，upload_texture()处理三种类型，对应switch语句的三个分⽀：

(1)如果frame图像格式对应SDL_PIXELFORMAT_IYUV格式，不进⾏图像格式转换，使⽤SDL_UpdateYUVTexture() 将图像数据更新到 &is->vid_texture。

(2)如果frame图像格式对应其他被SDL⽀持的格式(诸如AV_PIX_FMT_RGB32)，也不进⾏图像格式转换，使⽤ SDL_UpdateTexture() 将图像数据更新到 &is->vid_texture。

(3)如果frame图像格式不被SDL⽀持(即对应SDL_PIXELFORMAT_UNKNOWN)，则需要进⾏图像格式转换。

根据映射表获取frame对应SDL中的像素格式

get_sdl_pix_fmt_and_blendmode()

这个函数的作⽤，获取输⼊参数 format (FFmpeg像素格式)在SDL中的像素格式，取到的SDL像素格式存在输出参数 sdl_pix_fmt 中。

static void get_sdl_pix_fmt_and_blendmode(int format Uint32 *sdl_pi x_fmt SDL_BlendMode *sdl_blendmode) { int i; *sdl_blendmode = SDL_BLENDMODE_NONE; *sdl_pix_fmt = SDL_PIXELFORMAT_UNKNOWN; if (format == AV_PIX_FMT_RGB32 || format == AV_PIX_FMT_RGB32_1 || format == AV_PIX_FMT_BGR32 || format == AV_PIX_FMT_BGR32_1) *sdl_blendmode = SDL_BLENDMODE_BLEND; for (i = 0; i < FF_ARRAY_ELEMS(sdl_texture_format_map) - 1; i ) { if (format == sdl_texture_format_map[i].format) { *sdl_pix_fmt = sdl_texture_format_map[i].texture_fmt; return; } } }

在ffplay.c中定义了⼀个表 sdl_texture_format_map[] ，其中定义了FFmpeg中⼀些像素格式与SDL像素格式的映射关系，如下：

static const struct TextureFormatEntry { enum AVPixelFormat format; int texture_fmt; } sdl_texture_format_map[] = { { AV_PIX_FMT_RGB8 SDL_PIXELFORMAT_RGB332 } { AV_PIX_FMT_RGB444 SDL_PIXELFORMAT_RGB444 } { AV_PIX_FMT_RGB555 SDL_PIXELFORMAT_RGB555 } { AV_PIX_FMT_BGR555 SDL_PIXELFORMAT_BGR555 } { AV_PIX_FMT_RGB565 SDL_PIXELFORMAT_RGB565 } { AV_PIX_FMT_BGR565 SDL_PIXELFORMAT_BGR565 } { AV_PIX_FMT_RGB24 SDL_PIXELFORMAT_RGB24 } { AV_PIX_FMT_BGR24 SDL_PIXELFORMAT_BGR24 } { AV_PIX_FMT_0RGB32 SDL_PIXELFORMAT_RGB888 } { AV_PIX_FMT_0BGR32 SDL_PIXELFORMAT_BGR888 } { AV_PIX_FMT_NE(RGB0 0BGR) SDL_PIXELFORMAT_RGBX8888 } { AV_PIX_FMT_NE(BGR0 0RGB) SDL_PIXELFORMAT_BGRX8888 } { AV_PIX_FMT_RGB32 SDL_PIXELFORMAT_ARGB8888 } { AV_PIX_FMT_RGB32_1 SDL_PIXELFORMAT_RGBA8888 } { AV_PIX_FMT_BGR32 SDL_PIXELFORMAT_ABGR8888 } { AV_PIX_FMT_BGR32_1 SDL_PIXELFORMAT_BGRA8888 } { AV_PIX_FMT_YUV420P SDL_PIXELFORMAT_IYUV } { AV_PIX_FMT_YUYV422 SDL_PIXELFORMAT_YUY2 } { AV_PIX_FMT_UYVY422 SDL_PIXELFORMAT_UYVY } { AV_PIX_FMT_NONE SDL_PIXELFORMAT_UNKNOWN } };

可以看到，除了最后⼀项，其他格式的图像送给SDL是可以直接显示的，不必进⾏图像转换。关于这些像素格式的含义，可参考附录：⾊彩空间与像素格式。

重新分配vid_texture

realloc_texture()

根据新得到的SDL像素格式，为 &is->vid_texture 重新分配空间，如下所示，先SDL_DestroyTexture() 销毁，再 SDL_CreateTexture() 创建。

static int realloc_texture(SDL_Texture **texture Uint32 new_format int new_width int new_height SDL_BlendMode blendmode int init_tex ture) { Uint32 format; int access w h; if (!*texture || SDL_QueryTexture(*texture &format &access &w &h) < 0 || new_width != w || new_height != h || new_format != form at) { void *pixels; int pitch; if (*texture) SDL_DestroyTexture(*texture); if (!(*texture = SDL_CreateTexture(renderer new_format SDL _TEXTUREACCESS_STREAMING new_width new_height))) return -1; if (SDL_SetTextureBlendMode(*texture blendmode) < 0) return -1; if (init_texture) { if (SDL_LockTexture(*texture NULL &pixels &pitch) < 0 ) return -1; memset(pixels 0 pitch * new_height); SDL_UnlockTexture(*texture); } av_log(NULL AV_LOG_VERBOSE "Created %dx%d texture with % s.\n" new_width new_height SDL_GetPixelFormatName(new_format)); } return 0; }

什么情况下realloc_texture？

(1)⽤于显示的texture 还没有分配。

(2)SDL_QueryTexture⽆效。

(3)⽬前texture的width，height、format和新要显示的Frame不⼀致。

从上分析可以看出，窗⼝⼤⼩的变化不⾜，让realloc_texture重新SDL_CreateTexture。

格式转换-复⽤或新分配⼀个SwsContext

sws_getCachedContext()

*img_convert_ctx = sws_getCachedContext(*img_convert_ctx frame->width frame->height frame->format frame->width frame-> height AV_PIX_FMT_BGRA sws_flags NULL NULL NULL);

检查输⼊参数，第⼀个输⼊参数 *img_convert_ctx 对应形参 struct SwsContext *context 。如果context是NULL，调⽤ sws_getContext() 重新获取⼀个context。

如果context不是NULL，检查其他项输⼊参数是否和context中存储的各参数⼀样，若不⼀样，则先释放context再按照新的输⼊参数重新分配⼀个context。若⼀样，直接使⽤现有的context。

图像显示

texture对应⼀帧待显示的图像数据，得到texture后，执⾏如下步骤即可显示：

SDL_RenderClear(); // 使⽤特定颜⾊清空当前渲染⽬标 SDL_RenderCopy(); // 使⽤部分图像数据(texture)更新当前渲染⽬标 SDL_RenderCopyEx(); // 和SDL_RenderCopy类似，但⽀持旋转 SDL_RenderPresent(sdl_renderer); // 执⾏渲染，更新屏幕显示

2.图像格式转换

FFmpeg中的 sws_scale() 函数主要是⽤来做视频像素格式和分辨率的转换，其优势在于：可以在同⼀个函数⾥实现：1.图像⾊彩空间转换， 2:分辨率缩放，3:前后图像滤波处理。不⾜之处在于：效率相对较低，不如libyuv或shader，其关联的函数主要有：

(1)sws_getContext：分配和返回⼀个SwsContext，需要传⼊输⼊参数和输出参数；

(2)sws_getCachedContext：检查传⼊的上下⽂是否可以⽤，如果不可⽤则重新分配⼀个，如果可⽤则返回传⼊的，这个是自动分配。

(3)sws_freeContext：释放SwsContext结构体。

(4)sws_scale：转换⼀帧图像。

函数说明

/** 2 * Allocate and return an SwsContext. You need it to perform 3 * scaling/conversion operations using sws_scale(). 4 * 5 * @param srcW the width of the source image 6 * @param srcH the height of the source image 7 * @param srcFormat the source image format 8 * @param dstW the width of the destination image 9 * @param dstH the height of the destination image 10 * @param dstFormat the destination image format * @param flags specify which algorithm and options to use for resca ling 12 * @param param extra parameters to tune the used scaler 13 * For SWS_BICUBIC param[0] and [1] tune the shape of t he basis 14 * function param[0] tunes f(1) and param[1] f´(1) 15 * For SWS_GAUSS param[0] tunes the exponent and thus c utoff 16 * frequency 17 * For SWS_LANCZOS param[0] tunes the width of the wind ow function 18 * @return a pointer to an allocated context or NULL in case of err or 19 * @note this function is to be removed after a saner alternative is 20 * written 21 */ struct SwsContext *sws_getContext(int srcW int srcH enum AVPixelFo rmat srcFormat int dstW int dstH enum AVPixelFo rmat dstFormat int flags SwsFilter *srcFilter SwsFilter *dstFilter const double *param);

(1)srcW srcH srcFormat，原始数据的宽⾼和原始像素格式(YUV420)。

(2)dstW dstH dstFormat; ⽬标宽，⽬标⾼，⽬标的像素格式(这⾥的宽⾼可能是⼿机屏幕分辨率，RGBA8888)，这⾥不仅仅包含了尺⼨的转换，还有像素格式的转换。

(3)flag 提供了⼀系列的算法，快速线性，差值，矩阵，不同的算法性能也不同，快速线性算法性能相对较⾼。只针对尺⼨的变换。对像素格式转换⽆此问题。不同算法的效率⻅《10.3 ffmpeg中的sws_scale算法性能测试》⼩节。

#define SWS_FAST_BILINEAR 1

#define SWS_BILINEAR 2

#define SWS_BICUBIC 4

#define SWS_X 8

#define SWS_POINT 0x10

#define SWS_AREA 0x20

#define SWS_BICUBLIN 0x40

(4)后⾯还有两个参数是做过滤器⽤的，⼀般⽤不到，传NULL，最后⼀个参数是跟flag算法相关，也可以传NULL。

sws_getCachedContext

/** * Check if context can be reused otherwise reallocate a new one. * * If context is NULL just calls sws_getContext() to get a new * context. Otherwise checks if the parameters are the ones already * saved in context. If that is the case returns the current * context. Otherwise frees context and gets a new context with * the new parameters. * * Be warned that srcFilter and dstFilter are not checked they * are assumed to remain the same. */ struct SwsContext *sws_getCachedContext(struct SwsContext *context 14 int srcW int srcH enum AVP ixelFormat srcFormat 15 int dstW int dstH enum AVP ixelFormat dstFormat 16 int flags SwsFilter *srcFil ter 17 SwsFilter *dstFilter const double *param);

int srcW /* 输⼊图像的宽度 */

int srcH /* 输⼊图像的宽度 */

enum AVPixelFormat srcFormat /* 输⼊图像的像素格式 */

int dstW /* 输出图像的宽度 */

int dstH /* 输出图像的⾼度 */

enum AVPixelFormat dstFormat /* 输出图像的像素格式 */

int flags /* 选择缩放算法(只有当输⼊输出图像⼤⼩不同时有效) ⼀般选择SWS_FAST_BILINEAR */

SwsFilter *srcFilter /* 输⼊图像的滤波器信息若不需要传NULL */

SwsFilter *dstFilter /* 输出图像的滤波器信息若不需要传NULL */

const double *param /* 特定缩放算法需要的参数(?)，默认为NULL */

getCachedContext和sws_getContext的区别是就是多了struct SwsContext *context的传⼊。

struct SwsContext *img_convert_ctx = NULL; img_convert_ctx = sws_getCachedContext(img_convert_ctx frame->width frame->height frame->forma t frame->width frame->height AV_PIX_FMT_B GRA sws_flags NULL NULL NULL);

sws_scale

/** * Scale the image slice in srcSlice and put the resulting scaled * slice in the image in dst. A slice is a sequence of consecutive * rows in an image. * * Slices have to be provided in sequential order either in * top-bottom or bottom-top order. If slices are provided in * non-sequential order the behavior of the function is undefined. * * @param c the scaling context previously created with * sws_getContext() * @param srcSlice the array containing the pointers to the planes of * the source slice * @param srcStride the array containing the strides for each plane of * the source image * @param srcSliceY the position in the source image of the slice to * process that is the number (counted starting fr om * zero) in the image of the first row of the slice * @param srcSliceH the height of the source slice that is the numb er * of rows in the slice * @param dst the array containing the pointers to the planes of * the destination image * @param dstStride the array containing the strides for each plane of * the destination image * @return the height of the output slice */ int sws_scale(struct SwsContext *c const uint8_t *const srcSlice[] const int srcStride[] int srcSliceY int srcSliceH uint8_t *const dst[] const int dstStride[]);

(1)参数 SwsContext *c，转换格式的上下⽂。也就是 sws_getContext 函数返回的结果。

(2)参数 const uint8_t *const srcSlice[] 输⼊图像的每个颜⾊通道的数据指针。其实就是解码后的AVFrame中的data[]数组。因为不同像素的存储格式不同，所以srcSlice[]维数也有可能不同。

以YUV420P为例，它是planar格式，它的内存中的排布如下：

YYYYYYYY UUUU VVVV

使⽤FFmpeg解码后存储在AVFrame的data[]数组中时：

data[0]——-Y分量 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8……

data[1]——-U分量 U1 U2 U3 U4……

data[2]——-V分量 V1 V2 V3 V4……

linesize[]数组中保存的是对应通道的数据宽度，这些都是通过计算:

linesize[0]——-Y分量的宽度。

linesize[1]——-U分量的宽度。

inesize[2]——-V分量的宽度。

RGB24，它是packed格式，它在data[]数组中则只有⼀维，它在存储⽅式如下：

data[0]: R1 G1 B1 R2 G2 B2 R3 G3 B3 R4 G4 B4……

特别注意，linesize[0]的值并不⼀定等于图⽚的宽度，有时候为了对⻬各解码器的内存，提高CPU访问速度，实际尺⼨会⼤于图⽚的宽度，这点在我们编程时（⽐如OpengGL硬件转换/渲染）要特别注意，否则解码出来的图像会异常。

(3)参数const int srcStride[]，输⼊图像的每个颜⾊通道的跨度。.也就是每个通道的⾏字节数，对应的是解码后的AVFrame中的linesize[]数组。根据它可以确⽴下⼀⾏的起始位置，不过stride和width不⼀定相同，这是因为有以下2点原因：

第一，由于数据帧存储的对⻬，有可能会向每⾏后⾯增加⼀些填充字节这样 stride = width N。

第二，packet⾊彩空间下，每个像素⼏个通道数据混合在⼀起，例如RGB24，每个像素3字节连续存放，因此下⼀⾏的位置需要跳过3*width字节。

(4)参数int srcSliceY int srcSliceH 定义在输⼊图像上处理区域，srcSliceY是起始位置，srcSliceH是处理多少⾏。如果srcSliceY=0，srcSliceH=height，表示⼀次性处理完整个图像。这种设置是为了多线程并⾏，例如可以创建两个线程，第⼀个线程处理 [0 h/2-1]⾏，第⼆个线程处理 [h/2 h-1]⾏。并⾏处理加快速度。

(5)参数uint8_t *const dst[] const int dstStride[]定义输出图像信息，输出的每个颜⾊通道数据指针，每个颜⾊通道⾏字节数(也就是宽度)，对应上面的输入和输出。

sws_freeContext：释放SwsContext。

/** * Free the swscaler context swsContext. * If swsContext is NULL then does nothing. */ void sws_freeContext(struct SwsContext *swsContext);

具体转换

if (*img_convert_ctx != NULL) { uint8_t *pixels[4]; int pitch[4]; if (!SDL_LockTexture(*tex NULL (void **)pixels pitch)) { sws_scale(*img_convert_ctx (const uint8_t * const *)frame->d ata frame->linesize 0 frame->height pixels pitch); SDL_UnlockTexture(*tex); } }

上述代码有三个步骤：

(1)SDL_LockTexture() 锁定texture中的⼀个rect(此处是锁定整个texture)，锁定区具有只写属性，⽤于更新图像数据。 pixels 指向锁定区。

(2)sws_scale() 进⾏图像格式转换，转换后的数据写⼊ pixels 指定的区域。 pixels 包含4个指针，指向⼀组图像plane。

(3)SDL_UnlockTexture() 将锁定的区域解锁，将改变的数据更新到视频缓冲区中。上述三步完成后，texture中已包含经过格式转换后新的图像数据。

由上分析可以得出texture的缓存区，我们可以直接使⽤，避免⼆次拷⻉。

3.ffmpeg中的sws_scale算法性能测试

这里的性能测试仅供参考，根据不同的配置环境，效果可能些许不一样。

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(3)

⾸先，将⼀幅1920*1080的⻛景图像，缩放为400*300的24位RGB，下⾯的帧率，是指每秒钟缩放并渲染的次数。（经过我的测试，渲染的时间可以忽略不计，主要时间还是耗费在缩放算法上。）

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(4)

总评，以上各种算法，图⽚缩⼩之后的效果似乎都不错。如果不是对⽐着看，⼏乎看不出缩放效果的好坏。上⾯所说的清晰（锐利）与平滑（模糊），是⼀种客观感受，并⾮清晰就⽐平滑好，也⾮平滑⽐清晰好。其中的Point算法，效率之⾼，让我震撼，但效果却不差。此外，我对⽐过使⽤CImage的绘制时缩放，其帧率可到190，但效果惨不忍睹，颜⾊严重失真。

第⼆个试验，将⼀幅1024*768的⻛景图像，放⼤到1920*1080，并进⾏渲染（此时的渲染时间，虽然不是忽略不计，但不超过5ms的渲染时间，不影响下⾯结论的相对准确性）。

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）(5)

总评，Point算法有明显锯⻮，Area算法锯⻮要不明显⼀点，其余各种算法，⾁眼看来⽆明显差异。此外，使⽤CImage进⾏渲染时缩放，帧率可达105，效果与Point相似。

方案选型

个⼈建议，如果对图像的缩放，要追求⾼效，⽐如说是视频图像的处理，在不明确是放⼤还是缩⼩时，直接使⽤SWS_FAST_BILINEAR算法即可。如果明确是要缩⼩并显示，建议使⽤Point算法，如果是明确要放⼤并显示，其实使⽤CImage的Strech更⾼效。当然，如果不计速度追求画⾯质量。在上⾯的算法中，选择帧率最低的那个即可，画⾯效果⼀般是最好的。

不过总的来说，ffmpeg的scale算法，速度还是⾮常快的，毕竟我选择的素材可是⾼清的图⽚。(本想顺便上传⼀下图⽚，但各组图⽚差异其实⾮常⼩，恐怕上传的时候格式转换所造成的图像细节丢失，已经超过了各图⽚本身的细节差异，因此此处不上传图⽚了。)

注意：试验了⼀下OpenCV的Resize效率，和上⾯相同的情况下，OpenCV在上⾯的放⼤试验中，每秒可以进⾏52次，缩⼩试验中，每秒可以进⾏458次。放大和缩小是不一样的结果，放大处理，更消耗资源。

FFmpeg使⽤不同sws_scale()缩放算法的命令示例（bilinear，bicubic，neighbor）：

ffmpeg -s 480x272 -pix_fmt yuv420p -i src01_480x272.yuv -s 1280x720 - sws_flags bilinear -pix_fmt yuv420p src01_bilinear_1280x720.yuv ffmpeg -s 480x272 -pix_fmt yuv420p -i src01_480x272.yuv -s 1280x720 - sws_flags bicubic -pix_fmt yuv420p src01_bicubic_1280x720.yuv ffmpeg -s 480x272 -pix_fmt yuv420p -i src01_480x272.yuv -s 1280x720 - sws_flags neighbor -pix_fmt yuv420p src01_neighbor_1280x720.yuv

这篇文章就分析到这里，欢迎关注，点赞，转发，收藏。

网站首页

返回栏目

ffmpeg无损切割（超详细解析FFplay之输出和尺寸变换模块）

猜您喜欢：

相关文章