Decode video with FFmpeg

Decode video with FFmpeg


You need to decode video file when you create a piece of multimedia software. For example, you have to create multimedia player or show video in the game, thus your solution has to be cross-platform. In order to solve this task you can use FFmpeg, which is ideal for that purpose. About encoding with FFMpeg you can read in this article Encode file with ffmpeg.

(Вы можете также прочитать эту статью на русском языке)

About FFmpeg

FFmpeg is a free cross-platform software project that produces libraries and programs for handling multimedia data. It is licensed by GPL и LGPL. Consequently, you can use ffmpeg in comercial projects.

FFmpeg is used in ffmpeg2theora, VLC, MPlayer, Handbrake, Blender, Google Chrome, etc.

Preparatory measures

You need to make some preparation work to use FFmpeg lib. There are some ways:

    1. You can download FFmpeg source and compile it using gcc. You can spend some time for compilation, but you will know, what parameters were used for compilation.

2. The second way, which was chosen by the author of the article, is to download binary from In this article we use such components of FFmpeg as avcodec, avdevice, avformat, avutil, and swscale. Lib files and h-files you can find in Dev version and dll might be found in shared version.

You need to use "Exteran C" to include h-files of FFmpeg:

extern "C"
     #include "avcodec.h"

When the preliminary measures were done, we can start to decode video. It should be mentioned that below you can find the sample, where all the steps are done.

About video files

Video files may be in different formats (for example, avi, wmv, and ogg). Format, figuratively speaking, is container, which determines the structure of the file. Video file structure is as follows:

 Structire of media file

Header contains information about resolution, codecs, number of streams, etc.

Packages are encoded data. Packages can contain compressed image or part of image, i.e. the part that differs from the previous frame. Also package can contain compressed sound. There is no particular order of package of different streams.

Indexes have information about key frames. Key frames are full frames, they include full frame, unlike another frames with only changed part of the image.

Often media files include several audio streams for different languages, also they can house different video streams, though it happens rarely.

Decoding media files with FFmpeg

The process of decoding might be viewed as follows:

  • Step 0: Initialize FFmpeg
  • Step 1: Open the file
  • Step 2: Search for streams and open decoders
  • Step 3: Get information about streams
  • Step 4: Decode information
  • Step 5: Process frames
  • Step 6: Close the file

Steps 4 and 5 repeat for every frame. Let us describe each step separately:

Step 0: Initialize FFmpeg

// Register all components of FFmpeg

Step 1: Open the file

Open file and get its context:

// Open file
if (av_open_input_file(&pFormatCtx, inputFile.c_str(), NULL, 0, NULL) != 0)
   return false;
// Get infromation about streams
if (avformat_find_stream_info(pFormatCtx, NULL) < 0)
   return false;

Step 2: Find streams and open decoders

Find video and audio streams and find decoders for them. If the stream was encoded by unknown decoder, we get error.

Find video stream and open it:

// # video stream
videoStreamIndex = -1;
for (unsigned int i = 0; i < pFormatCtx->nb_streams; i++)
   if (pFormatCtx->streams[i]->codec->codec_type == CODEC_TYPE_VIDEO)
     videoStreamIndex = i;
     pVideoCodecCtx = pFormatCtx->streams[i]->codec;
     // Find decoder
     pVideoCodec = avcodec_find_decoder(pVideoCodecCtx->codec_id);
     if (pVideoCodec)
       // Open decoder
       res = !(avcodec_open2(pVideoCodecCtx, pVideoCodec, NULL) < 0);
       width = pVideoCodecCtx->coded_width;
       height = pVideoCodecCtx->coded_height;

Search of audio stream is similar to search of video stream, but you have to use CODEC_TYPE_AUDIO.

Step 3: Get information about streams

We get information about streams such as resolution, length, and fps.

Get information about video stream:

// FPS
videoFramePerSecond =
// Base time unit
videoBaseTime =
// Duration of video clip
videoDuration = (unsigned long)
pFormatCtx->streams[videoStreamIndex]->duration * (videoFramePerSecond *
// Frame width
width = formatCtx->streams[videoStreamIndex]->codec->width;
// Frame height
height = formatCtx->streams[videoStreamIndex]->codec->height;

Getting information about audio stream is the same, but it has no width and height.

Step 4: Decode information

Firstly, we need to read package and find the stream, that has this package. After that, we will decode this package. Audio and video packages are decoded by different functions. To continue explanation we need to give some additional information about time format.

Time in ffmpeg is hold in special units, it is named BaseTime. You need to multiply time value to BaseTime to convert time to seconds. Each stream has its own BaseTime.

To read package use this function:

while (av_read_frame(pFormatCtx, &packet) >= 0)

If function returns value less zero, it means that package cannot be read. It maybe end of file.

Here is code to check: Does video stream have current package?

if(packet.stream_index == videoStreamIndex)

where videoStreamIndex is video stream index.

Decode frame:

// Decode packeg to frame.
AVFrame * pOutFrame;
int videoFrameBytes = avcodec_decode_video2(pVideoCodecCtx, pOutFrame,
&got_picture_ptr, avpkt);

If decoding was successful, then got_picture_ptr should be more than zero and function will return value great zero (number of decoded bytes). If returned value is more than zero, but got_picture_ptr is less or zero, it is not error, but ffmpeg cannot decode frame.

To decode audio we use this function:

int packetDecodedSize = avcodec_decode_audio4(pAudioCodecCtx,
audioFrame, &got_picture_ptr, avpkt);

Digital sound is stored in point of discretization. We can get information about audio format from stream structure. For example: For format AV_SAMPLE_FMT_FLTP, each sample is float from -1.0 to 1.0 and each stream (left or right) holds in different planes. Pointer to each plane holds in audioFrame->extendeddata[i]. In the example you can find code to convert AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16.

Step 5: Process frame

There are various video frame formats RGB, YUV, and YUVP. YUV is often used. If you need to process frame in format RGB, you can use swscale lib. It is the part of FFmpeg. Here is a sample:

// Create context
pImgConvertCtx = sws_getContext(pVideoCodecCtx->width,
   pVideoCodecCtx->width, pVideoCodecCtx->height,
// Convert frame
sws_scale(pImgConvertCtx, pFrameYuv->data, pFrameYuv->linesize,
           0, height, frame->data, frame->linesize);

Step 6: Close file

When we close file, we need to release resources:

// close video codec
// close audio codec
// close file

Example of using ffmpeg

This article would not be full withoput an example. You can download sample of using ffmpeg or get it from GitHub/UnickSoft. Example shows you the basic FFmpeg functions. Sample program opens video file and stores to a disk first 50 frames. You can change descriptions to setup program:

#define FILE_NAME "C:\\temp\\test.avi"
#define OUTPUT_FILE_PREFIX "c:\\temp\\image%d.bmp"
#define FRAME_COUNT 50

For 'Release config' we should turn on an option: "References: Keep Unreferenced Data (/OPT:NOREF)" or add parameters /OPT:NOREF to command line.

Links - offcial site of FFmpeg. - about FFmpeg on Wiki. - projet FFmpeg for Windows. - video player in 1,000 lines. About using FFmpeg in more details.

Теги: C++ Видео FFmpeg English