iOS Video How-to

Mobile Development, Software development, Video Streaming / 14.03.20171 comments

Hi! My name is Sergii Simakhin and I’m an iOS software engineer at Cayugasoft since end of 2013.

I love to create great native iOS apps of any level of complexity because it’s a challenge for me every time I start a new project 😀. Working at Cayugasoft, some of my projects contained media features like streaming / converting / trimming videos, music, data from camera / microphone, mixing all of them, etc. So I decided to tell you the basics of merging videos using AVFoundation.


To be honest, this is not as easy as it looks. But it’s not quite hell.

The example I’ll show you will be written with Objective-C, the language that I’ve really loved working with since 2013.

The Example

How to add background audio to the video clip

In this example I’ll show you how to add a background music on a video clip.

For the extra: how to add captions on top of the video. Interested? 

At first, you’ll need to import AVFoundation on the top of your .m file:

Easy, yeah? Stop smiling, I’m just getting started. 😀

From now you can create a void method where we will do all our magic.

Next step is to create a composition, which will contain all our audio & video tracks inside.

I’ll call it mixComposition.

Now ensure that you have prepared the video file’s NSURL object. We will need it now.


Great! The next step is to extract a video track and audio track from the original clip.

Now it’s time to go back to our previously created mixComposition. We need to create an empty composition tracks first and fill them in with the original tracks which we just got above.

I’ll set error handlers to nil – let’s imagine that we have an ideal situation.

Now our mixComposition contains the original video clip. Adding background music to it will look straightforward.

The same as with video, we need to translate the audio file’s NSURL to the AVFoundation object. Don’t forget to extract the AVAssetTrack as I do below:

Going back to the mixComposition, let’s add the new audio track in it and fill it with the background audio track:

Now you are all set, buddy.

But it’s just a beginning! 😈

Keep moving

How to add captions to the video clip

Adding captions is also a straightforward thing (if you know what to do from this point 😄). It’s like a manipulation with multiple layers in Photoshop, where video is on the bottom layer and all other layers (text, picture in picture, whatever) are on the top. In our case layers will be CALayer objects.

Oh, and don’t forget to import QuartzCore!

Okay, at first let’s define that caption height – it will be not higher than 40px. For sure you can change it to any value at any time.

Along with it I’ll make a block (if you have no idea what’s going on below, feel free to follow which will help us generate CATextLayer based on provided text and Y-offset:

Now let’s generate our header and footer text layers. For the header I’ll use zero Y-offset, which means that it will pin on top. The footer will be pinned on bottom side.

So far so good! Now let’s create a parent layer and put our captions on it.

It’s time to manage all layers and hierarchy.


We’re finished with captioning. I had hoped that it would be more intuitive.

Now we have one main layer (I call it ‘parent layer’) which contains video layer first and overlay on top.

That's not all yet

Prepare to export


Don’t go, we’re almost done! 😄

Now we need to combine what we’ve previously done before: mixComposition, captions, egh. But first instruction needs to be made.

Now we have everything we need to export video. Finally!

Let’s create export session and set output quality, file type, location. Don’t forget to prepare the output URL!

From this moment we are ready to launch the export. It will be done asynchronously. Completion handler will be called after rendering is done.

Done ✅

Now you can test it, but please remember, TRY IT ON A REAL DEVICE ONLY (captioning DOES NOT WORK on a simulator).

I hope this example with help you understand the basics of the AVFoundation.

That's it!


I did it without captions, but you’ll definitely love the result. Just look!

Debug result 😄

Isn’t it cool, ha? 😎
p.s.: why I didn’t write this article for Swift? Oh, dear 🙃 I already have adopted the same code for the Swift version in another project, but since the language is so new, I’ll need to rewrite it every time when new version is coming out. So, believe me, things like this one which will work many many years and never change is better to be written with Objective-C.

  1. Razan says:

    This works perfectly Thank you!!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>