Editing Audio and Subtitle Tracks in .mkv Files with ffmpeg

I have been trying to consolidate a bunch of servers recently, and one of the ones that I am looking to sunset is a box that has been used as a BD ripping and media encoding box.  The reason it HASN’T been sunset yet is that, well, there are a bunch of things on it that have been ripped but not yet encoded for playback on an AppleTV and I needed to get on that.

Anyway, that’s in process, and I was deleting a bunch of source files that had already been encoded and that’s when I ran across Maken-Ki.

If you’ve never seen it, it’s a really forgettable show whose chief draw is a violent pettanko who may or may not be a dragon.  I kind of regret watching both seasons of it, and I am very unlikely to ever watch it again, but finding the source files made me remember something about the series that had REALLY annoyed me; which was a frustrating typo in the subtitles in one of the episodes.

So I thought to myself, how hard could this be to fix?  I’ve got ffmpeg, which is like a Swiss army knife on steroids when it comes to video file manipulation, I’m sure I can sort it out.

What could possibly go wrong?

My first step was to look at how the original file was laid out, which you can do just by throwing it into ffmpeg, like so:

ffmpeg -i MakenKiS02E08.mkv

And this gives a lot of output:

    Stream #0:0: Video: hevc (Main 10), yuv420p10le(tv), 1920×1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
      _STATISTICS_WRITING_APP-eng: mkvmerge v30.1.0 (‘Forever And More’) 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2019-02-10 22:33:00
      BPS-eng         : 2583237
      DURATION-eng    : 00:23:42.046000000
      NUMBER_OF_BYTES-eng: 459185284
      NUMBER_OF_FRAMES-eng: 34095

    Stream #0:1(eng): Audio: aac (LC), 48000 Hz, 5.1, fltp (default)
      title           : English 5.1 channel AAC
      _STATISTICS_WRITING_APP-eng: mkvmerge v30.1.0 (‘Forever And More’) 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2019-02-10 22:33:00
      BPS-eng         : 440524
      DURATION-eng    : 00:23:42.036000000
      NUMBER_OF_BYTES-eng: 78305206
      NUMBER_OF_FRAMES-eng: 66658

    Stream #0:2(jpn): Audio: aac (LC), 48000 Hz, stereo, fltp
      title           : Japanese 2.0 channel AAC
      _STATISTICS_WRITING_APP-eng: mkvmerge v30.1.0 (‘Forever And More’) 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2019-02-10 22:33:00
      BPS-eng         : 179997
      DURATION-eng    : 00:23:42.087000000
      NUMBER_OF_BYTES-eng: 31996579
      NUMBER_OF_FRAMES-eng: 66661

    Stream #0:3(zxx): Subtitle: ass (default)
      title           : Signs/Karaoke [Hatsuyuki] – [WHW]
      _STATISTICS_WRITING_APP-eng: mkvmerge v30.1.0 (‘Forever And More’) 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2019-02-10 22:33:00
      BPS-eng         : 470
      DURATION-eng    : 00:23:04.820000000
      NUMBER_OF_BYTES-eng: 81502
      NUMBER_OF_FRAMES-eng: 601

    Stream #0:4(eng): Subtitle: ass
      title           : [Hatsuyuki] – [WHW]
      _STATISTICS_WRITING_APP-eng: mkvmerge v30.1.0 (‘Forever And More’) 64-bit
      _STATISTICS_WRITING_DATE_UTC-eng: 2019-02-10 22:33:00
      BPS-eng         : 562
      DURATION-eng    : 00:23:37.210000000
      NUMBER_OF_BYTES-eng: 99571
      NUMBER_OF_FRAMES-eng: 897

    Stream #0:5: Attachment: ttf
      filename        : HeyGorgeous.ttf
      mimetype        : application/x-truetype-font

    Stream #0:6: Attachment: ttf
      filename        : INCOLHUA_R.ttf
      mimetype        : application/x-truetype-font

I’ve cut a lot out of this because what I want to know about is the Streams, which are all of the different components that make up the video file.  This file has one video stream, two audio streams, two subtitle streams and a bunch of embedded fonts.  I’ve cut off all the fonts after the first couple because the output was already long enough.

The English audio track and the “Signs/Karaoke” subtitle track are unnecessary, so let’s get rid of those first:

ffmpeg -i MakenKiS02E08.mkv -map 0 -map -0:1 -map -0:3 -acodec copy -vcodec copy -scodec copy tt.mkv

Breaking this down:

ffmpeg -i

just tells ffmpeg which file to read.

-map 0

tells it to copy all streams from file 0 to the output file.  ffmpeg starts counting everything from 0.  It’s annoying but you get used to it.  Except, we don’t want the English audio or signs and karaoke subtitles, so we use two more map commands to exclude those.

 "-map -0:1 -map -0:3"

is telling ffmpeg to drop streams 1 and 3 from file 0.

-acodec copy -vcodec copy -scodec copy

tells ffmpeg to copy the remaining audio, video, and subtitle streams without transcoding them.  This is very fast and doesn’t affect quality.

Finally, the last thing on your ffmpeg command line is the filename you want ffmpeg to write to.  “tt.mkv” is just my regular shorthand for temporary output files.

This cheerfully made a new mkv file, and I threw it into VLC to confirm that the subtitles and audio I wanted were present.  It was also 80 megabytes smaller without the English audio so you could use this to reduce the size of mkv files where you don’t care about some of the languages.

Then I needed to separate the subtitle file from the mkv file so I could make changes.

ffmpeg -t tt.mkv -map 0 -map -0:2 -acodec copy -vcodec copy tt_nosubs.mkv

Again I’m using the -map -0:2 command to exclude a stream, in this case the subtitle stream.  This gives me a file with just video, audio, and fonts.

I also extract the subtitle track to a text file with a .ass extension, like so:

ffmpeg -i tt.mkv -scodec copy script.ass

ffmpeg is smart enough to know that you can’t put video or audio into a .ass file so it drops those and we’re left with just the script.

After that, I made my changes to the script and needed to mash it all back into one file.

ffmpeg -i tt_nosubs.mkv -i script.ass -map 0 -map 1 -acodec copy -vcodec copy -scodec copy ttt.mkv

Two -i commands to tell ffmpeg that it needs to read from two input files, two -map commands to tell it to take all streams from file 0 (the mkv file) and all streams from file 1 (the subtitles file), copy them without transcoding and put everything into another file called ttt.mkv.

I am not particularly inventive with my file names.

I was then able to throw ttt.mkv through Handbrake and it gave me a lovely AppleTV-compatible .m4v file that I will probably never actually watch because Maken-Ki was not really worth a second watch, but damnit I fixed the typo and this was very important to me.

Though Himegami – that’s the violent pettanko’s name, had to look it up – WAS a hilarious character and it might be worth watching a couple episodes.  Someday.  When I get through all of the other shows in my queue.

This entry was posted in anime, video encoding. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.