Contents

Purpose

One day I set out trying to make a photo VCD (aka. stills VCD) for my parents in law. In China hardware VCD players (and of course nowadays DVD players) are common place in most Chinese households. That's what motivated the creation of a stills VCD instead of a simple CD-ROM or yet MPEG4 based solution which would require a PC. And of course I wanted to do this under Linux!

The result is the possibilty to create a mpeg stream (video only) from jpeg pictures stored in a given directory. Furthermore all these picture do NOT need to have the same size. So if you just bought yourself a X Megapixel camera but are still stuck with tons of low resolution jpegs you can all put then in your slide show without have to first resize them etc...

Restrictions

The patched tools have exactly the same restrictions as the original Mjpegtools of course. That is:
  1. The size of the input images must be EVEN
  2. Well, the jpeg2yuv(stills) tends to crash on some otherwise innocent looking images. I do not know whether it is a bug or another constraint in the input format.

Why patch the Mjpegtools utilities

After a few houres of Google searching I found very little. Mostly some scripts (bash and perl mostly) but no free all-in-one solution. The thing that came closest that what I was looking for were the Mjpegtools. The manual has a section dedicated to creation of a mpeg stream from seperate images. Sounded good but there were some limitation to the software:
  1. the input files can be specified using a rather simple syntax which implies images coming from modern digital cameras.
  2. I believe that the documentation is not completely in sync as far as the jpeg2yuv utility is concerned.
  3. To make a slides show you actually need the same frame to show for several which was not possible

What was changed

I patched the jpeg2yuv conversion utility so that it processes all the files in the directory specified on the command line. I also had to modify the yuvscaler to duplicate frames so that you can generate a real slides show.

The first change is that of changing the meaning of the -j option of jpeg2yuv. Instead of specifiying the input file(s) it now specifies the input directory.

The second change is the addition of a -l option to the yuvscaler. This specifies how many times every image has to be duplicated in the output stream. Adding this option to the yuvscaler saves a lot of work (convert to YUV and rescale once, duplicate a lot). Note that the value of the -l option depends on your framerate!

Usage example

Suppose you have copied the patch utilities (called jpeg2yuvstills and yuvscalerstills) in the current directory. Your input jpegs are in the Images directory. To create a PAL SVCD slides show at 25 frames per second with each picture being shown for 10 seconds (10 x 25 => 250) you would type:

./jpeg2yuvstills -f 25 -I p -n 1 -j Images/ | ./yuvscalerstills -O SVCD -l 250 | mpeg2enc -f 4 -o svcd.m2v

For the other parameters look on the sourceforge homepage of the mjpegtools or in the manpages.

Download

The binary compiled for x86 architecture:

If you wish to recompile for you own architecture. Copy the file bollow in the appropriate subdirs of the mjpegtools-1.6.2 directory. Better backup the originals somewhere. NOTE: the yuvscaler and jpeg2yuv utils will be replaced by the patched version! This because I did NOT change the Makefile and other configuration files. Neither did I add any indication at the beginning of the source files.

Details

Actually the initial patch (which was adding -l to the yuvscaler) lead to another and then another etc... In fact behind the screen I had to modify quite some things fortunately without losing any of the initial feaures. So yuvscalerstills can replace yuvscaler without any problem. This is not true for the jpeg2yuv tool because of the redefinition of the -j option.

What happened with the original tools (well at least that's how I understood it by looking at the code). The jpeg2yuv generates a stream which is fed into the yuvscaler. The yuvscaler then does its work before passing it to the next tool the mpeg encoder mostly (mpeg2enc). The streams produced by jpeg2yuv or yuvscaler looks like this:

SH = stream header
FH = frame header


[SH [FH YUV YUV .... YUV] [FH YUV YUV ... YUV] ... [ FH YUV YUV ... YUV]]

This works very well because the original tools expect images coming from 1 and the same source (ie. a digital camera). So all the input images are supposed to have the same size. This was rather inconvenient. Therefore I modified the stream generated by the jpeg2yuv a little. And of course had to modify the yuvscaler to handle the new stream structure.

Following the Keep It Simple Stupid principle I kept the changes to a minimum. The new stream structure is actually nothing but a concatenation of the original stream structure (simple enough ;) ). In other words the jpeg2yuvstills can produce stream looking like this (for example):

[SH [FH YUV YUV .... YUV] ... [FH YUV YUV ... YUV]] [SH [FH YUV YUV ... YUV] ... [FH YUV YUV ... YUV]]

For my slide show purpose each image is process only once by jpeg2yuvscaler. So the output stream consist of several streams with each stream containing only one frame. For a directory containing 3 pictures this would give:

[SH [FH YUV YUV .... YUV]]  [SH [FH YUV YUV ... YUV]] [SH [FH YUV YUV ... YUV]]

This stream contains enough information for the yuvscalerstills to do its work for each frame, even if the frame have different sizes. The yuvscaler still produces a valid stream (ie. with 1 stream header) and thus can readily feed it output to the other tools.

Bugs

I think there must still be some memory leaks somewhere both in jpeg2yuvstills and yuvscalerstills. I did not read ALL the code you see... And furthermore I noticed that the original tools were not always implemented with great care in that respect either (I mean missing free() things you know). I tried to fix the things I changed but I am sure I overlooked some details in the exists structure. .

Sample

Well words are nice. Here is output VCD (CLICK) and SVCD (CLICK) mpeg generated by the patched tools. It's a PAL (25 img/sec) slide show with 5 seconds per picture, starting from several jpegs of different size displayed below: