Enigmamachine - first working release

Posted by dave
on Monday, July 12

I’ve just put the finishing touches on the first working release of Enigmamachine. It’s a RESTful video encoder written in Ruby. It uses the thin webserver, eventmachine event framework, sinatra web framework, and datamapper ORM.

There’s a lot more info in the README, but basically it’s a standalone HTTP service which accepts a POST request triggering a video encode on the local filesystem. In short, your web app gets a video uploaded to it, and once it’s saved on your filesystem, you’d do something like:

wget http://username:password@localhost:2002/videos --post-data 'video[file]=/path/to/your/video.mp4&encoder_id=1'

This would apply whatever ffmpeg tasks you define to the uploaded video. Advantages for you? You don’t need to worry about process daemonization, threading issues, or maintaining a processing queue, you just fire off a POST request and enigmamachine does the rest.

There aren’t any callbacks yet, and I can see that this is probably the next thing to do (as well as slugging the encoder_id). That’ll wait, though. Basically it’s ready for first use, after a long long time messing with the idiosyncracies of combining ffmpeg (which logs to stderr), thin, a gem binary, and eventmachine’s multi-threaded mode. This was an interesting process all on its own, and I’ll do a write-up about Thin daemonization soon, as I think it’s a really great way to make low-ceremony services in Ruby.

Ruby Procs and Eventmachine callbacks

Posted by dave
on Sunday, January 17

For the last little while, I’ve been messing around to see whether I can whip off a video-encoding server which is easily installable. It’s put together using Sinatra, Thin, and Eventmachine (three of my favourite Ruby libraries lately).

At the point where it all started to work, I ran into a nasty order-of-operations problem. To fix it, I had to do a little research into how Ruby sets up blocks and procs.

The basic system is fairly simple. There’s a web interface written in Sinatra, which runs inside Thin. Off to the side of this running webapp is an Eventmachine process sitting around waiting to encode videos. The webapp interface allows users to easily define encoder profiles, which are sets of tasks for ffmpeg. As an example, we might define an Encoder with three EncodingTask objects attached to it:

  • output to 320×240 flash video
  • output to 720×576 ogg theora
  • make a 320×240 thumbnail jpg.

When this Encoder is applied to a new video, the tasks are run in order on the video. Once there are no more tasks, the video state is changed to “complete”.

The code that accomplishes this is pretty simple, except for the fact that it’s multithreaded:

  def ffmpeg(task, video)
    command_string = "ffmpeg -i #{video.file} #{task.command} #{video.file + task.output_file_suffix}" 
    encoding_operation = proc {
      video.state = "encoding" 
      video.save
      Log.info("Executing: #{task.name}")
      `nice -n 19 #{command_string}`
    }
    completion_callback = proc {|result|
      if task == encoding_tasks.last
        video.state = "complete" 
        video.save
      else
        current_task_index = encoding_tasks.index(task)
        next_task_index = current_task_index + 1
        next_task = encoding_tasks[next_task_index]
        ffmpeg(next_task, video)
      end
    }
    EventMachine.defer(encoding_operation, completion_callback)
  end

The first time through, the code is called with the first EncodingTask and whatever video is being encoded. The method then calls itself again for every EncodingTask until there are no more EncodingTasks left in the queue, at which point the video is marked as complete and everything chills out. The main motivations for this code are:

  • We need to execute a long-running, blocking process (the ffmpeg encoding)
  • We want the Sinatra/Thin application to keep running while we’re encoding, because we want to be able to queue up other videos to encode

The EventMachine.defer method does what we need here. EventMachine is mainly designed to be blindingly fast for network operations by using a single-threaded evented reactor loop, but the single-threaded approach only works in cases where non-blocking processing is being done. For blocking I/O, like the call to ffmpeg here, EventMachine.defer comes to the rescue for us: by default, EventMachine sets up a pool of 20 ruby threads which are available for use by EventMachine.defer in cases where it’s really necessary to use multiple ruby threads.

Don’t confuse EventMachine.defer with EventMachine Deferrables, which are related to the single-threaded part of EventMachine. EM.defer is a method which takes two proc objects as its arguments – the first should contain your long-running operation, the second is a callback which should be executed once the long-running operation is completed. The main problem with my code above is that it didn’t work.

I naively expected that I could set the local variable current_task_index inside my completion_callback proc and use it normally. This was a really dumb thing to do, as I realized once I did a little refresher research into Proc objects and how they work in Ruby.

Procs get their variables from their surrounding scope when they are defined, not when they are called. This led to all kinds of exciting order-of-operations bugs in my code, which was all speedily fixed once I moved the current_task_index = encoding_tasks.index(task) line so that it was the first thing defined in the method.