I need a way to call a process on a regular basis inside of my Rails application. I have been working in and around this idea for over a year and have tried several different methods.
One thing to mention is that CRON jobs are an obvious solution to these problems. However, as I am serving off a Joyent server, I have had difficulty setting up my CRON jobs and getting them into the right file with the right RAILS_ROOT settings to my application files. If anyone knows a good way to set these up on Joyent, please let me know!
I have been reading several excellent blog posts, including Igvita.com – http://www.igvita.com/2007/03/29/scheduling-tasks-in-ruby-rails/, and the Rails Wiki itself – http://wiki.rubyonrails.org/howtos/background-processes.
Rufus Scheduler
My current attempt is to try Rufus. I highly recommend that you examine the article Dead simple task scheduling in Rails by Brent Collier.
You can find the online documentation for the Rufus-Scheduler here: http://rufus.rubyforge.org/rufus-scheduler/
The first error I encountered in my odyssey was:
config/initializers/task_scheduler.rb:6: undefined method `start_new' for Rufus::Scheduler:Module (NoMethodError)
This was caused because I didn’t include the rufus/scheduler line at the top of my config/initializers/task_scheduler.rb file:
require 'rufus/scheduler'
I only include this oversight in case someone else happens to encounter the same error in their code.
Back To BackgrounDRb
Rufus Scheduler worked great for the simple tasks, but some of what I am trying to do has the potential to take up a lot of time. The problem I encountered with rufus was that as the time requirements grew, it was taking more and more instances from my mongrel cluster, eventually drawing the server request/response cycle to a complete stop. Not good when you are trying to host a public site!
Because I know that these processes can be time consuming, they need to be relegated to a separate server instance. So I am returning to BackgrounDRb.
My previous experiences with BackgrounDRb produced the following opinion from my own blog post:
I really like the theory and concept behind BackgrounDRb, but it seems to be a bit overkill for what I am trying to accomplish. Also, it was almost impossible to keep the BackgrounDRb server running, even with the use of Monit. There just always seemed to be something that was crashing our beloved process.
Learning from my previous experiences, I have thoroughly tested and retested the code that will be called from my workers. I am setting things up to run using the add_periodic_timer method as my first attempt. On my local machine, everything seems to be working fantastically. I plan to set everything in motion on my production server tonight and see where we sit in the morning.
The First Results
I am happy to report that with the proper configuration, and the strategic placement of begin .. rescue Exception code, BackgrounDRb is successfully performing the functions I require!
One comment that I have is that it is difficult to figure out what it is doing when. To counter this issue, I have designed a very simple background_logger to monitor the processes and let me know when things start and stop.
To do this, you first make a class for the BackgroundLogger:
From lib/background_logger.rb:
class BackgroundLogger < Logger
def format_message(severity, timestamp, progname, msg)
"#{timestamp.to_formatted_s(:db)} #{severity} #{msg}\n"
end
end
Then you add awareness of this logger in your environment file:
From config/environment.rb:
background_logfile = File.open("#{RAILS_ROOT}/log/background.log", 'a')
background_logfile.sync = true
BACKGROUND_LOG = BackgroundLogger.new(background_logfile)
This adds a contstant called BACKGROUND_LOG that you can then use in your workers:
From lib/workers/foo_worker.rb:
class FooWorker < BackgrounDRb::MetaWorker
set_worker_name :foo_worker
def create(args = nil)
BACKGROUND_LOG.info("Foo Worker Started")
add_periodic_timer(5400) { do_foo }
end
def do_foo
BACKGROUND_LOG.info("Starting Foo...")
MyModel::do_foo
BACKGROUND_LOG.info("Foo Done!")
end
end
The resulting log output will include timestamps and such that can help you keep track of what is going on, when it happens, and in the event of a hiccup, let you know when it happened and in which file. This is of course in addition to the information provided in BackgrounDRb’s own logs.
So, BackgrounDRb is NOT the answer… what else is there?
Everything was peachy until my processes gre to be very large in number. Not good when it takes so long to do something that the machine chokes and it all grinds to a halt!
So what are my other options? Turns out that now everyone recommends delayed_job, a plugin from shopify that does the same thing only better. The folks at Engine Yard highly recommend delayed_job over backgrounDRb due to severe memory leaks and other not so favorable conditions.
I am happy to report that the results are excellent! Without even having Monit or God running, delayed_job has been kicking serious butt on my project. I think I’ll stick with it for awhile.
Monitoring All Of These Processes
The next part is to explore how to keep it all running when I run off to play music for the weekend. I know that Monit is a common choice, and having read up about it, it might be the best option. I am however, going to start with a different option… you can follow my progress in this area from my Daemon Monitoring blog post.
Cheers!