Friday, August 3, 2012

Shallow Copy in Ruby

I often wonder why some organizations only consider candidates who have computer science or computer engineering degrees. In the real world I have seen many great developers who have come from diverse backgrounds. However, I had an event recently that made me really happy I have a degree in computer engineering. It made me understand why the favoritism exists for computer science credentialed developers. I had an issue with an email template in Ruby on Rails that without a good mental model of compilers and, specifically, compiler optimization would likely have taken a long time to identify and fix. Here is what happened:


in a database somewhere...
UPDATE app_configs SET affiliate_template = 'Welcome to UrbanBound #[FirstName], '

def send_email_to_todays_movers

  Mover.each do |m| 
    @boiler_plate_email = AppConfig.find_by_config_name('affiliate_template').value
    [business logic goes here] 
    prep_mover m, @boiler_plate_email
  end 
end

def prep_mover
   [business logic goes here]
   @movers_name = Mover.find_by_id(m.id)
   send_welcome_email @movers_name, @boiler_plate_email
end

def send_welcome_email (first_name, boiler_plate)
   boiler_plate['#[FirstName]'] = first_name
   ......
end

*If this is good code architecture or not I'll leave for another post. We do most initial development of features as an MVP at UrbanBound. Its usually not worth optimizing code the first time we are testing a feature out. That being said I have a strict rule of using TDD and keeping 95% code coverage on models, helpers and controllers. Generally speaking, no business logic is allowed in the views.


The first time I ran this code (via rspec - I'm not this guy) the outgoing email message had the correct first name for the first mover. However, the others all had the same first name as mover #1. It didn't make sense to me at all. The boilerplate email template was changed in the send_welcome_email method but we refreshed it each time a mover was processed. Then I thought about my days doing C/C++ and how compilers work. Maybe the Ruby interpreter couldn't see three levels deep into the execution tree to see the boilerplate was updated. Perhaps it just did a shallow copy of the boilerplate after the first execution and was effectively sending a memory location of the boilerplate. Never updating it after the initial read from the database. After some experimentation I found that I was indeed correct. The compiler was performing an optimization that made the code faster, but didn't produce the intended action the programmer gave it. This is the trade off compiler builders always make.  Back in graduate school I could get my compiler to generate code that ran 2x-3x faster than the benchmark compilers, however after running millions of test data sets against it I noticed errors in 1 or 2 of them. This may be fine if I was playing in media, such as an mp3, but not good for computations that affected people's life and safety.

It turns out the solution is simple. Explicitly tell Ruby that you want a copy. Here is the code:

  Mover.each do |m| 
    @boiler_plate_email = AppConfig.find_by_config_name('affiliate_template').value.dup
    [business logic goes here]
    prep_mover m, @boiler_plate_email
  end


I found another great description of the problem [here]