Ruby hashes are wonderfully versatile. You can store any kind of object using any kind of key and do all sorts of wacky and magical things. Hashes bring joy. Odes have been written to the wonderful “hashrocket” (Ruby’s affectionate name for the =>
operator).
But in our day to day work, we don’t see a lot of this:
{ [2, 3, 4] => 1.82343 }
More often, we see this:
{ name: "Steve" }
The latter is a natural fit for most data - this attribute has this value. So simple! But not so long ago (cough, 12 years), Rubyists used this syntax: { :name => "Steve" }
. This matches nicely to { "name" => "Steve" }
, and describes the same thing. So symmetrical. Strings, symbols, pick whatever you want. It’s Ruby, my friend; free yourself from the tyranny of static types.
But obviously { key: value }
is prettier. Fewer characters! Less ceremony! It looks like JSON! :symbol based keys have won — they are the ubiquitous way to store data in hashes.
Except when they’re not.
I’ve worked on many Rails upgrades in the last few years, and dealing with hashes is where everything gets messy. In this post, I’ll discuss some of the changes that have affected hashes in the last few versions of Rails, and how you can manage those changes in your next upgrade.
Before getting into the advice, though, let’s talk a little context. Where did this all begin?
HashWithIndifferentAccess
HashWithIndifferentAccess
is the Rails magic that paved the way for symbols in hashes. Unlike Hash
, this class allows you to access data using either symbols (:key
) or strings ("key"
). This was wonderful in controllers, because you could mix and mash client parameters with server parameters and not spend any time thinking about key types. ActiveRecord didn’t care if your keys were symbols or strings; it just worked. This undoubtably saved a lot of time in a language that doesn’t have a compiler to enforce types. When was the last time you worried about your key types?
All was well in Rails-land. Hashes were ubiquitous: the universal data type. But Rails and Ruby don’t stand still; both are continually evolving to bring us new and powerful features. The rules around hashes changed, and so should our ways of thinking about them.
So, what potential problem points should you look out for when upgrading a Rails app that uses hashes?
Strong parameters
Strong parameters were introduced in Rails 4 to avoid unintentional mass updates to database tables. Historically, ActionController::Parameters
extended HashWithIndifferentAccess
to get all the benefits of being a hash with indifferent keys. Unfortunately, because it was just a hash it’s easy to call hash methods and inadvertently bypass the mass assignment protections. So in Rails 5, ActionController::Parameters
no longer extends HashWithIndifferentAccess
and we must think about what happens with those request parameters.
Advice: Here are some general guidelines for handling params in Rails 5+.
- If we’re updating or creating a model, use strong parameters to make those changes safe and easy.
- If we’re sending the data to another class to use, then we can choose. Most teams will use a
HashWithIndifferentAccess
, as that’s considered conventional in a Rails application. It allows us to use those symbol keys we prefer. But if that hash will ever be passed to a third party library, we may want to choose aHash
. - If we’re serializing the data, we definitely want to convert it to a
Hash
.
Serialization
A HashWithIndifferentAccess
quacks like a Hash
, but it does not serialize like one. To get a HashWithIndifferentAccess
back out of its serialized representation, we must store that type information, because it’s not a native Ruby type. The serialized representation will contain a lot of class metadata, which can cause subtle class loading issues when moving between Rails, Ruby, or library versions. For instance, like I mentioned above, in Rails 5 ActionController::Parameters
stopped extending HashWithIndifferentAccess
. This means a params
that was serialized in Rails 4 cannot be deserialized as a Hash
in Rails 5. Similar changes to gems (particularly database libraries) often change class hierarchies. Libraries that are well tested often do not test for running an older version side-by-side.
Advice: If you’re enqueuing data in redis (sidekiq) or using ActiveRecord serialization, stick with native ruby types (like Hash
) to avoid messy deserialization bugs.
Keyword arguments
I recently upgraded the i18n
gem and some tests started failing, but it took me a while to figure out why.
My call looked like this:
translation = I18n.t("#{name}.title", data.merge(year: year))
So I popped into a debugger:
> data.merge(year: year)
=> { year: 2008 }
> "#{name}.title"
=> "outliers.title"
> I18n.t("#{name}.title", data.merge(year: year))
=> "Outliers was published in %{year}"
So it looks like my data is correct, but the interpolation wasn’t happening.
It turns out i18n changed a method signature to use keyword arguments:
def translate(*args)
to
def translate(key = nil, *, throw: false, raise: false, locale: nil, **options)
Hmm…does this work?
> I18n.t("outliers.title", year: 2008)
=> "Outliers was published in 2008"
> I18n.t("outliers.title", { year: 2008 })
=> "Outliers was published in 2008"
> I18n.t("outliers.title", data.merge(year: 2008))
=> "Outliers was published in %{year}"
> data.class
=> ActiveSupport::HashWithIndifferentAccess
Here’s the trick - a HashWithIndifferentAccess
will not “splat” into keyword arguments like a Hash
will. A HashWithIndifferentAccess
stores its keys as String types by default. A Hash
with symbol keys will be treated by Ruby as keyword arguments. If any non-symbol keys are included in the hash, the whole hash is treated as a single positional argument.
> I18n.t("#{name}.title", {}.merge(year: year))
=> "Outliers was published in 2008"
> I18n.t("#{name}.title", data.symbolize_keys.merge(year: year))
=> "Outliers was published in 2008"
Advice: Use symbolize_keys
(which converts to Hash
) when passing a hash of arguments to non-model classes.
JSON
JSON looks so much like a Ruby hash.
{ "name": "steve" }
But ugh, look at all those quotes. Given the above, it might feel natural to write this test:
json = '{ "name": "steve" }'
expect(JSON.parse(json)).to eq { name: "steve" }
However, this test will never pass, because :name
is not the same as "name"
. This is frustrating, because when you deal with JSON in a Rails controller, you can easily access the keys with symbols:
> puts json[:name]
=> "steve"
The secret, of course, is HashWithIndifferentAccess
. I used to spend a great deal of effort converting my JSON to HashWithIndifferentAccess
but I realized I wasn’t getting much value out of it. There’s nothing inherently wrong with string keys, it’s just a stylistic preference.
expect(JSON.parse(json}).to eq { "name" => "steve" }
Advice: When dealing with JSON outside of a controller, just embrace string keys. It’s easier to debug your code if the data looks just like the API documentation. Using with_indifferent_access
or forcing symbol keys requires discipline and just adds to debugging time.
Conclusion
After working with Rails for 10 years, I’ve developed a strong preference towards using Hash
over HashWithIndifferentAccess
. It just works, everywhere, and doesn’t surprise - unless you expect symbol keys all the time.
Letting go of my preference for symbol keys helped me when dealing with 3rd party APIs. Your team might really appreciate the indifference of HashWithIndifferentAccess
and make different choices. In either case, the takeaway for me is that recent versions of Rails require a bit more care with the edges of an application.
I hope the advice above at least gives you something to think about, and hopefully helps your next upgrade go a little more smoothly. Thanks for reading!