LIKE injection
Looking through our exception tracker the other day, I ran across a notice from our slow-query logger that caught my eye. I saw a SELECT … WHERE … LIKE query…
Looking through our exception tracker the other day, I ran across a notice from our slow-query logger that caught my eye. I saw a SELECT … WHERE … LIKE
query with lots of percent signs in the LIKE
clause. It was pretty obvious that this term was user-provided and my first thought was SQL injection.
[3.92 sec] SELECT ... WHERE (profiles.email LIKE '%64%68%6f%6d%65%73@%67%6d%61%69%6c.%63%6f%6d%') LIMIT 10
Looking at the code, it turned out that we were using a user-provided term directly in the LIKE
clause without any checks for metacharacters that are interpreted in this context (%
, _
, ).
def self.search(term, options = {})
limit = (options[:limit] || 30).to_i
friends = options[:friends] || []
with_orgs = options[:with_orgs].nil? ? false : options[:with_orgs]
if term.to_s.index("@")
users = User.includes(:profile)
.where("profiles.email LIKE ?", "#{term}%")
.limit(limit).to_a
else
users = user_query(term, friends: friends, limit: limit)
end
...
end
While this isn’t full-blown SQL injection, it got me thinking about the impact of this kind of injection. This kind of pathological query clearly has some performance impact because we logged a slow query. The question is how much?
I asked our database experts and was told that it depends on where the wildcard is in the query. With a %
in the middle of a query, the database can still check the index for the beginning characters of the term. With a %
at the start of the query, indices may not get used at all. This bit of insight led me to run several queries with varied %
placement against a test database.
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github.com";
1 row in set (0.00 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%ris@github.com";
1 row in set (0.91 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github%";
1 row in set (0.00 sec)
mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%c%h%r%i%s%@%g%i%t%h%u%b%.%c%o%m%";
21 rows in set (0.93 sec)
It seems that unsanitized user-provided LIKE
clauses do have a potential performance impact, but how do we address this in a Ruby on Rails application? Searching the web, I couldn’t find any great suggestions. There are no Rails helpers for escaping LIKE
metacharacters, so we wrote some.
module LikeQuery
# Characters that have special meaning inside the `LIKE` clause of a query.
#
# `%` is a wildcard representing multiple characters.
# `_` is a wildcard representing one character.
# `` is used to escape other metacharacters.
LIKE_METACHARACTER_REGEX = /([\%_])/
# What to replace `LIKE` metacharacters with. We want to prepend a literal
# backslash to each metacharacter. Because String#gsub does its own round of
# interpolation on its second argument, we have to double escape backslashes
# in this String.
LIKE_METACHARACTER_ESCAPE = '\\1'
# Public: Escape characters that have special meaning within the `LIKE` clause
# of a SQL query.
#
# value - The String value to be escaped.
#
# Returns a String.
def like_sanitize(value)
raise ArgumentError unless value.respond_to?(:gsub)
value.gsub(LIKE_METACHARACTER_REGEX, LIKE_METACHARACTER_ESCAPE)
end
extend self
module ActiveRecordHelper
# Query for values with the specified prefix.
#
# column - The column to query.
# prefix - The value prefix to query for.
#
# Returns an ActiveRecord::Relation
def with_prefix(column, prefix)
where("#{column} LIKE ?", "#{LikeQuery.like_sanitize(prefix)}%")
end
# Query for values with the specified suffix.
#
# column - The column to query.
# suffix - The value suffix to query for.
#
# Returns an ActiveRecord::Relation
def with_suffix(column, suffix)
where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(suffix)}")
end
# Query for values with the specified substring.
#
# column - The column to query.
# substring - The value substring to query for.
#
# Returns an ActiveRecord::Relation
def with_substring(column, substring)
where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(substring)}%")
end
end
end
ActiveRecord::Base.extend LikeQuery
ActiveRecord::Base.extend LikeQuery::ActiveRecordHelper
We then went through and audited all of our LIKE
queries, fixing eleven such cases. The risk of these queries turned out to be relatively low. A user could subvert the intention of the query, though not in any meaningful way. For us, this was simply a Denial of Service (DoS) vector. It’s nothing revolutionary and it is not a new vulnerability class, but it’s something to keep an eye out for. Three second queries can be a significant performance hit and application-level DoS vulnerabilities need to be mitigated.
Update: A number of people pointed out that in Rails 4.2, ActiveRecord includes a sanitize_sql_like
helper for sanitizing LIKE clauses.
Written by
Related posts
Breaking down CPU speed: How utilization impacts performance
The Performance Engineering team at GitHub assessed how CPU performance degrades as utilization increases and how this relates to capacity.
How to make Storybook Interactions respect user motion preferences
With this custom addon, you can ensure your workplace remains accessible to users with motion sensitivities while benefiting from Storybook’s Interactions.
GitHub Enterprise Cloud with data residency: How we built the next evolution of GitHub Enterprise using GitHub
How we used GitHub to build GitHub Enterprise Cloud with data residency.