Skip to content

LIKE injection

Looking through our exception tracker the other day, I ran across a notice from our slow-query logger that caught my eye. I saw a SELECT … WHERE … LIKE query…

Author

Looking through our exception tracker the other day, I ran across a notice from our slow-query logger that caught my eye. I saw a SELECT … WHERE … LIKE query with lots of percent signs in the LIKE clause. It was pretty obvious that this term was user-provided and my first thought was SQL injection.

[3.92 sec] SELECT ... WHERE (profiles.email LIKE '%64%68%6f%6d%65%73@%67%6d%61%69%6c.%63%6f%6d%') LIMIT 10

Looking at the code, it turned out that we were using a user-provided term directly in the LIKE clause without any checks for metacharacters that are interpreted in this context (%, _, ).

def self.search(term, options = {})
  limit = (options[:limit] || 30).to_i
  friends = options[:friends] || []
  with_orgs = options[:with_orgs].nil? ? false : options[:with_orgs]

  if term.to_s.index("@")
    users = User.includes(:profile)
                .where("profiles.email LIKE ?", "#{term}%")
                .limit(limit).to_a
  else
    users = user_query(term, friends: friends, limit: limit)
  end

  ...
end

While this isn’t full-blown SQL injection, it got me thinking about the impact of this kind of injection. This kind of pathological query clearly has some performance impact because we logged a slow query. The question is how much?

I asked our database experts and was told that it depends on where the wildcard is in the query. With a % in the middle of a query, the database can still check the index for the beginning characters of the term. With a % at the start of the query, indices may not get used at all. This bit of insight led me to run several queries with varied % placement against a test database.

mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github.com";
1 row in set (0.00 sec)

mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%ris@github.com";
1 row in set (0.91 sec)

mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "chris@github%";
1 row in set (0.00 sec)

mysql> SELECT 1 FROM `profiles` WHERE `email` LIKE "%c%h%r%i%s%@%g%i%t%h%u%b%.%c%o%m%";
21 rows in set (0.93 sec)

It seems that unsanitized user-provided LIKE clauses do have a potential performance impact, but how do we address this in a Ruby on Rails application? Searching the web, I couldn’t find any great suggestions. There are no Rails helpers for escaping LIKE metacharacters, so we wrote some.

module LikeQuery
  # Characters that have special meaning inside the `LIKE` clause of a query.
  #
  # `%` is a wildcard representing multiple characters.
  # `_` is a wildcard representing one character.
  # `` is used to escape other metacharacters.
  LIKE_METACHARACTER_REGEX = /([\%_])/

  # What to replace `LIKE` metacharacters with. We want to prepend a literal
  # backslash to each metacharacter. Because String#gsub does its own round of
  # interpolation on its second argument, we have to double escape backslashes
  # in this String.
  LIKE_METACHARACTER_ESCAPE = '\\1'

  # Public: Escape characters that have special meaning within the `LIKE` clause
  # of a SQL query.
  #
  # value - The String value to be escaped.
  #
  # Returns a String.
  def like_sanitize(value)
    raise ArgumentError unless value.respond_to?(:gsub)
    value.gsub(LIKE_METACHARACTER_REGEX, LIKE_METACHARACTER_ESCAPE)
  end

  extend self

  module ActiveRecordHelper
    # Query for values with the specified prefix.
    #
    # column - The column to query.
    # prefix - The value prefix to query for.
    #
    # Returns an ActiveRecord::Relation
    def with_prefix(column, prefix)
      where("#{column} LIKE ?", "#{LikeQuery.like_sanitize(prefix)}%")
    end

    # Query for values with the specified suffix.
    #
    # column - The column to query.
    # suffix - The value suffix to query for.
    #
    # Returns an ActiveRecord::Relation
    def with_suffix(column, suffix)
      where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(suffix)}")
    end

    # Query for values with the specified substring.
    #
    # column    - The column to query.
    # substring - The value substring to query for.
    #
    # Returns an ActiveRecord::Relation
    def with_substring(column, substring)
      where("#{column} LIKE ?", "%#{LikeQuery.like_sanitize(substring)}%")
    end
  end
end
ActiveRecord::Base.extend LikeQuery
ActiveRecord::Base.extend LikeQuery::ActiveRecordHelper

We then went through and audited all of our LIKE queries, fixing eleven such cases. The risk of these queries turned out to be relatively low. A user could subvert the intention of the query, though not in any meaningful way. For us, this was simply a Denial of Service (DoS) vector. It’s nothing revolutionary and it is not a new vulnerability class, but it’s something to keep an eye out for. Three second queries can be a significant performance hit and application-level DoS vulnerabilities need to be mitigated.

Update: A number of people pointed out that in Rails 4.2, ActiveRecord includes a sanitize_sql_like helper for sanitizing LIKE clauses.

Explore more from GitHub

Engineering

Engineering

Posts straight from the GitHub engineering team.
GitHub Universe 2024

GitHub Universe 2024

Get tickets to the 10th anniversary of our global developer event on AI, DevEx, and security.
GitHub Copilot

GitHub Copilot

Don't fly solo. Try 30 days for free.
Work at GitHub!

Work at GitHub!

Check out our current job openings.