SQL Single Quotes for Numeric Values: Is It Really Good Practice?

Overview

I discuss whether to put single quotes around numeric values in SQL query from  two angles: security concern (SQL injection) and performance (index) trade off.

Single Quotes around Numeric Value

The problem is simple: what is the difference between the following two SQL queries where col is some attribute of type Integer?

SELECT id FROM table_name WHERE col = '11';
SELECT id FROM table_name WHERE col = 11;

The first question would be is such a syntax to put quotes around number accepted? The answer is yes and no. As far as I know different DBMS behaves differently about whether accpet such syntax or not, probably most of them like MySQL, Oracle would accpet it , but please do check the document first before you use it.

However, I am note sure about the standard SQL, I remember it does not accept such a syntax. Anyone could point me to such official document on this? Thanks 

So the next question would be suppose, both syntax are OK, then what is the difference? There are two aspects I would like to show: Security Concerns (SQL Injection) and Performance Trade Off (Index).

Security Concerns (SQL Injection)

From the following referenced link, someone in the forum posted that MySQL says “A common mistake is to protect only string data values. Remember to check numeric data as well. If an application generates a query such as SELECT * FROM table WHERE ID=234 when a user enters the value 234, the user can enter the value 234 OR 1=1 to cause the application to generate the query SELECT * FROM table WHERE ID=234 OR 1=1. As a result, the server retrieves every row in the table. This exposes every row and causes excessive server load. The simplest way to protect from this type of attack is to use single quotes around the numeric constants: SELECT * FROM table WHERE ID=’234′. If the user enters extra information, it all becomes part of the string. In a numeric context, MySQL automatically converts this string to a number and strips any trailing non-numeric characters from it.”

Indeed, single quotes could protect your database from the common attack called SQL Injection. Take a look at one more example from the PHP manual and I run the test to show the results:

// We didn't check $_POST['password'], it could be anything the user wanted! For example:
$_POST['username'] = 'aidan';
$_POST['password'] = "' OR ''='";

// Query database to check if there are any matching users
$query = "SELECT * FROM users WHERE user='{$_POST['username']}' AND password='{$_POST['password']}'";

// This means the query sent to MySQL would be:
echo $query;

$link = mysql_connect('host_ip', 'name', 'password')
OR die(mysql_error());
$user = $_POST['username'];
$password = $_POST['password'];

$query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'",
mysql_real_escape_string($user),
mysql_real_escape_string($password));
echo $query;

The results of the above code would be the following

SELECT * FROM users WHERE user='aidan' AND password='' OR ''=''
SELECT * FROM users WHERE user='aidan' AND password='\' OR \'\'=\''

As we could see, the first one allows the user to login without a password as long as such user name exists, while the second one use mysql_real_escape_string() to escape special characters and thus is more secure.

So get back to our original issue, one opinion is that we should put single quotes and it is more secure as what MySQL suggested. However, I doubt about it because to protect our database from SQL injection such as WHERE ID=234 OR 1=1, we should check the type in the code rather than leave this job to database engine. I mean, good practice should be to distinguish between numeric and string values, in the example WHERE ID=234 OR 1=1, the value ’234 OR 1 = 1′ obviously is string, not valid integer, then before you make the sql query, you should do validation first and make sure it is an integer. In this way, we could avoid using single quotes for numeric values, and maybe we got performance gain as we discuss in next section.

Performance Trade Off (Index)

So as in the original problem, col is an attribute of type integer, while the value we provide will be treated as a string if we put quotes around the value, then in most of the database engines, it would do type conversion (obviously more time cost). To make things even worse, what if col is made an index? As far as I know, some DBMS such as Oracle will fail to use this index on the numeric column (called col in our example). Quoted and add more clarification inline from someone else “Oracle will convert it (numeric to string) as well, but will fail to use an index on the column because of the implicit conversion. So there are potential performance problems”.

More generally,  if you provide the wrong data type to compare against, the DBMS would fail to use the compared attribute as index and the performance would be downgraded to a big extent.

Conclusion

Well, so far, we have discussed quote a lot, so here comes my own suggestion: Do not use quote around numeric values in SQL query for performance. Rather, make sure in your say PHP code, make sure the numeric values are indeed valid numeric ones before making up the complete SQL query to protect your db from SQL injection.

One final interesting note: 

The Oracle, SQLite, MSSQL, and Firebird drivers all quote as the PDO MySQL driver, ignoring the param type. (refer to this stackoverflow thread: http://stackoverflow.com/questions/5356206/mysql-pdoquote-putting-single-quotes-around-integers)

So maybe we will not be bothered at all about this issue, are these PDO drivers optimized? Or do they actually not care about the performance? I am not sure about this at all, hope someone could help like again, point me to some explanation documents. Thanks!

Reference

http://www.dbforums.com/showthread.php?1639433-SQL-Quotes-around-the-values-of-numeric-columns

http://stackoverflow.com/questions/5356206/mysql-pdoquote-putting-single-quotes-around-integers

http://www.codeproject.com/Articles/35665/Top-steps-to-optimize-data-access-in-SQL-Serv

Summary

Whether to put single quotes around numeric values in SQL query is considered from  two angles: security concern (SQL injection) and performance (index) trade off. Please feel free to share your comments and correct me if I miss anything. Thanks!

Written on November 13, 2014