Next Previous Contents

6. Data injection into scripts

In the previous section we had ordinary text with perhaps some occasional active part. At least as interesting are scripts that do something for a user. Shell scripts, or perl scripts, or php scripts or mysql scripts, or ...

The most common situation here is that of scripts embedded in some web application. The user comes with requests and the application comes back with the results, perhaps to be viewed by a browser. The user request or input may come in via the URL, or via text typed to a form, or perhaps via direct http requests.

Very often it is possible to trick an application into doing something unintended by inserting special characters into the input data. Whitespace, control characters, especially NUL and backslash, string delimiters like single and double quote, active filename elements like slash, backslash and the .. sequence, active programming characters like semicolon, parentheses, braces, brackets, dollar sign, ampersand, less-than and greater-than signs, and so on.

The Telenor Nextel websites destroyed On 6 April 1997 over 11000 homepages and 70 commercial sites, including all material of several on-line newspapers, altogether 14 GB of data, was erased from the Telenor Nextel server by a 27-year old Norwegian programmer. He was acquitted on 5 March 1998. [På Norsk: Slik ble filene slettet, hack.html, dommen, ingen anke]

What happened was a simple case of perl/shell injection. The user interface supported sending mail to a specified user: a perl script would do open(MAIL, "| mail $adresse"); where $adresse was the user-specified address. Now if the user specifies the string a; rm -rf / as the address, the script will first send mail to a, and then delete all files on the server.

The programmer said he was only testing the security. The judge put part of the blame on the company: The server had not been configured in a way so that it would refuse, or limit the effects of, the command rm -rf.

The Valus payment system hacked Valus is a Danish on-line micropayment system that was started in May 2002. People soon started to discuss weaknesses. It was trivially possible to do arbitrary things on the data base server. Someone showed how to view other people's transactions by modifying a URL. Someone else showed how to inject SQL and see under what username the server ran. And wrote a moment later I guess one could kill the server with
http://www.valus.dk/publisering/default.asp?Cid=3%01SHUTDOWN
(Don't do it, it is almost too easy.) But (at least) four people tried it anyway, and it worked. Five months later these five people were arrested. Their computers were taken for investigation. In the end they got a suspended sentence.

The above perl/shell injection example was based on the fact that a semicolon is a command separator for the shell. The SQL injection discussed here worked because the character with code 1 (0x01, Ctrl-A) is a line separator for SQL Server.

6.1 SQL injection - first example

Let us try. We go to http://site and are redirected to http://site/home.php3. Looking at the source of home.php3 we see fragments like

menu.addMenuItem("Inschrijven","window.open('gpnlinfo.php3?cat_id=3', '_parent');");
that sound like the PHP page gpnlinfo.php3 accepts parameters and then generates Javascript.

Let us get http://site/gpnlinfo.php3?cat_id=1;, adding a final semicolon. This yields an error message about a failed SQL query in the browser window. But the page source does not contain the string SQL. Hmm. The page source contains fragments like

layer2.setup("gpnlinfo_inhoud.php3?a=regiofinales&cat_id=1;");
showing that the error message is probably from gpnlinfo_inhoud.php3.

We ask the browser for http://site/gpnlinfo_inhoud.php3?a=regiofinales&cat_id=1; and get the reply

SQL query "SELECT * FROM regiofinales WHERE cat_id = 1; AND jaar = 2004 ORDER BY regio" mislukt:
Syntax fout in query bij '; AND jaar = 2004 ORDER BY regio' in regel 1
Good. So we are talking to some SQL server and are able to introduce syntax errors in the script, and the site is friendly enough to tell us precisely what command it is executing.

We would like to know what data bases and tables and columns exist. Trying http://site/bandpagina_inhoud.php3?a=bandlist&cat_id=2; yields

SQL query "SELECT b.naam, f.band_id FROM bands b, regiofinalisten f, regiofinales r WHERE r.cat_id = 2; AND r.jaar = 2004 AND f.regiofinale_id = r.id AND b.id = f.band_id ORDER BY naam" mislukt:
Syntax fout in query bij '; AND r.jaar = 2004 AND f.regiofinale_id = r.id AND b.id = f.band_id ORDER BY na' in regel 1

Now the first example of SQL injection. We ask for http://site/bandpagina_inhoud.php3?a=bandlist&cat_id=2%20OR%200=1. (Here %20 is the representation of the space character.) The above error message has already revealed what the query will be:

SELECT b.naam, f.band_id FROM bands b, regiofinalisten f, regiofinales r
 WHERE r.cat_id = 2 OR 0=1 AND r.jaar = 2004 AND f.regiofinale_id = r.id AND b.id = f.band_id ORDER BY naam"
That means that we succeeded in removing three conditions from the query. (Indeed, AND binds stronger than OR, so the condition becomes r.cat_id = 2 OR FALSE, that is, just r.cat_id = 2.

6.2 SQL injection

The above example was a bit messy, and did not achieve very much, but it shows the principle. The approach: fiddle with the website, if possible get page and script sources, provoke some helpful error messages and then use a suitably crafted URL that changes the flow of control or parse tree of some program.

The above trick can be usefully repeated on login pages. When input is from a login form, and the SQL query is SELECT * FROM users WHERE name=$name AND password=$password then giving user name bill OR 0=1 will eliminate the password check, allowing one to login when a username is known.

If the query uses name='$name', then try as user name bill' OR 'a'='b. Similarly with double quotes.

If the query uses parentheses, a good reply might be ') OR ('a'='b.

A standard Login sequence is: Username: admin, Password: ' or '%' Like '%.

Some SQL implementations use -- as a comment introducer. MySQL uses # or -- for this purpose, and also allows /* comment */. This means that we can also remove the password check by giving user name bill #, with # encoded as %23. (In a URL the # character has a different meaning.)

A comment character is also very useful if the rest of the statement would get bad syntax because of our SQL injection.

Quote delimited strings and comments interact in an unknown way. Try to make sure that both inside and outside comments quotes are matched (unless you want to provoke error messages, of course). For example, if the field we are playing with is quoted, as in SELECT * FROM users WHERE name='$name' AND password='$password' then bill' or 1=1 -- ' may be a suitable name to enter.

Some servers allow multiple requests separated by semicolon. In such an environment playing may be very easy. Just put a semicolon followed by an arbitrary sequence of SQL commands in your input field.

Some servers allow multiple requests combined into a single one, as in SELECT * FROM users WHERE priv=0 AND user='$name' UNION SELECT * FROM users WHERE 1=1.

Some servers allow SELECT statements in a parenthesized subexpression.

Some servers allow one to obtain more information using the HAVING trick: append having 1=1 or ' having 1=1 to a reply. Microsoft Transact SQL replied

Microsoft OLE DB Provider for SQL Server (0x80040E14)

Column 'foo.bar' is invalid in the select list because it is not contained
in an aggregate function and there is no GROUP BY clause.
That is good, because it teaches us the name of a field. Now there are further possibilities for injection. E.g. ' or foo.bar like 'a%' --.

There are many variations.

Maybe SQL injection was first discussed by Rain Forest Puppy. See e.g. Phrack 54#8 and How I hacked Packetstorm, by rfp. It continues to be a useful technique. A 2011 example: Shop leaks credit card data.

6.3 Escapes and multibyte characters

Many web servers are built on the LAMP platform: Linux + Apache + MySQL + PHP/Perl/Python. In this environment various attempts have been made to increase security and combat SQL injection.

PHP has the function addslashes() that escapes quotes and backslashes and null characters by preceding them with a backslash, to be used just before passing a string to MySQL. Good. Or? Many data bases use two adjacent single quotes as representation for a single quote, and ignore the backslash. So, addslashes() should not be used.

PHP also has the boolean option magic_quotes_gpc that, if set, will automatically force an addslashes() on all GET, POST, and COOKIE strings, regardless of how they will be used. Bad. This was off by default for PHP3, on by default in some configurations of PHP4 and will not exist in PHP6. The existence of magic_quotes_gpc was an annoyance for all programmers. Since its value was determined in the php.ini configuration file, and the application could not set it, one had to test everywhere whether this option was set, and addslashes() if not.

There is a confusing mess here, ample room for security flaws. See also register_globals (defaults to off since 4.2.0), mysql_escape_string(), mysql_real_escape_string(), magic_quotes_sybase, and the SQL function sql_quote().

If magic_quotes_sybase is on it will completely override magic_quotes_gpc. Having both directives enabled means only single quotes are escaped as ''. Double quotes, backslashes and NUL's will remain untouched and unescaped.

Unescaped strings can be obtained if the input is urldecoded by urldecode().

Recently (Jan 2006) it was noticed that unescaped strings can also be obtained by playing with multibyte character sets. If the server uses a Chinese or Japanese 16-bit character set, and the second byte of a 2-byte character has the value 0x27 or 0x5c then code that is not encoding-aware might view this as a single quote or backslash, and escape it by inserting a backslash. But now the inserted backslash is part of the multibyte character and the quote or backslash that was there has been brought into the open. Similar possibilities arise when UTF8 is being used. If the server deletes illegal characters and the frontend escapes a quote that illegally ends a multibyte character, then afterwards the backslash illegally ends the multibyte character and is deleted by the server, so that the quote is visible again. See also PostgreSQL techdocs.50.


Next Previous Contents