Security under Linux : the Buffer Overflow Problem
Pas de version francaise pour le moment
Introduction
As a Web server administrator, I'm concerned by security
holes. After many weeks of setup, I was really CERTAIN that our server was clean
and secure. One day, I read in the news that a new Web site treating about
security had born: Jason T.
Murphy 's Linux Security Home Page . I decided to browse it and to retrieve
some exploits to test my system.
I was afraid to discover that any local
user could gain root access simply by using LPR, MOUNT, UMOUNT, and even through
the network by using a fake library which fooled the telnetd !!! I've patched
all that was tested and tried to find other bugs.
I understood that Linux is able to execute code in the stack (!), which made
it possible to overflow a buffer of any root program and make it execute a shell
or any program. This is the most common method of gaining root with suid-root
programs. That's why I'll describe it here, and, of course, some ways to avoid
that in your programs.
Description of the methods
The problem comes from the fact that many
programmers often use parameter parsing code like this: void parse(char *arg) {
char param[1024];
int localdata;
strcpy(param,arg);
.../...
return;
}
main(int argc, char **argv) {
parse(argv[1]);
.../...
}
or even main(int argc, char **argv) {
char param[1024];
int index=0;
while ((param[index++]=getchar())!=EOF);
.../...
}
As you see, no check is made on the string length. Lazy
programmers think "bahh, no one gives a file name which is 1024 chars long. And
if it does, he will have a Segmentation Fault". Good programmers dont think like
that, but sometimes they quickly add a little feature to one of their programs
to help them debug, but when it works well, they forget to add all the necessary
boundary tests.
If you don't understand very well the problem, here's a little explaination:
Under Linux (and nearly all Intelx86-based systems), the stack goes
up-to-down. That means that when a program calls a function, the return address
is pushed and the stack pointer (ESP) is decremented by 4 (32 bits). Then, if a
function define local variables (like "param" in the examples), the array is
defined below the return address of the function. When executing strcpy() in the
first program, the stack looks approximately like this:
[] [] [] [] [] []
<-------> <-----------> <-------> <----------------> <------------> <----------->
N bytes 4 bytes 1024 bytes 4 bytes M bytes 4 bytes
^
|_ Current ESP when calling strcpy()
Now
you should understand that if you fill with *EXACTLY* 1024 bytes
containing the necessary code to execute a shell, AND you add a 4-byte pointer
to the beginning of you code inside , then when the function
returns, it branches directly to your code inside .
So I wrote a very little (20 bytes) "assembler-level execlp() call" and tried
to overflow many buffers (specially the suid-root programs).
My program is a shell script which
consists in four parts:
- garbage filling to reach the end of a buffer
- execlp() assembler coding
- choose the executable to be run by execlp (typically /bin/sh or /bin/id)
- calculation of an approximate stack value
All these parts generate
data on their output which can then be concatenated and given as a parameter to
an executable.
Garbage filler
The garbage filler fills the
beginning of the buffer you want to overflow with nearly any data. In reality,
you must know exactly what you put here because as you will see below, the stack
calculation isn't very precise and the program you are executing can branch
inside the garbage. The first idea is to use NOP encoding (0x90), but you'll
notice that this is a 8-bit code and it is not printable so this is not very
easy for the tests. Better use another opcode which does something harmless: INC
EAX (0x41 = 'A'). EAX is not used by the execlp() code so that's not a problem,
and it's sometimes usefull to select the 'A' chain on the screen with a mouse
and paste it anywhere else !
The garbage filler I use accepts one parameter
which is the number of bytes you want to fill. This is the most important
parameter of the tester because it determines how the code will be aligned
relatively to the stack pointer. The return pointer is the ONLY parameter that
MUST go out of the buffer. The only output is a long 'AAA...AAA' chain which
length is the number you requested.
execlp() assembler coding
To detect the maximum of bugs, you must have
the smallest execlp() code so that it can fit in smaller buffers. This one is
only 20 bytes long ! To make it so small, I've used the fact that we are
executing in the stack so we already know where our parameters are (relative to
ESP, of course !). To execlp(), you have to call INT 80h with EAX=0bh, EDX and
ECX pointing to argv, and EBX pointing to the executable you want to run,
terminated by a zero. First, if argv[0]=NULL, that's not a problem because that
means the executable won't have a name. No matter. That means that argv can
point to the 0x00 terminating the executable name if it's 32-bit. The executable
name must be given just after the code. The stack looks like this: before the RET:
[garbage] [program] [progname] [stack pointer]
<---N---> <---20--> <---X----> <------4------>
^_ ESP
after the RET:
[garbage] [program] [progname] [stack pointer]
<---N---> <---20--> <---X----> <------4------>
^_ ESP
before int 80:
[garbage] [program] [progname] [0000]
<---N---> <---20--> <---X----> <--4->
^ ^_ ECX=EDX -> NULL
|_ EBX -> prog name
The
problem is that you must know the program name length (7) to make EBX point to
it because it is relative to ESP. Another way would be to CALL IP+5 and pop EBX,
and then add the code length to ebx to calculate the program address. But at
this time, this has not been used yet. The zero is simply pushed into the stack.
No need to explicitly write at an indirect address. The code follows: mov ecx,esp
xor eax,eax
push eax
lea ebx,[esp-7]
add esp,12
push eax
push ebx
mov edx,ecx
mov al,11
int 0x80
Get the assembler version and/or
the binary version.
Choose the executable to be run
The executable name is simply passed by
an 'echo -n "/tmp/sh"'. The reason for using "/tmp/sh" instead of "/bin/sh" is
that you can copy any program in /tmp/sh (I usually use /usr/bin/id) which helps
redirecting to a log file, and doesn't present the risk that someone gets a
shell on your console when you make it run as a loop. Moreover, sometimes you
get a program executed without root privileges. This is a "semi-bug": a buffer
is overflowed, but not in suid parts. In this case, /usr/bin/id is more
interesting than /bin/sh to discover well what's happening.
Stack pointer calculation
To calculate the stack pointer that the
program will use, I use the fact that when you run a program immediately after
another one, they are called from the same function in the shell, so with the
same ESP value. That make it possible to have a program which calculates ESP
before executing the "victim program". My program also provides an
option for subtracting a value to the actual stack pointer, and return that in a
binary form. (it gives the stack value twice because this sometimes helps
finding faster). This value isn't very important because you just need to make
the program branch into the garbage preceeding the execlp().
How to prevent this from being used on your system
The only way to be
secure is to be faster than the crackers. You should have a mailing list or web
addresses which are often updated and quickly test all the new bugs announced.
Get my
package and test it on every suid-root executable you have. To find them,
first make a list and print it: (echo "SUID-ROOT LIST" ; find / -user root -type f -perm -4000) | lpr
Also
search for root-owner/world writable files and directories: (echo "WORLD WRITABLE LIST" ; find / -user root -perm -022) | lpr
For
EACH suid-root executable you find, read its man and try my scripts with ALL
options and parameters. For example, this test will succeed on not-so-older LPR
(as in slackware 3.1). ./tryall.generic lpr -C
./tryall.generic lpr -J
You may get lots of Segmentation
Faults. If a program gives you a core or a Segmentation Fault, this means it has
a bug anyway and it is risky. Note it on your paper-list, you'll try it later.
Once you have noticed some risky programs, modify the script to try to adapt
values to begin with a normal behaviour, and get the segmentation fault after.
In this case, you could get a shell. Sometimes, you have a shell very late after
the Seg Faults.
I *STRONGLY* recommend that every program that gives a
Segmentation Fault be CHMODed to 755 so that it won't be suid anymore. Note that
even if you can't succeed in making it give you a shell, 2 things are possible:
- someone has more patience than you an make it work;
- someone uses it badly so that it makes you system hang.
All the programs that seem buggy should be replaced. You should allways
get the sources and compile them yourself.
I give a list of addresses to
consult at the bottom of this page. You can use them to search correction and
patches for your buggy programs.
What to do in your programs
Each time you use strcpy(), you should
replace it by strncpy() or test the length of the string you are about to copy.
The easier way to do this is to define MACROs for strcpy(), fread(), read(),
memcpy(), bcopy() ... that test the argument size each time it is possible. When
it is difficult to make these tests, better malloc() the buffer instead of
letting them in your local variables. With a malloc(), there's no risk because
the data there will never be executed. If you don't want to malloc(), try
defining your buffers globally. They will be in data sections so once more, it
will be impossible to execute them. After that, you could try my program on
yours to verify if there's no risk. A simpler way for that is to generate a long
string with 'rpt' and pass it to your program. If this make it hang, review it.
What could be done in Linux
Intel CPUs provide two things to make this
bug harder or impossible to use:
- it is possible to choose the direction of the stack: ascending or
descending. When it is ascending, you can fill it as long as you want, the
return addresses are before your first character so you can't overwrite it. At
first, I thought it was really impossible to do anything with this method, but
Aleph One gave me an example where it
was still possible. In general, when a function which contains its own buffer
calls another one with the pointer on that buffer, then the return pointer of
this last function can be overwritten. I admit this is more difficult, but it
is possible.
- the stack segment can be marked read-write-noexec. That should really be
enough, because if the return pointer pointed to a stack area, this would
generate a segmentation fault and nothing more. After reading BugTraq
postings, I can say that this is very difficult because some Linux mechanisms
rely on this feature (signals, trampoline gcc). A patch exists for the kernel
to deny execution in the stack, but the author explains this could cause
problems with some *very rare* programs such as SuperProbe which, in fact, do
not need to run as suid root.
Related sites
Back to MIAIF's home page
Willy Tarreau
tarreau@aemiaif.ibp.fr