0x8048441
0x8048443
0x8048446
0x8048449
0x804844b
0x804844c
0x8048451
0x8048454
0x8048456
0x8048457
(Some NOPS here. They stand for No Operation...meaning nothing is done).
End of assembler dump.
.########...######..########..######## .##.....##.##....##.##.....##.##...... .##.....##.##.......##.....##.##...... .########...######..########..######.. .##.....##.......##.##...##...##...... .##.....##.##....##.##....##..##...... .########...######..##.....##.##...... |
http://blacksun.box.sk/
Unix Clan
Lecturer: Ghost_Rider
Tutorial: Buffer overflow
Converter: DKsk8
Hello, here I am again, this time I'll let you know what is in fact buffer overflow and how you can detect if some program is vulnerable to buffer overflow exploits. This tutorial has C source code, so if you don't know C you can have some problems in this tutorial, you also need to have some notions on ASM and how to use gdb.
I tried to do the easiest I could, but still this tutorial isn't one of those where you really don't know shit about nothing and when you end it you know all this. This one takes some work to understand, hey it took huge work to write!
A little inside note, like everyone that is reading this lines I like to learn, so some weeks ago I said to myself "Hey what the heck, why not to start reading some texts about buffer overflows, I know how everything work but just superficially", so I just started learning and now I'm trying to pass the knowledge that I gained, to everyone that is interested. So this won't be one of those texts where you'll learn everything, this will be like a walkthrough, like the title says an Introduction, (In the end I'll give you some nice texts). If you have any questions concerning this tutorial post in our message board, if you find any "bug" in this tutorial please email me and I'll correct it. Enjoy.
Well probably everyone knows what an exploit is. But you still got to see that the ones that are entering the security world for the first time probably don't have the idea of what that is, that's why I wrote this tinny section.
So for the ones that don't know an exploit is a program, usually written in C, that exploits some problem that another program have. The exploit will allow you to run arbitrary code that will let you do something that you shouldn't be able to do in your normal status on the system.
Nowadays, most of the exploits are what we call Buffer Overflow Exploits. What's that you ask. Wait because we'll get there. After all, this is the subject of this tutorial.
Another thing you should know is that everyone knows how to use them(how do you think that most of the websites that are defaced?), the script kiddies just go to sites like security focus, packetstorm or fyodor's exploit world, download it and run it, and then got busted. But why doesn't everybody write exploits? Well the problem is that many people doesn't know how to spot some vulnerability in the source code, or even if they can they aren't able to write a exploit. So now that you have an idea of what an exploit is, let's go ahead to the buffer overflow section.
A buffer overflow problem is based in the memory where the program stores it's data. Why's that, you ask. Well because what buffer overflow do is overwrite expecific memory places where should be something you want, that will make the program do something that you want.
Well some of you right now are thinking "WOW, I know how buffer overflow works", but you still don't know how to spot them.
Let's follow a program and try to find and fix the buffer overflow
So let's say that important variable stores some system command like, let's say "chmod o-r file", and since that file is owned by root the program is run under root user too, this means that if you can send commands to it, you can execute ANY system command. So you start thinking. How the hell can I put something that I want in the important variable. Well the way is to overflow the memory so we can reach it. But let's see variables memory addresses. To do that you need to re-written the code. Check the following code.
Well we added 2 lines in the source code and left the rest unchanged. Let's see what does two lines do.
The printf("%p\n%p", somevar, important); line will print the memory addresses for somevar and important variables. The exit(0); will just keep the rest of the program running after all you don't want it for nothing, your goal was to know where is the variables are stored.
After running the program you would get an output like, you will probably not get the same memory addresses:
As we can see, the important variable is next somevar, this will let us use our buffer overflow skills, since somevar is got from argv[1]. Now, we know that one follow the other, but let's check each memory address so we can have the precise notion of the data storage. To do this let's re-write the code again.
Now let's say that the argv[1] should be in normal use send. So you just type in your prompt:
You'll get an output like:
Starting To Print memory address:
Nice isn't it? You can now see that there exist 12 memory address empty between somevar and important. So let's say that you run the program with a command line like:
You'll get an output like:
Hey cool, newcommand got over command. Now it does something you want, instead of something he was supposed to do.
Now let's think a little. Why does this happen? As you can see in the source code somevar is declared before important, this will make, most of the times, that somevar will be first in memory. Now, let's check how each one is got. Somevar gets it's value from argv[1], and important gets it from strcpy() function, but the real problem is that important value is assign first so when you assign value to somevar that is before it important can be overwritten. This program could be patched against this buffer overflow switching those two lines, becoming :
If this was the way that the program was done even if you give an argument that would get into the memory address of important, it will be overwritten by the true command, since after getting somevar, is assign the value command to important.
This kind of buffer overflow, is a heap buffer overflow. Like you probably has seen they are really easy to do in theory but, in the real world, it's not really easy to do them, after all the example I gave was a really dumb program right? It's a real pain in the ass to find those important variables, and also to overflow that variable you need to be able to write to one that is in a lower memory address, most of times all this conditions doesn't get together, that's why we are now gonna talk about stack buffer overflows.
heap - is the space that you reserve for a variable (you access heap when you use malloc() function).
stack - it's the place where is pushed or returned values from a function. When you are trying to overflow the stack you'll try to change the return address, making the code to jump some place in memory where you have put commands that you want to execute.
So let's get into the stack stuff. Here starts the part that most problems gave me and still give. Here we will need to know ASM, know how to handle with gdb (believe me it will start being one of your best friends), still don't give up.
We will talk in Smashing the Stack which consists in a kind of "attack" that will change the return address(RET). Doing this you can return the function to an address where you already had allocate some commands that you want to be executed.
Like in the heap overflow, let's see some source code.Now we will try to call two times the exploit() functions. How we will do this? Well first we need to find some nice addresses. This time let's use gdb. First we compile.
This is your prompt now we will disassemble main. To do this we just need to type disassemble (you can also type disas) main hard isn't it?
First you are probably wondering what's x/3bc command is. Well this is the command that let us examine memory.
(For more info type in gdb prompt help x/)
I did it because I was wondering what was being pushed into the stack at 0x80484cc , and as you can see is the string we want to print.
Doing this we will re-write the Return address for 0x0804844c returning the functions to the call exploit again. This will put us in a endless loop. Why we could exploit this program? Well because there was no checking in the length of the string we were sending. So here's an advice if you code something that needs to be secure, always use functions that do length checking, like fgets(), strncpy() instead of gets(), strcpy(), and so on.
Wanna see how an exploit affects the vunerable program. Enter in gdb and type.
Then you can see what the exploit does, and correct the problems if you are having any.
Well we reached the final. Hope this was some help for you... I have in my mind some "upgrades" in this tutorial, since it hasn't everything I wanted to say. But I think it's better to check everything I want to say, instead of saying something that I'm not 100% sure.
If you find something in this tutorial that don't match, please feel free to email me about it.
This 3 texts will give you a huge amount of info that you can need. They helped me... They can be found at packetstorm.securify.com
This appendix was written for a friend, Predator, which i gratefully thank for his efforts. Original text is below.
I wrote this as part of Ghost Rider buffer overflow tutorial which you can download at http://blacksun.box.sk
Now I will talk about shell code.Shell code is a char array which consist in machine instruction which are used to spawn shell.Since the program we try exploit doesn't have code which will execute shell,we must write it. For this, you must know a little of assembly,C and x86 structure, Linux is also required. But only C and assembly are really needed. Well lets start with it.
Usually shell code is written in program as ->
Both are correct so you can use both.:)).
This program is used to run shell.Why execve if there is a lot of exec function.The answer is simple execve is only exec function that is call with int $0x80 and which is very important to us.
well lets compile this with -static option and run it in gdb.
Well lets look in main:)All function start from there
Things to do->
Well we need the exact address in memory of our "/bin/sh" string. We can simple put "/bin/sh" after call which will push EIP on stack,and pushed EIP should be address of our string...Look at pic 0.1
on beginning of code we will put JMP instruction which will jmp to call,and call will save EIP and go to offset of a.EIP will be our "/bin/sh" address
well lets write this to asm->
Lets compile this
lets write our shell code->
This works... "\x2f\x62\x69\x6e\x2f\x73\x68" is same that if you wrote "/bin/sh" (this is at end of code) Take a look at this shell code...There is \x00 or '\0' at some places. As we know '\0' is end of string. So strcpy or other string function will copy it while they find '\0' and our shell code wouldn't be copied all. Lets get rid of this '\0'
rewrite c0de with this changes and we get this
compile like this
rewrite program:
It works...and it is smaller then our previous c0de and without 0x00 or \x00 or '\0' so strcpy(),sprintf() will copy it at all...
Here is simple program to print Stack pointer of current program: