Smart web form validation with PHP

Reading time: About 3 minutes

Since I’ve been working on my CMS the past few days I’ve been on a little PHP trip. I’ll probably be writing a few PHP posts over the next week or so, intermingled with other topics of course.

This tutorial should help you build smarter web forms with PHP and save you a ton of time.

Let’s start with web form sanitization.

Sanitize all user input through a standard PHP function

User input sometimes has unnecessary whitespace, so we’ll trim that off. We’ll also check if magic_quotes_gpc is automatically adding backslashes to user input and remove as necessary.

$hasgpc = get_magic_quotes_gpc();
foreach($_POST as $field => $value)
{
	$value = trim($value);
	if($hasgpc)
	{
		$value = stripslashes($value);
	}
	$filtered_post[$field] = $value;
}

Add user input verification

To verify user input, we’ll first add meta information about each web form input field to its name property.

Adding field properties to each field on the web form display side

Notice the use of the underline to separate a field’s properties in the name.

<form method="post" action="myscript.php">
<input type="text" name="username_required_alphanumeric">
<input type="text" name="email_required_emailaddress">
</form>

Verifying form input using on the web form submission side

$hasgpc = get_magic_quotes_gpc();
foreach($_POST as $field => $value)
{
	/* first we extract the actual intended field name */
 
	$fieldinfo = explode('_', $field); // turn the posted field name into an array of properties
	$field = array_shift($fieldinfo); // take the first element as the intended field name
 
	/* do the standard stuff we discussed above */
 
	$value = trim($value);
	if($hasgpc)
	{
		$value = stripslashes($value);
	}
	$filtered_post[$field] = $value;
 
	/* new stuff to verify user input based on $fieldinfo */
 
	if(array_search('required', $fieldinfo) && empty($value))
	{
		$error = TRUE;
	}
	if(array_search('emailaddress', $fieldinfo) && !preg_match('|^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}$|i', $value))
	{
		$error = TRUE;
	}
	if(array_search('alphanumeric', $fieldinfo) && preg_match('|[^a-z0-9]|i', $value))
	{
		$error = TRUE;
	}
}
 
/* if there were errors in the web form, send the user back to make changes */
if($error)
{
	header('Location: ' . $_SERVER['HTTP_REFERER']);
	exit;
}
 
/*
at this point the form has been cleaned and validated.
reference the fields with $filtered_post, such as $filtered_post['username']
*/

Things to consider about this method

  • This tutorial is supposed to emphasize web form sanitization and validation with PHP. You can greatly improve the user feedback on errors versus what I’ve used above. In fact I compell you to for the user’s sake. One way to do this is to pass the errors back in the session and display them above the web form. I’d also recommend sending back the submitted values so the user doesn’t have to retype them.
  • While this method kicks ass, it has its limitations. The one I thought of is that a field cannot be named the same thing as a filter. For example, <input type="text" name="emailaddress_required_emailaddress"> wouldn’t work.
  • Starting with PHP 5.2.0 you can use filter_var() to validate fields instead of regular expressions as used above.
  • A few people have pointed out that this is not the most secure method of form validation because someone can hack the form field names. Keep that in mind if you decide to use this method of form validation.

If you have any questions or comments, feel free to leave them below!

16 comments skip to comment form

  1. Michael said— 56 minutes later

    You should not put your validation logic into the clients hands. I can easily change the HTTP parameters for

    to

    or

    and the email value with get through your validation logic without causing an error. Do not give the user the power to adjust your validation, its more work to write the logic out explicitly, but it is also more secure and more predictable.

    #1
  2. Michael said— 1 hour later

    Correction to my last comment (since this blog does n0t escape my HTML special characters)

    I can easily change the HTTP parameters for

    <input type=”text” name=”email_required_emailaddress”>

    to

    <input type=”text” name=”email_emailaddress”>

    or

    <input type=”text” name=”email”>

    #2
  3. xmmm said— 2 hours later

    Replace “Smart” with “Hilarious” and we will pretend it was always like this.

    #3
  4. Brian Cray said— 4 hours later

    Michael: Yes, and anybody with that knowledge and time probably also has the knowledge to hack the session vars. This is one method, and I never said it was a bulletproof method.

    XMMM: Perhaps offer suggestions when provide criticisms? Just a thought.

    #4
  5. Nick Yeoman said— 5 hours later

    Sorry I’m with the two other guys, this system seems a flakey.

    I’d stick with writing a validation class then use your controller to validate the posts based on.

    Also what about when you want a corner case? Example: your building a site for a Canadian company that wants only people with only .ca domain email addresses.

    In your code you would have to add that to your base function (which you may not want for other projects). Which will make you code bloated and un-usable in the future. Alternatively you could duplicate the above function, but then you would have to fix the other validation checks (alpha-numeric, phone , etc) in duplicate places which is just asking for bugs.

    Also if this is for production you shouldn’t be building your own CMS as it would be quicker to learn and modify existing. If you are a corner case that you have to build your own you should be using a php framework such as CodeIgniter or Kohana. Which already have the above functions build in.

    #5
  6. Nick Yeoman said— 5 hours later

    Sorry replace:
    “I’d stick with writing a validation class then use your controller to validate the posts based on.”

    with

    “I’d stick with writing a validation class then use your controller to validate the posts based on each form.”

    #6
  7. Brian Cray said— 7 hours later

    Nick: This actually isn’t what I’m using in my CMS, but I thought about it while I was writing my CMS, and I thought it was unique enough to share. Thanks for your suggestions though.

    #7
  8. david said— 17 hours later

    If your method is only used to set php validation it’s a bad process. And why can’t you use a fieldname that is also a rulename? The fieldname gets split off before the the rules are checked.

    I think the use of the underscore to split the fieldname and the rules is very limiting. It’s better to use more metacharacters, for example fieldname.rule1-rule2:param

    A way to improve security is to process the html to extract the rules from the fields and only leave the name of the field when the form gets displayed.

    It’s not a bad idea if you think about giving CMS mods the rights to create forms without messing with code but it’s too basic to call it smart.

    #8
  9. Baba Nutboltoo said— 18 hours later

    Your clients won’t pay you more for web services if you make them in this way!!

    “This actually isn’t what I’m using in my CMS, but I thought about it while I was writing my CMS, and I thought it was unique enough to share.”

    - Please don’t share things which you didn’t use. First use it, test it, check if there is any error or hole then fix them and after then share it. PHP has a good set of functions to check these validations : filter_var, ctype_alnum, ctype_alpha etc.. Then why take the heck to write those regular expressions? Just learn how to KISS and effective :D

    #9
  10. Tutorial City said— 20 hours later

    I think it’s a good decision to use filter_var. this library has almost anything you would want to, and even an option to use your own regular expressions. The best idea(my opinion) is to create a class to hold all validation, and use it across all the projects(maybe almost all). ;)

    #10
  11. AntonioCS said— 1 day later

    This method is great for client side validation, if you also have server side validation.

    I thinks there is a jQuery plugin that does something like this (looks at the fields name to know how to validate it).

    Good article :)

    #11
  12. Brian Cray said— 1 day later

    David: You’re right, the idea here is to separate back-end and front-end development in cases where it’s handled by two different people. Great suggestions!

    Baba Nutboltoo: filter_var is definitely the way to go, as I outlined in “Things to consider about this method.” However, many people are not running PHP 5.2.0 or above, which is required to use filter_var. I do like the idea to stay with KISS =)

    #12
  13. Brian Reavis said— 1 day later

    As Michael said, control of data validation should never be put in the hands of the user.

    It’d be super easy for me to open Firebug and adjust the ‘name’ attribute on your form fields to insert whatever I like in your database. That’s not hacking session vars. If you use this with the assumption that the data going into your database is solid (like most app developers would, for basic fields like email addresses at least), XSS holes will pop up all over the place. I could change “email_required_emailaddress” to “email” and send:

    "><script type="text/javascript">/*send cookie data to remote source here*/</script>

    …to steal users’ passwords. If people have accounts and data on your CMS, security really should be a concern—no matter how difficult you perceive hacking to be. Data validation isn’t something that just affects the person sending the form. If data isn’t validated/sanitized properly, it becomes everyone’s problem and concern.

    Also, a bit of a side note: The array_map function is a really handy one to use. Your sanitation step could be a lot faster and smaller:

    $postVars = array_map('trim', get_magic_quotes_gpc() ? array_map('stripslashes', $_POST) : $_POST);

    Anyways, just my two cents. All the best.

    #13
  14. Brian Cray said— 1 day later

    Brian: cool use of the array_map function for sure, and good point about the XSS attack. I wanted to emphasize the methodology and extra filtering should always occur on user input, such as htmlspecialchars or strip_tags, depending on how the input data will be displayed. Thanks so much for your insight!

    #14
  15. Tutorial City said— 3 days later

    Just a note: filter_var_array and filter_input_array were created to handle arrays, sou you do not need to use array_map

    #15
  16. Brian Cray said— 3 days later

    Tutorial City: Very cool, I didn’t realize those existed on top of filter_var.

    #16
  17. Respond to this post—

Return to navigation
1288