Perl Programming Page 3 - Perl 101: The email form |
So far we have a working script, but it lacks many of the refinements which distinguish amateur scripting from professional scripting; and more importantly, it is insecure - if your web server is cracked because of one of your CGI scripts then your provider will not be very happy. The golden rule in CGI programming is to never trust user input. Some users are stupid, others are malicious - just because you ask them for their name, it doesn’t guarantee that they won't enter a series of commands which *could* cause your web server to display it's password files. Characters such as '|`>< have special meanings to perl and can be used by the ingenious cracker to make your script do things it shouldn't. In fact, it's much safer to list what we *should* allow rather than what we should *disallow*. In this particular case, limiting user input to letters, numbers, underscores, spaces, periods, question marks, exclamation marks, hyphens, and at signs (@) should be sufficient. Here's where the fun starts. We're going to be using regular expressions, which are a very important (and often complex) part of Perl. If you are coming from another programming language you may already be familiar with regular expressions (or regex's, as the are called for short). If this is your first time then they can look very daunting, however fear not as all will be explained. First off, a simple one: unless ($name =~ /^[\w ]/) { print "Oops you entered your name incorrectly - please go back and check it<br>"; die; } The important part of this is $name =~ /^[\w .]/ .Here we are testing to see if the name supplied by the user contains any characters which are not letters, numbers, or spaces. The two backslashes are boundaries – everything inside of them is treated by Perl as a regex, so our regex is: ^[\w ] \w is a shorthand meaning 'any word character', however Perl's definition of a word character is not what might you expect: to Perl it means 'any letter, number or an underscore'. The blank space following the \w is literal – i.e. Perl will look for a blank space. The square brackets ([ and ]) indicate that everything inside them is an alternative: Perl will be happy if it finds either a word character *or* a space. Finally, we invert the meaning of [\w ] by putting a carot (^) at the beginning: unless ($name =~ /^[\w ]/) { print "Oops you entered your name incorrectly - please go back and check it<br>"; die; } Now that you have a basic understanding of regular expressions, this line could be rewritten in English as: If $name contains any character which is NOT a letter, number, underscore or space, THEN print an error message and stop. So that was your first lesson in pattern matching and the bizarre world of regular expressions. If you understand everything we've covered then give yourself a pat on the back - it can be very hard at first, but soon you'll be able to write your own regex's with your eyes shut. If you are still unsure of regular expressions, then please take the time to re-read this page and suss them out - the effort will pay off once you start writing your own Perl scripts.
blog comments powered by Disqus |
|
|
|
|
|
|
|