PHP Tutorial
The purpose of this tutorial is to give enough
of a broad grounding in the use of PHP that you can understand
examples, write your own simple scripts, and most importantly,
learn to use the reference manual as a learning reference. A
little familiarity with programming concepts is required, but I
explain what variables are, which should be a simple enough
level for anyone comfortable using a PC.
Those who know Perl already can skip great
chunks of this tutorial - you'll be surprised how similar the
two languages are. Read everything up to the Syntax section,
then you can probably skip to the examples. It may be to your
benefit to read the section on arrays though since they are very
different from Perl arrays.
Updated: October 2002 for PHP
4.2.x (just in time for PHP 4.3 ;-)
Table of contents
-
What is it?
-
How does it compare to ASP? Perl?
-
Notes for Perl coders
-
Handy resources
-
Syntax
-
Where to put PHP code
-
Statements
-
Variables
-
Constants
-
Strings
-
Operators
-
Type casting
-
Arrays
-
Loops
-
Conditionals
-
Functions
-
Returning values
-
Examples with walkthroughs
-
Database access
-
Writing secure code
What is it?
PHP is an embedded web-based scripting language
similar to ASP, though with a syntax that is closer to Perl, C
and (to some extent) Java than VBScript, which is the commonly
used language in ASP.
Embedded means that PHP code is put inside
normal HTML to provide dynamic content. The advantages of this
include:
-
Graphic designers can design the look and
feel of a site without having to know how to code. Then
coders can add the dynamic stuff without having to know much
about good HTML design. Changing the look & feel of a
site can be done without much change in the code.
-
Rapid application development -- the HTML of
a site can be thrown out quickly to build a rapid prototype,
and then the code can be added to this framework.
-
Placing PHP code on an HTML form allows you
to use the same page for filling out the form and for
correcting validation errors (like a missing phone number
entry), so there's no need to look after two separate pages.
PHP's similarity to Perl makes it easy to learn;
the differences in the syntax are mostly to tidy it up. Those
familiar with Perl (but not too rabidly fond of it)
will find PHP refreshingly tidy in comparison, while the more
avid Perl fans will find that they can do almost anything in PHP
that Perl could do - in fact, in some places it's a lot easier.
String manipulation in PHP is a lot more simple than in Perl.
How does it compare to ASP? Perl?
-
PHP has the same context as ASP - that is,
it's embedded inside HTML. Perl on the other hand has a
shell context - it was designed for use on the command line,
not inside a web page.
-
ASP's native implementation is Windows only,
using the IIS server. Some UNIX implementations also exist,
but their feature sets aren't as complete, and at the time
of writing, none were free. Conversely, PHP is available for
well over a dozen platforms and web servers; in fact, any
server that can supports the CGI standard can run PHP, and
several optimised server-specific PHP modules also exist and
are supplied with the standard package.
-
Both PHP and Perl are modular -
functionality can be added and removed by including modules.
In Perl, modules must be loaded by the script or specified
on the command line, but in PHP modules can be either
dynamically loaded (in the script) or specified in the
PHP.ini file to load automatically when PHP starts up.
-
In tests, PHP's performance came out
marginally ahead of ASP, though to be honest, there's not
much in it. Both are approximately 3-4 times faster than
Cold Fusion, however.
-
PHP, like Perl, is open source. You can
download the code, compile it, modify it, basically do
whatever you want.
Notes for Perl coders
Skip this bit if you don't know Perl.
The major changes that most Perl coders find (in
my experience at least) when using PHP are:
-
In PHP there is only one kind of array, and
it's the hash (or named array), and you should treat every
array like it's a hash (i.e., in a loop over an array, step
through each key rather than going through the numbers on a
list)
-
In PHP, variables, lists, and hashes are ALL
signified by a $ at the front of the name. In Perl,
depending on the context, either $, @, or % is required.
-
Filehandles are replaced by file pointers,
and the file pointers are stored in a variable. This is
different from Perl because in Perl, filehandles have their
own namespace. So where in Perl one would do
open(MYFILE, ">newfile.txt")
or die("Couldn't open file to write");
in PHP, this is
if (! $myfile = fopen("newfile.txt",
"w")) die("Couldn't open file to
write");
-
There is no regular expression operator in
PHP - regular expression matching, substitution, etc. are
all standard functions. This tidies up the syntax no end, in
the author's opinion. However, on most PHP builds, PCRE is
built in which supplies Perl-compatible regular expression
functions for some seriously funky wizardry if you're that
perverted a programmer that you feel you need to make your
code illegible ;-)
-
In Perl, the default return value of a
function (if one is not specifically given) is the result of
the last statement in the function. In PHP, there is no
default return value; this isn't in the PHP manual, however,
so it may change in future. In all cases, it's wise to use a
return value in a function.
-
Only one item can be returned from a
function, though that one item can be a list. This is
discussed later on.
Handy resources
- The PHP Manual
-
Downloadable
versions
Online annotated
version
The compiled help format is in the new windows HTML help
style, which is extremely easy to search, browse and use.
It's utterly indispensable.
The online version of the manual has some advantages
however; most importantly, the user comments. Here people
have added notes with examples, caveats and other useful
comments. If you're stuck with a new function, you could do
a lot worse than checking the online manual out.
Syntax
Where to put PHP code.
When the PHP program runs through your script,
it looks for either <?php or <?
then runs everything from there until the next ?>
You can configure PHP to run code between <%
and %> tags too (ASP) though on IIS
this probably isn't the wisest of ideas.
You need to tell your web server to run PHP
files through the PHP program, too. This depends on which web
server you have, though the PHP package has good instructions
for most of the popular types. You should put your PHP code in
files with the extension .php and
configure your webserver to run .php
files through the PHP program, though you can also force PHP to
handle all .html files. The advantage
to setting up .php as an extension is
that files with no PHP code won't go through the PHP program, so
you'll get better server performance.
Statements
All statements end with a ;
Whitespace is ignored, so you can stack several
commands on one line, as long as you end every statement with a ;
Single line comments begin with //
and finish at the end of the line
Multiple line comments begin with /*
and end with */
You can group operations in brackets
( ) to make them more readable to humans. In fact it's
highly recommended anyway so you know exactly what's going to
happen when you run the code.
Variables
In PHP as in most computer programming
languages, data can be stored in named areas of memory called
variables. Variables can contain numbers, text, even binary
data, as well as arrays (lists). They're also used to store
references to files and database query results.
Arrays are special, so I'll go into those
separately, but for the most part using variables is simple. All
variable names begin with a $ and are
followed by one or more letters. You can use numbers, too, and
underscores ( _ ), but not spaces. Variable names are case
sensitive - beware! $FOO is not the
same as $foo.
$foo = 3;
$myline = "Hello world";
Unlike some other languages, you don't have to
declare your variables before you use them, though for security
reasons, it's recommended that you declare some of them (for
example, session variables). I'll cover this in greater detail
later.
Constants
Constants are much like variables, except you
can't change them. You can declare them at the beginning of a
script and use them instead of any fixed number that you can't
be sure will never change. PHP sets some constants for you, like
TRUE, FALSE
and NULL.
To create a constant, you use the define()
function. Constants can be named just like variables, and they
can store anything a variable can except an array.
define("MY_CONSTANT", "Hello World");
echo MY_CONSTANT; // prints "Hello World"
A lot of mathematical constants are also preset
for you when PHP starts up, namely M_PI
for pi. Read the "Mathematical functions" section of
the PHP manual for the complete list.
Strings
Strings are text, pure and simple. There's two
ways of storing strings, inside single quotes and double-quotes.
Using "double-quotes" allows you to use special
characters like \n for a newline. The
backslash character marks the next character as special, though
the only characters you can backslash in 'single-quoted' strings
are \' (a single quote) and \\
(a backslash).
Some other special characters (for double-quoted
strings)
\t
|
TAB
|
\n
|
newline
|
\r
|
carriage-return. In DOS text-files, a
new line is \r\n , though in practice in
Windows you can get away with UNIX format, which is just
\n
|
\$
|
$
|
\'
|
'
|
\"
|
"
|
The most important difference between single and
double-quoted strings is that double-quoted strings can contain
variable names:
$foo = 'This doesn\'t work'; // Oh yes it does!
$foo = "Matt's second line also works";
$age = 18;
echo 'Bob is $age'; // prints "Bob is $age"
echo "Bob is $age"; // prints "Bob is 18"
Check out the manual for more string stuff --
this is generally enough to get along with in daily use.
Operators
There are some operators that work directly on
variables to save you time and effort in programming. For
example, ++ and -- work to add and remove 1 from contents of the
variable given:
$foo = 3;
$foo = $foo + 1 // The 'old' way of doing this.
$foo++; // $foo is now 5
++$foo; // $foo is now 6
The difference between ++$foo
and $foo++ is the point at which the
number is added. If you're using $foo
in a statement more complicated than those above, you'll find
that :
$foo = 3;
$bob = $foo++; // $bob is 3, $foo is 4.
$foo = ++$bob; // $bob and $foo are both 4
In the first line, we set $foo
to 3. The following line reads in plain English as "Set $bob
to the value of $foo, then add one to $foo",
whereas the last line reads "Add one to $bob
and set $foo to the new value".
In general, if you have to think about what's going to
happen, then you should write your code in a more simple way --
think of the poor git who's going to replace you! You're excused
if:
If you want to add/subtract more than one, use
the += and -= operators.
For strings you can use the .=
operator.
$foo = "Fred is ";
$foo .= " Bob's friend"; // $foo now contains "Fred is Bob's friend";
$foo = 1;
$bob = 3;
$foo += 2; // $foo is now 3
$foo += $bob; // $foo is now 6
The usual operators
Use + to add, -
to subtract, / to divide, *
to multiply, and % to get the modulus
(remainder). These make sense with numbers. They don't with
words. Use . to join multiple items
(like pieces of text):
echo "You have " . $items . " items.";
Comparison operators
Use comparison operators in tests like if, while
and for loops to test a condition. These operators will compare
the items on either side and will return true or false. Again,
most of these only work meaningfully with numbers.
$a == $b Returns true if the contents of $a match those of $b, false otherwise.
$a < $b True if $a is less than $b
$a > $b True if $a is greater than $b
$a <= $b True if $a is less than, or equal to $b
$a >= $b True if $a is greater than, or equal to $b
Logical operators
As above, these operators will return true or
false depending on the outcome of the test it performs on the
two sides of the operator. It takes each side to be a binary
value, either true or false. In PHP, 0 and NULL are false,
everything else (including negative numbers) is TRUE.
&&
|
AND
|
Returns true if both sides of the
operator evaluate to true. However, if the left hand
side isn't TRUE, PHP won't bother testing the right hand
side.
|
&!
|
AND NOT
|
Returns true if the left side is TRUE
and right side is FALSE.
|
||
|
OR
|
Returns true if the left side or the
right side is true. Note that PHP won't bother checking
the right hand side if the left hand side is true.
|
!
|
NOT
|
This is one-sided only. ! $a will be
TRUE if $a is FALSE.
|
Because PHP (Like C and Perl)
"short-circuits" these operators when it can, you can
use them as control structures:
($fp = fopen("filename.txt", 'r')) || die("Couldn't
open file for read");
PHP runs the first side, ($fp
= fopen("filename.txt", 'r')). If the fopen()
function call opens the file and the file pointer is stored in $fp,
then this side of the line returns TRUE.
|| means "one side or the other
(or both) must be true", PHP doesn't bother running the
right hand side because it knows that one side is already true
which is what the operator requires. Now, if the file-open
function had failed, the left hand side would've returned FALSE
so PHP would -=have=- to check the right hand side to
see if that's true. The right hand side runs the die()
function which quits PHP with an error message, in this case
"Couldn't open file for read". In short, PHP quits if
it can't open the file.
Type casting
Like Perl, variables are cast automatically into
the right types (integer, floating point, string, etc.) though
you can force them if you want. Just put the required type in
front of the variable in brackets
$mynum = 3.01;
$intnum = (int) $mynum;
Check this out in the PHP manual if you want to
know more about typecasting. In general it's not necessary since
there are functions for dealing with (for example) number
formatting.
Arrays
Arrays in PHP are pretty much different from any
other implementation the author has seen in other languages. The
concept of an array is simple - it's a list of items:
$my_array = array( "bob", "fred", "barney", "thelma" );
Accessing the array is done as a whole ($my_array)
or by array item. Items are numbered from 0
at the beginning, so $my_array[0] is
"bob" and $my_array[3] is
"thelma"
So far, this should be familiar to many coders.
You don't have to declare arrays in PHP, or their length - the
size of an array will adapt to meet the contents. Array items
can be strings, numbers, even other arrays:
$my_array[4] = array("billy-anne", "billy-bob", "billy-sue");
A useful function to learn here is the print_r()
function to show the structure and contents of any variable. You
can use it to help you visualise how this data is arranged. You
can also run PHP from the command-line if you have the php-cli
program in your PHP installation package (from version 4.2.x
onward). To run a script from the command line, use:
The stops the HTTP headers from showing on the
command-line, which keeps your screen tidy.
Try saving the following lines to a file then
running it through PHP.
<?
$my_array = array( "bob", "fred", "barney", "thelma" );
$my_array[4] = array("billy-anne", "billy-bob", "billy-sue");
print_r($my_array);
>
Now, the powerful and unique feature of PHP
arrays (and also a big stumbling block for many) is that PHP
arrays aren't just numbered, they're named. For good coding, you
shouldn't assume that your arrays are numbered, or even numbered
in order! This means that a traditional for(
$i = 0; $i < count($my_array); $i++) loop won't
necessarily work.
Named arrays are assigned slightly differently,
but not much - you can use this format for numbered arrays too:
$this_user = array( "firstname" => "Bob",
"surname" => "Smith",
"age" => 24 );
$this_user['firstname'] = "Robert";
Hopefully you can see from this example why
named arrays are useful. Later on in the database section we
show how you can retrieve rows from a database query and store
each row as a named array. Identifying in your HTML code what $row[4]
is can be annoying and time-consuming, because you have to go
back and find the SQL statement and count to the 5th field...
etc. But if you see $row["age"]
you know exactly what part of the data is being shown.
(Un)fortunately it doesn't end there with arrays
- you can treat them like a stack and push things onto, or pop
them off of the top, like some programmers would with assembler
or forth. PHP also lets you shift and unshift items to/from the
bottom of the stack too. And because you can store arrays in
arrays, you can simulate trees. If you're not a full programmer
yet and this means nothing to you.. good! Less work for me :-)
Read the manual's function reference - it has a
whole section of array functions that let you shuffle, reverse,
sort, splice, count, merge, and filter arrays. You can subtract
one array from another, or build an array of the unique values
of another array.. in short, you can do a helluva lot with
arrays. They're one of the most powerful features of PHP.
I'll cover some simple array functions (like
looping over the contents) and ask you to play with the manual a
little.
Control Structures
Loops
for
syntax: for ( expr1 ; expr2;
expr3 ) { // code }
example: for ( $i = 0; $i
< 10; $i++ ) { echo $i; }
// prints 0123456789
>
description: This is a C-like for-loop. It runs expr1
once, and once only at the start of the loop. On every iteration
of the loop, PHP checks expr2 -- if
it's TRUE, the contents of the { }
block are run, otherwise it stops. At the end of every loop, expr3
is run.
foreach
syntax: foreach ($array as $array_item_value)
{ // code }
or: foreach ($array as $array_item_name
=> $array_item_value
) { // code }
description: PHP didn't have a foreach function
until version 4 because of the way PHP arrays are different from
normal arrays. There's a similar way of doing this in PHP 3
which I'll show later because it's what I'm used to doing -- if
you have to maintain anyone else's code, you'll need to
recognise it too. Basically what this loop does is step through
every item in the array called $array,
and puts the value of that item into $array_item_value.
If you're looking at the second example, the name/number of that
item is also stored in $array_item_name
. Using this loop you can perform the same action on every item
in an array (like printing it to the screen, or inserting it
into a database, or building an HTML table)
The PHP3 alternative for going through each item
in an array was:
while (list($array_item_name,$array_item_value) = each($array)) {
// code
}
You'll probably see a lot of that if you're
maintaining someone else's code.
while
syntax: while ( expr ) { //
code }
example: $i = 0; while ( $i
< 10 ) { print $i; $i++; }
// prints 0123456789
description: This evaluates expr
then runs the code.
If the expr is FALSE before the first loop
starts, the loop doesn't run at all.
do..while
syntax: do { // code } while
( expr )
example $i = 0; do { print $i++;
} while ( $i < 10 );
// prints 0123456789
description: Almost exactly the same as a while
loop except the expression is tested at the end, so the loop is
guaranteed to run at least once.
Conditionals
if
syntax: if ( expr ) { // code
}
example: if ( 1 == 2 ) { echo
"Mathematics is a LIE!"; }
description: Only runs the code if the
expression is TRUE, else does nothing.
if else
syntax:
if ( expr ) {
// code
} else {
// code
}
example:
if ( 10 < 20 ) {
echo "10 is less than 20.. phew!";
} else {
echo "Ye cannae change the laws o' maths, Jim!";
}
description: As with the if statement, if the
expression in brackets is TRUE then the code in the first set of
curly braces is run, but in this statement if the expression is
false, then the code in the curly braces after the 'else'
statement is run.
If you live in the same dimension as me, then
the example above should never talk like Scotty.
if .. elseif .. else
syntax:
if ( expr1 ) {
// code1
} elseif ( expr2) {
// code2
} else {
// code3
}
example:
if ( $a == 1 ) {
echo "\$a is 1!";
} elseif ( $a == 2 ) {
echo "\$a is 2!";
} else {
echo "\$a is something else!";
}
description: As with the if-else statement
above, PHP starts at the top and tests expr1.
If it evaluates TRUE, then code1 is
run, then PHP leaves the block. If expr1
is false, then expr2 is tested and so
on. If no if/elseif expressions are true, then PHP runs the else
block. You can have as many elseif
blocks as you like, and the end else
is optional.
switch
syntax:
switch ( expr ) {
case result1:
// code
break;
case result2:
// code
break;
default:
// code
}
example:
switch ( $i ) {
case 1:
echo "\$i is 1";
break;
case 2:
echo "\$i is 2";
break;
default:
echo "Dunno what \$i is";
}
description: The switch statement is quite close
to the if-elseif-else statement; expr
is evaluated, then PHP goes through each case until it finds a
matching value, then runs all the code in the switch block until
the end (including all cases below it); sometimes that's useful,
the rest of the time, use a 'break'
statement to leave the switch block when you've run the right
case, like in the example above. If no cases match and there is
a default case, PHP runs it. The default case is optional, but
if you include it, it must be at the end of the statement.
With a switch, the expression has to be
something simple like an integer, or a floating point number, or
a string. Objects and arrays don't work well in switch-blocks.
Functions
Functions are extremely powerful, and should be
familiar to anyone who knows another programming language. PHP
has MANY functions built in, or available through modules you
can load at run-time or in the PHP.ini
.. and you can make your own. To call a function, you just type
its name in your code. Functions have their own variable space
(called a name-space) that lasts as long as the function is
running, then it's shut down. You can import variables from
outside a function by declaring them global within the function,
though if you unset a variable inside a function, it won't be
unset outside.
The simplest way of declaring a function is
this:
function my_function() {
// code
}
And to run it:
Put re-usable code inside functions to save
yourself time and effort. You can store many useful functions
together in one file and include() or require()
them in other scripts so that you never have to repeat your
code.
If you want your functions to return values, use
the 'return' statement. You can only
return one thing, but that one thing can be an array so this
isn't much of a limitation (you've seen that arrays can store
just about everything anyway).
The most useful functions take arguments -
options, as it were - and return values. In PHP3, one had to
name all the arguments that a function required, and supply
defaults for the optional values. In PHP4, we can now use the
simple function declaration above and check for arguments. This
lets us be a lot more flexible. Still, sometimes the old ways
are best -- if you declare the function arguments in the old
style, it's very easy to see what is required to make a function
run.
function my_func($a, $b) {
$output = $a * $b;
return $output;
}
Here we have created my_func
to require two arguments. Then it multiplies them together and
returns the result. $output never
exists outside the function, by the way, and if you call my_func()
again, it won't remember what it was last time. Functions that
remember their state are called generators, and you see mention
of them in languages like Python. Using global and static
variables lets you simulate that kind of behaviour with PHP
functions, but that's something to play with on a rainy day,
right?
Now, to call my_func and catch the answer:
$result = my_func(10, 20);
Or how about:
Or even:
if ( $result = my_func(10,20) ) {
echo "My_Func returned a value of: $result";
} else {
echo "My_func returned zero or nothing.";
}
Returning values
You've seen above that you can simply type
But if you want to return more than one thing,
you can use the array() function to build an array to return:
return array($item1, $item2, $item3);
or:
return array("result" => $item1,
"error" => 0 );
Capture your function return-values like above,
then treat it like one. You could do:
$output = complex_func(10, "bob", 28.73);
if ($output['error']) {
echo "There was an error!";
} else {
echo "Returned output: " . $output['result'];
}
That's it! That's as much grounding on the
basics as I can give. It should be enough for you to understand
these examples -- except perhaps the database code, which is
going to need some new concepts introduced. Work through, run
them, play with them a bit.. you should find them useful, and
they'll show you how to step through arrays, how to access
files, etc. Read the comments.. there's more comment than code
below, but it's there for a reason :-)
Examples with walkthroughs
Web page counter
A simple application and one you can put in any
page. For example you could drop a single include() line in any
PHP page you like to show a hit-counter for, or tell your web
server to include this script at the bottom of every page (IIS
and Apache at least can do this).
This is a text counter, not one with images.. I
hate those!
<?
/* Start of the script. If there's a counter file, read it and get the current
* number of hits from it. Then we add one, display the number, and rewrite the
* file. */
if ( is_file("counter.txt") ) {
/* The file exists, so read it. The file() function reads a whole file
* into memory. With a counter file, which is tiny, there's no problem
* with this. The file in memory is stored as an array with one line
* per array item, so array[0] is the first line, etc. We only need the
* first line */
$file = file("counter.txt");
$count = rtrim($file[0]);
/* So $count now contains the first line, minus any whitespace and/or
* newlines at the end. This means $count should be just a number.
* Lucky for us, PHP's clever enough to turn useless text into 0 as a
* number, so we can just add 1 to $count and save that, whatever was
* in there before! */
$count++;
} else {
// There was no counter file, so we'll start from scratch.
$count = 1;
}
/* Now write the count back to a file, and display the number */
/* First, open the file to write .. */
if ($file_pointer = fopen("counter.txt", "w") ) {
/* Write the counter to the file we just opened */
fwrite($file_pointer, $count);
/* Close the file */
fclose($file_pointer);
} else {
/* $file_pointer wasn't set, so fopen() failed,
* so we can't save the counter. Say so! */
echo "Couldn't save counter!";
}
// Finally, show the counter.
echo "This page has been visited $count times";
?>
Browser-specific code
Sometimes when you're writing HTML you'll find
yourself in a situation where the code you want to use will only
work on one browser (and we all know which browser that is). Not
only won't it work on other browsers, but one specific browser
(and we all know which that is) crashes and burns, or won't show
the page at all. In this case you can detect the browser with
PHP and only show the dodgy code on IE, or show something else
on Netscape.
Whenever a browser asks a webserver for a page,
it presents some information to the server (like which page it
wants, and what browser it is.. ) and PHP turns this information
into variables when it loads. Check out the $_SERVER['HTTP_USER_AGENT
'] variable; it contains the name and version of the
browser. The problematic browser, especially regarding Cascading
Style-sheet bugs is Netscape 4, between 4.1 and 4.8 . We don't
include 4.0 because that's what Internet explorer pretends to
be.
So now, in your HTML code, or in your
style-sheet if you keep them separate:
<style type="text/css">
<!--
<? // Browser check: Netscape 4.x can't deal with borders on CSS elements
if (!eregi("^Mozilla/4.[1-8]", $_SERVER['HTTP_USER_AGENT'])) {
$css_border = "border: 1px #FFFFFF solid;";
} else {
$css_border = "";
}
?>
SELECT {
background: #002F54;
color: #FFFFFF;
font-family: Tahoma, Verdana, Arial, Helvetica, sans-serif;
font-size: 10pt;
<?=$css_border?>
}
--></STYLE>
It's the "border: 1px #FFFFFF
solid;" that makes Netscape heave.. for whatever
reason. So on Netscape 4.x, we don't print it (it doesn't work
anyway, right?) The eregi() function
tests a regular expression, which is a powerful pattern-matching
algorithm. There's no way I have time to teach you those now,
but there's another tutorial on my site that introduces you to
perl regular expressions. They're very similar.
Redirection
Redirection is sending someone from one page to
another. Or maybe reloading the current page. With PHP, as long
as no data has been sent to the browser yet (any HTML, any echo
commands, etc) you can send an additional HTTP header to the
browser that tells it to go somewhere else:
<?
if (!headers_sent()) {
header("Location: http://www.lazycat.org/tutorials.php");
exit;
}
?>
The exit statement is
important, because you don't want to bother running the rest of
the whole script again when no one's looking. The headers_sent()
function returns TRUE if the headers have already been sent - if
they have and you try to use the header()
function, PHP throws a wobbly and prints errors to the screen
and stuff. Very unprofessional.
Session management to persist user data
securely
This is a trickier concept than most of the
stuff we've covered above, so I'm going to go into some
background first. The protocol that we use today on the world
wide web is HTTP. This much you probably know. It's a state-less
protocol, which you probably didn't. What this means is that
when someone requests a page, the page is sent and the
connection is closed. End of story to the webserver. But for
you, the application writer, you want some way to identify a
single visitor through their visit, because they're not just
getting one page.. they're getting a dozen as they browse, and
maybe they typed in a password on that first page and don't want
to have to log in to every page as they go through your site.
Netscape saw that this was an issue and their
answer was the "magic cookie". A magic cookie is a
little piece of text that a server gives the browser with their
page. The cookie is stored on the browser and it has certain
instructions with it, like how long it's supposed to last, and
which servers it should give the cookie to. Then whenever the
browser asks for a new page, it gives the cookie to the server
as part of the request. So by giving data in a cookie to
someone, then the webserver (and the application) can maintain
variables across connections.
Now the problem with cookies is that people can
read them, and they can change them, and they can make them up
completely because they're on the browser and bad people have
browsers just like good people do. So what session management
does is that it keeps all the data, all the variables on the
server, where they're much safer than on some guy's hard drive,
and links the data to browsers with a unique number, a number
that's very hard to guess. So now when a browser asks for a
page, and gives its cookie, which has a long number in it, PHP
can load the data in the session file with that number, and
retrieve all the variables saved in it.
Which means that if your visitor logs in, you
can save a $_SESSION['logged_in']
variable in the session file and every time you load a page, you
start the session and see if $_SESSION['logged_in']
is set. Which means you don't have to make him log in on every
page, and you can be sure that the user didn't fake a login by
changing the cookie file.
You need to start a session before any data is
sent.. similarly, you need to register session variables before
any data is sent. Data means HTML.. your page. So do this code
right at the start, with no white space before the top of the
page.
<?
session_start();
$_SESSION['count']++;
echo "You have loaded this page " . $_SESSION['count'] . " times!";
?>
The default lifetime of a PHP session is 0,
which means it's deleted when the browser closes. However, you
can browse to other sites and then come back to this one, and
the session will still exist.
Look up the "Session Management
Functions" in the PHP manual -- you can even write your own
handler to save sessions, storing data in a database or in a
different format.
Remember, the thing that catches nearly everyone
out when writing session code is getting everything done before
the headers are sent. If you want to be sure this never happens,
and don't mind a little performance hit, you can switch on
output buffering in PHP.ini which
makes PHP wait until has finished drawing the entire page before
sending it, so you can send headers anywhere in the script
without worrying about errors.
Database access
PHP has native connections to about a dozen
kinds of database, each with their own set of function for
preparing, performing and interpreting functions, though you
should make your life easier and learn the ODBC functions, or
using the PEAR DB or PEAR MDB modules instead - then you have a
single set of functions that work really well on any database
and all you need is an ODBC driver for that database. PEAR is
the PHP equivalent of Perl's CPAN - a central repository of code
modules that can be downloaded and updated via a command-line
tool. Check out the PEAR website to
see about installing it - DB is part of the default install, and
anyone who's had enough of PHP's database-specific quirks will
find it refreshingly consistent. MDB is newer and so
supports fewer databases, but has many more features. Using
either of these modules requires you to know how to create and
modify objects though, and that's for another tutorial.
For now I'll show the "old method", which is the ODBC
functions.
In PHP, to get data from a database, you create
a connection to the database and tie it to a variable. Then
every command you want to send to the database uses that link
identifier (meaning you can connect to multiple database in the
same script, see?)
For an ODBC connection, you have to create a
system DSN using the ODBC administrator -- you'll find that in
the Control Panel on windows (administrative tools on Win2k).
Once you've created the DSN, then you can use the odbc_connect()
or odbc_pconnect() functions. The
difference between them is that the latter is persistent - if
PHP is in memory for more than one script, it will keep
pconnects open after the first script exits, and then if another
script requests a connection using the same username/password
and database, it'll recycle the connection. This can save a lot
of time if you're contacting a remote database. On the CGI
version of PHP though, ALL connections are closed when PHP quits
at the end of the script so there's no difference.
ODBC Database connection:
$link_id = odbc_connect("DSN", "username", "password");
or
$link_id = odbc_pconnect("DSN", "username", "password");
or more safely, if your page NEEDS a database
connection:
if (! ($link_id = odbc_pconnect("DSN", "username", "password")) ) {
die("Couldn't connect to database. Reload the page");
}
This basically quits if a connection fails. You
could go a step up and show some HTML that waits a few seconds
and reloads the page.
Concepts: result pointers
I'm going to assume you know SQL since you
wouldn't want to be connecting to a database if you didn't.
There's SQL tutorials all over the place, and it's very simple
(to start with, anyway :) NOTE: A lot of tutorials I've seen
preach in mySQL, and tell you how to do stuff like create and
alter databases in PHP, which is stunningly pointless -- the
only things that should be pissing around with your DB
design/structure are administration and design tools, or the
mySQL interface. These functions are only useful for people
writing one of the above; skip those bits.
When you have an open database connection, you
need to run a query and catch the result pointer. PHP runs the
query for you and creates a place in memory to store the result.
You can use a whole bunch of functions to move around in, fetch
and display and otherwise mess with the result by using this
pointer, just like the database link identifier. The most useful
functions are
odbc_exec()
|
Run a query
|
odbc_fetch_into()
|
Get a row from the query result, and
store it in an array.
|
odbc_result_all()
|
Prints an HTML Table with the entire
database result - really handy for testing, but not
really pretty enough for production use.
|
Look up the others in the manual under
"Unified ODBC Functions", there's some funky stuff in
there.
This little example creates an HTML table based
on the result of the query given. You can put any query you like
in there instead - the code can display any number of rows and
columns. It's a custom version of odbc_result_all(),
in fact.
if ($result = odbc_exec($link_id, "SELECT username, password FROM mytable ORDER BY username") ) {
/* $result was set, so the query worked. Beware that the result might
* actually be empty - a working query doesn't have to return anything,
* so don't assume anything =^.^= */
// Start an HTML table
echo "<TABLE border=1 cellspacing=1>\n";
// Build a header row (with the field names)
echo "<TR>";
for ($field_num = 1; $field_num <= odbc_num_fields($result); $field_num++) {
echo "<TH bgcolor=silver>" . odbc_field_name($result, $field_num) . "</TH>";
}
echo "</TR>\n";
/* Now show the data in the result.
* odbc_fetch_into() builds an array with every field in the row,
* and join turns an array into a string by joining every item in
* the array with a set string. The first and last items in an
* array, of course, don't have the joining string attached, which
* is why the line below echos "<TR><TD>" and "</TD></TR>" to start
* and end the HTML table row. odbc_fetch_into() returns false when
* there are no more rows to show, so it's good for a while loop,
* which stops when there are no more rows. */
$row_num = 0; $row = array();
while ( odbc_fetch_into($result, $row_num, $row) ) {
echo "<TR><TD>" . join("</TD><TD>", $row) . "</TD></TR>\n";
$row_num++;
}
// End the table
echo "</TABLE>";
} else {
echo "Database error";
}
Writing secure code before PHP 4.2.x
PHP is a honking fat security hole, apparently.
Of course so's a webserver. Imagine something that runs as the
system administrator, and lets ANYONE read files from your hard
drive, and run scripts without them ever needing to log in.
That's a bruiser, and PHP doesn't make much of a difference
after that, unless you write bad code that people can mess
around with.
This is something to do especially for session
variables. Say you've got a variable called $logged_in
and you save it in a session. Every time the page loads and you
load the session, PHP restores the value of $logged_in
and then you check to see if it's 1 or 0.
A strength, and a potential issue of PHP is that
you can type stuff in a URL that gets turned into variables when
PHP loads. This is how PHP handles GET forms (and POST forms,
though it's a tiny bit harder to spoof those) - someone could do
http://www.yoursite.net/index.php?logged_in=1&username=admin
And $logged_in would be set to 1, and $username
would be 'admin'
If $logged_in hasn't be registered in the
session yet, someone could do that and pretend they've logged
into your site. This supposes that they know your variable
names, of course.. but nothing's stopping them from guessing. To
get around this, declare your variables at the start of your
code before you load the session, then malicious users can't
"pre-load" those variables.
<?
$logged_in = 0;
session_start();
if (!$logged_in) {
echo "You aren't logged in, go away!";
} else {
include("secret.html");
}
?>
Let's say that you have a form on your page, and
you use the contents of that form to build an SQL query. Someone
could quite easily save the form to their PC, mess around with
it and then submit it, sending data that runs SQL of their
choosing on the end of your query. It's easy enough to do, so
you need to check for this kind of thing.
The easiest way around this is never to use data
from a web form inside database queries, or file-open calls, or
system calls. Sometimes you have to. Fortunately by default, PHP
has something called 'magic quotes' enabled by default - it'll
add backslashes to unsafe characters from web forms.
But really, I can't rub it in enough -- CHECK
USER INPUT. Whatever comes from outside can potentially be
faked, sometimes by error, sometimes maliciously. Don't trust
anything.
Ways around malicious input you can't check
too much:
-
If it's a file/folder-name, disallow the .
character (ESPECIALLY at the start, and especially ../ ),
and the / at the beginning of the string. This stops people
going back one or more folders (i.e.
../../../../../../../etc/passwd :-)
-
If it's filename, don't let the user choose
the full name - add the file extension yourself. Or use a
naming rule where you can and check it with a regular
expression
-
If your form allows file uploads, then use
the is_uploaded_file() and/or move_uploaded_file()
functions to handle them because they make sure that no one
is spoofing the filename and stealing a file from your
server.
If you think some of these are nasty/unlikely..
It can be done, it has been done, and it will be done again.
Don't imagine for a second that you're immune!
Security changes in PHP 4.2.x
These methods have been around for ages, but
weren't enforced. Up until PHP 4.2.x, the default
behaviour of PHP on starting up was to turn all input from the
GET, POST, COOKIE and SESSION environments into variables.
So if you had a form input on a page called "myvar",
then when PHP loads up the script for that request, it would set
$myvar to the content of the input
field. The problem with this is that sometimes you don't want
people to be setting variables in your script - what if you
forget to declare a variable like login-status and someone sets
it for themselves, as described above? Many people who
dislike PHP have this as their primary argument. Well in
PHP 4.2.x the behaviour has changed.
Now, form, cookie and session variables are all
stored in their own arrays, and not set as global variables when
PHP starts up. In PHP 4.1.x and earlier you could access
form inputs in the $HTTP_POST_VARS['var'],
$HTTP_GET_VARS['var'], and $HTTP_SESSION_VARS['item']
arrays, it was pretty rare for people to do so. Now it's forced,
but to help us beleagered programmers out, starting from PHP
4.1.x these variables also have short versions that are "superglobals"
that you don't have to declare inside functions and you know exactly
where your data is coming from : $_SESSION,
$_FILES, $_COOKIES,
$_SERVER, $_ENV,
$_POST, and $_GET.
Also, if you use the $_SESSION array,
sessions are automatically started and new variables
automatically registered, so you don't have to worry about session_register()
and session_start().
If you're writing scripts that have to run on
PHP 4.0.6 as well as 4.2.x, then you can still use the $HTTP_x_VARS
variables, but performance enhancements, PEAR, and security
fixes should combine to be a pretty good excuse to upgrade, and
using the new superglobals ensures that you know where your data
is coming from and that you can access it from anywhere in your
scripts, even inside functions and objects. Don't let this give
you a false sense of security though - always check and validate
any data that comes from outside the script; files, form inputs,
even environment variables. You can never be too
safe.
Conclusion
I hope I've given enough of a theoretical
grounding on PHP that you can (and have) written some short
scripts. Now it's time to play with the manual - look at the
functions there, put them into your scripts, play around. If all
else fails, well you have the examples above which are generally
quite useful. I'm working on an Object Oriented PHP
tutorial for people who've found this tutorial easy enough and
have gotten a little practice in, so keep an eye out for it if
you thought this was easy. Object orientation in PHP is about to
take a big leap forward in version 4.3.0 which uses the new Zend
2 engine; if you're familiar with other OO languages then you'll
find the new PHP version brings in some of your favourite
features to make it a better OO language than it ever has been
before. If you don't know what I'm talking about... well, read
the tutorial when it's done.
|