Application Techniques in PHP

Tarun Aror

In this article author explains some techniques that you may find useful in your PHP programming, such as code libraries, templating systems, efficient output handling, error handling, and performance tuning.

As you've seen, PHP ships with numerous extension libraries that combine useful functionality into distinct packages that you can access from your scripts. In addition to using the extensions that ship with PHP, you can create libraries of your own code that you can use in more than one part of your web site. The general technique is to store a collection of related functions in a file, typically with a .inc file extension. Then, when you need to use that functionality in a page, you can use require_once( ) to insert the contents of the file into your current script.

For example, say you have a collection of functions that help create HTML form elements in valid HTML—one function creates a text field or a textarea (depending on how many characters you tell it the maximum is), another creates a series of pop-ups from which to set a date and time, and so on. Rather than copying the code into many pages, which is tedious, error-prone, and makes it difficult to fix any bugs found in the functions, creating a function library is the sensible choice.

When you are combining functions into a code library, you should be careful to maintain a balance between grouping related functions and including functions that are not often used. When you include a code library in a page, all of the functions in that library are parsed, whether you use them all or not. PHP's parser is quick, but not parsing a function is even faster. At the same time, you don't want to split your functions over too many libraries, so that you have to include lots of files in each page, because file access is slow.

Templating Systems

A templating system provides a way of separating the code in a web page from the layout of that page. In larger projects, templates can be used to allow designers to deal exclusively with designing web pages and programmers to deal (more or less) exclusively with programming. The basic idea of a templating system is that the web page itself contains special markers that are replaced with dynamic content. A web designer can create the HTML for a page and simply worry about the layout, using the appropriate markers for different kinds of dynamic content that are needed. The programmer, on the other hand, is responsible for creating the code that generates the dynamic content for the markers.

To make this more concrete, let's look at a simple example. Consider the following web page, which asks the user to supply a name and, if a name is provided, thanks the user:

<html>
<head>
<title>User Information</title>
</head>

<body>
<?php if (!empty($_GET['name'])) {
// do something with the supplied values
?>

<p><font face="helvetica,arial">Thank you for filling out the form,
<?php echo $_GET['name'] ?>.</font></p>
<?php }
else { ?>
<p><font face="helvetica,arial">Please enter the
following information:</font></p>

<form action="<?php echo $_SERVER['PHP_SELF'] ?>">
<table>
<tr>
<td>Name:</td>
<td><input type="text" name="name" /></td>
</tr>
</table>
</form>
<?php } ?>
</body>
</html>

The placement of the different PHP elements within various layout tags, such as the font and table elements, are better left to a designer, especially as the page gets more complex. Using a templating system, we can split this page into separate files, some containing PHP code and some containing the layout. The HTML pages will then contain special markers where dynamic content should be placed. Example-1 shows the new HTML template page for our simple form, which is stored in the file user.template. It uses the {DESTINATION} marker to indicate the script that should process the form.

Example-1. HTML template for user input form

<html>
<head>
<title>User Information</title>
</head>

<body>
<p><font face="helvetica,arial">Please enter the following
information:</font></p>

<form action="{DESTINATION}">
<table>
<tr>
<td>Name:</td>
<td><input type="text" name="name" /></td>
</tr>
</table>
</form>
</body>
</html>

Example-2 shows the template for the thank you page, called thankyou.template, that is displayed after the user has filled out the form. This page uses the {NAME} marker to include the value of the user's name.

Example-2. HTML template for thank you page

<html>
<head>
<title>Thank You</title>
</head>

<body>
<p><font face="helvetica,arial">Thank you for filling out the form,
{NAME}.</font></p>
</body>
</html>

Now we need a script that can process these template pages, filling in the appropriate information for the various markers. Example-3 shows the PHP script that uses these templates (one for before the user has given us information and one for after). The PHP code uses the FillTemplate( ) function to join our values and the template files.

Example-3. Template script

$bindings['DESTINATION'] = $PHP_SELF;

$name = $_GET['name'];

if (!empty($name)) {
// do something with the supplied values
$template = "thankyou.template";
$bindings['NAME'] = $name;
}
else {
$template = "user.template";
}

echo FillTemplate($template, $bindings);

Example-4 shows the FillTemplate( ) function used by the script in Example-3. The function takes a template filename (to be located in the document root in a directory called templates), an array of values, and an optional instruction denoting what to do if a marker is found for which no value is given. The possible values are: "delete", which deletes the marker; "comment", which replaces the marker with a comment noting that the value is missing; or anything else, which just leaves the marker alone.

Example-4. The FillTemplate( ) function

function FillTemplate($inName, $inValues = array(  ),
$inUnhandled = "delete") {
$theTemplateFile = $_SERVER['DOCUMENT_ROOT'] . '/templates/' . $inName;
if ($theFile = fopen($theTemplateFile, 'r')) {
$theTemplate = fread($theFile, filesize($theTemplateFile));
fclose($theFile);
}

$theKeys = array_keys($inValues);
foreach ($theKeys as $theKey) {
// look for and replace the key everywhere it occurs in the template
$theTemplate = str_replace("\{$theKey}", $inValues[$theKey],
$theTemplate);
}

if ('delete' == $inUnhandled ) {
// remove remaining keys
$theTemplate = eregi_replace('{[^ }]*}', '', $theTemplate);
} elseif ('comment' == $inUnhandled ) {
// comment remaining keys
$theTemplate = eregi_replace('{([^ }]*)}', '<!-- \\1 undefined -->',
$theTemplate);
}

return $theTemplate;
}

Clearly, this example of a templating system is somewhat contrived. But if you think of a large PHP application that displays hundreds of news articles, you can imagine how a templating system that used markers such as {HEADLINE}, {BYLINE}, and {ARTICLE} might be useful, as it would allow designers to create the layout for article pages without needing to worry about the actual content.

While templates may reduce the amount of PHP code that designers have to see, there is a performance trade-off, as every request incurs the cost of building a page from the template. Performing pattern matches on every outgoing page can really slow down a popular site. Andrei Zmievski's Smarty is an efficient templating system that neatly side-steps this performance problem. Smarty turns the template into straight PHP code and caches it. Instead of doing the template replacement on every request, it does it only whenever the template file is changed. See http://www.phpinsider.com/php/code/Smarty/ for more information.

Handling Output

PHP is all about displaying output in the web browser. As such, there are a few different techniques that you can use to handle output more efficiently or conveniently.

Output Buffering

By default, PHP sends the results of echo and similar commands to the browser after each command is executed. Alternately, you can use PHP's output buffering functions to gather the information that would normally be sent to the browser into a buffer and send it later (or kill it entirely). This allows you to specify the content length of your output after it is generated, capture the output of a function, or discard the output of a built-in function.

You turn on output buffering with the ob_start( ) function:

ob_start([callback]);

The optional callback parameter is the name of a function that post-processes the output. If specified, this function is passed the collected output when the buffer is flushed, and it should return a string of output to send to the browser. You can use this, for instance, to turn all occurrences of http://www.yoursite.com/ to http://www.mysite.com/.

While output buffering is enabled, all output is stored in an internal buffer. To get the current length and contents of the buffer, use ob_get_length( ) and ob_get_contents( ):

$len = ob_get_length(  );
$contents = ob_get_contents(  );

If buffering isn't enabled, these functions return false.

There are two ways to throw away the data in the buffer. The ob_clean( ) function erases the output buffer but does not turn off buffering for subsequent output. The ob_end_clean( ) function erases the output buffer and ends output buffering.

There are three ways to send the collected output to the browser (this action is known as flushing the buffer). The ob_flush( ) function sends the output data to the web server and clears the buffer, but doesn't terminate output buffering. The flush( ) function not only flushes and clears the output buffer, but also tries to make the web server send the data to the browser immediately. The ob_end_flush( ) function sends the output data to the web server and ends output buffering. In all cases, if you specified a callback with ob_start( ), that function is called to decide exactly what gets sent to the server.

If your script ends with output buffering still enabled (that is, if you haven't called ob_end_flush( ) or ob_end_clean( )), PHP calls ob_end_flush( ) for you.

The following code collects the output of the phpinfo( ) function and uses it to determine whether you have the PDF module installed:

ob_start(  );
phpinfo(  );
$phpinfo = ob_get_contents(  );
ob_end_clean(  );

if (strpos($phpinfo, "module_pdf") === FALSE) {
echo "You do not have PDF support in your PHP, sorry.";
} else {
echo "Congratulations, you have PDF support!";
}

Of course, a quicker and simpler approach to check if a certain extension is available is to pick a function that you know the extension provides and check if it exists. For the PDF extension, you might do:

if (function_exists('pdf_begin_page'))

To change all references in a document from http://www.yoursite.com/ to http://www.mysite.com/, simply wrap the page like this:

<?php // at the very start of the file
ob_start(  );
?>

Visit <A HREF="http://www.yoursite.com/foo/bar">our site</A> now!

<?php
$contents = ob_get_contents(  );
ob_end_clean(  );
echo str_replace('http://www.yoursite.com/', 'http://www.mysite.com/',
$contents);
?>
Visit <A HREF="http://www.mysite.com/foo/bar">our site</A> now!

Another way to do this is with a callback. Here, the rewrite( ) callback changes the text of the page:

<?php // at the very start of the file
function rewrite ($text) {
return str_replace('http://www.yoursite.com/', 'http://www.mysite.com/',
$contents);
}
ob_start('rewrite');
?>

Visit <A HREF="http://www.yoursite.com/foo/bar">our site</A> now!
Visit <A HREF="http://www.mysite.com/foo/bar">our site</A> now!

Compressing Output

Recent browsers support compressing the text of web pages; the server sends compressed text and the browser decompresses it. To automatically compress your web page, wrap it like this:

<?php
ob_start('ob_gzhandler');
?>

The built-in ob_gzhandler( ) function is designed to be used as a callback with ob_start( ). It compresses the buffered page according to the Accept-Encoding header sent by the browser. Possible compression techniques are gzip, deflate, or none.

It rarely makes sense to compress short pages, as the time for compression and decompression exceeds the time it would take to simply send the uncompressed text. It does make sense to compress large (greater than 5 KB) web pages, though.

Instead of adding the ob_start( ) call to the top of every page, you can set the output_handler option in your php.ini file to a callback to be made on every page. For compression, this is ob_gzhandler.

Error Handling

Error handling is an important part of any real-world application. PHP provides a number of mechanisms that you can use to handle errors, both during the development process and once your application is in a production environment.

Error Reporting

Normally, when an error occurs in a PHP script, the error message is inserted into the script's output. If the error is fatal, the script execution stops.

There are three levels of conditions: notices, warnings, and errors. A notice is a condition encountered while executing a script that could be an error but could also be encountered during normal execution (e.g., trying to access a variable that has not been set). A warning indicates a nonfatal error condition; typically, warnings are displayed when calling a function with invalid arguments. Scripts will continue executing after issuing a warning. An error indicates a fatal condition from which the script cannot recover. A parse error is a specific kind of error that occurs when a script is syntactically incorrect. All errors except parse errors are runtime errors.

By default, all conditions except runtime notices are caught and displayed to the user. You can change this behavior globally in your php.ini file with the error_reporting option. You can also locally change the error-reporting behavior in a script using the error_reporting( ) function.

With both the error_reporting option and the error_reporting( ) function, you specify the conditions that are caught and displayed by using the various bitwise operators to combine different constant values, as listed in Table-1. For example, this indicates all error-level options:

(E_ERROR | E_PARSE | E_CORE_ERROR | E_COMPILE_ERROR | E_USER_ERROR)

while this indicates all options except runtime notices:

(E_ALL & ~E_NOTICE)

If you set the track_errors option on in your php.ini file, a description of the current error is stored in $PHP_ERRORMSG.

Table-1. Error-reporting values

Value

E_ERROR

           
E_WARNING

 

E_PARSE

E_NOTICE

 

 

E_CORE_ERROR

 

 

E_CORE_WARNING

 

E_COMPILE_ERROR

E_COMPILE_WARNING

E_USER_ERROR

E_USER_WARNING

E_USER_NOTICE

E_ALL

Meaning

Runtime errors
Runtime warnings

Compile-time parse errors

Runtime notices

 

Errors generated internally by PHP

Warnings generated internally by PHP

Errors generated internally by the Zend scripting engine

Warnings generated internally by the Zend scripting engine

Runtime errors generated by a call to trigger_error( )

Runtime warnings generated by a call to trigger_error( )

Runtime warnings generated by a call to trigger_error( )

All of the above options

Error Suppression

You can disable error messages for a single expression by putting the error suppression operator @ before the expression. For example:

$value = @(2 / 0);

Without the error suppression operator, the expression would normally halt execution of the script with a "divide by zero" error. As shown here, the expression does nothing. The error suppression operator cannot trap parse errors, only the various types of runtime errors.

To turn off error reporting entirely, use:

error_reporting(0);

This ensures that, regardless of the errors encountered while processing and executing your script, no errors will be sent to the client (except parse errors, which cannot be suppressed). Of course, it doesn't stop those errors from occurring. Better options for controlling which error messages are displayed in the client are shown below.
.

Triggering Errors

You can throw an error from within a script with the trigger_error( ) function:

trigger_error(message [, type]);

The first parameter is the error message; the second, optional, parameter is the condition level, which is either E_USER_ERROR, E_USER_WARNING, or E_USER_NOTICE (the default).

Triggering errors is useful when writing your own functions for checking the sanity of parameters. For example, here's a function that divides one number by another and throws an error if the second parameter is zero:

function divider($a, $b) {
if($b == 0) {
trigger_error('$b cannot be 0', E_USER_ERROR);
}

return($a / $b);
}

echo divider(200, 3);
echo divider(10, 0);
66.666666666667

Fatal error: $b cannot be 0 in page.php on line 5

Defining Error Handlers

If you want better error control than just hiding any errors (and you usually do), you can supply PHP with an error handler. The error handler is called when a condition of any kind is encountered and can do anything you want it to, from logging to a file to pretty-printing the error message. The basic process is to create an error-handling function and register it with set_error_handler( ).

The function you declare can take in either two or five parameters. The first two parameters are the error code and a string describing the error. The final three parameters, if your function accepts them, are the filename in which the error occurred, the line number at which the error occurred, and a copy of the active symbol table at the time the error happened. Your error handler should check the current level of errors being reported with error_reporting( ) and act appropriately.

The call to set_error_handler( ) returns the current error handler. You can restore the previous error handler either by calling set_error_handler( ) with the returned value when your script is done with its own error handler, or by calling the restore_error_handler( ) function.

The following code shows how to use an error handler to format and print errors:

function display_error($error, $error_string, $filename, $line, $symbols) {
echo "<p>The error '<b>$error_string</b>' occurred in the file '<i>$filename</i>'
on line $line.</p>";
}

set_error_handler('display_error');
$value = 4 / 0; // divide by zero error
<p>The error '<b>Division by zero</b>' occurred in the file
'<i>err-2.php</i>' on line 8.</p>

Logging in error handlers

PHP provides a built-in function, error_log( ) , to log errors to the myriad places where administrators like to put error logs:

error_log(message, type [, destination [, extra_headers ]]);

The first parameter is the error message. The second parameter specifies where the error is logged: a value of 0 logs the error via PHP's standard error-logging mechanism; a value of 1 emails the error to the destination address, optionally adding any extra_headers to the message; a value of 3 appends the error to the destination file.

To save an error using PHP's logging mechanism, call error_log( ) with a type of 0. By changing the value of error_log in your php.ini file, you can change which file to log into. If you set error_log to syslog, the system logger is used instead. For example:

error_log('A connection to the database could not be opened.', 0);

To send an error via email, call error_log( ) with a type of 1. The third parameter is the email address to which to send the error message, and an optional fourth parameter can be used to specify additional email headers. Here's how to send an error message by email:

error_log('A connection to the database could not be opened.', 1, 'errors@php.net');

Finally, to log to a file, call error_log( ) with a type of 3. The third parameter specifies the name of the file to log into:

error_log('A connection to the database could not be opened.', 3, '/var/log/php_
errors.log');

Example-5 shows an example of an error handler that writes logs into a file and rotates the log file when it gets above 1 KB.

Example-5. Log-rolling error handler

function log_roller($error, $error_string) {
$file = '/var/log/php_errors.log';

if(filesize($file) > 1024) {
rename($file, $file . (string) time(  ));
clearstatcache(  );
}

error_log($error_string, 3, $file);
}

set_error_handler('log_roller');
for($i = 0; $i < 5000; $i++) {
trigger_error(time(  ) . ": Just an error, ma'am.\n");
}
restore_error_handler(  );

Generally, while you are working on a site, you will want errors shown directly in the pages in which they occur. However, once the site goes live, it doesn't make much sense to show internal error messages to visitors. A common approach is to use something like this in your php.ini file once your site goes live:

display_errors = Off
log_errors = On
error_log = /tmp/errors.log

This tells PHP to never show any errors, but instead to log them to the location specified by the error_log directive.

Output buffering in error handlers

Using a combination of output buffering and an error handler, you can send different content to the user, depending on whether various error conditions occur. For example, if a script needs to connect to a database, you can suppress output of the page until the script successfully connects to the database.

Example-6 shows the use of output buffering to delay output of a page until it has been generated successfully.

Example-6. Output buffering to handle errors

<html>
<head><title>Results!</title></head>
<body>
<?php
function handle_errors ($error, $message, $filename, $line) {
ob_end_clean(  );
echo "<b>$message</b> in line $line of <i>$filename</i></body></html>";
exit;
}
set_error_handler('handle_errors');
ob_start(  );
?>

<h1>Results!</h1>

Here are the results of your search:<p />
<table border=1>
<?php
require_once('DB.php');
$db = DB::connect('mysql://gnat:waldus@localhost/webdb');
if (DB::iserror($db)) die($db->getMessage(  ));
// ...
?>
</table>
</body>
</html>

In Example-6, after we start the <body> element, we register the error handler and begin output buffering. If we cannot connect to the database (or if anything else goes wrong in the subsequent PHP code), the heading and table are not displayed. Instead, the user sees only the error message. If no errors are raised by the PHP code, however, the user simply sees the HTML page.

Performance Tuning

Before thinking much about performance tuning, get your code working. Once you have working code, you can then locate the slow bits. If you try to optimize your code while writing it, you'll discover that optimized code tends to be more difficult to read and to take more time to write. If you spend that time on a section of code that isn't actually causing a problem, that's time that was wasted, especially when it comes time to maintain that code, and you can no longer read it.

Once you get your code working, you may find that it needs some optimization. Optimizing code tends to fall within one of two areas: shortening execution times and lessening memory requirements.

Before you begin optimization, ask yourself whether you need to optimize at all. Too many programmers have wasted hours wondering whether a complex series of string function calls are faster or slower than a single Perl regular expression, when the page that this code is in is viewed once every five minutes. Optimization is necessary only when a page takes so long to load that the user perceives it as slow. Often this is a symptom of a very popular site—if requests for a page come in fast enough, the time it takes to generate that page can mean the difference between prompt delivery and server overload.

Profiling

PHP does not have a built-in profiler, but there are some techniques you can use to investigate code that you think has performance issues. One technique is to call the microtime( ) function to get an accurate representation of the amount of time that elapses. You can surround the code you're profiling with calls to microtime( ) and use the values returned by microtime( ) to calculate how long the code took.

For instance, here's some code you can use to find out just how long it takes to produce the phpinfo( ) output:

<?php
ob_start(  );
$start = microtime(  );
phpinfo(  );
$end = microtime(  );
ob_end_clean(  );

echo "phpinfo(  ) took " . ($end-$start) . " seconds to run.\n";
?>

Reload this page several times, and you'll see the number fluctuate slightly. Reload it often enough, and you'll see it fluctuate quite a lot. The danger of timing a single run of a piece of code is that you may not get a representative machine load—the server might be paging as a user starts emacs, or it may have removed the source file from its cache. The best way to get an accurate representation of the time it takes to do something is to time repeated runs and look at the average of those times.

The Benchmark class available in PEAR makes it easy to repeatedly time sections of your script. Here is a simple example that shows how you can use it:

<?php
require_once 'Benchmark/Timer.php';

$timer = new Benchmark_Timer;

$timer->start(  );
sleep(1);
$timer->setMarker('Marker 1');
sleep(2);
$timer->stop(  );

$profiling = $timer->getProfiling(  );

foreach($profiling as $time) {
echo $time['name'] . ': ' .  $time['diff'] . "<br>\n";
}
echo 'Total: ' . $time['total'] . "<br>\n";
?>

The output from this program is:

Start: -
Marker 1: 1.0006979703903
Stop: 2.0100029706955
Total: 3.0107009410858

That is, it took 1.0006979703903 seconds to get to marker 1, which is set right after our sleep(1) call, so it is what you would expect. It took just over 2 seconds to get from marker 1 to the end, and the entire script took just over 3 seconds to run. You can add as many markers as you like and thereby time various parts of your script.

Optimizing Execution Time

Here are some tips for shortening the execution times of your scripts:

*      Avoid printf( ) when echo is all you need.

*      Avoid recomputing values inside a loop, as PHP's parser does not remove loop invariants. For example, don't do this if the size of $array doesn't change:

      for ($i=0; $i < count($array); $i++) { /* do something */ }

      Instead, do this:

      $num = count($array);
for ($i=0; $i < $num; $i++) { /* do something */ }

    *      Include only files that you need. Split included files to include only functions that you are sure will be used together. Although the code may be a bit more difficult to maintain, parsing code you don't use is expensive.

    *      If you are using a database, use persistent database connections—setting up and tearing down database connections can be slow.

*      Don't use a regular expression when a simple string-manipulation function will do the job. For example, to turn one character into another in a string, use str_replace( ), not preg_replace( ).

Optimizing Memory Requirements

Here are some techniques for reducing the memory requirements of your scripts:

    *      Use numbers instead of strings whenever possible:

      for ($i="0"; $i < "10"; $i++)      // bad
for ($i=0; $i < 10; $i++)          // good

    *      When you're done with a large string, set the variable holding the string to an empty string. This frees the memory to be reused.

*      Only include or require files that you need. Use include_once and require_once instead of include and require.

*      If you are using MySQL and have large result sets, consider using the MySQL-specific database extension, so you can use mysql_unbuffered_query( ). This function doesn't load the whole result set into memory at once—instead, it fetches it row by row, as needed.

You can reach the author from the following email-id. tarun.aror1@gmail.com








}