Writing Clean and Efficient PHP Code

If you’ve ever had to go back to an application you wrote after an extended period of time, you already know the value of clean, well documented and efficient code. But how can you make your code better? Here are some tips that’ll help you speed up and clean up your development cycle.

Introduction

Articles preaching the virtues of writing clean code are abundant with good reason: some things bare repeating. Like any topic covering such an esoteric subject, there are a wide range of opinions as to what exactly defines “clean code”. At the end of the day, there are two criterion for clean code:

  1. Efficiency: Does the code run as quickly and efficiently as possible? Does the code make the most of it’s objects and variables with maximum reuse and minimal waste?

  2. Maintainability: Is the code easy to understand for other developers? Is it well planned, logical, properly documented, and easy to update?

This article will discuss the various elements comprising these two broad points regarding clean code, and provide examples in everyones favorite open source language: PHP. The target audience here is the beginner PHP programmer, although other levels of PHP programmers may find some useful information within.

{mospagebreak title=Loops & Other Conditionals}

One way or another, all pseudo code is eventually converted to machine code. This code will occupy some finite amount of memory. Some languages, such as Java and the .NET Framework go through some extra steps for portability sake before producing machine code by generating IL (Intermediary Language) code. Although it is not always a point of concern, the machine code produced can vary wildly in size based on the pseudo code a developer produces. In PHP, there are a few key ways to minimize the amount of memory and machine code overhead generated by your application. In a small personal site this may not matter, but in a performance-critical situation, the savings here can be just as valuable as the savings earned by writing efficient database queries. The best place to start the discussion is with loops and conditional statements.

Case Statements – A case statement produces roughly 1/5th the amount of machine code as an if/else structure that completes the same task. Use case statements whenever possible! You can use a case statement when you are expecting an object to contain a particular value chosen from a range of known values. Case statements also allow you to provide a default fork for unknown values, which makes it possible to use a case statement in the majority of places where you may want to use an if/else statement. The difference in memory overhead between these two statements is fairly nominal, but the case statement is more efficient to some small degree.

Loops – Loops vary more drastically than conditionals due to the nature of the operations they perform. PHP offers a number of loops, including some higher level looping methods based on the each() function. When deciding what type of loop to use, consider the operations taking place. Let’s look at the following example, where we have a query that will return roughly 1,500 results.

$r = mysql_query(‘SELECT * FROM someTable LIMIT 1500′);
//OPTION 1
for ($i = 0; $i < mysql_num_rows($r); $i++) {
 print mysql_result($r, $i, ‘ColA’).mysql_result($r, $i, ‘ColB’);
}

// OPTION 2
while (($row = mysql_fetch_assoc($r)) !== false) {
 print $row['ColA'].$row['ColB'];
}

The first option requires simply one scalar variable used as a counter for iteration, using a lightweight access method to read buffered data from a result handle. The second example uses an array constructor function to build an array and evaluate it’s contents on each iteration, and then references the array two times per iteration. Despite what you may guess, “OPTION 2″ runs significantly faster. It has lower memory and machine code overhead. The examples are somewhat poor in that they demonstrate poorly written code; database access handled in-line with iteration, and output mixed in. Ignore that. The building blocks are what contribute to the overall overhead of the application, and understanding that will help you make the right choices for your application. You may find it easier to use an associative array in iterations than a result handle cursor, but in a proper app, you would be performing this operation at the object level, and output would be handled elsewhere, driven by object oriented iteration of the values that were loaded into an object from our query.

{mospagebreak title=Objects}

Objects can cripple your page load time in PHP. PHP is still a young language and with version 5 will finally support “real” Objects, and supposedly offer faster compilation of objects. This is a good thing, as there is little room for a non-object-oriented application in the real world. When you are assembling objects in code, the natural thing to do is to simply copy objects as needed. For example, say every object in your application requires a database object to populate itself and save it’s state. However, since PHP is middle-ware and there should never be concurrent database access by a single PHP page, it makes sense to have your objects share a single database object. This point spans a vast array of situations where objects need to be shared and passed around. C++ programmers understand this concept, as C++ has wretched garbage collection. Fortunately, PHP does not share this shortcoming, but poor memory use within an application page can still result in pathetic load times and a crippled server if there are a lot of concurrent users and a lot of objects being copied haphazardly.

The point of this section is this: carefully consider the way in which you utilize objects in your PHP code. It is perfectly fine to have PHP that is completely object oriented and has nothing more than the equivalent of a Main() in a C++ app, but you must consider how your objects will be used, how large the contents of those objects will be, and how they will share those contents and methods.

Best Practices

This section could be made an entire book on it’s own, and often has. It is subject to much debate and a variety of standards. As with any language, some best practices can be applied to that language specifically. An example that truly stands out is using single quotes (‘) vs using double quotes (“). The PHP parser treats these two token identifiers very different. PHP treats anything inside a matching pair of single quotes as a string literal and does not parse the contents of that string. However, PHP treats anything inside a matching pair of double quotes as string data that may contain references to instance variables, objects, or functions, and will parse the contents of the string. This can result in major performance disparities in a page with extensive data output. I ran a quick test printing 1,000 lines of output in the following two methods:

// OPTION 1
print “The value of variable A is $a. The value of variable B is $b. The value of variable C is $c.
“;

// OPTION 2
print ‘The value of variable A is ‘.$a.’. The value of variable B is ‘.$b.’. The value of variable C is ‘.$c.’.
‘;

Option 2 ran roughly thirty percent faster on my server. Results will vary per server and page size. The big difference between these two examples is that in Option 1, PHP is required to full parse the contents of the string, doing printf-style substitution for instance variables. Option 2 is the equivalent of concatenating seven string literal values together and printing them out, which requires no actual string parsing, just some in-memory operations and a call to stdout.

Other “best practices” that are more thoroughly documented include:

  • Using a consistent and logical naming convention for objects, functions, classes, etc.
  • Commenting code line by line (or as near to line by line as is needed to clearly explain what operations are being performed).
  • Indenting code to reflect the beginning and ending of statements (particularly statements enclosed in curly braces).
  • Organizing files in a logical directory structure with easy to understand directory and file names.
  • Making code as modular and reusable as possible.

These things may seem obvious to seasoned programmers, but for a lot of beginners, these are things that are often overlooked in haste. All of these best-practices contribute to code reusability, readability, and maintainability, which is the topic of the next page.

{mospagebreak title=Code Reusability & Maintainability}

This is arguably the most critical point of the article. So maybe a project was a one-off and you’ll never use the code again. Maybe you will never have to go back and add support for another vendors database product. But guess what: someone else will have to. And, in all likelihood, at some time or another in every programmers career, he will be that “someone”. Assembly language programmers can easily relate to this, probably more so than any other developer. Code that is difficult or impossible to understand and extend is often referred to as “write-only” code. This is less common in higher level languages but can still exist. “Write-only” means that the code is so convoluted or complex that without proper comments and documentation, it is impossible to modify or extend in any way. That cool algorithm you wrote with one letter variables names and excessive use of by-reference parameter passing may be a great achievement to you as a developer, but six months later when you need to do the same routine + X, and you pull up the old code and realize it’s not documented or maintainable, you get to go back to the drawing board. That cool algorithm just became a pain in the neck.

So, come to terms with it now: you need to document your code. Not only do you need to document your code, but you need to follow a lot of other guidelines to make sure your code is as reusable and maintainable as possible (many of which are discussed throughout this article).

We’ve talked about maintainability, now let’s touch on reusability. Routines are often written as band-aids for a change in application design after development. These band-aid routines tend to be hard coded for a particular task and are often far more complex than the application routines that were in place before, because they are solutions applied to a problem that is outside of their scope. That is, they are applied in a situation where a higher level design change was required. These are the most notoriously difficult routines to maintain. Your day to day routines can also be excessively annoying to reuse, because they were written to perform one particular task in one particular way. The key here is abstraction. If you are writing code for one specific problem, try to analyze the problem at the interface level. “How does this routine talk to the application?” It isn’t so much about “What does this routine do?”, because you can write a routine capable of just about anything. The key to reuse is to find the level of abstraction at which the routine becomes reusable. You must determine the cost of producing a highly reusable routine when you are patching code, because in the real world, we don’t always have time to kick back and refactor an application to suit one or two new features that seem small to the end user and the client.

Conclusion

There are a number of things that contribute to the development of “clean code”. How clean your code is can be determined by measuring the efficiency and maintainability of the application code you (or your team) has written and assessing the weak areas of the code. The fewer weak areas of code you find, and the simpler the code is to understand and reuse, the better. Code that is obvious in function (through logically organized, well documented routines and application flow) and easy to maintain (through portability and well planned architecture) is clean code. Writing clean code simply requires that you take the time and effort to plan, document, and organize your projects. As you develop more applications, you will notice that planning and development takes less time and you will be able to reuse more and more code for future projects.

I hope this article will be of use to beginning PHP programmers, and that it will help you grow as a developer!

Google+ Comments

Google+ Comments