How to document the code in PHP?


Reflecting On Your Code

Everyone knows the value of keeping good documentation, but correspondingly everyone hates publishing new documentation as an application evolves.

Reflection is an OOP concept for accessing a script’s extended information; this information is also known as metadata. In other languages like C# there are lots of metadata tools like attributes, XML documentation, manifest files, code regions and the like, however, in PHP there is only one little known trick for including metadata in your application.

This trick relates to the way PHP code is parsed. To understand this trick, you have to first understand the phases of PHP execution under the hood. Explaining this in full is far beyond the scope of this article, however, the basics are as follows.

First, a request is received by the web server and routed to the PHP module. Once the PHP module has received a request, it will find the requested file, and begin the parsing phase. This is all fairly complicated, but to put it simply, the code is run through a part of the Zend Engine called the Zend_Language_Parser.

The result of this phase is a series of objects about every piece of code. They include operational codes, and fortunately for us, one member of this object is the doc_comment.

These objects are then used by php to actually begin compilation, execution and output back to the web server. But all we need know for our purposes is that the doc_comment is not discarded during this parsing operation as one might expect.

So what defines a doc_comment? A doc_comment as PHP defines it is any comment prefixing a class, interface, method or function that begins with a /** and ends with a */.

For example:

Listing 1 listing-1.php
<?php
    /**
     * This is a doc_comment.
     */
    class someclass
    {
        // this is not a doc comment because
        // it doesn't start with /** and end
        // with */ and doesn't prefix a class,
        // interface, method or function
    }
?>
The above comment is now stored in PHP’s metadata, however, it’s not very useful to us. It isn’t parsible, it doesn’t contain any kind of useful information and it’s certainly not in the valid phpdoc format.

A Crash Course In PhpDoc

For those of you not familiar with phpdoc, it is a method for documenting code that derives from Java’s JavaDoc format. Basically, it is a way to format comments so that they are machine readable, meaningful and easy for the programmer to write.

Some people are die-hard phpdoc users, some use derivatives that borrow the best from phpdoc and add some extra formatting. It is always highly controversial when you make changes to an established technology, but for the purposes of reflection, a few changes can be made to make it arguably more useful. Most specifically the addition of tab delimited fields for common attributes, which will be explained later in this article.

I will not discuss the original phpdoc standard here, because it is quite large, so if you are interested in the original way of doing this you should probably visit http://phpdoc.org. Instead, from now on when I refer to phpdoc I am referring purely to the derivative which my studio uses that is in absolutely no way associated with phpdoc.org or the defacto standardized phpdoc.

Our first look at actual phpdoc comments

To begin lets look at the general format of a phpdoc comment.

Listing 2 listing-2.php
<?php
    /**
     * Classname Additionaltitle
     *
     * Summary
     *
     * @attribute   some    something       something else
     */
    class classname
    {
 
    }
?>
The comment begins with the first word of the first line being a mirror of the items name. This is used in the title of the comment, and is added for readability. In my experience, it occasionally happens that code blocks are copied and pasted without regard to the comment; the function is then modified and the comment never updated. By mirroring the name we guarantee that the comment belongs to this function and helps to keep documentation straight from project to project.

The rest of the first line is what you want the entire title to show up as when the information is output. For example, the additional title for a merchant class might be:

Merchant provides credit card processing services.

The next part is always a blank line, as this indicates the end of the title and the beginning of the Summary block. The summary, unlike the title, should describe in brief how the class completes its function. For a Merchant class a several line summary is appropriate:

This class accepts multiple parameters via the constructor, depends on the global configuration manager for merchant account information and uses curl to communicate the credit card information to the processor. There are also various methods for checking the authenticity of the transaction and receiving processor status messages.

Next we have attributes. Attributes are a string that has special meaning to the parser. Attributes are always prefixed with @ and the data they refer to is tab delimited. There can be one or more tabs between each field in an attribute, so you can make all the data align nicely. Spaces will not be recognized as a delimiter only the \t tab character will be.

One such attribute is @var:

Listing 3 listing-3.php
<?php
    /**
     * Title
     *
     * Summary
     *
     * @var     type    $varname    Description
     * @var     array   $someVar    This array contains something
     */
?>
This data — unlike the summary — must be all on one line and the tab delimiting is important. This is derivative from the official phpdoc standard which only provides the name and a description.

When working with variables that change type the value mixed should be used for the type field. When working with objects, the name of the class should be used for the type field.

Here is a list of attributes that we make use of, some deviating from the official standard. There are many more attributes, so this is just a partial list of the most commonly used ones.

Global attributes

These can be used on all classes, interfaces, methods and functions.

Listing 4 listing-4.php
<?php
    /**
     * ...
     *
     * @author      Firstname Lastname
     * @copyright   ACME Inc.
     * @license     GNU LGPL
     * @see         Something
     * @link        http://www.example.com      Description of site
     * @remarks     Something people using the class might want to know.
     */
?>
Class specific attributes

Listing 5 listing-5.php
<?php
    /**
     * ...
     *
     * @var         type        $varname        Description
     * @extends     SomeClass
     * @implements  SomeInterface, SomeOtherInterface
     */
?>
Interface and class attributes

Listing 6 listing-6.php
<?php
    /**
     * ...
     *
     * @depends     SomeClass
     */
?>
Method specific attributes

Listing 7 listing-7.php
<?php
    /**
     * ...
     *
     * @access      public|private|protected
     */
?>
Function and method attributes

Listing 8 listing-8.php
<?php
    /**
     * ...
     *
     * @param       type            $variableName       Description
     * @throws      someException   Description
     * @return(s)   type            Description
     */
?>

Extracting Code Comments With The Reflection API

Now you know how metadata is stored, and how to format it. To access metadata, we need to use the Reflection API.

The Reflection API is a series of classes new to PHP 5 which you provide the name of the item you wish to introspect. The class then has a number of methods and properties containing the relevant information.

You can reflect functions with ReflectionFunction, classes and interfaces with ReflectionClass.

To reflect a class you simply provide the name as a string to the constructor of ReflectionClass. We then have a number of properties and methods available, but the one you care about is the getDocComment() method.

Listing 9 listing-9.php
<?php
    $oClassReflect = new ReflectionClass("classname");
    $sDocComment = $oClassReflect->getDocComment();
?>
You now have the doc comment as a string, but it isn’t very useful. To parse the phpdoc style commenting from the comment use a series of string processing functions. Use a few regular expressions to remove the *‘s and then, normalize the values into an array of lines. Additionally, each line may have multiple successive tabs and these should be treated as a single tab.

Listing 10 listing-10.php
<?php
    $sDocComment = preg_replace("/(^[\\s]*\\/\\*\\*)
                                 |(^[\\s]\\*\\/)
                                 |(^[\\s]*\\*?\\s)
                                 |(^[\\s]*)
                                 |(^[\\t]*)/ixm", "", $sDocComment);
 
    $sDocComment = str_replace("\r", "", $sDocComment);
    $sDocComment = preg_replace("/([\\t])+/", "\t", $sDocComment);
    $aDocCommentLines = explode("\n", $sDocComment);
?>
Next you will want to loop through the resulting array and do some basic parsing. String processing is beyond the scope of this article so I will simply point you to our website and you can look at the working parser in the StormAPI package (/model/docgen/).

Using The Extracted Document Comments

Now you should have been able to parse out the data from the comments and store it in an object. You can then do pretty much anything you want with that data. I personally like to output this data to an XML file and use an XSL stylesheet to interpret the data.

The next hurdle you will need to cover is the ability to load all the files for a site into memory, but without executing any of their code.

Fortunately there are several tools that allow you to do just this.

There are 3 main components to this process. These are:

Finding all the files in the site using SPL’s recursiveDirectoryIterator.
Loading those files without running their contents and without causing scope resolution issues.
Pragmatically determining all the classes, interfaces and functions that have been defined.
First, the Standard PHP Library (SPL) provides a useful class for traversing directories and files pragmatically, recursiveDirectoryIterator. This class is actually far easier to use than the old directory functions.

Second, to load the files located by the recursiveDirectoryIterator, we use a side effect of the function php_check_syntax(). That is, once it is run, the functions, classes and interfaces defined in those files remain in memory and can be reflected. This is, however, much different than include, because no code will be executed.

Listing 11 listing-11.php
<?php
    /**
     * includeTree
     *
     * Invocation method for recursive directory load.
     *
     * @param   string      $sDirectory     A path to include, all subdirs will be scanned.
     * @access  private
     */
    private function includeTree($sDirectory)
    {
        $this->recurseTree(new recursiveDirectoryIterator($sDirectory));
    }
 
    /**
     * recurseTree
     *
     * Recursive function that uses a spl recursiveDirectoryIterator
     * to scan directories for files to include.
     *
     * @param   iterator    $oIterator  A path to include, all subdirs will be scanned.
     * @access  private
     */
    private function recurseTree($oIterator)
    {
        $allowedTypes = array('php', 'inc');
        while ($oIterator->valid()) {
            if ($oIterator->isDir() && !$oIterator->isDot()) {
                if ($oIterator->hasChildren()) {
                    $this->recurseTree($oIterator->getChildren());
                }
            }
            else if ($oIterator->isFile()) {
                $path = $oIterator->getPath() . '/' . $oIterator->getFilename()
                $pathinfo = pathinfo($path);
                if (in_array(strtolower($pathinfo['extension']), $allowedTypes)) {
                    php_check_syntax_external($path);
                }
            }
            $oIterator->next();
        }
    }
?>
The above two methods are designed to exist within a class, however, php_check_syntax() needs to be called from outside class scope to function as expected. So we need to create a function in the global scope called php_check_syntax_external() so that class scope is not inherited.

Listing 12 listing-12.php
<?php
    /**
     * PHP Check Syntax External
     *
     * Marshalls php_check_syntax() for the global scope and discards the result
     *
     * @param   string  $sFile  The filename to pass along.
     */
    function php_check_syntax_external($sFile)
    {
        php_check_syntax($sFile);
    }
?>
Next, we need to determine what has been declared. There are three distinct functions for this job:

get_defined_functions()
get_declared_classes()
get_declared_interfaces()
get_defined_functions() works differently than the other two in that it returns a multidimensional array. You can see the docs for the full details but for our purposes we will just simply specify the first degree key user.

Listing 13 listing-13.php
<?php
    $aFuncs = get_defined_functions();
    $aFuncs = $aFuncs['user'];
?>
$aFuncs now contains an array of all user defined function names, to which you can loop and pass into ReflectionFunction($functionName).

To get the classes and interfaces, we must use a combination of their list functions as well as reflection to determine if they are built-in or if they were created by the user.

Listing 14 listing-14.php
<?php
    foreach(get_declared_classes() as $sClassName) {
        $oClassReflect = new ReflectionClass($sClassName);
        if($oClassReflect->isUserDefined()) {
            // Do something
        }
    }
?>

 

Courtesy - http://www.phpriot.com/articles/reflection-api

No comments

Enter your email address:

Delivered by FeedBurner

OR

 Subscribe in a reader

 
Latest Blogs
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Tips for optimizing php code by PHP Expert Important SERVER Variables in PHP - By PHP Expert Improved Error Messages in PHP 5 - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert New Object Oriented Features - By PHP Expert Object Overloading in PHP 5 Persistable Classes - By PHP Expert Dynamic Getter/Setter Methods - By PHP Expert New Functions in PHP 5 New Directives - By PHP Expert Exception Handling - By PHP Expert Password Encryption in PHP - By PHP Expert Output Buffering in PHP - By PHP Expert Page Excerpts Using CURL - By PHP Expert Quick and Easy Google Site Search - By PHP Expert Always Be Notified When Google Crawls Your Site - By PHP Expert How to POST Form Data using CURL - By PHP Expert Cryptography for web developers - By PHP Expert Dynamically Loading JavaScript Files - by PHP Expert What is Web 2.0 - By PHP Expert