PHP syntax checker for Git repository changes

So I was reading this article about writing a php lint checker, that would check your php files for syntax errors before commit. Since I use git, I decided to write something similar. I put this in .git/hooks/pre-commit and made that hook executable. So far it’s working very well.

git status | cut -c 3- | egrep  '^(modified|new file)' | egrep '\.(php|html)$' |awk -F: '{ gsub(/[[:space:]]*/,"", $2); print $2}' |xargs -n1 php -l

In Linux, you can take the output of one command, and make it the input of the next command using the pipe (shift \) character. So let’s break down this command and see how this helps us keep developers from committing php code with errors.

The first piece of code just shows us the status of a git project:

git status
# On branch master
# Changed but not updated:
#   (use "git add ..." to update what will be committed)
#
#       modified:   html/assets/css/images/bg-nav.gif
#       modified:   html/assets/css/site.css
#       modified:   mplat/Application/Processor/Industry.php
#       modified:   mplat/Themes/default/site/Industry.tpl2
#       modified:   mplat/Themes/default/site/Page.default.tpl2
#
no changes added to commit (use "git add" and/or "git commit -a")

Once we have the list of files that will be modified with the commit, we need to parse out those names. By using the cut command, you can strip off the first characters of a line. Since git uses 3 characters ( # and 2 tabs ) on the lines for files that have either been modified, added, or removed, we have to use the following:

git status | cut -c 3-
On branch master
Changed but not updated:
 (use "git add ..." to update what will be committed)

modified:   html/assets/css/images/bg-nav.gif
modified:   html/assets/css/site.css
modified:   mplat/Application/Processor/Industry.php
modified:   mplat/Themes/default/site/Industry.tpl2
modified:   mplat/Themes/default/site/Page.default.tpl2

changes added to commit (use "git add" and/or "git commit -a")

As you can see, we can now determine which files are staged, and processes them accordingly. Now we can pass this result into the extended grep command, and pull out only files that are going to be modified or added to the repository. We can skip deleted files since those will no longer be part of the repository after the commit.

git status | cut -c 3- | egrep  '^(modified|new file)'
modified:   html/assets/css/images/bg-nav.gif
modified:   html/assets/css/site.css
modified:   mplat/Application/Processor/Industry.php
modified:   mplat/Themes/default/site/Industry.tpl2
modified:   mplat/Themes/default/site/Page.default.tpl2

Once we have a clean list of files that we need to check, we can limit them based on their extension. Now, this could be done in a single egrep, but for the sake of this article it is easier to explain this as two steps instead of one complicated regular expression. In my applications both .html and .php files are processed as php. So we limit the php syntax check to these files.

git status | cut -c 3- | egrep  '^(modified|new file)' | egrep '\.(php|html)$'
modified:   mplat/Application/Processor/Industry.php

Now in this example we only have 1 file (Industry.php) that matches our regular expression. To extract this file from the line, I chose to use the awk command. Awk is an incredibly powerful language used for processing text. This particular awk command, uses the colon ‘:’, to split the line. Once it has the split line, it uses the gsub command to trim whitespace from the line and output it. In awk, variables start with a dollar sign ‘$’. $0 is the entire line, and in this case $1 is everything before the colon, and $2 is everything after.

git status | cut -c 3- | egrep  '^(modified|new file)' | egrep '\.(php|html)$' |awk -F: '{ gsub(/[[:space:]]*/,"", $2); print $2}'
mplat/Application/Processor/Industry.php

Once we have our list of files that we need to test, we need to pass them into the xargs command. The purpose of xargs is to break down the input we’ve given it, and make each line an executable parameter for another command. In this case php -l issues a syntax check on a file. If php -l detects a syntax error, it exits as an error, and ends up preventing git from issuing the commit. The normal php -l errors are displayed so it is easy for the developer to fix them and reissue the commit. And the final code would look like this:

git status | cut -c 3- | egrep  '^(modified|new file)' | egrep '\.(php|html)$' |awk -F: '{ gsub(/[[:space:]]*/,"", $2); print $2}' |xargs -n1 php -l
No syntax errors detected in mplat/Application/Processor/Industry.php

Once you have written and tested this command in a terminal, you can now add it into the git repository hooks. From the root of the git repository, edit ‘.git/hooks/pre-commit’ to look something like this:

#!/bin/bash
#
# An example hook script to verify what is about to be committed.
# Called by git-commit with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, make this file executable.

# This is slightly modified from Andrew Morton's Perfect Patch.
# Lines you introduce should not have trailing whitespace.
# Also check for an indentation that has SP before a TAB.
git status | cut -c 3- | egrep  '^(modified|new file)' | egrep '\.(php|html)$' | awk -F: '{ gsub(/[[:space:]]*/,"", $2); print $2}' | xargs -n1 php -l

You can remove the comments at the top, but I chose to leave them in since this was my first attempt at a git hook. All you have to do to make the hook run whenever a git commit is issued, just make the file executable, using chmod. Since I added this hook, I have done dozens of commits with no unexpected errors. Since this was my first attempt at working with git hooks, I will leave this article as is. If I make any changes to my pre-commit, I will post a new article. If anyone can think of a better way to do this, I’m always open to suggestions.

  1. It’s nice, but you’d better use git status --porcelain for getting the list of files to commit, as its output is guaranteed not to change.

Leave a Comment