Tagged with php

Zend Framework vs. Django Performance

This is not a scientific nor rigorious test…  But, here’s some interesting data for people to chew on.  I’ve got a production site built using the Zend Framework and a beta site built using django.  What’s interesting is the “Time spent downloading a page” graphs from Google Webmaster Tools.

The Zend Framwork Graph — average speed is 458 ms with a min of 246 ms

wink_time

The django graph — you can see where I changed from CGI style to FCGI in the speed — average is 531, and a min is 81ms — though my eyeball average for the month of April is 100ms.

zapquiz_time1

The nice part is these graphs have the same scale, so it’s pretty clear that once I switched from CGI to FCGI (under lighttpd) that the django performance is 4x better than the ZF.

Tagged , ,

Performance of Python, PHP and Perl

Had a 7GB text file that I needed to run some parsing on (to prepare for a DB import).  As part of my habit I pulled out perl and whipped up a quick program to parse and generate some loadable files.  While watching it run I got to thinking about … why … why perl (yes, I know habbits are hard to break).  So while watching it run I re-wrote the program into PHP and Python.

Performance Numbers (on 5 million lines worth of the file)

  $ time ./split.pl  p.test           # Perl 5.8.8

  real    0m38.577s
  user    0m33.554s
  sys     0m0.848s

  $ time ./split.py p.test            # Python 2.4.4
  real    0m44.895s
  user    0m42.975s
  sys     0m0.900s

  $ time php split.php p.test         # PHP 5.2.6RC4
  real    1m10.887s
  user    0m51.251s
  sys     0m18.677s

So, it appears that Perl is the right choice for this job.. Though python is a good second choice, but PHP 50% slower (most likely due to not having complied regular expressions).   I also might note that I’m not fond of the python if/else probably with a chained expression match, where I want to “side effect” out the results of the match — is there better syntax?

Here’s the code for you’re viewing pleasure and possible commentary.

Perl

use strict;

my %first;

open(FULL, ">full.txt");

while (<>) {
# __SINGLE_TOKEN__ adrianenamorado                 1
# __MULTI_TOKEN__ a aaron yalow        1
    chop;
    if (/^__MULTI_TOKEN__\s+(\S+)\s+(.*)\t?\s*(\d+)\s*$/) {
        $first{$1} += $3;
        print FULL  $1," ", $2, "\t", $3, "\n";
    } elsif (/^__SINGLE_TOKEN__\s+(\S+)\s*\t?\s*(\d+)\s*$/) {
        $first{$1} += $2;
    } else {
        print "Unknown: ", $_, "\n";
    }
}

close(FULL);

open(FIRST, ">first.txt");
while (my($k, $c) = each %first) {
    print FIRST $k,"\t",$c,"\n";
}
close(FIRST);

Python

import sys, os, re

first = dict()

ofd = open("full.txt", 'w')

mre = re.compile('^__MULTI_TOKEN__\s+(\S+)\s+(.*)\t?\s*(\d+)\s*$')
sre = re.compile('^__SINGLE_TOKEN__\s+(\S+)\s*\t?\s*(\d+)\s*$')

ifd = open(sys.argv[1], 'r')

for line in ifd :
    line = line.strip()
    m = mre.match(line)
    if m :
        first[m.group(1)] = m.group(3)
        print >> ofd, m.group(1), " ", m.group(2), "\t", m.group(3)
    else :
        m = sre.match(line)
        if m :
            first[m.group(1)] = m.group(2)
        else :
            print "Unknown ", line

ofd.close();

ofd = open("first.txt", 'w')
for (k, c) in first.iteritems() :
    print >> ofd, k, "\t", c
ofd.close()

PHP

$first = array();

$fd = fopen("full.txt", 'w');
$in = fopen($argv[1], 'r');

while ($line = fgets($in)) {
    $line = trim($line);
    if (preg_match('/^__MULTI_TOKEN__\s+(\S+)\s+(.*)\t?\s*(\d+)\s*$/', $line, $m)) {
        $first[$m[1]] += $m[3];
        fprintf($fd, "%s %s\t%d\n", $m[1], $m[2], $m[3]);
    } else if (preg_match('/^__SINGLE_TOKEN__\s+(\S+)\s*\t?\s*(\d+)\s*$/', $line, $m)) {
        $first[$m[1]] += $m[2];
    } else {
        print "Unknown: {$line}\n";
    }
}

fclose($fd);

$fd = fopen("first.txt", 'w');
foreach ($first as $k => $c) {
    fprintf($fd, "%s\t%d\n", $k, $c);
}
fclose($fd);
Tagged , , ,

Bad Metrics for Trends in Programming Languages

Was reading a posting about Trends in web development according to Google one section got me me thinking.  Their trends show end user behavior as it relates to popularity.   However, when I think about language popularity I think in terms of programs written or lines of code produced…   The Google trend I think is a misplaced metric. 

Specifically, when I’m writing various languages I’ve got these behaviors:

  • PHP – I’m frequently found typing the following query “php strpos” or “php call_user_func”.  Basically using “php” as a google keyword to get me to man page about the specific php function.  It’s fast and easy.
  • Perl - My man page behavior is a bit different.. it’s “perldoc -f strpos”, since perl has all of the documentation built into the system with simple command line tools to bring it forth.
  • Java – Very little of java is focused on language issues, its all about the libraries.  You’re going to be doing a query like HashMap without the Java qualification.
  • C++ – Very similar to Java – you know the language, it’s queries like “STL list iterator” or other library or feature specific bits.  Sure, once in a while I might use “C++ abstract template” but when you’re in C++ land usually the word “template” and %$(#*(#^ — that’s the current g++ error on your screen — is sufficient to get the right help
  • JavaScript - Can’t really leave it out, it is a language that’s web search centric and I do qualify my queries with JavaScript to get to the search results, but I’ve found that this isn’t very good (the pagerank for the site that has the good info is low)

That’s it, unless you do an aggregation of all of the sub-search terms I think it’s premature to use Google Trends as a valid qualification of true language trends.

Tagged ,