Skis / Toys / Fun

Appeal to greatness not guilt

Skis / Toys / Fun

Facebook vs. Google

June 8th, 2009

Had a great discussion this morning with a co-worker.  The crux of the conversation is Google vs. Facebook — who is going to win. Is Google Wave the game changer that will kill facebook?

Nope.  We’ve really created three different product spaces in these organizations.  We’ve got Google which is “defined” as an applications company (gmail, docs, apps, etc.) and facebook which is social.

Can Google beat Facebook or Facebook beat Google — probably not — I don’t quite see how my “who’s got prettier eyes”  news feed item has any meaning on my “apps” page on iGoogle.  Sure, in a tab, etc, etc.  But, at the same time should facebook be a news feed, platform, or ???  Fundamentally, I don’t see why facebook isn’t pushing it’s platform harder and harder than Google.   It would be nice to see every site offer a login with facebook, let me move my data my personality around the web.  Make my friends part of my web experience, cut down on the “do you want to give permission” gunk.  Go for two-tier access — anonymous FB vs. logged in FB.

That way when I click on the “who’s eyes are these” trivial quiz, I can participate without sharing all of my data, but then they can impose the “join” functionality when I feel like it.

Side note — the funny part is that facebook is great for accepted bi-directional friendship, but twitter proves there’s a solid marketplace for uni-directional friendship.  Which is handled — weakly — by “fan” pages on Facebook.   It’s probably a topic unto itself to discuss how business can utilize Twitter/Facebook to market their “fan” tendancies to deliver new customers.

Tags:   · ·

Content agregation vs. Human Editors

June 2nd, 2009

This idea came up in a conversation the other day with somebody… The crux of it is that more startups, toy websites or other research projects have been created over the years to aggregate blogs, twitters, lives into a single stream. Everything from facebook, friendfeed, google reader, netvibes, etc. Hey, even I wrote one (feedini.com — it’s probably not running at the moment).

The problem is that none of these experiences can compare to a newspaper in a few ways:

  • Focus
    We’ve all read and re-read the same story over 37 different blogs or news outlets, which gets to be a tedious pain.
  • Breadth
    If I’m interested in “food” I’m interested in food, reading a single blog isn’t going to make me feel like I’m reading the food section of the newspaper.
  • Diversity
    I don’t read the sports section, but ever now and then I find that because it’s there I see something that’s worth reading.

This got me thinking, everybody has been trying to make a NLP/Machine learning system that creates the “right” content for you.  What happens if that’s the wrong idea!

My quick proposal, is that somebody should create an editors desk.  Think of it as bloglines meets about.com — I can create a personal newspaper that I can focus and drive anyway I want….  Create editorial content or republish content quickly easily, etc.   With the added benefit that I could format it similar to a news site (rather than a blog).

This way my readership could look at a collection of stories that I was publishing — technically republishing (think AP wire).  I could establish the editorial voice of the site and have conversations/discussions that we close to the readership (think Hacker News).  Then make sure there is a solid set of editor tools — this is where my inner geek get off, this is where the NLP/Machine learning could help an editor focus in on important content to republish on their site.

Thus if you wanted to make a food portal you create “food.example.com” select a layout from the inventory — ala bloglines templates — then feed it a collection of interesting feeds and it would suggest a bunch more that were simlar… Walla you’re now 80% of the way to making and editorial portal.

Maybe it’s just democratizing Huffington Post…

Tags:   · ·

Array Intersection Bake-off

May 15th, 2009

One of those moments where an interview question turns into a research project, or is it really a bake off?  The simple problem is demonstrate an algorithm to intersect two lists of numbers, fundamentally it’s a question about using modern interpreted languages and their associative array bits to make a simple intersection routine.  However many languages support many different ways to do things.  I’ve put together a test of Python vs. Java vs. Ruby vs. Perl vs. PHP and got a few interesting benchmarks.

Short Version of the results: java wins with python comming in second.

But, things are not so simple for instance there is a simple base approach (python example)

def isect(a, b) :
    o = []
    h = {}
    for i in a :
        h[i] = True
    for i in b :
        if h.get(i, False) :
            o.append(i)
    return o

or a more language unique version

def isect(a, b) :
    h = dict(iter([(i, True) for i in a]))
    return [i for i in b if h.get(i,False)]

or just using features of the language

def isect(a, b) :
    return list(set(a) & set(b))

All of these return the same result, but clearly we can think of this as a progression of an alorigthm.   Version 1 focusing on textbook to code, version 2 says there’s some cool language features and version 3 says, dude like don’t you really know what your doing?

The upshot was that I sat down and wrote 12 different versions of this just to see what the language differences were.

Language Version Alg. Time Run Time
java 1 3.076 5.888
perl 1 3.622 6.691
php 1 3.901 22.878
python 1 1.740 4.526
ruby 1 3.517 9.853
java 2 0.817 4.362
python 2 3.550 6.441
ruby 2 7.984 14.281
python 3 1.500 4.356
ruby 3 3.809 10.184
c++ 1 0.830 1.209
python 4 1.040
php 2 2.000
php 3 10.064
java 3 3992.045

Of course you’re probably wondering about language versions:

  • java 1.6.0
  • perl 5.8.8
  • php 5.2.9
  • python 2.5.4
  • ruby 1.8.6
    // Java Version # 1
    private static int[] isect(int a[], int b[]) {
        int            l[] = new int[a.length];
        TreeMap        h = new TreeMap();
        int            idx = 0;
 
        for (int i = 0; i < a.length; i++) {
            h.put(new Integer(a[i]), 1);
        }
        for (int i = 0; i < b.length; i++) {
            if (h.containsKey(new Integer(b[i])))
                l[idx++] = b[i];
        }
 
        int o[] = new int[idx];
        for (int i = 0; i < idx; i++) {
            o[i] = l[i];
        }
 
        return o;
    }
# Perl Version 1
sub isect {
    my($a, $b) = @_;
    my(@o, %h);
 
    for my $i (@$a) {
        $h{$i} = 1;
    }
    for my $i (@$b) {
        push(@o, $i) if $h{$i};
    }
 
    return @o;
}
# PHP Version 1
function isect($a, $b) {
    $h = array();
    $o = array();
 
    foreach ($a as $i) {
        $h[$i] = true;
    }
    foreach ($b as $i) {
        if ($h[$i]) {
            array_push($o, $i);
        }
    }
 
    return $o;
}
# Python Version 1
def isect(a, b) :
    o = []
    h = {}
    for i in a :
        h[i] = True
    for i in b :
        if h.get(i, False) :
            o.append(i)
    return o
# Ruby Version 1
def isect(a, b)
  return a &amp; b
end
// Java Version 2
    private static Integer[] isect(Integer a[], Integer b[]) {
        ArrayList<integer> l = new ArrayList</integer><integer>(a.length);
        HashSet</integer><integer>   h = new HashSet</integer><integer>();
 
        for (int i = 0; i < a.length; i++) {
            h.add(a[i]);
        }
        for (int i = 0; i < b.length; i++) {
            if (h.contains(b[i]))
                l.add(b[i]);
        }
 
        return (Integer[])l.toArray(new Integer[0]);
    }
# Python Version 2
def isect(a, b) :
    h = dict(iter([(i, True) for i in a]))
    return [i for i in b if h.get(i,False)]
# Ruby Version 2
def isect(a, b) 
  # Convert array to Set objects and perform intersection
  a = a.to_set
  b = b.to_set
 
  return a & b
end
# Python Version 3
def isect(a, b) :
    return list(set(a)& set(b))
# Ruby Version 3
def isect(a, b) 
    o = Array.new
    h = Hash.new
 
    a.each do |i|
        h[i] = 1
    end
    b.each do |i|
        if h[i] 
            o.push(i)
        end
    end
 
    return o
end
# PHP Version 2
function isect($a, $b) {
    $b = array_flip($b);
    $o = array();
 
    foreach ($a as $i) {
        if(isset($b[$i])) {
            $o[] = $i;
        }
    }
 
    return $o;
}
# PHP Version 3
function isect($a, $b) {
    return array_intersect($a, $b);
}
    // Java Version 3
    private static Integer[] isect(Integer a[], Integer b[]) {
        Set<integer> l = new HashSet</integer><integer>(Arrays.asList(a));
        l.retainAll(Arrays.asList(b));
 
        return (Integer[])l.toArray(new Integer[0]);
    }
</integer>
# Python Version 4
def isect(a, b) :
    h = set(a)
    return [i for i in b if i in h]
// C++ Version 1
std::list<int>* isect(int len, int a[], int b[]) {
    __gnu_cxx::hash_set</int><int>   h = __gnu_cxx::hash_set</int><int>();
 
    for (int i = 0; i < len; i++) 
        h.insert(a[i]);
 
    std::list< int>   *l = new std::list< int>();
 
    for (int i = 0; i < len; i++) 
        if (h.find(b[i]) != h.end()) 
            l->push_back(b[i]);
 
    return l;
}
</int>

Would enjoy peoples thoughts and feedback, or alternate implementations for these and other languages

Tags:  

Zend Framework vs. Django Performance

April 24th, 2009

This is not a scientific nor rigorious test…  But, here’s some interesting data for people to chew on.  I’ve got a production site built using the Zend Framework and a beta site built using django.  What’s interesting is the “Time spent downloading a page” graphs from Google Webmaster Tools.

The Zend Framwork Graph — average speed is 458 ms with a min of 246 ms

wink_time

The django graph — you can see where I changed from CGI style to FCGI in the speed — average is 531, and a min is 81ms — though my eyeball average for the month of April is 100ms.

zapquiz_time1

The nice part is these graphs have the same scale, so it’s pretty clear that once I switched from CGI to FCGI (under lighttpd) that the django performance is 4x better than the ZF.

Tags:   · ·

Django performance

April 23rd, 2009

I’ve been working on ZapQuiz and still having the love hate with python and django.  Areas that I would like see improved:

Template Variables : I’m currently doing strange things like:

 <body id="{% block tmpl_id %}{% endblock %}" class="{% block tmpl_class %}{% endblock %}">

and then the included template is setting those variables.  Which when you think about it is a bit off..

Footer Scripts: I would like to be able to stuff all the scripts in the footer, but there’s no easy way to faciliate “block append”  where any included template could include a short snippet to be appended to the footerscript block.

Template Performance: Ok, it’s good.  It really helped when I found a script to cache the parsed version of a template for rendering.  My page render time for most pages is < 100ms, which when you consider it’s running on a 5 year old machine.

Models: The models are a slippery slope, I’m finding that 90% of my functionality is drifting into the model layer, which is good… but, I don’t feel like there’s good seperation.

File Layouts: This really is django’s strongest and weakest point, it wants everything to be modular, but almost takes it too far for an application.  What’s an app whats a webapp?

Auth: django.contrib.auth sucks!  It forces two paradimes that should be abolished… SESSIONS and USER=AUTHENTICATION.   Long ago I learned that to use a SESSION object is a flawed approach, you should cache the database objects and reconstruct state as necessary, not depend on having things like a shoppping cart in memory.   The second bit is that if I want to provide multiple authentications system email+password, user+password, OAuth, Facebook having your user object tied to authentication is a bad practice.   The biggest problem is that I don’t want to re-write all of auth yet…

That’s it today…more ranting later when…  Check out my quick quiz site and send me feedback… I just need one solid day of love and I think it’ll be done…

Tags:

iPhone Objective-C bug?

April 16th, 2009

Funny bug from XCode and objective-c …


warning: initialization from distinct Objective-C type

For the following block of code:

        // Quiz *qz = [[Quiz alloc] initFromDict: qdata];
        Quiz *quiz = [Quiz alloc];
        [quiz initFromDict: qdata];

From what I’ve read I ended up with the uncommented code as the operational code, since it implies that something bad will happen…  Before you ask “initFromDict” returns a (Quiz *) object..

Tags: