One of those moments where an interview question turns into a research project, or is it really a bake off? The simple problem is demonstrate an algorithm to intersect two lists of numbers, fundamentally it’s a question about using modern interpreted languages and their associative array bits to make a simple intersection routine. However many languages support many different ways to do things. I’ve put together a test of Python vs. Java vs. Ruby vs. Perl vs. PHP and got a few interesting benchmarks.
Short Version of the results: java wins with python comming in second.
But, things are not so simple for instance there is a simple base approach (python example)
def isect(a, b) :
o = []
h = {}
for i in a :
h[i] = True
for i in b :
if h.get(i, False) :
o.append(i)
return o
or a more language unique version
def isect(a, b) :
h = dict(iter([(i, True) for i in a]))
return [i for i in b if h.get(i,False)]
or just using features of the language
def isect(a, b) :
return list(set(a) & set(b))
All of these return the same result, but clearly we can think of this as a progression of an alorigthm. Version 1 focusing on textbook to code, version 2 says there’s some cool language features and version 3 says, dude like don’t you really know what your doing?
The upshot was that I sat down and wrote 12 different versions of this just to see what the language differences were.
| Language | Version | Alg. Time | Run Time |
|---|---|---|---|
| java | 1 | 3.076 | 5.888 |
| perl | 1 | 3.622 | 6.691 |
| php | 1 | 3.901 | 22.878 |
| python | 1 | 1.740 | 4.526 |
| ruby | 1 | 3.517 | 9.853 |
| java | 2 | 0.817 | 4.362 |
| python | 2 | 3.550 | 6.441 |
| ruby | 2 | 7.984 | 14.281 |
| python | 3 | 1.500 | 4.356 |
| ruby | 3 | 3.809 | 10.184 |
| c++ | 1 | 0.830 | 1.209 |
| python | 4 | 1.040 | |
| php | 2 | 2.000 | |
| php | 3 | 10.064 | |
| java | 3 | 3992.045 |
Of course you’re probably wondering about language versions:
- java 1.6.0
- perl 5.8.8
- php 5.2.9
- python 2.5.4
- ruby 1.8.6
// Java Version # 1
private static int[] isect(int a[], int b[]) {
int l[] = new int[a.length];
TreeMap h = new TreeMap();
int idx = 0;
for (int i = 0; i < a.length; i++) {
h.put(new Integer(a[i]), 1);
}
for (int i = 0; i < b.length; i++) {
if (h.containsKey(new Integer(b[i])))
l[idx++] = b[i];
}
int o[] = new int[idx];
for (int i = 0; i < idx; i++) {
o[i] = l[i];
}
return o;
}
# Perl Version 1
sub isect {
my($a, $b) = @_;
my(@o, %h);
for my $i (@$a) {
$h{$i} = 1;
}
for my $i (@$b) {
push(@o, $i) if $h{$i};
}
return @o;
}
# PHP Version 1
function isect($a, $b) {
$h = array();
$o = array();
foreach ($a as $i) {
$h[$i] = true;
}
foreach ($b as $i) {
if ($h[$i]) {
array_push($o, $i);
}
}
return $o;
}
# Python Version 1
def isect(a, b) :
o = []
h = {}
for i in a :
h[i] = True
for i in b :
if h.get(i, False) :
o.append(i)
return o
# Ruby Version 1
def isect(a, b)
return a & b
end
// Java Version 2
private static Integer[] isect(Integer a[], Integer b[]) {
ArrayList l = new ArrayList (a.length);
HashSet h = new HashSet ();
for (int i = 0; i < a.length; i++) {
h.add(a[i]);
}
for (int i = 0; i < b.length; i++) {
if (h.contains(b[i]))
l.add(b[i]);
}
return (Integer[])l.toArray(new Integer[0]);
}
# Python Version 2
def isect(a, b) :
h = dict(iter([(i, True) for i in a]))
return [i for i in b if h.get(i,False)]
# Ruby Version 2
def isect(a, b)
# Convert array to Set objects and perform intersection
a = a.to_set
b = b.to_set
return a & b
end
# Python Version 3
def isect(a, b) :
return list(set(a)& set(b))
# Ruby Version 3
def isect(a, b)
o = Array.new
h = Hash.new
a.each do |i|
h[i] = 1
end
b.each do |i|
if h[i]
o.push(i)
end
end
return o
end
# PHP Version 2
function isect($a, $b) {
$b = array_flip($b);
$o = array();
foreach ($a as $i) {
if(isset($b[$i])) {
$o[] = $i;
}
}
return $o;
}
# PHP Version 3
function isect($a, $b) {
return array_intersect($a, $b);
}
// Java Version 3
private static Integer[] isect(Integer a[], Integer b[]) {
Set l = new HashSet (Arrays.asList(a));
l.retainAll(Arrays.asList(b));
return (Integer[])l.toArray(new Integer[0]);
}
# Python Version 4
def isect(a, b) :
h = set(a)
return [i for i in b if i in h]
// C++ Version 1
std::list* isect(int len, int a[], int b[]) {
__gnu_cxx::hash_set h = __gnu_cxx::hash_set ();
for (int i = 0; i < len; i++)
h.insert(a[i]);
std::list< int> *l = new std::list< int>();
for (int i = 0; i < len; i++)
if (h.find(b[i]) != h.end())
l->push_back(b[i]);
return l;
}
Would enjoy peoples thoughts and feedback, or alternate implementations for these and other languages