Skis / Toys / Fun

Appeal to greatness not guilt

Skis / Toys / Fun

What’s an expert?

August 29th, 2008

Found this on skidiva.com Advanced or Expert - Not Sure, it’s worth considering for any pursuit — not just skiing.

skigrl27 wrote:

Tiger Woods is an expert golfer, but he is challenged every time he plays, right? You will always be challenged on the slopes - heck, that’s kinda why we’re out there, right? There will always be harder trails, bigger cliffs, sicker lines, etc.

I think if you willingly and frequently ski double-blacks and “expert only” terrain - and do so respectfully (ie: not tumbling down the whole thing) then you are considered an expert skier. If you kind of warily and timidly ski double-blacks - and only do so occasionally, then you are more likely advanced.

But then that starts debate about double blacks in Colorado vs. them in VT or Canada…or in Europe. ugh…who cares, just ski and be merry!

altagirl replied:

I absolutely agree with that.

There is NO SUCH THING as an athlete so good they can’t improve. Even the pros are always pushing themselves - you never get to say “well, I passed the test, now I know it all and I’m done learning and trying to improve.”

I know this topic comes up on ski boards every now and then and it gets on my nerves, to be honest. I think we can all agree that no one wants to be that person walking around bragging about how they’re an “expert skier” - it’s meaningless and sounds stupid. It’s like some guy in the bar who says they can ski “everything” at a big resort like Alta or Snowbird. Pfft - yeah right. There are plenty of lines with massive drops and crazy exposure that no one has been crazy enough to ski yet. Skiing the marked “trails” is only the beginning. So there’s no real reason to classify yourself for that purpose as far as I can tell. Whatever you’re doing is death defying in one person’s eyes and easy and boring to someone else.

So it comes down to two things: Fulfilling a personal goal - in which case, who cares if we all agree on the same definition? Define your goal however you like and work towards it.

Or being able to classify yourself properly for a class, group, or trail choice. And to be honest - those things all vary so widely there’s nothing even close to being a universal definition. Because they are all relative to the other people in the group or the other trails on the mountain. There is no universal definition for what a black or double black run is or how hard it is - and the same run can be easy in some conditions and impossible in others. Even when you try to say - well, I like steeps - even if you have your inclinometer out and can say - I skied a 50 degree slope - that’s great. But that could be very easy because it was short or had a nice open run out at the bottom or very difficult because it was tight/rocky/exposed.

So don’t stress over it one way or the other. It’s not a label on your forehead. And don’t undersell yourself when you’re taking lessons. If your friends are telling you you’re better than you think you are - listen to them when it comes to describing your skiing for an instructor or a guide. If you have to - say “I have a hard time saying it because I feel like I have so much room for improvement, but my friends all tell me that I’m an expert skier.” NO ONE who is trying to divide people into groups is expecting the definition of “expert” skiers to be perfection or anywhere near it. And Good Lord - if you walked into a ski shop and told the employees you’re anything other than an expert, they assume you need beginner gear. Because they’re without a doubt assuming that if you can make it down a black run alive, skis on or not, you’re calling yourself an expert.

Tags:   · ·

Skiing in Chile

August 28th, 2008

Ok, spent a week in Chile skiing at Portillo — totally cool!  But, like anything I’m both totally psyched and totally frustrated…  Here’s some video from the last day…  My left hand turns suck (ok, it’s relative…)

Tags:

Greylisting spam…

August 7th, 2008

I think I’ve now spent more time in my life than I ever want to think about fighting spam…  I guess that has to do with spending the 20 years having some involvement in SMTP mail systems.  Of course the stint writing the core infrastructure for MailFrontier’s anti-spam engine didn’t hurt my knowledge base.

My current pain is that I run a server that receives lots of spam, of course there is only three users on the whole machine.  I think at last check I was getting over 500 pieces of spam a day.  For years I’ve been reasonably happy with a well trained spamassassin setup, but recently that’s started failing.  One of the technologies that I’ve not deployed for anti-spam is greylisting… So, here’s the notes about my days spent getting a good greylisting system in place.

Initial round:

postgrey - solid perl package, fairly basic in features, did cut down the flow from about 100 missed messages to 40.  Very nice, but in the course of watching it for a few days noticed that a few classes of senders were making it through.  Not it’s fault, but just a failure in greylisting.  Note: Most of these were large clusters of hosts that are hosted at The Planet (ISP know for spam).

greyboa - my quick homebrew system based on postgrey (don’t ask, it was also a project to learn python asyncore/asynchat).  Worked just about as well as postgrey.   Moved it be based on SQLite since in one of the nightly postgrey runs the DB had some corruption which of course crashed the server and now I wasn’t getting any mail… sigh..

sqlgrey - The current experiment, not that I don’t like my code, but I’m going out of town for a few weeks and don’t want things to die while I’m not watching.  Nice things:  Better logging the postgrey/greyboa, it does some good sender (envelope level) matching to notice things like ‘+’ addressing and Y!Groups unique sender patterns.  Which should reduce the retry behavior a bit.

Here’s my random comment – I’m convinced that while spam is a problem, there has to be away to solve it…   No, it’s impossible to not have unsolicited messages, but there should be some much better ways to take advantage of the fact that spammers must operate on a shotgun approach to sending messages.  That shotgun leaves a lot of scatter… 

Fundamentally messages fall in a few buckets:

  • Immediate — communication between well known parties, with a long history
  • Unknown  — messages between people who don’t have an established relationship, or little history
  • Junk         — clearly “low priority” messages.

We should be able (greylisting is a good example) of delaying “unknown” messages for “long” periods of time until enough history has built up to re-bucket them into one of the other classes.  That’s the basics of what I’m doing with greylisting, delay a message until RBL/Razor/Pyzor has a chance to build a little history before I delivery it to my mailbox…

Tags:

Python import improvements

August 4th, 2008

[originally posted to comp.lang.python -- but worth repeating]

Ruby has been getting pummeled for the last year or more on the performance subject. They’ve been working hard at improving it. From my arm chair perspective Python is sitting on it’s laurels and not taking this as seriously as it probably should. In general it’s possible to make many comments that swirl around religion and approach, one of the things that I’ve noticed is that wile Python has a much more disciplined developer community they’re not using this for the greater good.

Specifically, in looking at a benchmark that was posted, python was an order of magnitude (20 secs vs. 1 sec) slower than ruby. In investigating the performance bottleneck is the random module. This got me to think, why does python have “pickle” and “cPickle”? Comes down to lowest common denominator, or the inability for somebody to write an optimized package that can mimic a python package.

To that end why would somebody write big try catch blocks to see if modules exist and if they exist alias their names. Wouldn’t it be better if there was a way that if I have an “interface compatible” native (aka C) module that has better performance that there could be a way that python would give it preference.

e.g.

  import random(version=1.2, lang=c)

or

  import random(version=1.2, lang=py)   # use the python version by default

or

  import random     #  use the latest version in the "fastest" code (C given preference)

where there could be a nice set of “standard” key value pairs that could provide addtional hints as to what language and version of a library was to be used.


Notes from the comp.lang.python discussion

People fell into a few camps:

  • There’s many “optimization” projects, like PyPy or ShedSkin (compile Python to C/C++ code).  In essence JIT the code, though not as passive as a true JIT.
  • “most bottlenecks are from IO (disk, network) or interaction” so worrying about library performance is not worthwhile.
  • Why worry about it, what we’ve got is good enough.

What amazes me is that time and time again, people don’t realize who their customers’ are.  Languages like PHP realize their customers are hack VB programmers looking to build web pages.  So, having solid libraries and other things that can be totally abused (used) is a good use of “higher order design”.

In general, systems should support progressive improvement.  Somehow I fundamentally belive the python community doesn’t grasp that, but belive they’re building for themselves, not tools that enable a different class of engineer.

Tags:

Python vs. Ruby Performance

July 29th, 2008

Not sure why I’m spending the time on this problem, but it looked interesting.  For starters read a Hacker’s News article that mentioned Python vs. Ruby performance, which in turned liked to a polish blog post.

The core of the blog post was this:

20 threads * 100,000 iterations

  1. Ruby 1.9 = 1.54 s.
  2. Ruby Enterprise = 3.01 s.
  3. JRuby 1.1.2 = 5.82 s.
  4. Jython 2.2.1 = 11.86 s.
  5. Python 2.5.2 = 12.32 s.
  6. Ruby 1.8.7 = 22.68 s.

Which is totally amazing for a performance improvement stand point, but in digging further the original code is:

from time import time
from random import Random
from threading import Thread
rand = Random().randint # alias
 
class Test(Thread):
   def __init__ (self):
      Thread.__init__(self)
      print "Starting %s" % self.getName()
   def run(self):
      a = [rand(0,SIZE) for x in xrange(SIZE)]
      a.sort()
      print "%s finished" % self.getName()
 
print "Start"
start = time()
THREADS, SIZE = 20, 100000
threads = []
for i in xrange(THREADS):
    t = Test()
    threads.append(t)
    t.start()
while True in [t.isAlive() for t in threads]:
    pass
print "Time: %s s" % (time() - start)

The first observation is that (unlike the ruby version) the python version has the overhead of a busy wait on the threads, so with than tiny fix (reduced runtime by 1 second)

for t in threads :
    if t.isAlive() :
        t.join()

Time: 17.6256890297 s

Doing a quick decomposition of this, we really have a program that’s doing the following

from time import time
from random import Random
rand = Random().randint # alias
 
THREADS, SIZE = 20, 100000
 
print "Start"
start = time()
 
for t in xrange(THREADS) :
      a = [rand(0,SIZE) for x in xrange(SIZE)]
      a.sort()
 
print "Time: %s s" % (time() - start)

Time: 14.3786399364 s

Not getting into numbers, but this executes in almost the same time as the threaded version… Hmm, so is the ruby version really all about “Threading Performance”? Can’t be, has to be either in the random or the loop… Lets look further.

from time import time
from random import Random
rand = Random().randint # alias
 
THREADS, SIZE = 20, 100000
 
print "Start"
start = time()
 
for t in xrange(THREADS) :
      for x in xrange(SIZE) :
          rand(0,SIZE) 
 
print "Time: %s s" % (time() - start)

Time: 10.9540541172 s

There we have it, the rand() is taking 70% of the total time, while it does appear that the array append overhead is still 30% (~4 seconds) it’s at least useful to notice that there’s nothing that possible to improve it beyond this point.

Conclusion: Ruby might be faster, but to mix up a bunch of performance stats with threading is going to be problematic.

Update: In digging deeper python’s random number generator is written in python, thus of course it’s slow… It’s competing against a C version.

Tags:

Google Knol is not Wikipedia

July 25th, 2008

After giving a few days for the hype to cool down — not the hype around Dark Knight — but around Google Knol.  I think I understand the business and the opportunity and the shortcomings.

In the press, they keep on equating Knol with Wikipedia, which is totally inaccurate:

I personally think the comparison to Wikipedia is inaccurate, it’s really all about croud sourcing about.com.  Let’s do a quick comparison:

  about.com wikipedia google knol
user contributed only editors yes yes
personal profit yes no yes
easy to create no wiki-cabal yes
any subject yes no yes
SEO easy yes yes yes
multiple similar docs yes no yes
authority yes maybe no
       
       

The biggest win of knol is the ability for people to create any arbitrary page about any subject — which you can’t really do in wikipedia (this isn’t a valid wikipedia page:  Learning to Ski — not really about a thing).  While it’s a rich topic area for about.com or for personal blogs, the addtion to knol makes it something where anybody can create a meaninful document.  The downside is that if _you_ disagree with the Learning to Ski page, you have two choices:

  • Contribute your edits back
  • Create your own page

Now this is where the fun begins.  Wikipedia promotes a model by having no personal gain to foster a community of people who create documents and improve documents.  While Google Knol creates a disincentive to create better documents, while it’s possible to set up a document as a collaborative entity, if there’s ad revenue associated with a document, an above average user would be better served by creating a new document and fracturing the market to gain in $$. 

The biggest problem is that right now google knol is being dominated by “doctors” creating pages on medical conditions for personal a gain.  It’s still unclear if as this product matures that it’ll be a collaborate enviornment or a morass of pages and dis-connected content.  Sure, there’s some star ratings to indicate the credibility of the document or author, but will this fundamentally influence the SEO bits on the page.

Tags: