Golang Impala Client

Yesterday’s post where I figured out what it took to build thrift interfaces to attach to Cloudera Impala got a big improvement today. I combined my work with the hivething project that Derek Greentree wrote. It’s of course called impalathing over on github, it really does clean up the API.

The big thing is that since this behind the scenes uses the ImpalaServer Thrift API everything is marshaled back from the server as TAB delimitated strings.  No helping that, but it gets a working system up and running pretty quick.  Now off to pull this into a real application to test some service architecture and real performance.

Here is the code to make calls – which is much simpler than everything else. It’s been good to look at some API patterns to make it this simple.

 

 

Golang and Hive/Impala – Thrift

This started out as a quick project to see about taking a component of our service and migrating it from Python to Go. We’ve been talking about migrating services from the semi-monolithic version to more loosely coupled – the general idea is to move to Thrift oriented services. We have a core component of our system that uses Impala as a key backend, it’s a very stable service that could be logically broken out.

As an exercise for the weekend, started looking at golang and the ability to migrate it. While I can’t share the full code / details. As I was putting this together the resources for Go and Thrift are weak at best.

The first thing is that you need to get your “thrift build” aka Makefile right

Turns out that package_prefix is a big deal, since the thrift build is going to build a collection (not-just one) interface we need a way to make sure that they’re all available and name spaced.

Also – worth noting that HiveThing is a useful reference – though useses Hive Server 2 not beeswax.

The example code I ended up writing is here – hopefully this helps somebody in the future, figure things out just a bit faster.

 

 

NPS for Support Feedback

Quick observation:

At the end of any support email thread NPS rank it.  NPS is based on the key question “How likely is it you would recommend us to a friend?”  However, when dealing with support it’s never a “recommend”.  Could the question be one of these:

  • How satisfied were you with the handling of your issue?
  • How likely would support be a reason for continuing to use our service?

Just thinking that I want to rate a current interaction with a company as a “5” …

ioloop as a core concept

In the begininning there was main() and that was good.  But under the surface that has changed, it’s still main() but what really happens is dynamic linking, exit handling resources…  We’re even throwing garbage collection in for good luck.  But, it’s still main().

Why?

If you do anything that’s isn’t linear programming you see that ioloop() is really main and you have boilerplate to set up everything for your call to the ioloop().  So, now main isn’t really important we’re boiler plating main() into a bunch of work for ioloop()…  But, alas the languages don’t support this, sometimes you can do things like

Why should I do all that – if the language assumed all IO operations needed to yield to ioloop() the syntax would support that model. You wouldn’t need to worry about what was sync and async, you wouldn’t need to wrap and decorate.  It would just be good.

Imagine- open(), read(), write() socket(), connect() all assumed they were async and would default yield to the ioloop…

Now to write this language…  Maybe I’ll call it “eve”, but the $64k question is compiled or interpreted…

Update: This really is the “async” that is proposed for Python 3.5 – hope it makes it.

Reproducible instalation

The learning curve on chef is a lot steeper than I wish.

Can’t say that I’ve got it running like I would.

But, I can now provision a devbox on AWS in 5 minutes with basic packages and users in a sane state.

It took 2 days of fiddling.

Now back to developing systems onto of basic environments.

Things left to learn –

  • How to get private keys onto the machine so git integration is smooth
  • How to checkout private repos with chef onto the boxes
  • The big challenge is how to “auto” configure an environments with AWS Elastic IPs or other tidbits.

 

Twitter is the Forrest

Was reminded this morning.

I check my Facebook page for updates, but I never check my twitter page. I’ll watch the twitter feed during the day.

Thinking it’s like a tree in the woods, if it falls and nobody’s watching do you care? If it’s important it’s on Facebook.