Golang Impala Client

Yesterday’s post where I figured out what it took to build thrift interfaces to attach to Cloudera Impala got a big improvement today. I combined my work with the hivething project that Derek Greentree wrote. It’s of course called impalathing over on github, it really does clean up the API.

The big thing is that since this behind the scenes uses the ImpalaServer Thrift API everything is marshaled back from the server as TAB delimitated strings.  No helping that, but it gets a working system up and running pretty quick.  Now off to pull this into a real application to test some service architecture and real performance.

Here is the code to make calls – which is much simpler than everything else. It’s been good to look at some API patterns to make it this simple.

 1package main
 2
 3import (
 4    "log"
 5    "fmt"
 6    "time"
 7    "github.com/koblas/impalathing"
 8)
 9
10func main() {
11    host := "impala-host"
12    port := 21000
13
14    con, err := impalathing.Connect(host, port, impalathing.DefaultOptions)
15
16    if err != nil {
17        log.Fatal("Error connecting", err)
18        return
19    }
20
21    query, err := con.Query("SELECT user_id, action, yyyymm FROM engagements LIMIT 10000")
22
23    startTime := time.Now()
24    total := 0
25    for query.Next() {
26        var (
27            user_id     string
28            action      string
29            yyyymm      int
30        )
31
32        query.Scan(&user_id, &action, &yyyymm)
33        total += 1
34
35        fmt.Println(user_id, action)
36    }
37
38    log.Printf("Fetch %d rows(s) in %.2fs", total, time.Duration(time.Since(startTime)).Seconds())
39}