Golang Impala Client

Yesterday’s post where I figured out what it took to build thrift interfaces to attach to Cloudera Impala got a big improvement today. I combined my work with the hivething project that Derek Greentree wrote. It’s of course called impalathing over on github, it really does clean up the API.

The big thing is that since this behind the scenes uses the ImpalaServer Thrift API everything is marshaled back from the server as TAB delimitated strings.  No helping that, but it gets a working system up and running pretty quick.  Now off to pull this into a real application to test some service architecture and real performance.

Here is the code to make calls – which is much simpler than everything else. It’s been good to look at some API patterns to make it this simple.

package main

import (
    "log"
    "fmt"
    "time"
    "github.com/koblas/impalathing"
)

func main() {
    host := "impala-host"
    port := 21000

    con, err := impalathing.Connect(host, port, impalathing.DefaultOptions)

    if err != nil {
        log.Fatal("Error connecting", err)
        return
    }

    query, err := con.Query("SELECT user_id, action, yyyymm FROM engagements LIMIT 10000")

    startTime := time.Now()
    total := 0
    for query.Next() {
        var (
            user_id     string
            action      string
            yyyymm      int
        )

        query.Scan(&user_id, &action, &yyyymm)
        total += 1

        fmt.Println(user_id, action)
    }

    log.Printf("Fetch %d rows(s) in %.2fs", total, time.Duration(time.Since(startTime)).Seconds())
}