Counting Digits

One of blogs that I enjoy reading is The Daily WTF. Recently there was this article that highlighted some weird Geoinformatics WTF. What was interesting was that the comments on this post revolved around the performance of counting digits. Should somebody use Math.Log10 or do some basic counting.

1var digits = 0;
2while (num >= 1) {
3  digits++;
4  num /= 10;
5}

I had to wonder, how many digits before the two approaches really differ. Since benchmarking in Go is so much easier the quick test showed that they cross around 10 digits in length.

 1BenchmarkMatrix / Log2 - 10                        225116233       5.453 ns / op
 2BenchmarkMatrix / Cnt2 - 10                       1000000000       0.993 ns / op
 3BenchmarkMatrix / Log2000 - 10                     231034142       5.212 ns / op
 4BenchmarkMatrix / Cnt2000 - 10                     462714898       2.549 ns / op
 5BenchmarkMatrix / Log2000000 - 10                  231727612       5.189 ns / op
 6BenchmarkMatrix / Cnt2000000 - 10                  333076740       3.620 ns / op
 7BenchmarkMatrix / Log2000000000 - 10               227900226       5.160 ns / op
 8BenchmarkMatrix / Cnt2000000000 - 10               217574472       5.470 ns / op
 9BenchmarkMatrix / Log2000000000000 - 10            227399866       5.184 ns / op
10BenchmarkMatrix / Cnt2000000000000 - 10            158506080       7.632 ns / op
11BenchmarkMatrix / Log2000000000000000 - 10         232024206       5.186 ns / op
12BenchmarkMatrix / Cnt2000000000000000 - 10         100000000      10.06  ns / op
13BenchmarkMatrix / Log2000000000000000000 - 10      232336196       5.175 ns / op
14BenchmarkMatrix / Cnt2000000000000000000 - 10       94450698      12.48  ns / op

Sidebar about using floating point for counting digits, this is much harder than it looks. For example in go (floating point) you end up with math.Log10(1_000_000_000_000_010) = 15.000000000000005 and math.Log10(1_000_000_000_000_000) = 14.999999999999998. In JavaScript you get something else Math.log10(1_000_000_000_000_000) = 15 and Math.log10( 999_999_999_999_999) = 15. So in general using floating point Log10 is error prone.

The source code for reference

digits.go

 1package digits
 2
 3import (
 4	"math"
 5)
 6
 7func Log(num int) int {
 8	return int(math.Log10(float64(num)) + 1)
 9}
10
11func Count(num int) int {
12	digits := 0
13
14	for ; num >= 1; num /= 10 {
15		digits++
16	}
17
18	return digits
19}

digits_test.go

 1package digits
 2
 3import (
 4	"strconv"
 5	"testing"
 6)
 7
 8func IntPow(base, exp int) int {
 9	result := 1
10	for {
11		if exp&1 == 1 {
12			result *= base
13		}
14		exp >>= 1
15		if exp == 0 {
16			break
17		}
18		base *= base
19	}
20
21	return result
22}
23
24var testValues = []int{
25	2,
26	2_000,
27	2_000_000,
28	2_000_000_000,
29	2_000_000_000_000,
30	// 1_000_000_000_000_000, // rounding error Math.Log10 returns 14.999999999999998
31	2_000_000_000_000_000,
32	2_000_000_000_000_000_000,
33}
34
35func TestBase(t *testing.T) {
36	for _, val := range testValues {
37		numL := Log(val)
38		numC := Count(val)
39		if numL != numC {
40			t.Errorf("Log(%d)[%d] != Count(%d)[%d]", val, numL, val, numC)
41		}
42	}
43}
44
45func BenchmarkMatrix(b *testing.B) {
46	for _, val := range testValues {
47		b.Run("Log"+strconv.Itoa(val), func(b *testing.B) {
48			for n := 0; n < b.N; n++ {
49				Log(val)
50			}
51		})
52		b.Run("Count"+strconv.Itoa(val), func(b *testing.B) {
53			for n := 0; n < b.N; n++ {
54				Count(val)
55			}
56		})
57	}
58}