Advanced Go Fuzzing Techniques

April 28, 2022 ~ 10 min read
Tutorial

Fuzzbuzz fuzz tests your code in CI/CD and catches bugs before they make it into production. Get started for free

Fuzz testing is a powerful tool that finds bugs and vulnerabilities by running your code with millions of procedurally generated inputs. Basic fuzz tests can easily find crashes, hangs, and unexpected behavior, but with some tweaks they can uncover a whole host of other critical bugs.

This post will introduce some advanced fuzz testing strategies that are applicable to projects of any size and complexity. If you find yourself stuck, or just want to chat fuzzing, we’ve got a Fuzzbuzz Discord just for that!

Note: This is the second in a series of posts about Go’s new fuzz testing tools. If you’re new to fuzzing with Go, we recommend starting with our Go Fuzzing Basics post first.

Fuzzing with Assertions

Beego recently published a patch for a high severity vulnerability in their router that caused certain URL paths to be matched to the incorrect handlers. Specifically, file extensions within the route were ignored (x.html/y.html would be matched to x/y.html) potentially allowing bad actors to gain unauthorized access to sensitive resources.

Let’s take a look at how this vulnerability could’ve been caught with fuzzing, by adding checks that validate the underlying logic of the routing code.

All of the fuzz tests in this post are in github.com/fuzzbuzz/go-fuzzing-tutorial. To get started, clone it and enter the 02-advanced-techniques directory.

git clone https://github.com/fuzzbuzz/go-fuzzing-tutorial.git
cd go-fuzzing-tutorial/02-advanced-techniques

All commands will be run from within this folder.

It’s important to remember that fuzzing is great at taking the question “what property should hold true for all inputs to my code?”, and finding counterexamples developers may have missed.

When it comes to route matching, there are a couple of properties, or invariants, that must hold true, one of which is: if a pattern x matches a route y, then y should be a prefix of x. This can be codified with the following fuzz test (it’s not as scary as it looks):

// 02-advanced-techniques/beego_test.go
package advancedtechniques


import (
	"regexp"
	"strings"
	"testing"

	"github.com/beego/beego/v2/server/web"
	context "github.com/beego/beego/v2/server/web/context"
)

func FuzzMatch(f *testing.F) {
	// Create some example routes for our rest API
	route1 := "/prefix/abc.html"
	route2 := "/1/2/3/hi.json"
	route3 := "/hel/lo/wo/rl/d"

	// Add them as valid routes to the router
	tree := web.NewTree()
	tree.AddRouter(route1, route1)
	tree.AddRouter(route2, route2)
	tree.AddRouter(route3, route3)

	// Seed the fuzzer with the routes we just created
	f.Add(route1)
	f.Add(route2)
	f.Add(route3)

	slashRE := regexp.MustCompile("//+")
	f.Fuzz(func(t *testing.T, pattern string) {
		// Filter repeated slashes, Beego handles this for us
		// i.e. /prefix//abc.html -> /prefix/abc.html
		pattern = slashRE.ReplaceAllString(pattern, "/")

		// Try to match the fuzzed pattern to one of our defined API routes
		obj := tree.Match(pattern, context.NewContext())

		// Check if the pattern matches the example route
		if obj != nil {
			if matchedRoute := obj.(string); matchedRoute != "" {
				// Make sure the pattern and example route share the same prefix
				if !strings.HasPrefix(pattern, matchedRoute) {
					t.Fatal("Found match with incorrect prefix", pattern, matchedRoute)
				}
			}
		}
	})
}

Running this fuzz test turns up an input that fails the match check after about 30 seconds of testing on my machine (or about 800k total tests):

go test -v -run FuzzMatch -fuzz FuzzMatch .

=== FUZZ  FuzzMatch
fuzz: elapsed: 0s, gathering baseline coverage: 0/25 completed
fuzz: elapsed: 0s, gathering baseline coverage: 25/25 completed, now fuzzing with 8 workers
...
fuzz: elapsed: 36s, execs: 795998 (21746/sec), new interesting: 21 (total: 46)
fuzz: minimizing 49-byte failing input file
fuzz: elapsed: 37s, minimizing
--- FAIL: FuzzMatch (37.15s)
    --- FAIL: FuzzMatch (0.00s)
        match_test.go:36: Found match with incorrect prefix /prefix.html/abc.html /prefix/abc.html
    
    Failing input written to testdata/fuzz/FuzzMatch/1db4abe997139201a4c3c8914996af7dfa8a6898a52f56d2aa7b4f341947bb9b
    To re-run:
    go test -run=FuzzMatch/1db4abe997139201a4c3c8914996af7dfa8a6898a52f56d2aa7b4f341947bb9b
FAIL
FAIL
exit status 1
FAIL	github.com/fuzzbuzz/go-fuzzing-tutorial/02-advanced-techniques	142.200s

If we open the input in testdata/fuzz/FuzzMatch/1db4abe997139201a4c3c8914996af7dfa8a6898a52f56d2aa7b4f341947bb9b, we see:

go test fuzz v1
string("/prefix.html/abc.html")

The route matcher thinks that /prefix.html/abc.html matches /prefix/abc.html, which clearly shouldn’t be the case, since a malicious actor could circumvent access controls for /prefix/abc.html’s handler. As such, once found, it was marked as a critical vulnerability and assigned CVE-19381. You can see the patch for this CVE here.

Sure enough, if we change our beego version to one after the fix, we can see that the fuzz test we wrote no longer finds the bug:

go get github.com/beego/beego/[email protected]
go test -v -run FuzzMatch .
=== RUN   FuzzMatch
=== RUN   FuzzMatch/seed#0
=== RUN   FuzzMatch/1db4abe997139201a4c3c8914996af7dfa8a6898a52f56d2aa7b4f341947bb9b
=== RUN   FuzzMatch/8433cf0c82fc839c8524cd4c35ee6f7f7368be31ab565a80dd50b63b8690298e
--- PASS: FuzzMatch (0.00s)
    --- PASS: FuzzMatch/seed#0 (0.00s)
    --- PASS: FuzzMatch/1db4abe997139201a4c3c8914996af7dfa8a6898a52f56d2aa7b4f341947bb9b (0.00s)
PASS
ok  	github.com/beego/beego/v2/server/web	0.004s

What’s important to note is that the fuzz test we wrote checks if the logic of the underlying code holds true, rather than just stuffing random strings into a function and hoping to find crashes.

Thinking about the properties that must hold true for all inputs makes fuzz testing more powerful than manually written tests when it comes to finding logic bugs, as you can cover much more of your program’s state space.

Round-Trip Fuzzing

If you have code that takes input of type A and returns a type B (ie, A -> B), and code that does the opposite operation (B -> A), you can combine the two functions to discover data integrity and logic bugs.

Let’s take a look at a concrete example using the github.com/Rhymond/go-money safe money library. The library claims to provide reliable operations for currency-based calculations that don’t lose pennies due to rounding errors.

We can check if this property holds true by writing a test that splits up an arbitrary currency value into a random set of pieces, and then adds them back together. We should expect the same amount of money at the end of the test, regardless of the provided values:

// 02-advanced-techniques/currency_roundtrip_test.go
package advancedtechniques

import (
	"testing"

	"github.com/Rhymond/go-money"
)

func FuzzCurrency(f *testing.F) {
	f.Fuzz(func(t *testing.T, currencyAmount int64, splitAmount int) {
		// Set up a new money struct with a random amount
		amount := money.New(currencyAmount, money.GBP)

		// Split that amount into a random amount of pieces
		split, err := amount.Split(splitAmount)
		if err != nil {
			return
		}

		// Add those split pieces back together
		final := money.New(0, money.GBP)
		for _, s := range split {
			final, err = final.Add(s)
			if err != nil {
				// This shouldn't happen
				t.Fatal("Error adding split currency back together", final, s)
			}
		}

		// Make sure the summed pieces equal the starting amount
		eq, err := amount.Equals(final)
		if err != nil {
			t.Fatal("Error when comparing currency values that should be valid", err)
		}

		if !eq {
			t.Fatal("Splitting currency into", splitAmount, "parts, and adding back together, produces mismatch")
		}
	})
}

Run this test:

go test -fuzz FuzzCurrency -run FuzzCurrency .

I let the fuzzer run 100 million test cases before stopping it, without any signs of a bug. There are of course other functions we could check within this library, but we can be quite confident that the splitting code works as described.

Unfortunately, not all code is this reliable - a round-trip bug in Go’s encoding/xml library resulted in 3 Critical Severity CVEs:

It’s likely these bugs could have been found with a round-trip fuzz test of the encoding/xml package’s functionality by decoding XML into a struct and re-encoding it back into XML, and looking for differences in the input and output.

If you have data that can be converted back and forth, round-trip fuzzing can uncover subtle edge cases. Situations like API clients and server handlers, import & export, or encoding and decoding, are great examples of situations where round-trip fuzzing should be applied.

Differential Fuzzing

Differential fuzzing uses previously written reference code as the “invariant”. Simply put, differential fuzzing runs inputs provided by the fuzzer through two different pieces of code that are meant to do the same thing, and then checks to make sure their outputs are equal.

For example, take two popular YAML parser libraries: github.com/go-yaml/yaml, and github.com/goccy/go-yaml. In a perfect world, these two libraries would handle inputs the same way, allowing for interoperability between different systems that consume YAML. We could write the following fuzz test to check if this property holds true:

// 02-advanced-techniques/yaml_test.go
package advancedtechniques

import (
	"reflect"
	"testing"

	yaml1 "gopkg.in/yaml.v2"
	yaml2 "github.com/goccy/go-yaml"
)
func FuzzYamlDifferential(f *testing.F) {
	// We could add some seeds here using f.Add if we wanted
	// to provide the fuzzer with some example YAML strings

	f.Fuzz(func(t *testing.T, data []byte) {
		map1 := map[string]interface{}{}
		map2 := map[string]interface{}{}

		err1 := yaml1.Unmarshal(data, &map1)
		err2 := yaml2.Unmarshal(data, &map2)

		if err1 == nil && err2 == nil {
			if len(map1) == 0 && len(map2) == 0 {
				// Reflect.DeepEqual doesn't handle this case well
				return
			}

			// If both think the data is valid, make sure they got the same structure
			if !reflect.DeepEqual(map1, map2) {
				t.Logf("Yaml1: %+v", map1)
				t.Logf("Yaml2: %+v", map2)
				t.Fatalf("Parsed yaml mismatch.")
			}
		}
	})
}

This fuzz test discovers an input that the two libraries disagree on almost immediately:

go test -fuzz FuzzYamlDifferential -run FuzzYamlDifferential .
warning: starting with empty corpus
fuzz: elapsed: 0s, execs: 0 (0/sec), new interesting: 0 (total: 0)
fuzz: elapsed: 1s, execs: 1432 (1331/sec), new interesting: 18 (total: 18)
--- FAIL: FuzzYamlDifferential (1.08s)
    --- FAIL: FuzzYamlDifferential (0.00s)
        yaml_test.go:34: Yaml1: map[0:8]
        yaml_test.go:35: Yaml2: map[:0]
        yaml_test.go:36: Parsed yaml mismatch.
    
    Failing input written to testdata/fuzz/FuzzYamlDifferential/84153dc131dc035f4b5918b4c18c5fbf517c6af946339b25abe8565baab7e85b
    To re-run:
    go test -run=FuzzYamlDifferential/84153dc131dc035f4b5918b4c18c5fbf517c6af946339b25abe8565baab7e85b
FAIL
exit status 1
FAIL	github.com/fuzzbuzz/go-fuzzing-tutorial/02-advanced-techniques	1.079s

This test has the potential to produce a wide array of inputs, so yours may look different, but the test case in my testdata/fuzz/FuzzYamlDifferential/84153dc131dc035f4b5918b4c18c5fbf517c6af946339b25abe8565baab7e85b was:

go test fuzz v1
[]byte("0: 08”)

This tells us that these 2 Go libraries interpret 0: 08 differently. We were curious to see how other languages handled this input, so we ran this input through a few more YAML libraries, but were left with more questions than answers:

  • Go’s Yaml.v2: interprets it as a struct with key “0” and float64 value of 8
  • Go’s goccy/go-yaml: interprets it as a struct with key “” and uint64 value of 0
  • Javascript: completely ignores the leading zero in 08
  • Python 2: errors on 08 because it interprets it as an octal value
  • Python 3: errors on the leading zero with a SyntaxError

Inconsistencies like these can lead to subtle interoperability bugs. Some inconsistencies can be more serious than others - last year a cryptographic differential fuzzer found a vulnerability in Go’s crypto/elliptic library.

Differential fuzzing is one of the most effective ways to test for compatibility between different implementations of the same functionality. It’s a great way to make sure refactored code doesn’t introduce new bugs or change functionality, or to validate the correctness of new, optimized versions of slow code.

How to get started

It can be daunting to get started with fuzz testing - generalizing your code’s behavior for all inputs isn’t always simple. We’ve found that the best way to start adding fuzz tests to your codebase is to write some simple ones that work end-to-end, using some of the techniques we’ve described so far. Even if you think a test is too basic or unlikely to find any bugs, fuzzing can surprise you.

If you’re struggling to get started, stop by our Discord, we’d love to help you.

In our next post, we’ll take a look at how you can use fuzz testing to validate your REST API by calling HTTP handlers just like a user would. You can sign up below to be notified as soon as it comes out!