A Bit Of Gardening

A Bit Of Gardening

The focus with Hazeltest in the past weeks has been to implement a certain feature set so I can use the application as early as possible in a project with my current client. While this goal was achieved, there is, of course, still “no free lunch”, and the price tag here was a couple of things I didn’t like about the code. Because I’m an early learner of Go, these things combined with the new Go-foo I’ve learned in the past weeks provided just the perfect opportunity to get out the gardening gloves and do a little refactoring.

Since the previous blog post was published, the Hazeltest source code has seen both a couple of refactorings and some new features. In the following sections, I’d like to talk about the refactorings I’ve made in the code, and cover the new features in a dedicated blog post.

Introduction

In the context of refactoring, the authors of the excellent book The Pragmatic Programmer make the following analogy: Rather than construction, software is more like gardening. They continue to point out that software is more organic than concrete and that it’s quite natural for its source code to evolve over time (very much to the chagrin of management, who prefer the construction analogy because it makes planning software projects, and reasoning about the software, so much more straightforward). While the construction analogy may be useful for management, the idea of thinking about software development as being a bit like gardening seems to reflect the reality of the craft much more accurately and realistically: As source code gets added, modified or deleted, new abstraction layers emerge and offer themselves as new “building blocks” to be refactored into, thus leading to improvements of the code’s -ilities (like extensibility and maintainability). A classic example of this might be a class or module that needs to be split into smaller classes or modules because it got too large and exceeded the responsibility boundaries the developer had in mind when initially designing it.

As far as Hazeltest is concerned, the top priority in the past two months or so was to make the application useful for a project with my current client – we urgently needed one specific piece of functionality, and as a young learner of Golang, I wasn’t able to provide it without creating some rough edges around the code. On the bright side, those rough edges then offered plenty of opportunities for some impactful gardening work, and as a little bonus, performing refactorings on Hazeltest also gave me the chance of applying the new Go-foo I’ve learned in the past weeks.

The Gardening Threesome

So, let’s pull out the gloves, dust off the gardening tools and get to work! In the following sections, I’d like to show you three little refactorings the code base has undergone (although the first two were improvement ideas at first that kind of turned into refactorings in the process).

Signalling Readiness

Issue on GitHub: Raise Readiness Only Once Hazelcast Clients Have Connected

Context

Because Hazeltest is optimized to run on Kubernetes, it needs to implement an endpoint informing about the application’s readiness the Deployment object’s readiness probes can query. Generally, a readiness endpoint should state the application is ready only once it has booted correctly and is available for handling requests or otherwise able to perform its intended functionality. In the case of Hazeltest, intended functionality refers to the various runners the application uses to interact with Hazelcast, and so it follows that Hazeltest’s readiness endpoint should yield readiness only once the runners have everything they need to do their jobs. There are two prerequisites for the latter, namely, (a) correct configuration, and (b) a connection to a Hazelcast cluster.

Problem

Before the pull request linked to the issue above was merged, the readiness endpoint had signalled readiness once a subset of (a) was evaluated to be true, but long before (b) was true for the internal Hazelcast clients the runners employ to connect to Hazelcast. The purpose of this issue, then, was simply to raise readiness only once all enabled runners have been able to successfully initialize their Hazelcast clients.

Solution

The improvement was put into practice according to the following idea:

  1. Parse runner configuration
  2. If runner is enabled, signal non-readiness to the API responsible for exposing the readiness endpoint
  3. Initialize Hazelcast client in runner
  4. If initialization was successful, runner signal readiness to the API

If all runners adhere to this, then all the API has to do is keep track of the number of non-ready clients and return HTTP 200 once this number has reached zero. The code for making this happen is very straightforward (excerpt from api/endpoints.go as of pull request #6):

// From api/endpoints.go
// (Note that some renamings have taken place since then, and to improve readability, I have included them here)

type readiness struct {
	Up                         bool
	atLeastOneRunnerRegistered bool
	numNonReadyRunners         int
}

// ...

func RaiseNotReady() {

	m.Lock()
	{
		r.numNonReadyRunners++
		if !r.atLeastOneRunnerRegistered {
			r.atLeastOneRunnerRegistered = true
		}
		lp.LogApiEvent(fmt.Sprintf("runner has raised 'not ready', number of non-ready runners now %d", r.numNonReadyRunners), log.InfoLevel)
	}
	m.Unlock()

}

func RaiseReady() {

	m.Lock()
	{
		r.numNonReadyRunners--
		lp.LogApiEvent(fmt.Sprintf("runner has raised readiness, number of non-ready runners now %d", r.numNonReadyRunners), log.InfoLevel)
		if r.numNonReadyRunners == 0 && r.atLeastOneRunnerRegistered && !r.Up {
			r.Up = true
			lp.LogApiEvent("all runners ready", log.InfoLevel)
		}
	}
	m.Unlock()

}

// ...

The only exported field, Up, of the little readiness struct right on top of the snippet is what gets encoded into a tiny JSON object to serve as the response to requests on /readiness along with HTTP 200 (once readiness has been achieved, that is; if not, the client will receive HTTP 503). The other two properties are for internal state keeping only:

  • numNonReadyRunners: Used to keep track of the number of non-ready runners. Once this number hits zero, Up is set to true, and from this point on, clients will receive the aforementioned JSON response on the /readiness endpoint. Because the functions writing and reading this property get invoked concurrently by runners on different goroutines, access to the property has to be protected by means of a mutex.
  • atLeastOneRunnerRegistered: According to the logic above, the readiness endpoint would signal readiness even if no runner has registered. That wouldn’t make much sense despite the fact that the number of non-ready runners is technically really zero in this case, so to prevent this from happening, the atLeastOneRunnerRegistered property is used to make sure Up only gets set to true if at least one runner has previously “registered” by signalling it is not ready yet.

So far, so good, but that doesn’t really give you refactoring vibes, does it? The refactoring bit came when I felt like a boy scout while looking at the following (diff as of pull request #6 over here):

// From an older version of maps/testloop.go

type TestLoopStatus struct {
	Source            string
	Config            *runnerConfig
	NumMaps           int
	NumRuns           int
	TotalRuns         int
	TotalRunsFinished int
}

As you can probably tell from its name, this is the struct holding the status of a test loop (for maps, to be precise), which can be queried via another endpoint in the API, /status. There is nothing wrong with the struct itself – the fields it defines are appropriate and they all need to be exported for the JSON encoding to take them into consideration –, but rather with its placement in the code, namely, inside the maps package. This is unelegant because of the following reasons:

  • The most obvious one is that runners for other data structures, such as queues, might produce a very similar or even the same kind of status, hence would benefit from using the TestLoopStatus struct, too. Having become commonly used functionality, it should not reside in the maps package. (You might point out I could have YAGNI‘d this until a runner outside the maps package really needs it, but in fact, this had already happened with the implementation of two new queue runners.)
  • With the maps/testloop.go file encapsulating the test loop status, a piece of functionality in the api package has to query the struct in order to do something with it. This is a violation of the Tell, Don’t Ask guideline and, as a consequence of this, introduces a rather ugly coupling between the field in the maps/testloop.go file exposing the TestLoopStatus variable and the logic in the api package acting on it.

What do to about this? Well, for reasons I’ve summarized in this issue, the most elegant and sustainable solution probably would have been to redesign the means of communicating test loop status across package boundaries altogether by – potentially, I’ll have to try this and see what happens – ditching the struct and only pass around a simple map[string]interface{} while isolating the knowledge of how to populate it in the individual runners. This would have resulted in a refactoring exceeding the idea of simply cleaning up a little mess, though, and would thus have diluted the purpose of the original issue dealing with fixing when the application signals readiness. As a compromise – to make this part of the code at least a tiny bit better –, I moved the TestLoopStatus struct into the the api/status.go file, forcing the application’s runners to inform the API about their status rather than the latter asking for it. This is most obviously represented by the reversal of package imports – the individual runner packages now import the api package to access the functions for updating the test loop status rather than vice-versa. Of course, this means there is still coupling between runner packages and the api package, and a change to the TestLoopStatus in the api package would still ripple to all places where it is used, but TestLoopStatus and the functionality acting on it are much closer together now since they now both reside in the api package. Thus, the seemingly very small act of simply moving a struct to another package means the cross-package coupling between state and the functionality for acting on it was removed.

You can find the diff of the pull request by which the issue was resolved over here.

Reading Configuration

Issue on GitHub: Refactor Configuration Mechanism

Context

Hazeltest currently offers four runners for generating load on the Hazelcast cluster under test, and they can be configured using a range of options for users of the application to fine-tune the load generation and adjust it to whatever aspect of their Hazelcast clusters they would like to test. Although the runners come with what I believe are reasonable defaults, all configuration options are exposed to the outside, and can be overridden by means of a Yaml file, the path to which users can specify by means of a command-line argument on application startup. Because there are so many configuration options – and because more will surely follow –, I tried to make providing the configuration as convenient as possible by enabling users to specify only a subset of the available options in their provided file, and make the application apply the defaults wherever the user hasn’t provided a value.

Problem

The first iteration for a design to implement this desired behavior was based on the idea that if the user specified a path to a configuration file, then read it, and read the default configuration file otherwise. The latter was a Yaml file, defaultConfig.yaml, placed alongside the Go source files and embedded into the application by means of the go:embed directive. The problem with this approach was that it did not allow the contents of both files to exist simultaneously as state in the application, and therefore, to allow for the previously described behavior, the source code itself had to define defaults, too, such that when the user specified an override for the default configuration file not containing the full set of options, the remainder could be populated from the code-internal defaults. Consequently, there was no single source of truth for the notion of default – you’d expect the defaultConfig.yaml file to be that single source of truth because of its name, only to then learn about the code-internal defaults. So, not only did the initial design create the need to keep two sources of default values in sync, but also constituted a violation of the Principle Of Least Astonishment. On top of that, the implementation of this first design iteration was very clunky and resulted in much code duplication (not necessarily a result of the aforementioned design flaw, but rather due to the fact that, having just started to learn Golang, my understanding of the language at the time of implementing this first design was still very basic).

To make this more tangible, let’s back it up with a short example. Say you’re a Hazeltest user and you would like to override the number of maps the PokedexRunner uses. The default is 5, and because you want the runner to spawn 10 map goroutines, you supply the following configuration file:

maptests:
  pokedex:
    numMaps: 10

What happened internally when you specified the path to this configuration file via the --config-file argument was this (excerpt from the pre-refactoring version of maps/runner.go):

// From an old version of maps/runner.go

// ...

const (
	// ...
	defaultNumMaps	= 10
	// ...
	defaultNumRuns  = 10000
	// ...
)

// ...

func (b runnerConfigBuilder) populateConfig() *runnerConfig {

	keyPath := b.runnerKeyPath + ".enabled"
	valueFromConfig, err := config.ExtractConfigValue(b.parsedConfig, keyPath)
	// ...

	keyPath = b.runnerKeyPath + ".numMaps"
	valueFromConfig, err = config.ExtractConfigValue(b.parsedConfig, keyPath)
	var numMaps int
	if err != nil {
		lp.LogErrUponConfigExtraction(keyPath, err, log.WarnLevel)
		numMaps = defaultNumMaps
	} else {
		numMaps = valueFromConfig.(int)
	}

	// ...

	keyPath = b.runnerKeyPath + ".numRuns"
	valueFromConfig, err = config.ExtractConfigValue(b.parsedConfig, keyPath)
	var numRuns int
	if err != nil {
		lp.LogErrUponConfigExtraction(keyPath, err, log.WarnLevel)
		numRuns = defaultNumRuns
	} else {
		numRuns = valueFromConfig.(int)
	}

	// ...

}

// ...

The config.ExtractConfigValue function extracts the configuration value at the given Yaml path from the specified configuration file, which can be either the default or the user-provided one. In our example, since you’ve provided a configuration file containing the maptests.pokedex.numMaps key, the function will return the corresponding value, and this value will be applied instead of the default. In case of all other values, however – exemplified here by the maptests.pokedex.numRuns property –, the function will return an error since the key path in question was not contained in the configuration file, causing the caller to log a warning and apply the default instead. Apart from the obvious code duplication, this also is a violation of the Principle Of Least Astonishment from the user’s perspective – you wouldn’t expect the application the log a warning in case, say, numRuns hasn’t been provided when it was precisely your intention to not provide it and let the application use the default instead.

Solution

The issue that needed to be resolved first was the flaw in the initial design preventing two configuration files from existing simultaneously as state to use in the application. As the very first step, the structure of the client package was flattened (having worked mostly with Java thus far, I was used to assemble relatively small packages and create nested package hierarchies, but that doesn’t really work in Go for reasons explained further down below in the next section), and all functionality related to reading configuration – both provided via command-line arguments and by means of configuration files – as well as extracting the values contained in that configuration were moved to a single file, config.go. Being located in the same package, it became possible for most identifiers to be unexported, thus reducing the API surface visible from outside the package. More precisely, config.go now only exports three functions (excerpt from the state of client/config.go as of pull request #10):

// From client/config.go

// ...

func ParseConfigs() {

	parseCommandLineArgs()
	parseDefaultConfigFile()
	parseUserSuppliedConfigFile()

}

func RetrieveArgValue(arg string) interface{} {

	return commandLineArgs[arg]	

}

func PopulateConfigProperty(keyPath string, assignValue func(any)) {

	if value, err := retrieveConfigValue(keyPath); err != nil {
		lp.LogErrUponConfigExtraction(keyPath, err, log.FatalLevel)
	} else {
		assignValue(value)
	}

}

// ...

The first function, ParseConfigs, gets called by main itself and is responsible for parsing all sources of configuration into internal state the application can later rely on. As you can see here, the option of two configuration files being present was accounted for – the default configuration file will always be present thanks to go:embed and get parsed into a map[string]interface{}, and if the user has provided a custom configuration file, it will be parsed, too; otherwise, its corresponding map[string]interface{} will simply stay nil. The second function is not relevant for this discussion, but the third one, PopulateConfigProperty, is: It lets callers provide a key path (such as maptests.pokedex.numMaps, to stick with the example above) and a function for assigning the retrieved value to its target property. PopulateConfigProperty is key to removing the redundancy in assigning the retrieved value to the correct property on the caller’s side. Thanks to this function, the caller can do the following (excerpt from the state of maps/runner.go as of pull request #10):

// From maps/runner.go

// ...

func (b runnerConfigBuilder) populateConfig() *runnerConfig {

	// ...

	var numMaps int
	client.PopulateConfigProperty(b.runnerKeyPath+".numMaps", func(a any) {
 		numMaps = a.(int)
 	})

	// ...

	var numRuns int
	client.PopulateConfigProperty(b.runnerKeyPath+".numRuns", func(a any) {
 		numRuns = a.(int)
 	})

	// ...

}

// ...

You’ve seen earlier on how the implementation of the assignment of the numMaps and numRuns properties prior to the refactoring carried a lot of redundancy. Now, after the refactoring, the code is much cleaner and a lot less elaborate, making it easier to read, understand, and maintain. This is also due to the fact that the caller doesn’t need to know anymore which value might be assigned to the target property – whether this value is a default or a specific user-provided one is now handled entirely in the client package. There, the config value retrieval is implemented by the retrieveConfigValue function (excerpt from client/config.go as of pull request #10):

// From client/config.go

// ...

func retrieveConfigValue(keyPath string) (any, error) {

 	if value, err := retrieveConfigValueFromMap(userSuppliedConfig, keyPath); err == nil {
 		lp.LogConfigEvent(keyPath, "config file", "found value in user-supplied config file", log.TraceLevel)
 		return value, nil
 	}

 	if value, err := retrieveConfigValueFromMap(defaultConfig, keyPath); err == nil {
 		lp.LogConfigEvent(keyPath, "config file", "found value in default config file", log.TraceLevel)
 		return value, nil
 	}

 	errMsg := fmt.Sprintf("no map provides value for key '%s'", keyPath)
 	lp.LogConfigEvent(keyPath, "config file", errMsg, log.WarnLevel)
 	return nil, errors.New(errMsg)

 }

 // ...

The idea of two configuration files that have been parsed into to different internal maps is represented here, too, and this function also expresses the fact that the map populated from the user-supplied configuration file always takes precedence over the default map. In case no value could be bound – which can only happen if I made a mistake either setting up the default configuration file or coding the key paths to be extracted from there –, the function logs a warning and returns a non-nil error, upon which its caller will cause the application to exit with a non-zero exit code by, ultimately, invoking log.Fatal. Thus, if execution flow continues, the callers – which are the application’s various runners – can be sure all the configuration values they rely on have been populated successfully.

Ultimately, this refactoring resulted in a single source of truth for default configuration values, and made the code for parsing and assigning these values a lot cleaner. You can find the entire diff of the pull request resolving the issue here.

Go Ain’t Java

Issue on GitHub: Refactor Size Of Go Files

Context

If you’ve taken a look at my blog posts from quite some time ago, such as this one, you may have noticed the source code examples are all in Java. Indeed, I come from the Java world – it was the programming language that introduced me to programming and the craft of software engineering during my studies of Computer Science, and due to its popularity, it appeared in every stage of my professional career thus far. Java has been a well-known and highly appreciated companion of mine since 2012, and because it was my first programming language, it will always occupy a special place in my heart.

With the introduction and rise of Kubernetes, though, another player has entered the stage: Golang. It’s a relatively young programming language – a lot younger than Java, anyway –, but it has already seen massive adoption. There are great articles out there whose authors are much better at describing the characteristics and advantages of the language than me (this one, for example), so I won’t go (ha-ha) into that here – for me, the main reason to be interested in the language is the fact that it fits so well into the world of containerization with its outstanding execution performance and small executables, and when the need for a Hazelcast testing program arose, I saw an opportunity to finally take my Golang learning to the next level.

When starting to code in Golang, many differences to Java became obvious immediately – the fact that it has explicit support for pointers, for example. Other things were less obvious at first, such as the way Golang code is organized into files and packages. Java comes with various visibility modifiers for its language structures (such as classes and methods) – public, protected, and private, as well as the default visibility that does not have an explicit modifier. The existence of the private modifier seems to encourage Java developers to write relatively small units of functionality because it provides a means for doing so: The private modifier allows for “locking down” state and the functionality to act on it. And indeed, this seems to be necessary in Java: Once a Java class becomes too large and/or its responsibility borders become too fuzzy, developers will usually want to refactor it into multiple smaller classes, and because a Java file can only contain one top-level Java class or interface, a higher number of Java classes as the result of refactorings or simply a growing code base necessarily translates to a higher number of Java files, which then need to be organized into packages. Consequently, large Java programs tend to contain many files organized into deeply nested and sophisticated package hierarchies, and the language’s visibility modifiers enable developers to impose fine-grained rules on how the many pieces of functionality these files contain can interact with each other. Therefore, the availability of low-visibility, especially private, means that a higher number of packages to organize Java files and the identifiers they encapsulate does not necessarily result in a larger API surface as seen from outside each package.

The code organization paradigm in Golang seems to differ greatly from the one present in Java. The first indication for this is that Golang offers only two visibility modes (using the word modifier may not be entirely accurate here because these two modes do not, in fact, have dedicated terms in the language to express them):

  • Unexported: The identifier in question is package-private, i. e. is visible throughout the package it has been declared in
  • Exported: The identifier is visible throughout the entire code base and can therefore by imported by any other package

Since unexported translates to package-private, there is no motivation anymore to create very small files because the identifiers they contain will be visible in the entire package anyway, so organizing code into small files becomes nothing more than a means to structure the code for increasing readability, but it seems to have no meaning as far as the language is concerned. The question that follows would then be: If all the files can access each other’s identifiers anyway, then why not simply create larger files? After all, one Go file can contain multiple top-level identifiers, so the limitation from Java that one file can contain only one top-level class or interface does not apply (which also because Golang does not know the concept of a class as such, but uses structs and the methods defined on them as the closest equivalent, which was yet another interesting feat about Golang to learn).

Problem

Coming from the Java world, I ditched the idea of creating files with larger scope at first because, while Java’s limitation of only one top-level class or interface might seem limiting, it also made thinking about the content of a single file more straightforward, at least as far as said top-level object was concerned. So, my naive me sailed on happily creating many, many Go files and then – inevitably – starting to organize them into packages, and when different pieces of functionality needed to invoke other pieces across package boundaries, I simply exported the identifiers subject to invocation. Things did not go well for long, because, for starters, simply exporting all identifiers seems wrong in principle, but in Golang, exporting an identifier carries quite some significance because it means this identifier is visible throughout the entire code base, thus exporting a lot of identifiers quickly created a huge API surface for the various packages, which, in turn, greatly increased coupling between the various identifiers in the involved packages.

As it turns out, one is discouraged to create many small files in Golang because problems arise as soon as the high number of files needs to be organized into packages. Instead, it seems wise to apply a different paradigm when thinking about what scope, and thus what size, a single Go file should have. The great Dave Cheney formulated the rule of thumb that one Java package is equivalent to one Go source file, and a Go package is equivalent to a whole Maven module. This seems to make a lot of sense – widening the scope of one’s Go source files alleviates, or even erases, the need for deeply nested package hierarchies, so the package hierarchy of the average Go program seems to be much more shallow than that of the average Java program.

One such example of having created too nested a package hierarchy was the maps package with its different runners. When first implementing those runners, I had the idea that the runners must be separated from each other, and in my mind, that meant isolating each one in its own package. Thus, the PokedexRunner was placed in maps/pokedex, and the LoadRunner in maps/load – so far, so good. The result, however, was that commonly used identifiers in the maps package had to be exported to make them accessible in the two sub packages, which meant vastly increasing the API surface of the maps package. Why would other packages unrelated to map runners want to import and make use of the MapRunnerConfig or the SleepConfig structs? Highly unlikely they would, and if so, they probably shouldn’t because it would be a smell for bad design.

Solution

The first step was to flatten the package hierarchy by unifying all identifiers both the PokedexRunner and the LoadRunner (and any other map runner that may be implemented) have to rely on in only two files:

  • runner.go encapsulates all functionality for initializing and configuring a runner, such as the runnerConfig struct, and its accompanying runnerConfigBuilder. The only exported identifiers are now the MapTester struct and its TestMaps method, both of which actually must be exported because main initializes the former and then invokes the latter. Thus, the API surface as seen from outside is – as far as my current understanding goes, anyway – exactly what it should be.
  • testloop.go contains – as the name suggests – the implementation of the test loop the various runners initialize and use. The test loop executes actions against the Hazelcast cluster on behalf of those runners using the datasets and adhering to the configuration the latter have provided them with. This file has existed prior to the pull request, but has undergone many changes in scope of it, most of them related to unexporting identifiers. Today, testloop.go does not expose any identifier to outside the package because it’s used only by runners within the maps package, so the API surface seems much more on point now.

You can find the diff for both of these files as of pull request #5 over here and here, respectively. The queues package has undergone a very similar set of changes, but they weren’t quite as plentiful because, back when the refactoring took place, there was only one queue runner available, the TweetRunner. You can find its diff over here.

Closing Thoughts

I love refactorings, especially when the code is written in a language I’ve just started learning, and I hope that Hazeltest’s code base was much improved both in terms of the functionality it offers and in terms of its -ilities. The reason I’m so fond of refactorings is that, while you might well have a vision of where you want to end up because someone described the goal in a ticket or issue of some sorts, you’ll never really know how exactly the path to get there may look like at the beginning. So, you let your intuition guide you to the first change, and applying it may create the opportunity for another change, which then maybe creates the opportunity for yet another one, and so on, gradually and organically transforming bits and pieces of the code base, until, various commits and mugs of coffee later, it has evolved into a new, improved version of itself. Coming back to the notion conveyed in the introduction of this blog post, while the ticket or issue that sent you on the journey might have contained a very vivid and exact formulation of the goal, it seems entirely impossible – to me, anyway – to plan each step to get there in perfect precision because the planning becomes a fuzzy chain of “what ifs” right after having agreed on the first step. And so, if the idea of precise planning falls short for refactorings, then how could it possibly succeed for the implementation of an entire software system, of which refactoring is only one part? It seems to me, then, that the authors of The Pragmatic Programmer had a point when they exclaimed that software, and thus all work involved in creating it, is more like gardening and less like construction. So, when will you get your gardening gloves?