Good test, bad test
May 28, 2021 · 10 minute read · Commentsrubytesting
I ❤ tests! They give me confidence that the code does what it’s supposed to do, they document behaviour, they let me discover edge cases when writing the code in the first place. All tests are not created equal, though, and I don’t love them all the same.
I’d like to put into words what are the properties which make tests good but… it depends. It depends on many factors: language, type of test, trust in cooperating components, criticality of the code under test, cost of running a test, time pressure and many more. There are definitely attributes which transcend all aspects (e.g. clarity, focus) but they might manifest differently based on circumstances of the test. What I can do is present two versions of a specific test and describe what I find valuable.
The test is fictional but heavily inspired by a test I discussed with a colleague during a review few weeks back. It’s in Ruby, using RSpec as the testing framework and FactoryBot for fixtures.
Bad test
describe PlayerCreator do
describe 'validations' do
before(:each) do
@c = create :customer
@s = create :sport, customer: @c
@t = create :team, customer: @c
@current_user = build_stubbed(:admin)
@params = {
player: {
sport_idd: [@s.id],
team_id: @t.id,
nick: 'Alice',
}
}
end
it 'should allow leading and trailing whitespace in nick' do
@params[:player][:nick] = " Alice\n"
result = PlayerCreator.(
current_user: @current_user, params: @params
)
expect(result).to be_success
end
it 'should prohibit whitespace within nick value' do
@params[:player][:nick] = "Al ice"
result = PlayerCreator.(
current_user: @current_user, params: @params
)
expect_error(
result,
:nick,
'Spaces, new lines and other whitespace characters are prohibited'
)
end
end
end
Good test
describe PlayerCreator do
it 'strips off leading and trailing whitespace(s) from the nick and succeeds' do
result = PlayerCreator.(
current_user: build_stubbed(:admin),
params: player_attributes(nick: " Alice\n"),
)
expect(result).to be_success
expect(result[:model].nick).to eq 'Alice'
end
it 'prohibits whitespace within the nick' do
result = PlayerCreator.(
current_user: build_stubbed(:admin),
params: player_attributes(nick: "Al ice"),
)
expect_error(
result,
:nick,
'Spaces, new lines and other whitespace characters are prohibited'
)
end
def player_attributes(**fields)
c = create :customer
s = create :sport, customer: c
t = create :team, customer: c
attrs = attributes_for(:player, **fields) do |attrs|
attrs.merge!(team_id: [t.id], sport_ids: [s.id])
end
{ player: attrs }
end
end
The differences
Test case names
RSpec has a
built-in documentation
output format
which prints the full test case names and “paths”:
# bad test
PlayerCreator
validations
should allow leading and trailing whitespace in nick
should prohibit whitespace within nick value
# good test
PlayerCreator
strips off leading and trailing whitespace(s) from the nick
prohibits whitespace within the nick
This format is great for understanding a code you’re not familiar with. Ideally, it should tell you what the code under test does. While it’s possible to discern it from the bad test’s output, the second example is more comprehensible in my opinion.
It is also in line with the BDD (behaviour driven development) nature of RSpec - to test (and describe) behaviour rather than implementation. It might be hard to accomplish with unit tests of a low level component, but I strive for as much documentational value as possible.
A related point, reflected in the preferred matcher style in RSpec is to write test names as statements of facts instead of wishes, e.g. prohibits whitespace within the nick instead of should prohibit whitespace within nick value.
Nesting
The good test doesn’t wrap the test cases in describe
/context
labelling them
as validation test cases. My stance here is weaker, more nuanced and less
dogmatic.
Often, the nesting wrapper breaks the flow of test case names discussed in the previous section. This needs to be balanced against the benefit of grouping the tests. This test file has only five test cases in total so I lean towards not including it. If the test file has thirty cases which fall into four groups, I’d be much more likely to use nesting.
Nesting context
s often leads to nesting let
s which override each other on
every level and make it hard to figure out what is the setup of a particular
test. It doesn’t have to happen but it’s oh so tempting. I know I’ve written
test files nested eight (maybe more) levels trying to test all combinations of
inputs. They were extremely useful when untangling the existing spaghetti code
with low test coverage but touching them months later feels like
Less nesting has one weird technical benefit - it leaves you more characters per line for the test code. Most projects have some limit for line length. The lower it is, the more you feel every level of tabs you sacrifice to nesting.
Setup data
The test needs associated records. The bad test sets up this data in
a before
block and stores it in a bunch of instance variables referenced by the test
itself. While instance variables are an extreme approach which might outrage
many Ruby-ists, using the aforementioned let
s might fly under the radar:
let(:c) { create :customer }
let(:s) { create :sport, customer: c }
let(:t) { create :team, customer: c }
let(:current_user) { build_stubbed(:admin) }
let(:params) do
{
player: {
sport_idd: [s.id],
team_id: t.id,
nick: 'Alice',
}
}
end
It is basically the same thing as instance variables but with visual clutter, slightly worse support in IDEs and editors (e.g. jump to definition) and a promise that “it won’t even get run unless it’s needed”.
Tangent 1: As mentioned in the previous section, let
s and nested context
s
tend to make tests harder to read. This example is short and not deeply nested
so it’s just more typing and visual clutter but I still prefer not having to
deal with it.
Tangent 2: The promise that let
s are evaluated only when used by the actual
test is used as an argument for let
s being cheaper than a helper function, if
the test no longer needs that value, the test will get faster. That might be
true but as a reader of the test, I’m going to wonder why it’s in there. What
needs it? It leads to dead code which IDEs have much harder time pointing out
for you because it’s a DSL, not plain Ruby.
What the bad test achieves is separating the test setup from the code execution and assertions. That’s a great! As long as there are only a bunch of tests sharing the setup. When you get need to understand test number five or more, you have to figure out that the setup is outside of the test and then scroll up and down between the shared setup and the test itself to grasp the interaction.
The good test moves the test setup into a helper method. The method lives within the test file and is contended to be very specific to that file. Its advantage is that it’s normal function call that can be followed with any editor. If it becomes unused, the tooling might highlight it. It also doesn’t change with every layer of nesting, you simply get what you see.
Setup, execute, assert
A commonly advised test structure splits a test into three steps: setup, execute, assert. The bad test follows the order but could make it clearer by blank lines in between the steps. The good test kinda merges setup and execution. Personally, I don’t mind because it’s so simple. However, a more complex test might benefit from being a bit more explicit:
player_params = player_attributes(nick: " Alice\n")
result = PlayerCreator.(
current_user: build_stubbed(:admin),
params: player_params,
)
expect(result).to be_success
expect(result[:model].nick).to eq 'Alice'
Assertions
The first bad test case is called should allow leading and trailing whitespace
in nick and then asserts expect(result).to be_success
. Apart from the wording
issues described earlier, the test case name and the assertion match. However,
the code actually strips the whitespace before creating the player. It’s a kinda
trivial transformation but why not test and document it?
The good test case is called strips off leading and trailing whitespace(s) from
the nick and succeeds and adds an
assertion expect(result[:model].nick).to eq 'Alice'
. It describes the
transformation and tests it as well.
Most developers who have heard about testing have heard about a single assertion
per test and might start splitting the test in two in order to have a single expect
call per test. They can knock themselves out, as long as we don’t work in the
same codebase. I’m not saying “stuff as many assertions into a single test” but
I see several legitimate reasons to have more than one expect
call per test.
A pragmatic reason is performance. Each test adds to the suite execution time and slow suites are a drag. Complex software needs large suites with many tests but let’s not add unnecessary tests because of misunderstanding of a recommendation.
Another reason is the overlooked distinction between a single assertion and a
single expect
call. Often, people take them to mean the same thing but
consider a built-in RSpec
matcher have_attributes
:
expect(person).to have_attributes(name: 'Tiff', age: 32)
It is a single expect
call but technically, it compares name and age. I have
never heard a person object to a test using have_attributes
and trying to
rewrite it into two (or more) tests. Let’s test that the attributes have
changed and consider it an atomic operation for the purposes of testing. 1
The example test expects
expect(result).to be_success
expect(result[:model].nick).to eq 'Alice'
One can argue that a success and an attribute of a model are distinct enough to not fit under the umbrella of a single assertion.2 Actually, this is where some familiarity with other languages and paradigms comes in handy. Let’s imagine an equivalent expectation in Elm:
playerParams [(Name, " Alice "]
|> createPlayer
|> Result.map (.model >> .nick)
|> Expect.equal (Ok 'Alice')
The result of the Elm function equivalent to PlayerCreator
would be
a Result
which
can be one of the values Ok
or Err
, each carrying data relevant only for
that result. It clearly is a single expectation which encompasses both pass/fail
and the actual value/error. Ruby doesn’t allow us to express the result via
types in a similar way but the underlying behaviour is the same.3
Focus
A rule of thumb for structuring code I like is: each function should have a
single level of abstraction. Measurable approximation of this abstract concept
is the number of nesting levels: one layer if
/case
/switch
/loop
per
function. Both versions of the test are similar in this - they don’t have any
conditionals (as most tests don’t have them) or any loops. However, the good
test has a slightly more complex call of the code under test:
result = PlayerCreator.(
current_user: build_stubbed(:admin),
params: player_attributes(nick: "Al ice"),
)
It’s not terrible and pretty standard RSpec so I consider it good enough. In a more complex test though, I might write another helper to get to:
params = player_attributes nick: 'Al ice'
result = create_player params
expect_error(
result,
:nick,
'Spaces, new lines and other whitespace characters are prohibited'
)
This matches the setup-execute-assert structure to three function calls. I don’t consider it a neccessity but the advantage is that the reader’s focus is pulled directly towards the important things in the test:
- key inputs of the code under test,
- execution of the code without all the implementational noise and
- assertion about the result
Even the most tired reader will grasp what is important in this test without being boggled down in ceremony of calling a method with many/complex arguments and processing complex types. If they need more detail, they can navigate to the part they need to dig into. The longer you stick with a codebase, the more you appreciate its readability and readability of its test suite.
These are by no means all properties of good tests as I value them. Being a human, my opinions are shaped by my experiences and keep evolving, this post is just a snapshot. You might have a different perspective or different reasons for the same approach - I’m not saying my way is the only right way and I’m happy to hear about your testing philosophy and reasoning behind it. By explaining mine approach, I hope to help developers at an earlier point on their path and save from running in some unpleasant directions.
-
There’s a non-trivial number of developers who might try to reconcile the difference by using or writing complex matchers. I don’t find the RSpec API for complex matchers easy to navigate but if it’s your jam, go ahead. ↩︎
-
You could get down to a single
expect
call if you added some implicit assumptions about the code:expect(Player.find_by(name: 'Alice)).to exist
. The code assumes that (a) there is no other record matching that criteria and (b) the record being persisted implies success. ↩︎ -
There’s dry-monads gem implementing precisely this behaviour. ↩︎
We're looking for developers to help us save energy
If you're interested in what we do and you would like to help us save energy, drop us a line at jobs@enectiva.cz.