Tag Archives: Black-Box Testing

What Over-Mocking Revealed to Me

Recently, I took the idea of “units” in code to the extreme. After researching several programming methodologies and learning about the advantages and disadvantages of some of the methodologies, I found myself very keen of the “unit” methodology – all functions and methods are their own functional “unit”, with their own unit tests and functionality. All of these functional units should be “unit-testable”, meaning that they should not rely on a long list of other functions to be called first. From my understanding, having code like this also contributes to the Clean Code methodology, although I have not read the hailed guide.

In making code that follows the “unit” methodology, one simply assumes that all other functions and methods besides the one in question work properly. That is, there should be no unexpected bugs or kinks within them. They are summed to have complete test coverage and not have any special failure cases. Thus, when testing units of code, one can safely mock out all external functions (even those from within the same class, module, or project) and ensure that the code still follows the intended flow-of-logic.

When one attempts to write these tests, however, they will quickly notice as I did that the tests are no longer testing specific input and output, but rather that only the code-flow is being tested. That is, the only thing being ensured is the fact that expected lines are executed. Although this is a good thing because it is good to ensure that code is flowing as intended, it doesn’t actually test any specific corner cases, or that output is as expected (a big problem).

If there exists a test where all references are mocked and only code-flow is tested, it must then follow that there is another test that tests input and output, ensuring that output is as expected and that certain input generates the proper output. After all, these are the important tests that ensure that the user will not be surprised when they provide a string as a parameter to a multiply method.

However, it seems extremely tedious to think of and write two types of tests for every unit. Not only does the programmer need to think of proper ways to mock references, but he also has to think about possible corner-cases that may break his code. The aforementioned StackOverflow post [1] and long sessions of thinking allowed me to come to a conclusion.

Since the tests where references are mocked require knowledge of the actual code, these are called white-box tests, meaning that it is easy to see what goes on in the “box” – the unit of code. The tests where only input and output is tested are known as black-box tests, because the test-writer shouldn’t care what goes on in the box, only that certain input results in certain output.

Given the requirements of both white-box tests and black-box tests, it is easy to see who should be writing the tests. White-box tests should be written by the developer himself at the time of writing code. These tests ensure that there are no variables where there should have been different variables, that all necessary code executes, and that nothing is left out. The creation of these white-box tests also gets the developer to think about possible problematic inputs.

When the white-box test-assisted code is complete, the code is then given to a quality engineer, who writes the black-box tests to ensure that all inputs, no matter how wacky, generate expected results. This ensures that the end-user (whether it be other developers, clients, or simply other functions within the same module) doesn’t get stuck on any unexpected behavior. The quality engineer is the perfect person to write these tests, as he doesn’t know how the code works on a technical level, only what it is supposed to do and how it should react to certain inputs.

This makes the idea of functional “units” a bit more understandable. Someone who knows the code should write tests to ensure that the flow is as intended, and someone unaffiliated should make sure input is as expected. Of course, on a single-developer team, both jobs are for that single developer.

With that being said, white-box tests are not always necessary. If a method is simple enough, as in get_first_elem_of_array(int* arr) -> int, it doesn’t need to have a white-box tests associated with it. It is easy to see that the code should function as required. However, if a function is more complicated, a white-box test should be written.

White-box tests are something special, however. Since they are written based on the specific flow of code possessed by a functional unit, the test’s passing is entirely reliant on the code that was in-place at the time of writing the test. If the code in the function was changed, the test will fail. This may strike some as a bad thing; however, it forces the developer to design easily-testable code, even if making just a small change. After all, small changes can indeed break things, so small changes should be tested. As long as the same functionality is maintained, the black-box tests should not fail.

I am executing this newfound understanding of functional units while working on PyCFramework, and so far, it has produced very high-quality, modular, extensible code. Although writing tests takes a large chunk of time, the process of writing tests has forced me to think about the design of my code, how it could be improved, and what mistakes I may have made while coding.

References
1 https://stackoverflow.com/questions/32622040/python-unit-testing-should-other-classmethods-be-mocked/32624367?noredirect=1#comment53142597_32624367