Learning from Mistakes

Recently our Android app; Open Secret Santa received a user review that was actually useful.

That in of itself is probably newsworthy, if you’ve ever had the misfortune of attempting to track down bugs from single sentence problem descriptions containing nothing more than some vague indication that something didn’t work and a brief treatise on how your app sucks, then you’ll understand what getting value from user reviews on Google Play is like.

In this case, the review provided a complete description of the failure scenario, which in lead in turn to the uncovering of a bug, how testing was inadequate and a rethink about how I should have written some of our code to make it more testable in the first place.

As explained in my previous post, the Draw Engine is a library I wrote that is responsible for creating Secret Santa draws using a mapping between participants and the other participants to whom they restricted from giving. Refer to the Draw Engine source code on GitHub.

The review reported -

The program looks good, but when I tried to use this for our family it said it couldn’t draw.

Two married grandparents, two daughters and sons-in-law and two children for each daughter. I restricted spouses from giving to each other. I also restricted the grand children from giving to their siblings or their own parents and the middle generation parents from giving to their own children.

Maybe a tough challenge, but everyone should have had at least six people they could give to.

The reviewer seemed to have a good point and given they’d put all this detail into a review I owed it to them to check it out. I tried the scenario using the app on my phone and it worked, hmm. I then wrote a quick unit test to verify the scenario programmatically in the Draw Engine in the hope that I could eliminate the possibility of a faulty Draw Engine and isolate the bug further. That test also passed. Hmm…

Then… I ran the “same” test again and it failed. My attention was then drawn to the code inside the Draw Engine that shuffles the members to faciliate redraws being in a new order.

Collections.shuffle(randomMembers, new Random());

It was then pretty obvious that the algorithm was failing due to the variation in the order the nodes were being visited, which was being determined by the Random object.

The code in the BasicDrawEngine was updated, but I was still surprised that I’d managed to miss the problem initially. After all, this was the part of the app I was most confident in! But, it was clear that I wasn’t actually able to exercise all the states and path of the algorithm by simply varying the input. I hadn’t and couldn’t test it properly.

It’s now reasonably obvious that the DrawEngine shouldn’t be responsible for the rerandomisation of the members. This should be performed externally – by the component that actually knows that it’s a redraw (ie in my Android app code). By doing this and providing the randomisation element to the DrawEngine – either explicitly (as a new Random() instance) or via the randomising of the members list order itself then the code can be tested more easily as the test can now better control the execution paths that were previously at the mercy of the Random object hidden deep inside the engine.

Interestingly enough, I recently stumbled upon an old-ish but relevant article that explains the principles I’ve just described, especially in the context of testability. The article (and my experiences!) certainly gave me a better understanding on how my method signature choices failed me and how designing for test pays off.

Most importantly; I can say that a Google Play review actually managed to help a developer find a problem – surely that’s a first!


ps- You can track the testing refactor here.