Fork me on GitHub

14 Sept 2009

Testing Grails Apps With Selenium-RC

I mentioned in my last post that I'd started to think we were going about Selenium testing all wrong. For years I've been writing Selenium tests using the Selenium IDE, so the tests are in HTML files. When doing pre-commit testing and on CI the tests are run with the HtmlTestRunner. This technique is fine and well - developing tests with the IDE is fast and they can be easily worked on an re-tested in the IDE until they are correct. However, some things have always bothered me and Adam Goucher's thoughts on the Selenium Value Chain crystallised some of those suspicions.

So what are the big advantages of using Selenium RC over just the IDE and HtmlTestRunner?

Test design

Selenium tests written in HTML tend to either become an amorphous blob, not testing one simple thing or proliferate to the point of having hundreds of small test files. With Selenium RC you can write Selenium test classes with multiple test case methods just as you would unit tests. Each test case can be neat, self-contained and share set up with other test cases in the same class. Also by abstracting out common command sequences into utility methods you can start to develop a higher level DSL for testing your application. In HTML tests this can be done by writing extension functions in Javascript but it's not as straightforward.

Data fixtures

Data set up is the biggest headache when writing Selenium tests in HTML. You can't drop out of selenese and set up a couple of domain objects so you end up either trying to run all your tests against a monolithic data fixture (a Sisyphean task if any of your tests make modifications to the data) or write some sort of URL endpoints that can set up and tear down data. We've done the latter and it's resulted in the single most bloated, horrible, misused bit of code in our app. Fixtures get re-used by other tests because they set up approximately appropriate data - until some change requires a change in the fixture and suddenly a bunch of unrelated tests fail. Selenium RC tests don't have this problem; you're in Groovy-land so you can access domain classes directly and use the invaluable build-test-data plugin to set data up.

Data verification

I mentioned previously the anti-pattern of false moniker testing and HTML based Selenium tests are terrible for this. If you have a test that creates some data, how do you verify it has done it correctly? The only way is to scrape the data on another screen - e.g. a CRUD list or show view. Ugh! With Selenium RC you can read the data from the database with standard Grails domain classes and verify your expectations directly.

I started experimenting over the weekend with getting Selenium RC tests written in Groovy to run using grails test-app -functional. It turns out to be really fairly straightforward and I've put some alpha code up on GitHub.

11 Sept 2009

Thoughts on Testing

These are some of the practices I try to live by when writing test coverage. Some of the detail may be Grails-centric since that's been my development platform of choice for the last couple of years, but the gist of the points is generally applicable.


I've been thinking about posting something like this for quite a while and was pushed over the edge by reading some interesting things...



Make just one (logical) assertion per test.

The prime attribute of a good test case is legibility. When my change breaks your test I don't what to be scratching my head trying to figure out what it is your the test is trying to prove. The best way to achieve legibility is to have each test case test just one thing. Sure, it may take more than one actual assert statement to test it properly. For example to test a query returns correct values you may want to assert that the return value is non-null, that it is a collection with the correct size and that each element of the collection complies with the query parameters. Those are separate assertion statements but in the service of a single logical assertion.

One of the worst habits you can get into when adding new functionality is to just bung a few more asserts into an existing test case. Imagine we're testing a list action on a Grails controller class. There's an existing test case to confirm the list of instances in the model are correct. You implement a new requirement to load some sidebar content for the page by adding some new stuff to the model (leaving aside the fact that there are better ways to achieve this). This change should be tested with a separate (set of) test case(s) to those testing the list model is correct. That may mean some duplication with several test cases invoking the same controller action with the same parameters but the trade-off is that each test case is self-contained, legible and will not break due to further changes that aren't related to the functionality the test is supposed to be verifying.



Name test cases according to what they are testing

This sounds obvious, but how often do you come across test cases named something like testSaveArticle that doesn't make it obvious what it is testing about saving an article? A good test case name is verbose but only as verbose as it needs to be; names like testListReturnsEmptyListWhenNoResultsFound, testListOnlyReturnsLiveAssets, testUserIsRedirectedToLoginFormIfNotLoggedIn tell me what I have broken when my changes cause those tests to fail. On the other hand if a test case name is too long or contains a lot of 'and's & 'or's it's probably a sign that the test is doing too much and not making a single logical assertion.



Avoid obliquely named 'helper' methods

Few things bug me more when trying to decipher other peoples' tests than helper methods whose names either don't describe properly what it is they do or that have a raft of side effects. These things make tests harder to read, not easier! Names like initMocks (to do what?), createArticles (how many? anything special about them? what behaviour are they trying to drive out?) or checkModel (what about it? how reusable is this?) require the person maintaining your test cases to find the implementation and figure out what it's doing.

Worse, such methods have a tendency to accumulate complex parameters over time, (often with defaults designed to minimise impact on existing code at the expense of making any sense) as they get re-used in new test cases that require them to do slightly different things.



Avoid the test-per-method anti-pattern

The 'test-per-method' anti-pattern is characterised by test cases named testMethodNameX that attempt to test all the paths through a given API method in a single test case (or worse, test only the happy path). A lot of people start out writing unit tests this way, probably because a lot of unit testing tutorials are written this way.

The problem is this pattern bundles a bunch of assertions into a single test and if the earlier ones fail the later ones won't even run meaning you can waste time uncovering layers of broken functionality as you work your way through the failures and re-run the test.

Legibility suffers as well as it may not be obvious at a glance what edge cases, boundary conditions and side-effects are covered and what the expectations for them are. If any kind of state has to be reset between different calls to a method then you will end up writing code that you simply wouldn't if you separated things out into multiple test cases.



Test edge cases and variant behaviour

Tests that only cover the happy path give you false confidence. Coverage should include things like...

  • What happens when null is passed to this method?
  • What is returned when a query finds no results?
  • What happens when you can't connect to that web service or the connection times out?
  • What happens when a user submits a form with incorrect data?
  • What happens when someone figures out your URL scheme includes domain instance ids and starts crafting their own URLs to retrieve records that don't exist?
  • What happens when someone figures out your URL includes a maxResults parameter and crafts a URL to request 10100 results?

TDD is your friend here; if you don't write the code until you've got a test for it, it's a lot harder to miss coverage of edge cases. Similarly the first thing you do when a bug is uncovered is to write a test that fails then fix it.



Test the thing you care about not a signifier

Max introduced me to the anti-pattern of false moniker testing and it's really changed the way I write assertions. It's an easy trap to fall into. Imagine you're testing that a query only returns 'live' records. Do you set up some live and some non-live records with particular names and then test that the names you expect show up in the results, or do you set up the data then assert that all the results actually are 'live' records? The former is terribly error prone (you assume you've set the data up correctly) and much less legible (which records are we expecting? Why that one in particular?)

Groovy's every iterator method is a godsend here enabling you to write code like:

def results = MyDomain.findAllLiveRecords()
assertFalse results.isEmpty()
assertTrue results.every { it.state == 'live' }


Avoid the monolithic setUp method

Much as it's bad to overload individual test cases with too many assertions it quickly becomes a problem when your setUp method is trying to set up data for too many test cases. Different test cases will probably require slightly different data to drive them which results in setUp trying to be all things to all cases or doing things that only a subset of the test cases actually need. As new functionality is developed the setUp becomes a black hole, accumulating more and more mass and becoming less legible and harder to refactor. The cohesion of the individual tests is harmed as well as it's hard to tell at a glance what data exists when an individual test case is run.

Ideally setUp should only be used for bootstrapping things that are truly common between all test cases and the test cases themselves should be responsible for setting up data to drive the test case's assertions.

Really bad examples of this problem will have setUp create a bunch of data and then some of the test cases actually delete it again before doing anything!
The worst example I've ever seen was from a project I worked on where the entire integration suite used a single DBUnit fixture! It was next to impossible to figure out what data would be returned in particular scenarios, impossible to change anything in the fixture without breaking a raft of other tests and the setUp and tearDown phases were so slow that the integration suite took 3 1/2 hours to run!



Keep data fixtures simple

On a related note it's worth striving to keep test data fixtures as simple as possible. For example, if a test has to wire up a bunch of properties and sub-graphs on domain object instances to satisfy constraints it can be really hard to see the wood for the trees when trying to figure out what data the test actually cares about.

When testing Grails apps the build-test-data plugin is a fantastic tool here. It allows sensible defaults to be configured so that tests only need to specify the properties they need to drive their assertions. With well thought out defaults in TestDataConfig.groovy the resulting test code can be very simple.

In unit tests the situation is easier. Since domain objects don't need to pass constraints to simulate persistent instances via mockDomain you don't need to jump through hoops setting properties and wiring up sub-graphs, test cases can set up simple skeleton objects that just contain sufficient state to drive the functionality and assertions the test case is making.



Test at the appropriate level

Slow-running integration tests that are not testing anything that couldn't be adequately tested with unit tests are just wasting time. In the Grails pre 1.1 this was a real issue as the enhancements that Grails makes to artefacts (domain classes, controllers, etc.) were not available in unit tests and therefore integration tests were the only practical way to Grails artefacts. Now with the unit testing support found in the grails.test package this is much less of an issue.

Integration tests definitely have their place but, I believe, should be used sparingly for things that a unit test cannot test and not just written as a matter of course. In Grails apps, for example, criteria queries and webflows cannot be tested with unit tests. I would also argue that there is value in integration testing things like complex queries or transactional behaviour where the amount of mocking required to get a unit test to work would result in you simply testing your own assumptions or expectations.

Do...

  • Write integration tests that add coverage that unit tests cannot.
  • Learn about the Grails unit testing support and understand the differences between unit and integration tests.

Don't...

  • Write an integration test when you should be writing a unit test or functional test.
  • Write integration tests that re-cover what is already adequately covered by unit tests.

I find that integration tests classes named for a single class (e.g. if I have a SecurityController and an integration test called SecurityControllerTests) are a bad code smell worth looking out for. They often tend to be an integration test masquerading as a unit test.

Refactoring Grails integration tests into unit tests doesn't (usually) take long, speeds up your build and I find often results in up to 50% less test code with the same amount of coverage.



Write functional test coverage

Okay, so your unit test verifies that your controller's list action will accept a sort parameter and return query results appropriately but it can't test that the associated view renders a column-heading link that sends that param or that clicking on it again reverses the order. When you start developing rich functionality on a web front-end a back-end unit test isn't going to do you a lot of good either.

Grails' functional testing plugin is a pretty good tool although for rich interaction testing I'd rather use Selenium.

Discussion of practices for these kind of tests really justifies a whole other post.

7 Sept 2009

Session Bound Temporary Files

I've just released Session Temp Files, a very simple plugin for managing a temporary file storage space bound to the HTTP session. I'm working on something where a user uploads a file at the start of a webflow conversation, the flow then validates the file and may require the user to enter extra information before finally the file is saved and an associated record is added to the database. The uploaded files needed to be stored somewhere between the initial upload phase and the successful outcome of the flow where they will be copied in to a permanent storage space. The file is really too big to keep in memory in the flow scope. It's easy enough to create a temp file and delete it at the end of the flow (even if the user cancels out of the flow) however users can also do things like close the browser, suddenly decide to do something different and type in a new URL, etc. The webflow then gets abandoned in an intermediate state. The flow itself will be destroyed when the HTTP session ends but the temp files I created will hang around until the OS decides to sweep the temp directory.

The plugin simply allows you to create a directory within the normal temp directory - System.properties."java.io.tmpdir" - that will get deleted when the HTTP session expires. It binds two new methods on to HTTPSession: getTempDir() returns the session's temp directory, creating it if it doesn't exist and createTempFile(prefix,suffix) works like File.createTempFile except that the file is created inside the directory returned by getTempDir.

The code is up on GitHub and the plugin is available from the standard Grails plugin repository via grails install-plugin session-temp-files.