Today I had a discussion about the representation of test data. Using the system functions will achieve a 100% complete representation of the objects we want to test. But will we achieve the same if we use a separate test data-loading tool, which will insert the records in the database directly?
Basically there are 3 ways to build your test data:
- Through system functions
- Through separate test data loading tools (via SQL for instance)
- Using production data (if you’re working on a migration project)
Personally I don’t like to use production data. It can contain a lot of fuzziness in the new system and I’ve seen a lot of extra testing time wasted because of that. It can be useful to replay some situation on production and see if the new system improves or corrects it.
Here are some of the pros and cons using system functions and data loading tools.
| Through system functions | Through load tool | |
|---|---|---|
| Pros: |
|
|
| Cons: |
|
|
Until now I’ve been using a mix of loading tools and system functions during testing. When we need to load big data into the database to have a central initial situation we use the loading tools. In order to test the situations (like updating and reading objects) we use the system functions. But be careful! Maintaining this to be imported data can be a hell of a job. Changing one of the values in this batch can break other tests, so there should be someone gatekeeping these changes in a structural way.
Most of the test cases I like to generate the data through system functions. This also makes sure that the system makes unique instances of test objects, so it won’t influence the other tests running at the same time. Sometimes I reuse the generated data in a script to test multiple variations of situation. After every completed test I simply remove all the data the tests created, so it won’t hurt any of the other tests.
From my point of view it’s better to create test data on the spot. And if you can cluster some of the test data organize your tests by feature (or User Story). When your test is telling you something is broken you can simply investigate that piece of test (of set of tests) containing the data it needed. If the test data needs to be changed you’ll don’t need to worry about other tests breaking.








2 reacties tot nu toe
Hi Kishen, hope you are well.
Something to consider perhaps…
In such systems, you are often testing, “Data in flow” and “Data out flow”. That is to say, “If I press these buttons and enter these values, I should end up with this dataset”, and conversely, “If I start with this dataset, my UI should display these widgets with these values”.
My suggestion, would be to make the dataset in both cases the same. So in the first instance, the dataset becomes your test verification tool, (“Ok I have automated the following actions, database should now match my dataset file”), and in the second instance your bootstrap (“Ok, I have inserted my dataset, now my widgets should be correct”).
By doing this, you give yourself the option of testing data in/data out separately, as above, or combining the two tests into a full integration test where the data out test is performed immediately after running the data in test. Effectively it means you can alternate between your two approaches at will, and receive both sets of ‘pros’ with very little effort, and eliminate the ‘cons’ by having both options available ‘at the flick of a switch’.
Developers are working towards the full end to end, but QA can be performed on individual functions up to that ramp up…
Hi Wez!
Thanks for your reply.
So basically this happens?
1) Create a dataset with an automated action suite using system functions.
2) Verify the dataset with an automated test suite using system functions (like UI, xml, json, etc.)
I like the idea! I think we used it a couple of times right?
In this case you don’t need a data load tool at all!
So when we want to extend tests (by Exploratory Testing), we need to add some actions to the automated action suite (1) and verify the output with the automated test suite (2).
Would you use a central dataset in that case? Or use several for specific cases?
Plaats je reactie!