Tuesday, September 14, 2010

Precision Solutions

Precision and accuracy are two hallmarks of software quality that should always be emphasized. The best way of ensuring precision and accuracy is via a rigorous testing process. At Charonite we employ the following test practices:

- automated test harnesses that run a number of tests on each module and function, saving the results in a test log and database

- automated regression testing performed after each major build to ensure that new bugs do not creep into tested code and that new features do not break old code

- dependency and impact analysis to identify possible areas that may be affected by the new changes

- prioritised change management procedures that work hand-in-hand with testing procedures to ensure that changes and updates occur in a reliable and pre-planned fashion, with a backup rollback plan always ready in case of problems during the update process

The Obulus platform is also adding automated system simulation and statistical stress test functions that give additional confidence to developers that their application will work properly once deployed on live systems.

Some common issues that we encounter during testing for precision:

- rounding errors especially when using float data types can creep in when processing massive datasets, giving rise to very small numbers that can wreck havoc with code like (account_balance <= 0) which will fail unexpectedly. A common solution that we generally add is to have code like (account_balance <= 0.01) or (account_balance <= OB_PAYMENTS_MIN_THRESHOLD).

- the order of operations in long sequences of floating point calculations may affect the accuracy of the result, especially when the GPU is involved in speeding up the results. A CPU-GPU result set comparison may be useful in such cases to identify discrepancies. We have noticed that these errors may start becoming noticeable only when the dataset has a couple of million entries, so large-scale testing is a must.

- the result calculations should be as traceable as possible. It is very convenient to keep track of the basesources on which a specific result has been produced, as this will aid in identifying errors.

- in cases where performance needs override precision, quick approximations may be developed to give a roughly accurate result which is then refined if more time and resources are available. Quick approximation functions are also useful to generate previews and samples.

- good quality sample simulated data generation routines are a must. Random number sequences should be truly random, artificially generated names and addresses should be similar to real life ones, and most importantly: there should be a percentage of erroneous inputs to simulate real life accurately.

- it is not enough to test for positive cases only but also for negative cases also. This is one of the most overlooked aspects of testing that can really improve precision: do not make assumptions about the system behaviour and try to test for events that should not happen but that may happen if there is an error.