The Software Evaluation and Testing Process

Ben Tasker

2010-11-28 19:05 (updated 2012-06-27 12:22)

Verification, Validation and Testing has become an intrinsic part of the Software Development Life Cycle, the
importance of Testing throughout an applications development has been recognised with most companies now
designing their processes by using the 'V' Model.

Recognition of the importance of formal testing and planning is only a relatively recent development in comparison to the art
of programming itself. Not so many years ago, software was designed on the back of a cigarette packet in the
pub, and then implemented, there was little to no documentation, and very little structured testing.

Contrary to Popular Belief, Testing is not just about seeing if the end product works, if Testers are involved as soon as the Initial Specification is written, they can begin to plan later tests and also pick up on requirements that may have been omitted from the Specification. If a requirement has been incorrectly stipulated, badly defined or omitted it is far more expensive to rectify one the Implementation stage has been passed.

Consider the old waterfall model;

Analysis and Design

-->

Planning

-->

Implementation and Execution

-->

Reporting

-->

Testing

-->

End Product

Using this process map, Testing is left until the last minute, and any errors will require moving at least a section back to the Analysis and Design section. If a programmer has misinterpreted a requirement, then the process must once again be moved back. Using the V Model combats this, allowing Testers to become involved in every aspect of the process, checking requirements are well defined, looking for bugs in the software etc.

Once the software is written, it needs to be tested and evaluated. Whilst programmers should run basic tests on their code, full testing should be implemented by an Independant party. Depending on the size of the organisation, there will probably be an Independant Test Team. Ths is done to avoid the bias felt by programmers
towards their own code, there is often a stronger urge to show how good their code is than there is to actually find errors.

Once the code is in the hands of the test team, they will begin checking it for errors. First they will use Static Analysis to check the code itself. They should look for structural errors, where the coder may have forgotten to close an IF statement, or forgotten to Declare and Define a variable. This can be done manually or automatically.

The tool will also check for Cyclomatic Complexity (Count the 'Decisions' and add 1, should be less than 11), and various other errors that can be introduced. This type of testing is known as 'White Box' Testing as it requires the Tester to know and understand how the software makes the end product work.

In contrast to this, Black Box testing is conducted with no understanding of how the product works, just that it should. This type of testing is best conducted by someone who could use the product in future, the test design will stipulate what the Input Variables should be, and also what the expected output variables are. The tester would the proceed to test the item using methods such as Equivalence partitioning, Boundary Value Analysis or State Testing.

Whilst the term Equivalence Testing sounds quite complex, the actual procedure is relatively simple. For example a tester is provided with a box that will output "Valid" for any number between 1000 and 2000, and any numbers outside that range will produce an "Invalid" Output. To test the box using Equivalence Partitioning, the tester would enter a number less than 1000 (expecting an invalid output) a number between 1000 and 2000 (Expecting a valid output) and finally a number greater than 2000 (again expecting an invalid output). This simple, and reasonably obvious test, is Equivalence Partitioning.

A test complementary to this is Boundary Value Analysis, this tests the boundaries themselves so in the above test you would also enter 999 (should be invalid), 1000 (Should be valid), 2000 (Should be valid), 2001 (Should be invalid). When programming, the boundary values are often the source of an error, especially if the coder is tired. It is all to easy to forget to state that the value should be greater than OR equal to another value.

Not every application can be tested using Boundary Value Analysis or Equivalence Partitioning. Some applications just do not accept input in a range in quite the same way, some of these can be tested using State Transition Diagrams. Consider the CD Player below;

OFF	---->	Press Power Button	---->	STANDBY	---->	Press Play	---->	PLAY
				STANDBY	<----	Press Stop	<----	PLAY

Although a very simple example, it does show how each link between a state can be tested, and checks for other transitions can be made. For example if the CD Player is switched off, and I press the Play button, will it bypass the Standby state and start playing? Whilst a CD is playing, does it actually switch back to Standby when I press Stop?

State Transition diagrams are usually far more complex, but ideally you want to test every possible route through the systems states, it doesn't matter if you test each transition more than once, so long as they all get tested.

Of a similar ilk, is Decision Table testing. As an example - If it is a workday, John goes to work, but if it is not and it is early he will go fishing (But only if the weather is good). If it is late he will clean the house.

So we would draw this scenario out in the following table;

Workday?	T	T	T	F	F	F	T	F
Early?	T	T	F	F	F	T	F	T
Good Weather?	T	F	F	F	T	T	T	F
Go to Work?	T	T	T	F	F	F	T	F
Go Fishing?	F	F	F	F	F	T	F	F
Tidy?	F	F	F	T	T	F	F	F

Each column contains a possible outcome, based on a simple True or False. In that example, we identified a missing requirement as the scenario did not state what John actually does if it's not a workday, is early but the weather is bad. You would then pass this information back so that the requirements specification could be updated.
This is a very good example of exactly why Testers should be involved as early as possible in the development lifecycle.

Black box testing can also look at asthetic requirements, if the requirements say it should be in a brown 3x3 box, is it? If the specification is vague and states that it should be in a dark box, this can be reported back for clarification.

These forms of testing can be used throughout the lifecycle, ideally many of the tests will be designed as soon as the requirements specification is published. That way any vague or missing requirements will be identified as quickly as possible.

Once the software is actually written, each component of the software should be tested using the various techniques detailed above, and then an Integration test should take place. This is done to ensure that all the modules will work together, and largely focuses on the interfaces between each module, whether it be hardware or software.

Once this is done a System Test will take place, this helps to check that the intended environment for the software is suitable.

An acceptance test will then be run, aimed primarily at ensuring the system meets the specifications laid out in the relevant documentation, you may or may not invite the customer to participate in these tests. They will have a far keener understanding of what they require the system to do, and seeing a working system in test conditions will help to build confidence about the suitability of the product.

However, be aware that once the product is released to the customer, the number of bugs found typically soars. This is because end-users are ingenious at finding ways to break a system, and are using it with real world data.

So with this in mind, when should Testing end? Managers will refer to timelines and budgets, but that has little relevance. It depends on the risk involved in a failure, if the software is a 'critical system' then a failure could affect life and limb, if your system is a commerical game then the risk is not so high. No matter what the product, customers will be upset about any failure, but consider the real world implications, compare the Ariane V (caused by insufficient testing of software) to when MS Windows crashes. What has more of an impact?

The testing end point will be set based on this criteria, if it is a critical system then there should always be extra time and money available (there often isn't unfortunately). This point will be laid down in the Exit Criteria drafted at the beginning of Testing, depending on the type of product being made it will usually refer to the number of bugs found per week. So for example, once we are finding less than 10 bugs a week in software A, it is ready for release.

Even after the software is released, testing will continue. Regression testing is carried out whenever an update is applied, it tests the unchanged bit of the software to ensure that no new bugs have appeared, or been highlighted by the update (if this had been done on the Ariane V.....)
Confirmation testing will be carried out once a bug has been fixed, simply to ensure that it has been, there is likely to be the heaviest workload during the beta stage of testing.

Beta testing is the stage that follows alpha testing (testing on the developers site), Beta testing takes palce with a select few customers on their own site, traditionally they often get the software for free as an incentive to help develop the application. This is, of course, not true in all cases.

In conclusion, each stage in the Software Development lifecycle should have an equivalent testing stage. One the Requirement specification is written, testers should be involved. Finding faults early saves money, especially if the fault is in the documentation that the programmers are working from. There are many forms and styles of testing, but no one test will test everything, different tests need to be designed. If timescales are running short, testers can try to find as many bugs as possible by using Defect Clustering (the theory that states that Defects are normally grouped, whether through an inexperienced programmer or due to a bottle of wine - 80% of your bugs will come from 20% of your code).

Testing should not end until it's objectives have been acheived, whilst there is no such thing as error free code, the risk of failure must govern the level of testing.

Related Snippets