5.1 Software Test Automation
5.1 Software Test Automation
1. Automation saves time as software can execute test cases faster than human do. The time
thus saved can be sued effectively for test engineers to,
a. Develop additional test cases to achieve better coverage
b. Perform some esoteric or specialized tests like ad hoc testing; or
c. Perform some extra manual testing.
2. Test automation can free the test engineers from mundane tasks and make them focus on
more creative tasks.
3. Automated tests can be more reliable.
4. Automation helps is immediate testing.
5. Automation can protect an organization against attrition of test engineers
6. Test automation opens up opportunities for better utilization of global resources.
7. Certain types of testing cannot be executed without automation.
8. Automation means end-to-end, not test execution alone.
Automation –
Automation –
Second Automation – Third generation
First generation
generation
Skills for test Skills for test Skills for test case
Skills for framework
case automation case automation automation
Record-playback Scripting Programming
Scripting languages
tools usage languages languages
Design and
Programming Programming
architecture skills for
languages languages
framework creation
Knowledge of Design and Generic test
data generation architecture of the requirements for
technique product under test multiple products
Usage of the
Usage of the
product under
framework
test
External Modules
o TCDB and DB
o The steps to execute all the test cases and the history of their execution is stored in
TCDB
o DB (Defect database / Defect Repository) contains details of all the defects that
are found in various products that are tested in a particular organization.
Scenario and Configuration File Modules
o Scenarios are information on “how to execute a particular test case”.
o A configuration file contains a set of variables that are used in automation. This
file is important for running the test cases for various execution conditions and for
running the tests for various input and output conditions and states.
Test Cases and Test Framework Modules
o A test case means that automated test cases that are taken form TCDB and
executed by the framework.
o A test framework is a module that combines “what to execute” and “how they
have to be executed”.
o The test framework is considered the core of automation design. It subjects the
test cases to different scenarios.
o The framework monitors the results of every iteration and the results are stored.
o It can be developed by the organization internally or can be brought from the
vendor.
Tools and Results Modules
o When a test framework performs its operations, there are set of tools that may be
required
o To run the compiled code, certain runtime tools and utilities may be required.
o The results for each of the test case along with scenarios and variable values have
to be stored for future analysis and action.
o The history of all the previous tests run should be recorded and kept as archives.
Report Generator and Reports / Metrics Modules
o Preparing reports is a complex and time consuming effort and hence it should be
part of automation design.
3. Any pointers to how the data can be used for future planning and continuous
improvements.
Metrics are thus derived from measurements using appropriate formulae or calculations.
Types of Metrics
Project Metrics: A set of metrics that indicates how the project is planned and executed.
Progress Metrics: A set of metrics that tracks how the different activities of the project
are progressing. It can be further classified into test defect metrics and development
defect metrics.
Productivity Metrics: A set of metrics that takes into account various productivity
numbers that can be collected and used for planning and tracking testing activities.
Project Metrics
Effort Variance
o Baselined effort estimates, revises effort estimates and actual effort are plotted
together for all the phases of SDLC.
o If there is a substantial difference between the baselined and revised effort, it
points to incorrect initial estimation.
o Calculating effort variance for each of the phases provides a quantitative measure
of the relative differences between the revised and actual efforts.
o Variance % = [ (actual effort – revised estimate) / revised estimate ] * 100.
Schedule Variance
o Schedule variance is the deviation of the actual schedule from the estimated
schedule. The variance percentage for schedule is same as effort.
Effort Distribution
o Adequate and appropriate effort needs to be spent in each of the SDLC phase for
a quality product release.
o The distribution percentage across the different phases can be estimated at the
time of planning. These can be compared with the actual at the time of release for
getting a comfort feeling on the release and estimation methods.
Test defect metrics help us understand how the defects that are found can be used to improve
testing and product quality. The defects can be classified by assigning a defect priority. The
priority of a defect provides a management perspective for the order of defect fixes. Also, it can
be classified as defect severity levels. The severity of defects provides the test team a
perspective of the impact of that defect in product functionality.
Productivity Metrics
Measurement-related data, and other useful test-related information such as test documents and
problem reports, should be collected and organized by the testing staff. The test manager can
then use these items for presentation and discussion at the periodic meetings used for project
monitoring and controlling. These are called project status meetings.
Test-specific status meetings can also serve to monitor testing efforts, to report test progress, and
to identify any test-related problems. Testers can meet separately and use test measurement data
and related documents to specifically discuss test status. Following this meeting they can then
participate in the overall project status meeting, or they can attend the project meetings as an
integral part of the project team and present and discuss test-oriented status data at that time.
Each organization should decide how to organize and partition the meetings. Some deciding
factors may be the size of the test and development teams, the nature of the project, and the
scope of the testing effort.
Another type of project-monitoring meeting is the milestone meeting that occurs when a
milestone has been met. A milestone meeting is a mechanism for the project team to
communicate with upper management and in some cases user/client groups. Testing staff, project
managers, SQA staff, and upper managers should attend.
Status meetings usually result in some type of status report published by the project manager
that is distributed to upper management. Test managers should produce similar reports to inform
management of test progress.
Rakos recommends that the reports be brief and contain the following items
(i) Activities and accomplishments during the reporting period: All tasks that were attended
to should be listed, as well as which are complete.
(ii) Problems encountered since the last meeting period: The report should include a
discussion of the types of new problems that have occurred, their probable causes,
and how they impact on the project.
(iii)Problems solved: At previous reporting periods problems were reported that have now
been solved. Those should be listed, as well as the solutions and the impact on the
project.
(iv) Outstanding problems: These have been reported previously, but have not been solved to
date. Report on any progress.
(v) Current project (testing) state versus plan: This is where graphs using process
measurement data play an important role.
(vi) Expenses versus budget: Plots and graphs are used to show budgeted versus actual
expenses. Earned value charts and plots are especially useful here.
(vii) Plans for the next time period: List all the activities planned for the next time period as
well as the milestones.
An example bar graph for monitoring purposes can be shown with the help of a bar graph. The
total number of faults found is plotted against weeks of testing effort can be shown in another
graph based on defect data.
Graphs especially useful for monitoring testing costs are those that plot staff hours versus time,
both actual and planned. Earned value tables and graphs are also useful. In status report graphs,
earned value is usually plotted against time, and on the same graph budgeted expenses and actual
expenses may also be plotted against time for comparison. Although actual expenses may be
more than budget, if earned value is higher than expected, then progress may be considered
satisfactory.
The agenda for a status meeting on testing includes a discussion of the work in progress since
the last meeting period. Measurement data is presented, graphs are produced, and progress is
evaluated. Test logs and incident reports may be examined to get a handle on the problems
occurring. If there are problem areas that need attention, they are discussed and solutions are
suggested to get the testing effort back on track
As testing progresses, status meeting attendees have to make decisions about whether to stop
testing or to continue on with the testing efforts, perhaps developing additional tests as part of
the continuation process. They need to evaluate the status of the current testing efforts as
compared to the expected state specified in the test plan. In order to make a decision about
whether testing is complete the test team should refer to the stop test criteria included in the test
plan. If they decide that the stop-test criteria have been met, then the final status report for
testing, the test summary report, should be prepared
At project postmortems the test summary report can be used to discuss successes and failures
that occurred during testing. It is a good source for test lessons learned for each project.
If we stop testing now, we do save resources and are able to deliver the software to our clients.
However, there may be remaining defects that will cause catastrophic failures, so if we stop now
we will not find them. As a consequence, clients may be unhappy with our software and may not
want to do business with us in the future. Even worse there is the risk that they may take legal
action against us for damages.
On the other hand, if we continue to test, perhaps there are no defects that cause failures of a
high severity level. Therefore, we are wasting resources and risking our position in the market
place.
Managers should not have to use guesswork to make this critical decision. The test plan should
have a set of quantifiable stop-test criteria to support decision making.
The weakest stop test decision criterion is to stop testing when the project runs out of time and
resources.
Fault (defect) seeding technique is based on intentionally inserting a known set of defects into a
program. This provides support for a stop-test decision. It is assumed that the inserted set of
defects are typical defects; that is, they are of the same type, occur at the same frequency, and
have the same impact as the actual defects in the code. One way of selecting such a set of defects
is to use historical defect data from past releases or similar projects.
Several members of the test team insert (or seed) the code under test with a known set of
defects.
The other members of the team test the code to try to reveal as many of the defects as
possible.
The number of undetected seeded defects gives an indication of the number of total
defects remaining in the code (seeded plus actual).
A ratio can be set up as follows:
o Detected seeded defects / Detected actual defects = Total seeded defects / Total
actual defects. Using this ratio we can say, for example, if the code was seeded
with 100 defects and 50 have been found by the test team, it is likely that 50% of
the actual defects still remain and the testing effort should continue. When all the
seeded defects are found the manager has some confidence that the test efforts
have been completed.
The items that will be under configuration control must be selected, and the relationships
between them must be formalized. The four configuration items are, a design specification, a test
specification, an object code module, and source code module.
2. Change Control
There are two aspects of change control—one is tool-based, the other team-based. The team
involved is called a configuration control board. This group oversees changes in the software
system. The members of the board should be selected from SQA staff, test specialists,
developers, and analysts. It is this team that oversees, gives approval for, and follows up on
changes. They develop change procedures and the formats for change request forms.
These reports help to monitor changes made to configuration items. They contain a history of all
the changes and change information for each configuration item. Each time an approved change
is made to a configuration item, a configuration status report entry is made. The reports can
answer questions such as:
who made the change;
what was the reason for the change;
what is the date of the change;
what is affected by the change.
Reports for configuration items can be disturbed to project members and discussed at status
meetings.
4. Configuration audits
The audit is usually conducted by the SQA group or members of the configuration control board.
They focus on issues that are not covered in a technical review. A checklist of items to cover can
serve as the agenda for the audit. For each configuration item the audit should cover the
following:
(i) Compliance with software engineering standards. For example, for the source code modules,
have the standards for indentation, white space, and comments been followed?
(ii) The configuration change procedure. Has it been followed correctly?
(iii) Related configuration items. Have they been updated?
(iv) Reviews. Has the configuration item been reviewed?
higher-quality software;
increased productivity (shorter rework time);
closer adherence to project schedules (improved process control);
increased awareness of quality issues teaching tool for junior staff;
opportunity to identify reusable software artifacts;
reduced maintenance costs;
higher customer satisfaction;
more effective test planning;
a more professional attitude on the part of the development staff.
Managerial reviews usually focus on project management and project status. Technical reviews
are used to:
o verify that a software artifact meets its specification;
o to detect defects; and
o check for compliance to standards
Informal reviews are an important way for colleagues to communicate and get peer input with
respect to their work. There are two major types of technical reviews—inspections and
walkthroughs— which are more formal in nature and occur in a meeting-like setting.
Inspections are a type of review that is formal in nature and requires pre-review preparation on
the part of the review team. Several steps are involved in the inspection process as outlined in the
following figure.
The inspection leader plans for the inspection, sets the date, invites the participants, distributes
the required documents, runs the inspection meeting, appoints a recorder to record results, and
monitors the follow up period after the review.
The checklist contains items that inspection participants should focus their attention on, check,
and evaluate. The inspection participants address each item on the checklist. The recorder
records any discrepancies, misunderstandings, errors, and ambiguities; in general, any problems
associated with an item. The completed checklist is part of the review summary document.
Inspection
Policies and Entry Criteria
Plans
Checklist Initiation
Preparation
Inspection meeting
Defect database
Reporting Results
Metric database
Exit
When the inspection meeting has been completed (all agenda items covered) the inspectors are
usually asked to sign a written document that is sometimes called a summary report.
The inspection process requires a formal follow-up process. Rework sessions should be
scheduled as needed and monitored to ensure that all problems identified at the inspection
meeting have been addressed and resolved. Only when all problems have been resolved and the
item is either reinspected by the group or the moderator is the inspection process completed.
Walkthroughs are a type of technical review where the producer of the reviewed material serves
as the review leader and actually guides the progression of the review. If the presenter gives a
skilled presentation of the material, the walkthrough participants are able to build a
comprehensive mental (internal) model of the detailed design or code and are able to both
evaluate its quality and detect defects.
The round robin review where there is a cycling through the review team members so that
everyone gets to participate in an equal manner.
Another instance, every reviewer in a code walkthrough would lead the group in inspecting a
specific line or a section of the code
Reason 1
During a review there is a systematic process in place for building a real-time mental model of
the software item. The reviewers step through this model building process as a group. If
something unexpected appears it can be processed in the context of the real-time mental model.
There is a direct link to the incorrect, missing, superfluous item and a line/page/figure that is of
current interest in the inspection. Reviewers know exactly where they are focused in the
document or code and where the problem has surfaced.
They can basically carry out defect detection and defect localization tasks at the same time.
Reason 2
Reviews also have the advantage of a two-pass approach for defect detection. Pass 1 has
individuals first reading the reviewed item and pass 2 has the item read by the group as a whole.
If one individual reviewer did not identify a defect or a problem, others in the group are likely to
find it. The group evaluation also makes false identification of defects/problems less likely.
Individual testers/ developers usually work alone, and only after spending many fruitless hours
trying to locate a defect will they ask a colleague for help.
Reason 3
Inspectors have the advantage of the checklist which calls their attention to specific areas that are
defect prone. These are important clues. Testers/developers may not have such information
available.
1. Review Goals
requirements documents;
design documents;
code;
test plans (for the multiple levels);
user manuals;
training manuals;
standards documents.
4. Review Procedure
For each type of review, there should be a set of standardized steps that define the given review
procedure. For each step in the procedure the activities and tasks for all the reviewer participants
should be defined. The review plan should refer to the standardized procedures where applicable.
5. Review Training
Review participants need training to be effective. Responsibility for reviewer training classes
usually belongs to the internal technical training staff. Review participants, and especially those
who will be review leaders, need the training. Test specialists should also receive review
training. Suggested topics for a training program are shown below.
6. Review Checklist
Correctness
Are there any incorrect items?
Are there any contradictions?
Are there any ambiguities?
1. For inspections—the group checklist with all items covered and comments relating to
each item.
2. For inspections—a status, or summary, report (described below) signed by all
participants.
3. A list of defects found, and classified by type and frequency. Each defect should be
cross-referenced to the line, pages, or figure in the reviewed document where it occurs.
4. Review metric data
The inspection report on the reviewed item is a document signed by all the reviewers. It may
contain a summary of defects and problems found and a list of review attendees, and some
review measures. The reviewers are responsible for the quality of the information in the written
report. There are several status options available,
1. Accept: The reviewed item is accepted in its present form or with minor rework required
that does not need further verification.
2. Conditional accept: The reviewed item needs rework and will be accepted after the
moderator has checked and verified the rework.
3. Reinspect: Considerable rework must be done to the reviewed item. The inspection
needs to be repeated when the rework is done.
If the software item is given a conditional accept or a reinspect, a follow-up period occurs where
the authors must address all the items on the problem/defect list.
The moderator reviews the rework in the case of a conditional accepts. Another inspection
meeting is required to reverify the items in the case of a “reinspect” decision.
The defect report contains a description of the defects, the defect type, severity level, and the
location of each defect.
The inspection report contain vital data such as,
(i) number of participants in the review;
(ii) the duration of the meeting;
(iii) size of the item being reviewed (usually LOC or number of pages);
(iv) total preparation time for the inspection team;
(v) status of the reviewed item;
(vi) estimate of rework effort and the estimated date for completion of the rework.
A ranking scale for defects can be developed in conjunction with a failure severity scale
The walkthrough report lists all the defects and deficiencies, and contains data such as:
the walkthrough team members;
the name of the item being examined;
the walkthrough objectives;
list of defects and deficiencies;
recommendations on how to dispose of, or resolve the deficiencies
A final important item to note: The purpose of a review is to evaluate a software artifact, not the
developer or author of the artifact. Reviews should not be used to evaluate the performance of a
software analyst, developer, designer, or tester.
Maintainability can be partitioned into the quality subfactors: testability, correctability, and
expandability.
o Completeness: The degree to which the software possesses the necessary and sufficient
functions to satisfy the users needs.
o Correctness: The degree to which the software performs its required functions.
o Security: The degree to which the software can detect and prevent information leak,
information loss, illegal use, and system resource destruction.
o Compatibility: The degree to which new software can be installed without changing
environments and conditions that were prepared for the replaced software.
o Interoperability: The degree to which the software can be connected easily with other
systems and operated.
The standards document also describes a five-step methodology that guides an organization in
establishing quality requirements, applying software metrics that relate to these requirements,
and analyzing and validating the results.
A template called an “attribute specification format template” that can be used to clearly describe
measurable system attributes. Template components are,
Scale: describes the scale (measurement units) that will be used for the measurement; for
example, time in minutes to do a simple repair.
Test: This describes the required practical tests and measurement tools needed to collect
the measurement data.
Plan: This is the value or level an organization plans to achieve for this quality metric. It
should be of a nature that will satisfy the users/clients.
Best: This is the best level that can be achieved; it may be state-of-the-art, an engineering
limit for this particular development environment, but is not an expected level to be
reached in this project. (An example would be a best system response time of 3.5
seconds.)
Worst: This indicates the minimal level on the measurement scale for acceptance by the
users/clients. Any level worse than this level indicates total system failure, no matter how
good the other system attributes are. (An example would be a system response time of 6
seconds.)
Now: This is the current level for this attribute in an existing system. It can be used for
comparison with planned and worst levels for this project.
See: This template component provides references to more detailed or related documents
Suppose a quality goal is to reach a specified level of performance. It is appropriate for testers to
collect data during system test relating to:
1. Response time. Record time in seconds for processing and responding to a user request.
(Descriptive remarks for data collection: An average value for the response time should
be derived from not less than 75 representative requests, both under normal load and
under stress conditions.)
2. Memory usage. Record number of bytes used by the application and for overhead.
(Descriptive remarks for data collection: Data should be collected for normal and heavy
stress and volume.)
To address quality goals such as testability, the following can be collected by the testers:
(i) Cyclomatic complexity.
(ii) Number of test cases required to achieve a specified coverage goal. Count for code
structures such as statement or branch.
(iii)Testing effort-unit test. Record cumulative time in hours for testers to execute unit tests
for an application.
For addressing goals with respect to maintainability, the following measurements are appropriate
for testers to collect:
1) Number of changes to the software. (Descriptive remarks for data collection: Count
number of problem reports)
2) Mean time to make a change or repair. Record mean-time-to-repair (MTTR) in time
units of minutes.