The latest version of the Achilles Satellite (3.3) was released in December 2010. If you aren't familiar with the Satellite, check out the Satellite Overview page here for a description of this cyber security test tool for automated systems. This week's blog describes fault isolation, just one of the many capabilities of the Satellite.
Before we begin, there are two terms that need explaining: anomaly and subtest. In Achilles-speak, an anomaly is any unusual device behavior that is detected by the Achilles monitors during execution of a test case. The important point to note is that the Satellite does not attempt to report device pass or failures for each test because it is simply impossible to know the performance characteristics of every device that might be tested. Likewise, it is impossible to know the failure or security mechanisms of every device. Suppose for example, that your secure device is configured to ignore what it considers to be a faulty client after a certain number of invalid packets have been sent. According to your configuration, the device behaves correctly when it stops responding to a test that is sending invalid packets.
Without knowing every configuration of every device, the Satellite cannot tell if a test causes a device to behave as designed or exposes an implementation issue. So it alerts you, by reporting an anomaly, whenever the device stops responding. You can then analyze monitor results, packet capture files and the device specification to determine whether this unusual DUT behavior is acceptable.
The second term, subtest, describes an individual interaction or a group of packets with a common characteristic.
As any tester knows, the ability to accurately reproduce anomalies is an important aspect of discovering and solving software issues, and is vital when verifying that these issues no longer exist. The Satellite supports automatic fault isolation during the execution of tests to identify packets that cause unusual device behavior.
So how does fault isolation work? When you execute an Achilles test in fault isolation mode, it attempts to run the full range of subtests specified for the test. If an anomaly is reported, the Satellite begins an automated binary search to isolate the problem subtest. The initial range is narrowed down and the new range of subtests is executed. Here is an example of how the Satellite's fault isolation works.
In the illustration above, an anomaly is discovered in the full range of subtests. Although the anomaly is reported when subtest 4 is executing, an earlier subtest may have actually caused the unusual device behavior. In order to thoroughly investigate the cause of the anomaly, the test halves the range between the first subtest and the subtest that was running when the anomaly was reported (1 to 4) and searches the first half (1 to 2). It does not find an anomaly so it searches the second half of the range (3 to 4). It finds an anomaly, divides the range in two and searches the first half (subtest 3). The anomaly is reported when subtest 3 is executed, indicating that this is a problem subtest. This part of the binary search is now complete and the isolated subtest (3) is displayed in the Achilles Client, which is the Satellite's UI. To ensure that there are no other problem subtests, the test searches the remaining range (4 to 10). In this example, no other anomalies are reported so it is safe to conclude that subtest 3 is the only problem subtest.
Now that the problem subtest has been isolated, you can rerun the test using only this subtest, capture the traffic for your records, and examine it to investigate whether the issue is caused by an implementation error on the device. You also have the ability to exactly replicate the traffic so that, if a software fix is applied to the device, you can test it using the same subtest that caused the problem in the first place and be confident that the issue has been resolved.