Surface Ship Torpedo Defense (SSTD)
SSTD TECHEVAL/OPEVAL LESSONS LEARNED
A Study in Test and Evaluation Management
STATEMENT A: APPROVED FOR PUBLIC RELEASE; DISTRIBUTION IS UNLIMITED
by George Axiotis, NAVSEA T&E Office
The following is an outline of the significant "Lessons Learned" from the development and operational testing of the Surface Ship Torpedo Defense (SSTD) program. Each "Lesson Learned" is related to a fundamental Test and Evaluation (T&E) management concept or practice utilized within the Naval Sea Systems Command (NAVSEA) and its affiliated Program Executive Offices (PEOs).
The SSTD program was an ACAT II level defensive system development to counter specific undersea weapon threats to high value surface ships. The system at the time of this study consisted of detection, control, and counter weapon subsystems. The counter weapon portion was comprised of a hardkill subsystem for outer layer engagement and a seduction subsystem (softkill) for inner layer defense. SSTD is the first undersea warfare program to use a layered-attrition approach for the defense of surface ships.
The overall development effort was rather modest and was treated as "low risk" because it used existing detection and torpedo systems. However, the mission these systems would be required to perform was rather new, and complex. The test and evaluation (T&E) program consisted of separate in-water phases for each subsystem followed by partial integrated system testing on small surface ships. The last phase consisted of in-water testing of the integrated system on an aircraft carrier. This last phase was identified as TECHEVAL, based on which the system's readiness for independent operational testing would be judged.
The program was under tight fiscal and schedule constraints, but considered low risk. Based on its performance during OPEVAL, the Commander, Operational Test and Evaluation Force (OPTEVFOR) assessed it as neither operationally effective nor operationally suitable. The following are "lessons learned" from a T&E planning perspective that were reinforced by what happened in this OPEVAL.
Operational Utility & Testing Linkage
Based primarily on the performance in OPEVAL, OPTEVFOR did not assess the overall system as effective, even though a majority of the sub-systems were effective in countering the incoming threats. There were performance thresholds in the Test and Evaluation Master Plan (TEMP) for system effectiveness both for the overall system and for each of the major sub-systems. The system level thresholds included performance both with and without each of the two "kill" mechanisms. This partitioning of the thresholds enabled a separate evaluation of each subsystem, but pitted one subsystem against another --- and perhaps indirectly one "kill" mechanism against the other. In effect, this diluted the objective of an integrated systems approach. It was never clear how important the hardkill capability was, if the softkill proved to be fully effective -- which it did.
The development process for the ORD and the TEMP must force a understanding of and document the expected operational utility of a system. If requirements are to be specified below the system level, the development of the ORD and TEMP must force an understanding not only of the operational utility of each subsystem but their weighted contribution to the overall system utility.
Testable and Measurable Thresholds
The "kill" performance thresholds were expressed in the ORD and TEMP in measures of feet, but the range and weapon tracking data could only realistically provide accuracy measured in yards. Also, in at-sea testing, there were not enough intercepts close to the threshold from which to validate the data analysis methodology. During OPEVAL, there were two runs which came very close to the threshold which OPTEVFOR scored as "misses". The methodology used for this scoring left much to subjective interpretation. Later analysis showed that these runs could have just as readily been scored as "kills".
LESSON LEARNED --
Thresholds must be both realistic and testable. Any inherent limitations due to the T&E process itself, such as range capabilities, safety restrictions, instrumentation, should be reflected in the TEMP thresholds and their definitions. When it is anticipated that there may be ambiguities in the test results, there must be tolerances in the thresholds to allow for them. Clear data analysis plans must be developed and validated for DT and OT when the T&E complexity warrants.
Testing Old Systems in New Mission Areas
The hardkill mechanism in SSTD was an existing weapon which was modified for use against a new target and operating in a near-surface environment. It was recognized that its success was highly dependent upon the launch and run in this environment. So that was studied, modeled, and lab-tested. With little at-sea test time to do so earlier, the majority of TECHEVAL was spent verifying the results of this earlier work. It was not until that same time frame that it became evident that there were problems with the weapon's interaction with the target that had not been anticipated and that were not fully understood.
LESSON LEARNED --
Applying new mission areas to existing systems and/or putting them into new operating environments frequently requires significant regression engineering and testing. Certainly do not underestimate the time required to iterate the design to accommodate the unknowns. Unfortunately, obtaining aircraft carrier test time is not easy to do and will stretch out most test plans beyond comfortable limits. For this program there were no substitutes that could provide the same operational environment as a carrier.
Ensuring System Readiness For Operational Test
A critical computer software modification developed to correct a problem raised in TECHEVAL was not installed in time for OPEVAL. We are generally very reluctant to allow software changes between TECHEVAL and OPEVAL, but in this case the risk was minimal and the fix was so critical, that the PEO approved this mod and specified its incorporation as part of the configuration for which he was certifying readiness for OPEVAL. The patch did not reach the platform in time for the first phase of OPEVAL, but was available for the second. However, the OPTEVFOR Operational Test Director decided not to allow the patch to be installed because he believed it would corrupt the data and invalidate much of the results from the first period. As expected, without this patch, the threshold for False Alert Rate was not achieved. This also had the cascading effect of causing the crew to erroneously fire weapons, ultimately, affecting the False Alarm Rate and systems effectiveness results. Additionally, the program had been forced to change ships between TECHEVAL and OPEVAL which had a major impact on training and logistics. The problems were compounded by a requirement in the ORD that the system be unmanned. The aircraft carrier was undermanned at the time and was not receptive to yet another "unmanned" system that had very real demands for ships force to operate and maintain. So, OPTEVFOR reported that not only were False Alert and Alarm rates not achieved, but that training and manning were not satisfactory because too few operators were available to constantly man the watch station.
The Program Office and TDA must ensure the entire system is ready to support OT, and that the status of all changes, both incorporated and not, is fully understood by both the program team and OPTEVFOR.
Pressured Into OPEVAL
The difficulty in getting the services of the aircraft carrier and the fixing of new technical problems uncovered during TECHEVAL limited the amount of valid test results to support the OPEVAL Readiness Review. At the start of OPEVAL, not all thresholds had been demonstrated. Combining the TECHEVAL data with runs from earlier DT (though there were many significant dissimilarities), and existing Fleet data of certain sub-systems increased everyone's confidence in the system's readiness for OPEVAL, but many felt that the system performance -- while greatly improved -- still might not be robust enough for OPEVAL. However, the schedule did not allow for more at-sea testing and there was no additional fiscal resources to continue TECHEVAL. In addition, the OPNAV Sponsor felt that a further slip in the program schedule that delayed OPEVAL (it had already been delayed once) would present a high risk of program cancellation. The system was certified ready for OT, but ALCON had reservations.
TECHEVAL should be the "dress rehearsal" for OPEVAL. When it turns out to be a test-analyze-fix period, there is great increased risk that the system will perform well in OPEVAL. TECHEVAL time should be dedicated to grooming the system and increasing crew proficiency. When that objective is short-changed, problems encountered in OPEVAL will be -- or will appear to be -- "surprises" which will undermine OPTEVFOR's confidence that the system is ready for fleet introduction. We should be wary of anticipating success in OPEVAL based on the fact that we have installed fixes for the list of mission-critical failures from the last set of at-sea tests. OPEVAL Readiness Review Boards like to see that achievement of the OT&E thresholds has actually been demonstrated, not that in some cases, it only is "predicted".
We must avoid pressures to certify readiness for OPEVAL because a schedule slippage might jeopardize the program. The CNO's criteria for a SYSCOM, a DRPM or PEO certifying readiness are that the system works properly; that it is technically, operationally and logistically ready; and that it can be expected to be found operationally effective and suitable by COMOPTEVFOR. Possible adverse programmatic impacts of delaying OPEVAL is not a criteria for the PM; such problems should be dealt with by OPNAV and ASN(RDA).
Testing the Same
Runs which probably would have been scored as "kills" in DT were "misses" in OT. The target was designed to operate at a specific depth below the surface which emulated the postulated threat. The DT target was set to run within a specific stratum. Too shallow --- and the target could hit the platform during the test. Too deep --- and it would not emulate the threat. To avoid damaging or loss of the limited target assets, an additional stratum separation between the hardkill weapon and the target was specified for DT. The PM fought vigorously to constrain the operation of the target in OT to the depths previously tested and there were intense arguments over whose Intelligence data was more accurate. During OT, however, OPTEVFOR ran the targets a few feet shallower to more closely emulate the threat in accordance with the intelligence data they had.
LESSON LEARNED --
Clear operating guidelines for targets and weapons with wide performance envelopes must be established early in the program for both DT and OT. The PM needs to understand how OPTEVFOR plans to employ the targets during OT so he can evaluate the realism and the risks, as well as focus on those capabilities in his engineering and DT testing. If the PM cannot resolve such issues with OPTEVFOR, he needs to refer them to the OPNAV Requirements officer and providing, as applicable, any cost estimates for accommodating OPTEVFOR's desires.
Analyzing the Same
OPTEVFOR favored a method of data analysis which was based primarily on range data for track reconstruction. Other relevant launch platform and weapon internal homing and fuse data was considered secondary and used only when issues arose. This lead to questionable data extrapolation on missed targets. The program office knew early that success was dependent upon correct and accurate test data processing based on using all available data and adherence to a test data analysis protocol. This protocol was utilized in DT and was expected to be used in OT. OPTEVFOR's trusted agent used a different methodology because it was felt that the DT protocol was not binding for OT analysis.
Make sure there is a good data analysis plan and that all parties are signed-up to follow it. Test data issues uncovered during DT must be thoroughly addressed and shared with the OT evaluators to reduce test analysis methodology differences. Encourage the Operational Test Director to seek assistance from the program when ambiguities arise.
Validity of Simulation
Because the first time the full system would go to sea would be for TECHEVAL, there was significant dependence on the use of simulation from the beginning of the program. The in-water performance of the system never measured up to that predicted by the simulator. Thus, some critical interactions were not fully understood early, and were not accurately assessed until we had in-water failures. This simulator could not adequately represent environmental conditions such as wake, surface effects, and mixed thermal layers, because models were not developed for the new operating environment. The simulator was assessing the performance of a system in an environment in which it had never been used before. The simulator used throughout the program was continually evolving as in-water data was collected. In retrospect, since extensive aircraft carrier testing on range would be required to validate the models, more time and money should have been allocated.
Simulation is only as good as the assumptions put into it. The system was being applied in a new environment for which the simulation was not validated. Testers should use simulation to establish hypotheses, and then test those hypotheses on range.
When a simulation has not been previously validated for use in the primary mission and environment, it brings its own risks and uncertainties into the program. In such a situation, it may still be well worth investing in the simulation anyway, and it may provide a useful planning tool for design, engineering and even testing, but it should not be relied upon as a substitute for actual testing.
Testing With Artificial Targets
The primary target used for the program was a modified in-service weapon. As a target, it met most, but not all, expectations for emulating the threat. Changes to its speed, noise augmentation, and maneuverability were suggested, but the program budget could not support them. This affected how many tests were done, as well as OPTEVFOR's assessment of the system's performance against the true threat. There were only eight targets available, which constrained the numbers of runs that could be accomplished during a given period. At the depot, the target had to compete with other programs for assembly and turn-around work. Its reliability was relatively good, but even one target failure would cause major perturbations in the program's test schedule. The pressure to avoid damaging a target affected the rigor to which we were willing to expose it in some scenarios, and ultimately limited our technical understanding of the system/target interaction.
Modifications to an existing weapons for use as targets and for firing on T&E ranges must allow for a full characterization of its capabilities and limitations, and its relationship to the actual threat. Enough target assets must be planned and budgeted for early in the test planning process. When possible, use existing target assets as-is and develop test criteria to reflect their limitations.
Available Surrogate Test Sites
A land-based integration facility did not exist for the overall system. During the latter part of EMD, each new technical issue required access to the ship -- which was extremely limited -- to examine problems, to assess alternative solutions, and to verify fixes. Earlier in the program, testing was performed on other ships, but the performance of the system on those ships varied considerably from that for a carrier. There was a significant shortfall in environmental, self-noise, flow and turbulence effects available. Every time the system went to sea, even late in the program something new was learned. TECHEVAL, was spent characterizing the system; which detracted from what TECHEVAL should have been focused on: ensuring the system and crew were ready for OPEVAL.
Each program must judiciously investigate its risk mitigation efforts to ensure not only that the system performance will meet its requirements at the end of the program, but also that technical problems that could jeopardize mission success are identified and addressed as early as possible. Reduce the number of technical risks early, so that we are not still trying to sort them out late in the EMD phase when we should be refining training and tactics, and incorporating only small system changes. Early risk mitigation can be accomplished by good SYSTEMS engineering, well-structured modeling and simulation, an aggressive test-analyze-and-fix reliability program, and carefully tailored life-testing of components and sub-subsystems. In addition, an integration test site -- either at a land-based facility or in a surrogate ship -- can greatly help in ensuring that the program is not still finding and fixing major problems at the time of TECHEVAL and OPEVAL. Such a test site can certainly be a boon for a program where access to Navy ships for at-sea testing will be very limited. A specific test period should also be planned to establish an environmental baseline when one does not exist.
Following OPEVAL, the PEO convened a Technical Advisory Panel (TAP) to review the requirements and the design of SSTD. In essence, the TAP found the design concept to be valid. They recommended the installation of the fixes planned and the resumption of OPEVAL as soon as possible. Unfortunately, the resources necessary to implement the recommendation were not fully available and the program has since been restructured. SSTD today is a modular system comprised of the original detection and softkill sub-systems with capability to accommodate new hardkill systems in the future. The re-baselined SSTD program certified again ready for OPEVAL and has since successfully completed testing. In retrospect, the system presented for the original OPEVAL simply did not have the requisite maturity to handle the complex mission and demanding environment. "With a little more developmental testing, and the correction of emergent deficiencies, the system would have passed OPEVAL and perhaps introduced into the fleet earlier."
Revised: March 16, 1998