Beware of tests that use simulated data to validate product performance, researchers warn

10.11.2010
WALTHAM, MASS. -- Testing against simulated data may not be worth much depending on how closely the simulation , researchers told attendees at an IEEE conference on homeland security.

Info-assurance products such as anomaly detectors check only a certain set of characteristics within the data, and if those characteristics aren’t accurately simulated, the results will lack validity, says John DeVale, a researcher at Johns Hopkins Applied Physics Lab who presented the research at the 2010 IEEE International Conference on Technologies for Homeland Security.

As a result, anomaly detectors tested against simulated data and found effective may prove ineffective when deployed in real networks monitoring real traffic, he says. They may actually have trouble operating in real-world environments or generate blizzards of false alarms, he says.

“Dealing with false alarms is manpower intensive,” DeVale says. “You can’t look at them all, so you want a detector to have a low false-alarm rate.”

The solution is to recognize what data subset the anomaly detector looks for and make sure that subset is accurately represented in the simulation, he says. “If assumptions are wrong, you get problems,” DeVale says. “It’s more difficult than people realize.”