Privacy matters: When is personal data truly de-identified?

25.07.2009

They often point to four cases:

And last month, two Carnegie Mellon researchers made headlines when they released the results of a where they were able in fewer than 1,000 attempts to identify all nine digits of the Social Security Numbers of 8.5% of deceased people who were born after 1988.

This academic version of the -- where hackers try to outdo each other -- has led de-identification purists to gravitate around the so-called "k-anonymity" method of statistical de-identification. Hopefully HHS will back-burner this option, because k-anonymity is to data what chemotherapy is to human tissue: It destroys the good when going after the bad.

According to Columbia University epidemiologist and statistical de-identification expert Daniel Barth-Jones, "The problem with certain de-identification approaches [such as k-anonymity] is that they can badly distort the accuracy of statistical analyses."

"Progress on numerous goals for the government's health IT agenda like quality improvement, patient safety and reducing health disparities could be seriously stunted, or even do more harm than good," he added, "if we aren't conducting our analyses with data that has been de-identified with a rigorous approach for preserving statistical accuracy."