Gov't turns to software to redact public documents

The personal data of millions of U.S. residents may have been exposed by the public posting of official documents, and local governments are increasingly looking for ways to automate the process of cleaning up data being put online.

Among the solutions available is redaction software that allows government agencies to remove sensitive personal data from the online images of public records. The software, which is being used in at least two Florida counties now, works in much the same way antispam software does -- by using algorithms to analyze images for specific phrases or words.

Some vendors use multiple levels of automatic analysis, while others narrow down the number of documents likely to need redaction, then use human intervention to winnow the desired data and train the applications for improved automatic redaction.

'It's a new technology, but a proven technology,' said Paul Miller, president of Aptitude Solutions Inc. in Casselberry, Fla. Aptitude Solutions provides its aiRedact software to Broward and Hillsborough counties in Florida, as well as to counties in other states.

The issue of removing sensitive information -- including Social Security numbers, bank account information, driver's license data and personally identifying details -- from public documents is gaining attention in light of concerns from privacy advocates. They have argued that the number of public documents being posted online with sensitive data included could open the door for a wave of identity theft and fraud. To meet that concern, county officials across the nation are turning increasingly to software to remove that data.

Since finding information in scanned images is more complex than simply locating instances of unique words in a text file, the redaction of information can't be done using traditional methods such as word-pattern analysis, according to Aptitude Solutions.