Gov't turns to software to redact public documents

13.04.2006

AiRedact automatically indexes and redacts images using algorithms that look for targeted numbers or words or seeking out related words in context -- adjacent words like 'account number' or 'Social Security number.' Once keywords are found, the software automatically redacts the information, Miller said. The software can also remove personal information by indicating a certain area on a scanned form for automatic redaction -- as long as the forms have a standard layout with information in fixed locations.

As the application looks for candidates for redaction from among millions of document images, several thousand pages are culled and analyzed individually by a person who can verify that the information should be redacted. As the pool of documents is reviewed, the software automatically adjusts to redact the remaining records based on the choices made manually, Miller said.

The amount of time needed to redact the records depends on the hardware used and the number of records that must be checked, he said. But a typical review process can take two to three months. The software costs typically range from US$200,000 to $300,000, depending on the size of the county, he said.

Although the software won't pick up 100 percent of the data that needs to be redacted, it does do the vast majority of the work, according to Miller. 'It certainly is a challenge,' he said.

In Florida, counties are required to have all online public records redacted for sensitive personal information by Jan. 1, 2007, under a recently enacted state law. All newly posted public records in the state will have to be redacted automatically after that date. The deadline is being challenged by some county clerks in the state, but remains intact.