New York police department boosts data warehouse

05.09.2006
The New York Police Department, the recipient of a steady stream of plaudits since launching its Real-Time Crime Center (RTCC) more than a year ago, is continuing to build up the data warehouse underpinning it.

Already under way is the addition of more data sources and real-time alerts to the IBM-built data warehouse used by the RTCC and the rest of the 37,000-officer force. The data warehouse will also accelerate the NYPD's famed CompStat crime-mapping tool, though it won't replace it.

"We're in a war, so we need to give our guys in the front lines the best tools possible," said CIO James Onalfo. The NYPD has spent about US$300 million in the past three years on new technology tools.

One of those tools was the RTCC, which was built by systems integrator Dimension Data Holdings PLC and launched in June 2005. The RTCC is essentially a centralized help desk tasked with providing quick data to the department's 8,000 detectives, who used the tool and solved 74% of all homicides last year.

Though the RTCC has garnered more publicity, the IBM-built data warehouse underpinning it is more fundamental to the entire NYPD.

In the first phase, completed in the middle of 2005, several years of historical data, including complaints, arrests, stops, "question and frisks," criminal summons, shootings and homicides, was imported into an IBM DB2 8 database running on a pair of IBM P650 eight-way AIX servers, according to Christine Tyler, an associate partner in the public-sector practice at IBM's Global Business Services division. The second phase, which just finished, brought in complaint records going back to 1995 and arrest data going back to 1990, Tyler said. That now includes free text notes in the arrest records.

"Now we can look for things like a silver gun or a name on a tattoo, or search for a person's name," Onalfo said.

Those databases weren't easy to import. The NYPD has about 55 databases scattered in various locations, most using older technologies such as FoxPro, Microsoft Access, flat file storage and even mainframe-era technologies such as VSAM and Adabas that Tyler joked only "old ladies" like herself are still familiar with.

The third phase, which the NYPD is now beginning, involves connecting additional data sources from within the NYPD and from outside agencies. In doing so, the NYPD is moving from homegrown data integration software based on Cobol to Informatica Corp.'s PowerCenter tools.

"We won't initially save time, but our investment is reusable, so it will lower future development [costs]," Tyler said.

Informatica is already used to send 911 emergency call data to the RTCC, which transmits alerts straight to detectives in the field. For the data warehouse, Informatica will help integrate sources such as the fingerprint databases run by New York state and the FBI, speeding up a fingerprint matching process that can take up to three weeks to one that is completed in seconds, Onalfo said.

Other real-time features in the works include generating alerts when a "stop and frisk" report matches a name in an outstanding warrant in the data warehouse. New York is even considering emulating a city of London project and setting up thousands of cameras at major intersections around the city to scan license plates in real time, Tyler said (see "Closed-circuit TV may aid London bombing investigation). The data warehouse would be able to match those license plates and generate alerts if matches are made.

"Like the private sector, the NYPD data warehouse appears to be all about CDI -- except in this case, it's not customer data integration, it's criminal data integration," said James Kobielus, an analyst at Sterling, Va.-based Current Analysis Inc.

Kobielus, who had not been briefed by the NYPD, said the department's drive for real time is a good idea and parallels what many Fortune 500 firms are doing today. To ensure no latency, however, he recommended that the NYPD consider adding management and monitoring tools as the private-sector organizations are doing.

The data warehouse, currently 80GB in size and accessible through Cognos 8 business intelligence tools, should grow to about 400GB by the time the third phase is completed, Tyler said.

Onalfo said he would eventually like the data warehouse to serve as a central information hub for as many as 30 local government agencies, the district attorney's office and police departments in neighboring Nassau and Westchester counties. The data warehouse will also take over CompStat, launched in 1994 and credited widely for helping reduce crime in the city. Currently, it takes two to three employees at each precinct a full day to prepare weekly crime-trend reports. Eventually, precincts will be able to generate reports in a weekly, daily or even real-time fashion via dashboards that commanding officers can monitor.

To speed up getting data from reports still mostly written on pen and paper, the NYPD is trying special wireless pens and pads from IBM that let officers continue to handwrite reports in the field even as the data is simultaneously uploaded digitally into a precinct server, Onalfo said.

Asked whether this massive centralization of data should arouse privacy concerns, Onalfo said that procedures such as proper warrants are still required to search for data. "It's a well-controlled process. Even I don't have the ability to look at data," he said.

As for data theft or exposure due to mobile devices that the NYPD is slowly rolling out to police officers, Onalfo said that laptops and handheld devices will eventually be protected by a combination of smart cards, fingerprint-based biometrics and passwords.