Google Flu Trends spreads privacy concern

10.12.2008

last month that search queries such as "flu symptoms" tend to be very common during flu season each year. A comparison of the number of such queries with the actual number of people reporting flu-like symptoms shows a very close relationship, it said. As a result, tallying each day's flu-related searches in a particular geography allows the company to estimate how many people have a flu-like illness in that region.

In making the announcement, Google noted that it had shared results from Flu Trends with the Epidemiology and Prevention Branch of the Influenza Division at CDC during the last flu season and noticed a strong correlation between its own estimates and CDC's surveillance data based on actual reported cases. Google said that by making flu estimates available each day, Google Flu Trends could provide epidemiologists with an early-warning system for flu outbreaks.

Rotenberg said the service was potentially useful, but much depended on the kind of search data that Google is collecting and analyzing to make its predictions. Google has said that the database it uses for Flu Trends retains no identity information, IP addresses or any physical user locations. However, what is not clear is whether the company is completely deleting IP addresses, and if so, when it is doing it. Also, he said another issue was whether all Google is doing is anonymizing IP addresses by redacting some of the numbers in an IP string.

Google also claims that as part of its overall privacy policy it anonymizes all IP addresses associated with searches after nine months. Yet in an apparent contradiction, when introducing Flu Trends, Google noted that it uses both current and historic search data - dating back to 2003 -- to make its predictions, Rotenberg said.

Jeffery Chester, executive director of the Center For Digital Democracy, said Google's also makes it important for the company to disclose what kind of data it is collecting and using for Flu Trends.