Using data drawn from queries entered into Google, Microsoft and Yahoo
search engines, scientists at Microsoft, Stanford and Columbia
University have for the first time been able to detect evidence of
unreported prescription drug side effects before they were found by the
Food and Drug Administration's warning system.
Using automated software tools to examine queries by 6 million Internet
users taken from Web search logs in 2010, the researchers looked for
searches relating to an antidepressant, paroxetine, and a cholestorol
lowering drug, pravastatin. They were able to find evidence that the
combination of the two drugs caused high blood sugar.
The study, which was reported in the Journal of the American Medical
Informatics Association on Wednesday, is based on data-mining techniques
similar to those employed by services like Google Flu Trends, which has
been used to give early warning of the prevalence of the sickness.
Currently, the F.D.A. asks physicians to report side effects through a
system known as the Adverse Event Reporting System. But its scope is
limited by the fact that
data is generated only when a physician notices something and takes the initiative to report it.
The new approach is a refinement of work done by the laboratory of Russ
B. Altman, the chairman of the Stanford bioengineering department. The
group had previously explored whether it was possible to automate the
process of discovering “drug-drug” interactions by using software to
hunt through the data found in F.D.A. reports.
The group reported in May 2011 that it was able to detect the
interaction between paroxetine and pravastatin in this way. Its research
determined that the patient’s risk of developing hyperglycemia was
increased compared with taking either drug individually.
The new study was undertaken after Dr. Altman wondered whether there was
a more immediate and more accurate way to gain access to data similar
to what the F.D.A. had accesscould get.
With the aid of Microsoft researchers, he was able to acquire anonymized
data taken from a software toolbar installed in Web browsers by users
who allowed their search histories to be collected. The Microsoft
researchers were able to look at 82 million searches for drug, symptom
and condition information.
The researchers first identified individual searches for the terms
paroxetine and pravastatin, as well as searches for both terms, in 2010.
They then computed the likelihood that users in each group would also
search for hyperglycemia as well as roughly 80 of its symptoms — words
or phrases like “high blood sugar,” “blurry vision,” “frequent
urination” or “dehydration.”
They were able to determine that people who searched for both drugs
during the 12-month period were significantly more likely to search for
terms related to hyperglycemia, compared with those who searched for
just one of the drugs. (Approximately 10 percent, compared with 5
percent and 4 percent for just one drug.)
They were also able to determine that people who did the searches for
symptoms relating to both drugs were likely to do the searches in a
short time period: 30 percent did the search on the same day, 40 percent
during the same week and 50 percent during the same month.
“You can imagine how this kind of combination would be very, very hard
to study given all the different drug pairs or combinations that are out
there,” said Eric Horvitz, a managing co-director of Microsoft
Research’s laboratory in Redmond, Wash.
The researchers said they were surprised by the strength of the “signal”
that they were able to detect in the millions of Web searches and
argued that it would be a valuable tool for the F.D.A. to add to its
current system for tracking adverse effects.
“There is a potential public health benefit in listening to such
signals,” they wrote in the paper, “and integrated them with other
sources of information."
In an interview, the researchers said that they were now thinking about
how to add new sources of information, like behavioral data and
information from social media sources. The challenge, they noted, was to
integrate new sources of data while protecting individual privacy.
Currently the F.D.A. has financed the Sentinel Initiative, an effort begun in 2008 to
assess the risks of drugs already on the market. Eventually, that project plans to
monitor drug use by as many as 100 million people in the United States.
The system will be based on information collected by health care
providers on a massive scale.
“If you have a well-instrumented population of 100 million Americans,
than it can become trivial to test our predictions,” Dr. Altman said.