[tt] New HIV Browser

Isabelle Hakala <ismirth at gmail.com> on Tue Jun 3 18:29:28 UTC 2008

http://www.ucsc.edu/news_events/text.asp?pid=2242
By Branwyn Wagman (831) 459-3077; bwagman at soe.ucsc.edu; Tim Stephens  
(831) 459-2495; stephens at ucsc.edu

A new HIV data browser developed by the University of California,  
Santa Cruz, and the nonprofit organization Global Solutions for  
Infectious Diseases (GSID) will give researchers access to a wealth  
of data collected during clinical trials of an AIDS vaccine. Although  
the vaccine did not succeed in preventing infections, the clinical  
trial generated a huge amount of valuable data for researchers  
studying how the virus evolves and causes new infections.

Modeled on the UCSC Genome Browser, the GSID HIV Data Browser is the  
brainchild of Phillip Berman, professor and chair of biomolecular  
engineering in UCSC's Baskin School of Engineering. Berman helped  
oversee the clinical trials, which ended in 2003, when he was senior  
vice president for research and development at VaxGen, the company  
that developed the vaccine and conducted Phase III clinical trials in  
North America, Europe, and Thailand.

"After the trials concluded, I spent a couple of years trying to  
think what was the most important thing I could do for HIV research,"  
Berman said. "I concluded it was using new technology to preserve the  
data from these clinical trials and present it in a form useful to  
the scientific community."

In 2004, Berman cofounded GSID, based in South San Francisco and  
dedicated to combining knowledge and expertise from the biotechnology  
industry and the public health sector to address infectious disease  
problems in the developing world. He joined the UCSC faculty in 2006.

"Despite the fact that the vaccine trial didn't work, a huge amount  
of useful information was obtained," Berman said. The "North  
American" trial included about 60 different clinical sites in North  
America and one site in the Netherlands. Of particular value to  
researchers are the genetic sequences of the viruses that infected  
participants during the trial.

"The trial represented the only up-to-date broad survey of virus  
sequences from new infections that had ever been carried out," Berman  
said. "Every time there was a new infection in the vaccine or placebo  
group, the virus was sequenced. The sequence information provides the  
best picture we have about what the immune system sees when there is  
a new infection."

This is important, Berman said, because other major repositories of  
HIV sequence data are not annotated for the time after infection, the  
clinical status of the patient, or the histories of the specimens  
sequenced. That limits their usefulness for studying such a rapidly  
evolving virus.

HIV is highly mutable and evolves in response to attacks by the  
immune system. As a result, HIV isolated from a patient years after  
the initial infection is genetically different from the virus that  
caused the infection in the first place. A vaccine should target the  
most infectious form of the virus, Berman said. Yet all the vaccines  
tested so far have been based on viruses isolated from patients with  
longstanding infections.

"A current hypothesis in HIV vaccine research is that the antigenic  
structures of HIV viruses that mediate new infections differ from  
those recovered from people long after infection," Berman explained.  
"The specimens in this set represent the largest group from new  
infections that has ever been collected."

Besides viral genome-sequence data, the database links to a  
repository of preserved specimens (blood samples and cells) that  
researchers can access from GSID and the National Insitutes of Health  
(NIH) for further study.

"This is the first time that an HIV sequence database has been linked  
to a specimen repository and a database of clinical information,"  
Berman said. "These clinical specimens are longitudinal, collected  
from the same person during a two-year follow-up period. This will  
allow investigators to study the evolution of the virus and the  
evolution of the immune response and clinical outcomes."

At UCSC, Berman teamed up with the Genome Browser group to develop a  
browser for the sensitive clinical data collected during the vaccine  
trial. Jim Kent, associate research scientist for the UCSC Genome  
Browser and principal investigator on the project, said it was the  
first time his group had worked with data from participants in a  
clinical trial.

"This data must be handled differently and great care taken with  
confidentiality," Kent said. "We learned from this project how to  
build the infrastructure to cope with that. This will be useful for  
other medical projects, such as cancer genomics, in the future."

Fan Hsu, director of proteomics for the UCSC Genome Browser, said the  
emphasis on security was very different from past projects. "Before,  
everything we have worked on is totally open, totally public. With  
the GSID project, only authorized users can access the data, so we  
needed to set up special controls," Hsu said.

How to display the very large number of HIV sequences on the browser  
was another challenge. "Our original genome browser has only one  
reference genome. For this HIV database, we have about 350 infected  
people and more than 1,000 sequences," he said.

Hsu and software developer Galt Barber adapted the genome browser  
software to accommodate the large number of HIV sequences and the  
data security along with interactive selection criteria for viewing  
the data. As the project evolved, Hsu also coordinated the transfer  
of the software to GSID. The UCSC team, which also included Erich  
Weiler, Robert Kuhn, and Ann Zweig, worked nights and weekends to  
bring the new browser online.

The resulting GSID HIV Data Browser is a customized version of the  
UCSC Genome Browser. It provides researchers with searchable  
demographic and clinical data from volunteers who became HIV infected  
during the VaxGen clinical trial. The browser allows users to align  
viral sequences with one another and with reference or consensus  
sequences.

"This is something where the university can make a difference,  
because the private sector is not so interested in vaccines; they're  
not so profitable," Kent said. "There is very little economic  
incentive to develop an AIDS vaccine, but there is a tremendous  
humanitarian incentive."

Kent hopes that just as the UCSC Genome Browser has continued to  
build the collaborative nature of the genomics research community,  
this HIV data browser will help motivate the AIDS research community  
to work together and pool their data.

Vaccine development efforts have been repeatedly frustrated. An HIV  
vaccine candidate developed by the pharmaceutical company Merck  
recently failed in clinical trials cosponsored by NIH. "The recent  
failure of the Merck HIV vaccine has thrown the field into turmoil,"  
Berman said. "All the best ideas for an HIV vaccine in the past 20  
years have failed. The information in this database is now more  
critical than anyone could have imagined. It tells us what's being  
transmitted."

The next phase of the HIV browser project involves releasing the  
sequence data from infected participants in the Phase III clinical  
trial that VaxGen conducted in Thailand.

"In the future, the database will be expanded to allow associations  
between virus sequences, clinical data, immune response data, and  
host genetics," Berman said. "We hope to eventually include data from  
other HIV vaccine trials sponsored by the NIH, private companies, and  
other HIV vaccine research organizations."

GSID is making these data and serological samples available to the  
HIV research community through an agreement with VaxGen and with  
funding provided by the Bill and Melinda Gates Foundation.

For information on accessing the GSID HIV Data Browser and background  
on the clinical trials, visit the GSID web site.

More information about the tt mailing list