A new computational method is helping researchers design cheaper and more effective drugs.
"The longer a peptide is, the less effective it is as a drug," said Meghan Bellows-Peterson, a doctoral candidate in the department of chemical and biological engineering at Princeton University.
Peptides are chains of amino acids that serve as the building blocks of life, forming molecules such as proteins.
"It [a long peptide] may target the protein you're trying to hit very well," Bellows-Peterson explained, but longer peptides are more likely to have difficulty penetrating the cell membrane, or get degraded or chopped up by the body's natural enzymes.
Longer peptides are more expensive to manufacture, too. Fuzeon, a peptide used to treat HIV-1, consists of 36 amino acids and costs nearly $20,000 US per patient per year. And, as you might expect with so many amino acids, it breaks down rapidly; consequently, patients have to inject their treatments twice daily.
Now, by applying well-understood mathematical techniques, Christodoulos Floudas' Princeton group (of which Bellows-Peterson is a member) has demonstrated a general method for discovering more peptides that can bind to a given protein.
In the case of HIV-1, for example, they found five candidate peptides, each consisting of only 12 amino acids. Then Johns Hopkins medical researchers led by Robert Siliciano tested the five candidates to see if they could really prevent HIV-1 from infecting human cells in vitro. The result? Four inhibited HIV-1, and one of those was even able to inhibit strains of HIV that are resistant to Fuzeon.
Peptide drugs like Fuzeon work by binding to a key site on a protein. Researchers use something called an energy function, or forcefield, to describe the energy of peptides before and after they have bound to a target site; the more the energy of a peptide drops when it binds to the target site, the more strongly a peptide will bind to the target site.
Before they can design a new peptide drug, researchers must search the Protein Data Bank for a protein target associated with the disease or virus they want to treat, and a peptide that is known to bind to the target. The proteins in the data bank have all been solved and stored as 3-D templates that provide the x-, y-, and z-coordinates for all of the atoms in the protein. If researchers wish to target a protein that isn't in the data bank, they must first solve the protein using structural biology methods. (You can read more about how Structural Biology Grid generates these templates here.)
They must also decide what constraints they wish to place on the peptide in terms of biological characteristics, in what way it will be allowed to mutate, and which energy function they wish to use.
In particular, the possible peptide mutations can be reduced by allowing only hydrophilic, or "water-loving" amino acids on the surface of the peptide and hydrophobic, or "water-hating" amino acids buried in the core of the peptide. The de novo design framework, created by Floudas' research group, finds new peptide drugs via a two stage process.
"In step one, the optimization model finds amino acid mutations for the peptide that would lower the energy of the template structure," Bellows-Peterson said. "The energy function is dependent upon the distances between amino acids, so it takes into account interactions between all the amino acids in the complex."
This process varies somewhat depending on the type of template they have; single templates are more common and quicker to compute, but less accurate. The method used by the de novo design framework was pioneered by Floudas' group in the early 2000s. It models the problem as a "quadratic assignment-like problem." That means that the energy function is quadratic and it "assigns" mutations to certain positions to minimize the energy.
In this case, the de novo design framework fed a single template for a 14-amino acid peptide binding to the HIV-1 virus into the General Algebraic Modeling System program to generate 500 12-amino acid peptide sequences. The calculations took approximately 10 minutes on her workstation, a compute time that will increase exponentially based on the number of amino acids in the sequences.
The second step in the de novo design framework is broken into two parts. First, they examine something called fold specificity, which measures the probability that a given peptide sequence from step one will fold into the same shape as the template structure. For this step, the de novo framework uses ASTRO FOLD or Tinker/CYANA. The former, which was developed by Floudas' group, uses global optimization to determine the 3-D structure of a protein; Tinker/CYANA combines the separate Tinker and CYANA programs to generate the structures more quickly, but identifies only the local solutions.
The top-ranked fold specificity sequences are fed into Rosetta++, which generates 46,000 structures per sequence in order to finally calculate the binding affinity - a measure of how strongly the peptide will bind to the target protein site. The peptides with the greatest affinity are recommended for experimental investigation. The whole second step, including the fold specificity and binding affinity calculations, required between four and six hours on a 482 compute nodes, said Bellows-Peterson.
While the potential replacements for Fuzeon are exciting, it is the general applications of this process that are truly exciting. "The new aspect of this model is the approximate binding affinity calculations which give a way of filtering sequences for the desired function. Most design methods in the past simply check the structure of the designed peptides, basing the design on the hypothesis the structure implies function," Bellows-Peterson explained. "It can be applied to any system. We've applied it to HIV, auto-immune disease targets, cancer targets, diabetes targets. It is a way to greatly decrease the sequence space to 5 or 10 lead candidate peptides which can then be tested experimentally."