Showing posts with label structure-function. Show all posts
Showing posts with label structure-function. Show all posts

Sunday, March 25, 2012

FRETting about folding

Nature is the consummate magician. There are things that human intelligence and its servant computers fail to do despite the mightiest struggles; nature does them with an insouciant shrug. For a biologist, the most maddening example of this is the folding of proteins.


A protein is a linear molecule, hundreds or thousands of atoms long. Every third atom in this chain has a chemical decoration; there are twenty different types of decorations, some acidic, some basic, some neutral, some positively charged, or negatively, or relatively large or small. Depending on the linear arrangement of these decorations, the protein can coil up like an old-fashioned telephone cord, or fold itself in pleats, or sort of randomly squiggle about, in any combination of different patterns in three-dimensional space. Here’s an example, the botox protein I wrote about earlier.

If this protein were to fold up in the wrong shape, it wouldn’t work at all. So how does a protein always fold up into the right shape?


Now, we know that the decision about how to bits of the protein line up next to each other is in some way dependent upon the order of decorations on the atoms in the protein chain. Some decorations like to be next to each other, while others shun each other’s company. In a really simplified image, you can imagine a protein as being like a whip with decorated with a plus and a minus static charge, a north and a south pole magnet, a bit of fuzz and a bit of claw Velcro, a “male” and a “female” Lego block, a similar pair of Duplo blocks, and an electrical plug and a socket. If you were to randomly shake that whip around, you could predict that you would always end up at the same end state: plus with minus, north with south, and so on. This arrangement is the most stable state—the lowest energy state.


Proteins (in theory) behave similarly, only the “whip” is shaken by the random jiggling of thermal motion—and, usually, a specific protein with a specific arrangement of decorations will always end up in the same three-dimensional shape. And, as a testament to human ingenuity and the power of computers, we can actually predict the most stable, lowest-energy state of short proteins with relatively simple arrangements of decorations.


We run into problems, though, when we try to predict the three-dimensional structure of more everyday proteins—which have hundreds of decorations. The most sophisticated computers get bogged down with all the possible permutations, and we have a mixed record at best for understanding how these things fold up. And while we crack our skulls about the problem, nature casually takes proteins and effortlessly folds them into the right shape, over and over again. It keeps a biologist humble.


We don’t even really understand the kinetics of the process—some proteins fold up into their finished shape in milliseconds, while others that are not much longer take thousands of times longer to fold up. What takes longer? Do the slower proteins have more possibilities to try out before they settle on the best shape? Are they just not as flexible? A neat technical tour-de-force gives us at least a little clue towards this last problem. A group of researchers at the National Institutes of Health (I approve of this use of my taxes) found an interesting similarity between the behavior of “fast-folding” and “slow-folding” proteins.


So, consider these two proteins.

This one, nicknamed WW, folds into this shape rapidly.

This one, named GB1, folds into shape 10,000 times more slowly.


What does that mean, what I just wrote? Those numbers are based on taking a huge number of unfolded protein molecules of WW or GB1, putting them in solution, and measuring how long it takes for half of them to assume their folded shape. So, in this case, “how fast something folds” is descripting of a large population, but doesn’t tell us much about how an individual molecule behaves. How long does it take a single individual protein to transition from unfolded to properly-folded?


A morbid analogy would be the half life of a human population: if you looked at all the people born in 1903, you could calculate a half-life, or how long it takes for half of that group of people to die. This tells you a lot about how long an average person lives, which is many years. It tells you nothing about how long it takes to transition from alive to not-alive, which is usually a rapid transition.


What the NIH researchers did was to modify these proteins so that they could examine them, and distinguish more precisely how long it took an individual to change from unfolded to properly folded. To do this, they attached specific dye molecules to either end of the unfolded protein. These dye molecules have a really cool property: if you zap one with the right amount of energy in the form of purple light, it will actually dump that energy onto the other dye molecule, which will fluoresce, shining with red light. They will only do this, though, if the dye molecules are really close together, and in this setting, they are only close together if the protein is properly folded. This process is called Förster Resonance Energy Transfer, or FRET. To go back to the image of a decorated whip, this is like attaching dye markers to either end of the whip. If the whip is unfolded, when you shone purple light on it, it wouldn't fluoresce. Energy couldn't get from one end of the whip to the other:

If it were partially folded, it would fluoresce weakly, because it would be hard for energy to get from one dye molecule to the other.


If it were fully folded, and you shone purple light on it, it would be easy for energy to get from one dye molecule to the other, so it would fluoresce brightly.

So, you could measure how fast it takes the whip to get folded by measuring how rapidly red fluorescence increases. So, measuring how long it takes an individual protein to change from unfolded to properly folded was a matter of measuring how rapidly FRET efficiency increased.


The data from these experiments are not all that fun to look at, involving a fair amount of

But the bottom line was that for both fast- and slow-folding proteins, the transition from no structure at all to completely folded structure was about the same, in the range of a hundredth of a millisecond. The "slow-folding" proteins seem to dawdle and delay and do everything they can to put off folding, but once they decide to fold, they fold just as rapidly as the "fast-folding" ones. It's kind of like the situation mentioned earlier with the population of humans born in 1903; some may live a long time, others die in infancy, but the transition between alive and dead always takes the same, brief amount of time.


So, we know a little more about the process of protein folding now. If we are trying to understand why two proteins fold up at rates that differ ten thousand-fold, we at least know where not to look for answers. However, we still don’t really know what the answer is--what the slow protein is doing when it's not folding up. Nature, like a good magician, is still reminding that we are in the dark.


Hoi Sung Chung, Kevin McHale, John M. Louis, and William A. Eaton (2012). Single-Molecule Fluorescence Experiments Determine Protein Folding Transition Path Times. Science 335: 981- 984.


The Wikipedia web page on FRET is not bad. The above is obviously a gross simplification.

Tuesday, March 20, 2012

A lesson from Botox

There's not much in common between what the Real Doctor studies--ophthalmology--and my chosen field of microbial genetics and physiology. One of these rare commonalities is an interest in Botox. For the Real Doctor, it's a handy tool to paralyze an ocular muscle and cure a case of walleye. For me, "Botox" is botulinum toxin, a poison secreted by Clostridium botulinum, and a nice example of how bacterial toxins work.


Right off the bat, I gotta say that bacterial toxins are amazing. It always gives me pause when I see examples of evolution reaching across domains of life. I mean, it is not too surprising when a bacterium evolves a chemical signal to communicate with another bacterium--after all, they share the same biochemistry. But with toxins, bacteria have evolved a very complicated set of molecules that target very specific proteins on the surface of very specific nerve cells in only a few types of organisms; the bacterium is reaching across domains to communicate biochemically with an alien biochemistry.


That said, there are other reasons to be interested in bacterial toxins. They’re amazingly potent—30 nanograms of botox, or a volume of about one billionth of a sugar cube, is enough to kill a human. Their power and specificity makes them valuable tools for understanding the biochemical processes of our own cells, and some (such as botox) have medical uses. Bacteria, in their diversity, have evolved a huge number of bacterial toxins specific to different tissues in different hosts, but most of these toxins are variations on a couple of themes.


One major class of bacterial toxins can be thought of like nuclear missiles; they require two sophisticated components to do their deadly job. By itself, an A-bomb is not so useful a weapon--unless you deliver it to its target, you will only damage yourself. By itself, a guided missile won't do too much damage to its target--it's merely a bus to deliver the payload. But put together the A-bomb and the bus, and you have yourself a tremendously destructive weapon. The bacterial "A-B" toxins are the same way. The "A" component of these toxins is like the A-bomb--a tremendously destructive protein molecule, but it needs to be delivered to its target. The "B" component is like the missile (or bus), a protein that is by itself harmless, but protects the "A" component and delivers it to its target.


Botulinum toxin is an example of an A-B toxin, with a wicked but delicate A component and a very interesting B component. The bacteria release this toxin in our guts, an extremely acidic environment, and it must travel into our blood and then find a specific molecule on a specific type of nerve cell to do its dirty work. Some recent work by a group at the Sanford-Burnham Medical Research Institute in La Jolla, California, has given us a peek into how the B component delivers the destructive A component to its target, and also--like any good study--raised some more interesting questions.


The researchers focused on the first stage of this deadly missile's trajectory--the trip from the acidic digestive system into the more neutral bloodstream, a hazardous journey that almost no proteins can survive. As long as people have known that botulinum toxin is made of protein, it has been a mystery how it avoided destruction. So, what about B allowed this? By making various modified versions of the B component, they found that only one of the four parts of this carrier was necessary for protectin the A component from acid. Then, since the function of any protein is intimately tied to its structure, they tried to find out the structure of this minimal B and how it fit together with the A. They found a couple of interesting surprises.


First, the B protein is sensitive to acid, just as susceptible to acid as the delicate A component of the toxin. Both B and A, individually, are attacked by acid at a few specific sensitive places, and break into a few distinctive pieces. However, if you put them together in an acidic environment, they laugh it off. The reason is in how these proteins fit together: all the bits that are sensitive to acid are covered up by the way the proteins nestle up against each other. They snuggle against each other to protect each other from harm, a romantic image if we weren’t dealing with deadly poisons.


A second interesting feature came to light when the researchers looked at exactly how these proteins fit together. The researchers were able to recognize the specific parts each protein that touched each other. These parts were notable because of how they responded to acidity: in the harsh of environment of the stomach, these parts of the proteins remained neutral, and helped to hold the A and B complex together. However, when the protein complex was moved to a neutral environment, these parts of the protein turned acidic--enough that they would cut the bonds between the A and the B proteins. The connections between the A and B proteins acted like the explosive bolts that hold together the stages of a missile, holding them together through the boost phase but dramatically separating the two when the booster is done. Indeed, the researchers conclude that this is how the first part of botulinum toxin's journey goes: the minimal part of the B component protects the A component in the stomach and carries it into the bloodstream. Once the two are in the bloodstream, the neutral environment causes B to cut its links to A, and the A component can go on its merry way.


This is all pretty cool. It solves a mystery about how botulinum toxin works, and actually suggest as way that biotechnologists could design protein-based drugs that could be taken orally. If a protein drug could be designed to fit with a carrier similar to botulinum toxin B component, then it could survive the trip through the stomach and be absorbed into the bloodstream, where it could do its job.


However, for my money, the most surprising (and instructive) thing revealed by the structure of the minimal botulinum toxin is that the "A" and "B" components have almost identical structures. Proteins are linear strings of hundreds of amino acids, and these strings clump and fold together in unique and characteristic ways. The structures are often quite complex, with enough helixes and turns and twists that a single protein's structure would resemble a plate of spaghetti, the the entire platefull were one loooong noodle. It is this unique molecular shape that gives each different type of protein the ability to do its specific job--say, disable a nerve cell for the A component of botulinum toxin, or protect another protein from acid for the B component. It is surprising that two proteins that you might expect to have structures as different as an A-bomb and a booster rocket to look almost the same.


Check out this picture: the A component is an orange stringy ribbon, and the B component is a green stringy ribbon. The two structures are superimposed upon each other, and you can see that they are almost identical. (This is just the first third of each of the proteins, but the remaining 2/3 are available, in stereo, at http://www.sciencemag.org/content/335/6071/977/suppl/DC1)


One of the great big hairy problems of modern biology is how proteins--linear molecules--fold up into unique three-dimensional shapes, and botulinum toxin gives us another bracing surprise here. I teach my intro bio students that the "primary structure," or linear arrangement of amino acids in the protein, determines the three-dimensional structure of the protein. A change in the amino acid sequence should result in a change in the three-dimensional structure. I'd expect the botulinum toxin A and B components to have very similar amino acid sequences, given their structural similarity. However, their amino acid sequences are only 20% identical. That's crazy.


The authors, being most interested in biochemistry, don't pursue a Big Question raised by this--how does this evolve? The genes for botulinum toxin A and B components are probably also about 20% identical (mea culpa--I don't have my computer with me, so I have a hard time looking this up. It is left as an exercise for the interested reader). It's possible that two entirely different genes evolved to have such similar structures, but I don't think it's at all likely. More likely, it is a case of an ancestral gene being duplicated, and then each of the two copies undergoing evolutionary change (this hypothesis is weakly supported by the observation that the A and B genes tend to be clustered). These changes altered the sequences of the proteins, while preserving their three-dimensional structures. This would be like sitting down with a sonnet and a thesaurus, and changing 8 of every 10 words. If you were careful, you could preserve the structure, meter, and rhyme scheme of the sonnet, while completely changing the meaning of the poem—and a textual analysis would show no relationship between the source and the end product.


If I could tell these researchers what to do, I would have them gather more structure and sequence information from a variety of related toxin proteins. It would be fun to make a family tree, and see just how much structure could be conserved as amino acid sequence changes.


Shenyan Gu, Sophie Rumpel, Jie Zhou, Jasmin Strotmeier, Hans Bigalke, Kay Perry, Charles B. Shoemaker, Andreas Rummel, Rongsheng Jin (2012). Botulinum Neurotoxin Is Shielded by NTNHA in an Interlocked Complex. Science 335: 977-981.


Peck, MW (2009). Biology and genomic analysis of Clostridium botulinum. Advances in Microbial Physiology 55: 183-265.

Wednesday, September 28, 2011

Proteins, Puzzles, and Perjury

There was a bit of news last week that generated headlines such as “Gamers Solve Problem that Stumped Scientists.” As always with science by press release, the reality is cool but not that cool.


The “Protein Folding Problem” is one of the most damnable problems facing biology. It would really be nice to reliably predict protein structures. Knowing the structure of a protein allows us to understand how the protein works, so we can do useful things like design effective drugs. However, precisely determining the 3-D structure of a protein is extremely time-consuming, fiddly work that has a low probability of success. So, there’s a lot of interest in using computers to predict the 3-D structure of a protein.


The problem is this: genes encode proteins, and we can easily “read” a gene to predict the linear sequence of amino acids in a protein. However, a linear sequence of amino acids is useless: it must fold on itself in an often-incredibly complicated structure to make a functional protein. Starting with a linear sequence—basically a string—there’s a nearly infinite number of three-dimensional structures that are possible. Some possible shapes can be eliminated, since certain amino acids in the string don’t want to be near each other or near water. Some other possible shapes are more likely, since certain amino acids in the string want to be near each other, or near water.


In principle, those simple rules should make it possible to predict how a linear sequence of amino acids will fold to make a protein. However, a typical protein is made of several hundred amino acids. So, while computers are OK at predicting structures of very short fragments of proteins, predicting the structure of a protein requires more power. Lots of power—the number of possible ways a typical protein can fold far exceeds the number of possible moves in a game of chess (about 1046), so IBM built a successor to the chess-playing “Deep Blue” supercomputer and called it “Blue Gene,” intending it to work on this problem. Blue Gene has been among the most powerful supercomputers for several years, but it still is far from efficient at predicting protein structures


A somewhat more effective approach to “the protein problem” has been to use distributed computing—borrowing time on hundreds or thousands of networked PCs when their owners are not using them. SETI@home, which screens huge amounts of radio telescope data for potential signals of extraterrestrial life, is a famous example of this. Biochemists have Rosetta@home, which uses the same approach to predict protein structure. This venture has actually produced some predictions which jibed pretty well with the actual structures. But Rosetta is still limited; being a computer program, it relies on brute force and wastes resources looking at possibilities that are “stupid.”


One way to get around this problem is to borrow from humans something that computers lack: intuition. This has been the approach of the creators of “Foldit,” a program that turns the protein folding problem into a game. Players are given a snippet of a protein, and (not needing to understand anything about Van der Waals forces or acid-base interactions), jiggle it around until it reaches a very stable conformation—which corresponds to a high score. As the authors of the paper that made the headlines say, this program uses the power of games…

“to channel human intuition and three-dimensional pattern-matching skills to solve challenging scientific problems. Although much attention has recently been given to the potential of crowdsourcing and game playing, this is the first instance that we are aware of in which online gamers solved a longstanding scientific problem. These results indicate the potential for integrating video games into the real-world scientific process: the ingenuity of game players is a formidable force that, if properly directed, can be used to solve a wide range of scientific problems.”

So what did the gamers actually do? They started with a bunch of predicted structures for one protein, generated by Rosetta@home, and tweaked them. Once the actual protein structures were experimentally determined (again, a terribly painful and difficult task), the gamers’ predictions were noticeably better than Rosetta’s. Here’s a picture comparing their results with the actual structure—the linear string of amino acids is sometimes presented as a flat ribbon, sometimes as a noodle; it can curl up like a telephone cord, or lie flat in a sheet, but this picture shows one string.

The red ribbons represent the predictions of Rosetta; the yellow represent the predictions of the gamers; and the blue is the real structure of the protein. All three are superimposed. In almost all parts of the protein, the yellow, gamers’ structure is closer to the real, blue structure than the red, Rosetta structure. Bravo gamers! However, it is worth noting that the gamers started from structural predictions by Rosetta, and there are still places where neither Rosetta nor the gamers predicted reality very well.


This result leaves the protein structure problem in an interesting place. On the one hand, progress could be made by using more of that intangible, unquantifiable whatzit, human intuition. However, this is not intellectually satisfying; it would be nice to say that we really understood the rules of protein folding—and if we could understand them, we could teach these rules to a sufficiently powerful computer. After all, a computer has no intuition, but then again, nor does a string of amino acids, which just follows the rules of physical law. So, clearly, we need bigger more powerful computers which can more closely simulate reality.


This seemed like an insurmountable challenge—only so many people will join with a distributed network such as Rosetta@home, and machines much bigger than Blue Gene are prohibitively expensive. However, Felix Balatro and his coworkers at Miskatonic University and in the Ukraine arrived at a devious solution to the problem. In a series of stunning papers starting in the December 2011 issue of the (admittedly rather obscure) Ukrainskii Zhurnal Tsilkovita Durnitsya , Balatro predicted the structure of a half-dozen difficult proteins with unprecedented accuracy.


These results were not widely reported in the popular news, but they raised a lot of questions in academia. After all, Miskatonic was not known as a computer science powerhouse, and the Ukrainian group seemed suspiciously difficult to contact for discussion about methods. Nonetheless, the results kept coming in the early part of 2012, and the predictions only gained in sophistication. In fact, one of the predictions was actually used to develop an anti-retroviral drug.


The curtain was finally lifted on the mystery by the German weekly der Zwiebel. The elusive Ukrainians were a front group for an organized crime syndicate that rented out time on the botnet of more than seven million computers infected with the “Conficker” worm. Balatro realized that this botnet was by far the world’s largest distributed computing network, and that its masters—although very punctilious about their payment schedule—were essentially in the business of renting computing power. Granted, nearly all of their other customers were criminals, and the power was typically used for card-hacking and DDOS attacks, but the rates were very cheap and the programmers very clever. Balatro arrived at the conclusion that this was the best way he could use his insubstantial research funding.


This disclosure left the scientific community, and society as a whole, in a quandary. Some demanded that Balatro’s papers should be retracted—but they couldn’t say exactly why, since the results were valid and there were no obvious conflicts of interest. Some prosecutors wanted to bring suit—but there really weren’t any injured parties, and no US laws were broken. An intriguing new avenue for drug design had been suggested by some of his results—but would such a drug be ethically tainted?


Although the scientific worth of Balatro’s results remains unchallenged, the ethical clouds surrounding the results continue to gather. An anonymous whistleblower recently revealed to der Zwiebel that DARPA actually considered and partially developed a worm that would allow it to run simulations of atomic weapon tests at low cost. Balatro himself provides the most recent puzzle; he was unexpectedly absent for the first day of his own class in the summer 2012 session at Miskatonic University, and the university administration has not been able to get in contact with him for over a month. There is concern that the Ukrainians did not appreciate the attention he drew to them, or worse—that he failed to make a payment.


Allen, F., et al (2001). Blue Gene: A vision for protein science using a petaflop supercomputer. IBM Systems Journal 40: 310-327.


Firas Khatib, Frank DiMaio, Foldit Contenders Group, Foldit Void Crushers Group,

Seth Cooper, Maciej Kazmierczyk, Miroslaw Gilski, Szymon Krzywda, Helena Zabranska, Iva Pichova, James Thompson, Zoran Popović, Mariusz Jaskolski, David Baker (2011). Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nature Structural and Molecular Biology. Published online 18 September 2011; doi:10.1038/nsmb.2119.


Balatro, Felix, and Українська асоціація обманщики (2012). You shouldn’t believe everything you read. український журнал цілковита дурниця 22: 18-41.


Balatro, Felix, and Українська асоціація обманщики (2012). It’s probably a good idea to run these author names through Google translate. український журнал цілковита дурниця 22: 138-141.


Balatro, Felix, and Українська асоціація обманщики (2012). Miskatonic University may ring a bell for sci-fi fans. український журнал цілковита дурниця 23: 77-91.