My PhD is in Biophysics. Yeah, biophysics. I normally get the following response:
Starting only this week, I'm still finding my feet in the lab where I'm working, and I'm only beginning to scratch the surface of the mountains of literature which are pertinent to my (impenetrable)PhD title:
G-protein regulation phosphoinositide 3-kinase conformation and dynamics in signal transduction, autophagy and cellular sorting.
Or to paraphrase:
Cancer.
Because, eventually, everything you learn in biochemistry will, in some way, and at some time be made applicable to cancer.
In all seriousness though, this PhD is actually relevant to cancer. The main focus of my study are phosphoinositide 3-kinases (PI3 Kinases for short), which are pretty interesting little enzymes. Kinases are enzymes which phosphorylate - i.e. tag a phosphate group onto a target. The target of PI3 Kinases being a molecule of fat - called phosphatidylinositol - which is part of the cell membrane.
Phosphorylation is a big deal in biochemisty. It's how everything gets done. Attaching a phosphate group to an enzyme is the atomic equivalent to flicking a molecular switch (although not necessarily onto the "on" position). Phosphate groups consist only of a handful of atoms, but can radically change the workings of a entire enzyme -consisting of many thousands of atoms. These little phosphate achieve these enzymatic adjustments by altering the local electrostatic environment fairly drastically - causing a of modification to the shape and dynamics of protein, fundamentally responsible for any changes in activity. It's these very small, very negatively charged bundles of atoms which can ultimately change the fate of the cell and , by extension, the organism as a whole i.e you and me.
A diagram of a G-protein (or more accurately a G-protein coupled receptor (GPCR).*
Back to PI3 Kinases though. Changing cell activity is ultimately down to signalling, such as sending a hormone from the pancreas, through the blood stream to the cells in question. Signals from outside the cell however have to somehow manage to effect the internal workings of the cell. Some signals can just saunter in across the cell membrane, if they're small and of the correct charge. Many of the larger signals, like hormones, however have a specific protein with is a receptor sticking out of the cell membrane - and that's where the "G-protein" part comes in. G-proteins are a class of cell receptors which span the cell membrane, allowing the translation of signals which were received on the external face of the cell membrane to be transmitted into signals inside of the cell. G proteins only function when they are bound to their own unique highly specific signalling molecule - their ligand.
G-proteins are a HUGE deal in biochemistry. More time and effort around the world goes into understanding G-proteins than many people care to imagine. Some are fairly well understood, whilst others are complete black boxes of understanding. One fairly well understood G-protein is rhodopsin, the protein responsible for all that vision you take for granted.
The general outline for this kind of signalling is the following:
signal binds to G-protein on the cell surface-> G-protein binds to PI3 Kinase inside of the cell -> activates PI3 Kinase -> PI3 Kinase phosphorylates lipid
So now you know what the G-proteins and phosphoinositide 3-kinase and singal transduction bits of the PhD title are about. As for the autophagy and cellular sorting I'm still in the dark about them.
“If you want to make an apple pie from scratch, you must first create the universe.”
Life is composed primarily of 6 types of atom (or elements); carbon, nitrogen, oxygen, hydrogen, sulphur, and phosphorus. That's it really - and I included sulphur because I was feeling generous. Collectively, these atoms are sometimes refered to as CHON (or CHNOPS). All the other atoms which life uses are just the bells and whistles really - things like potassium, magnesium, and iron give biochemistry an extra kick when it needs to do something in a hurry.
Newts are cool, ok?
It's remarkable then that life is so diverse. With a toolbox consisting of only 6 fundamental units, life somehow seems to be able to achieve a extraordinary level of diversity and complexity, from newts to newfoundland terriors, whilst only having six fundamental building blocks to do this with.
With this in mind I'm going to start a series called life from the ground up. Life from the ground up is going to be a series of posts about how life can be viewed concurrently in both a reductionist and holistic way. In other words, it's a series on how a handful of atoms can be the root of a disease, the difference between cure and poison, and a cause of speciation.
Affixing Atoms into Amino Acids
With only 6 atoms to play with, producing life was always going to be difficult. It's amazing that it happened in the first place, but with the benefit of hindsight, it's probably a good thing that it did. For simplicity I'm going to annex Life (rather arbitrarily) into two realms, into the DNA and Protein domains, and for simplicity's sake (it'll become clear later) I'm going to focus on proteins for the moment.
Proteins are the "do-ers" of the biochemical world. DNA doesn't really do anything, it's a repository of information - which in itself is immensely important - but in isolation DNA would just sit around and not much Life would happen. It's proteins which interact with DNA, allowing it to be read and utilised.
Proteins, or more accurately enzymes, are the facilitators, capable of achieving chemical slights of hand, enabling chemical reactions to proceed within the cell at speeds several orders of magnitude faster than they would do so uncatalysed. What makes enzymes amazing is their plasticity and their diversity of function - just about every chemical reaction that keeps us ticking over is aided by enzymes.
Where do those six types of atom come in? Well, proteins are exclusively composed of these six elements. With only 6 components, it could be imagined that proteins consist of these atoms in every imaginable configuration. These proteins are huge molecules, consisting of thousands of atoms - surely the diversity of life can be attributed to the innumerable, most likely infinite, number of different ways that these 6 atoms can be joined together, in endless complexity?
These atoms are joined together in 20-odd ways. Yup, 20 ways, and what's more these 20 ways are all strikingly similar, and are known as amino acids.
It's like asking a painter to paint thousands of pictures of innumerable different things, and giving him oil paints of every conceivable colour, tone and hue, and coming back to find that every image he painted was done in 20 odd shades of purple.
Here's an amino acid (you can zoom, spin, change the viewing style, go nuts) (oh you need Java enabled):
Above (hopefully) you'll see a 3D copy of the amino acid alanine. In the viewer, carbon atoms are grey, oxygen red, nitrogen blue, and hydrogen are white. Alanine doesn't contain sulphur or phosphorus (in fact phosphorus isn't found really in nascent amino acids), but later on we'll see amino acids with sulphur incorporated, and phosphorus tacked on.
Alanine is one of the simplest of all the amino acids. Amino acids are so called because they have a carboxylic acid group (the red carbon/oxygen end of the molecule) and an amine group (the blue nitrogen end. You can see a carbon atom joining the carboxylic and amine groups, which is known as the C-alpha carbon. Above the C-alpha carbon is another carbon atom with three hydrogen atoms coming off it - known as a methyl group. The methyl group is the only bit that changes between each of the 20 amino acids, and it's known as the R group.
And that's about it. The diversity of the amino acids is found in R groups - of which there are twentyish different types. Next time we'll see how these R groups impact on the character of the amino acids, and we'll get a flavour of how they join up to make proteins.
“-It'll be easy-peasy-lemon-squeezy. -No, it won't. It'll be difficult-difficult-lemon-difficult”
Doing molecular biology is becoming easy. It’s facile. It’s easier to cock-up baking a Victoria Sponge than to fail a SDS-PAGE experiment – trust me. The tools are easier to use, the reagents purer, and the equipment getting cheaper. This progress means that a lay person with the perquisite levels of intelligence and resource can quite easily dabble in molecular biology. The gentleman (or lady (!)) scientist – a staple figure of Victorian Britain – can be quickly making a comeback.
With this rise in accessibility, recent graduate students in biochemistry are often hit by a one-two punch of anxiety and inferiority when they are about to accept their £9,000 (soon to be £27,000) diploma. A disquieting revelation dawns insidiously; I have spent the past three years learning to follow various recipes which anyone with the inclination and resource could follow.
Granted, biochemistry graduates are (hopefully) very well versed in the molecular basis for these recipes, with an intimate knowledge of which atoms go where, when and how along the metabolic pathway. That knowledge is largely defunct however when the application of the science can be achieved in the relatively simplistic terms of “add 300µl of colourless liquid A to 300µl of colourless liquid B, leave at 4˚C for 30 minutes, measure fluorescence at 450nm”. Although that may sound unduly technical, the reality is that after a half-hour introduction to the workings of a pipette and fluorimeter, the exercise becomes wonderfully (or woefully) easy, with the greater part of the brain being occupied with a crossword during the 30 minute wait.
With this greater accessibility, the application of biotechnological advances is far more amenable. Putting molecular biology into the hands of the curious, a notion espoused by groups such as DIYBio or the BioPunk Movement, may well create a generation of gentleman scientists. A Gentleman (or lady) scientist, tinkering with bits of DNA and protein, may well stumble across the secret of life. That wasn’t me being facetious – Einstein came up with the theory of special relativity sitting on his arse after a hard day in the patent office.
There are limitations however imposed on the gentleman molecular biologist scientist. Firstly, it’s widely believed that there are no more “low hanging fruit” discoveries in the scientific arena. All the easy stuff has already been found. Basic research of life processes is difficult-difficult-lemon-difficult, and it takes more than a couple of professors, some funding, and an attractive/gifted PhD student to make any new discoveries.
Secondly, and related to the first point, is that the stay-at-home scientist is limited in their scope of research by the proper grown-up scientists. Equipment especially can be highly specialised to certain aspects of the science. It’s hard to imagine any gentleman structural biologists – a science which requires access to an extremely powerful X-Ray sources.
Finally, there is the serious concern of bioterrorism, or more precisely, the neologism “bioerrorism”. Bioerrorism is a term given to a biohazard cock-up on grand proportions – either putting themselves at risk or the wider (global?) public. Bioerrorism may be over hyped, but certain chemicals which are prevalent in biochemical labs are pretty nasty. There’s certainly more potential for disaster in a home-made biochemistry lab then in a beetle collectors or amateur botanist.
One huge advantage that the Gentleman Scientist has over the student however is that he can go whenever his mind wanders. Assuming he is self-financed, his pocket decides how to fund the science – rather than a bureaucratic research council. Whether it matters or not that no-one in necessarily concerned with the ethics of his research may raise concern.
More people practicing biotechnology is surely nothing but a good thing; inoculating the public with some knowledge of biotech will hopefully dampen the torrent of Frankenstein-Huxley-Faust-Piranhamoose media frenzy which accompanies any discovery or achievement in biochemistry or biotechnology.
"I predict that within one-hundred years, computers will be twice as powerful, 10,000 times larger and so expensive that only the five richest kings of Europe will be able to afford one".
The mind boggling rise of computer technology from abstract plaything of Victorian gentry, to 70s technophobe joke, to inevitable robot overloads, is both marvellous and boringly ubiquitous to my generation, brought up with Windows '95 and Netscape.
For people old enough to remember when a Casio digital watch was both a technological wonder and status symbol of conspicuous consumption however, computer technology holds an uneasy place in there life, occupying both the "it's cool to Google my name" and the "it's scary there's a picture of me when I Google my name" parts of their brains.
Two properties which helped stoke this explosion in computer technology were both the plasticity and "scalability" of computer technology. The plasticity of computer tech is the fact that the "bits" of a computer can really be made of anything - in a "normal" computer it's electricity, but it could be anything given enough imagination. Even things which we don't fully understand - such as quarks.
Lego Vs PlayMobil.
Scalability describes whether a technology is capable of being grown, or whether it's a stand-alone application. A good analogy is Lego. A Lego model can be pulled apart and reduced to it's basic and easily-understood components, i.e. the bricks, and built and scaled up into more complex structures. Compare this to something like PlayMobil - a "standalone" product which can't be reduced and changed or built upon. Computers are like Lego - they have basic parts which can grow in complexity.
One amazing example which illustrates these two properties was unveiled the other week in the journal Science, a scalable computer - made of DNA (Science 113:1196-1201).
This isn't the first DNA computer, but it definitely shows more promise that previous attempts - which were stand-alone constructs. This new computer can solve the square root of any number up to and including 15 (which is 3.87298335 by the way), and it only takes 7 to 10 hours figure this out - I know impressive right?
Being facetious aside, it's impressive that DNA can be cajoled into achieving this mathamatical feat. But further to this is that the design of this DNA computer is eminently scalable as it is essentially based on the same logic used in computers - Boolean logic.
An AND gate.
Boolean logic is the essence behind the earliest computers and electronics - manifest in a technology called logic gates. If you did GCSE physics, you probably came across these as AND and OR gates in circuit diagrams, but there are plently of exotic logic gates like NOR, NAND, XOR and XNOR.
The basic principle is that the gates takes a various "input" signals and then transforms them into something else, an "output signal". An AND gate (pictured) works when the various input signals (labelled as A and B in the image) are both True - or in binary terms, 1. When this is true, the signal coming from the other side of the AND gate (from wire O) will also be True (or 1). An OR gate alternatively has an output of 1 when either A or B (or both) equal 1.
These logic gates can be wired up in large numbers in various sequences to produce some very nasty and complex circuit diagrams which can achieve wonderfully complex computations. The DNA computer was achieved using some of the most basic properties of DNA to produce logic gates.
Compulsary Image of DNA
DNA is a double stranded molecule, with the two complementary strands being highly specific for each other. The DNA computer uses this property by creating DNA nanostructures called "seesaw gates".
Essentially seesaw gates consist of pieces of double-stranded DNA which wait around for a signal - a complementary single-stranded piece of DNA - to reach them. Using some DNA-based jiggery-pokery, the incoming single strand (if sufficiently concentrated) displaces one of the strands inside the seesaw gate, and hybridises to the free strand. The displaced single strand of DNA is then a new signal, which can go to another seesaw gate - and so the signal is propagated throughout the circuit.
Seesaw gates can be tuned into make their properties more interesting. This is achieved by increasing the concentration of the incoming signal DNA required to cause of strand displacement of the gate DNA. By having a few seesaw gates in order, along with a fluorescence-based (and slightly more complex) "reporter gates", we can produce DNA-based NOT and AND gates.
What's great about this method is it's scalibility - just string together the NOT and AND gates and you too can create a DNA computer. What's more is that stringing up about 30 of these seesaw gates can produce a circuit which can produce the square root for 15 numbers - imagine what can be done with 300 or 3000 seesaw gates! Entire fully understood DNA computers could , if robust enough, be used to create synthetic life in an entirely comprehensive and determinate way.
"If you want to change the world in some big way - that's where you should start - biological molecules." - Bill Gates
Molecular biology is starting to undergo a cultural change - moving from labs to garages and spare rooms. Soon every man and his dog will soon be meddling with molecular biology in a vision which is reminiscent of the the 1980s computer-garage-hackers - who were so last millennium by the way.
At the heart of this movement is the hope that the manipulation of DNA, the operating system of the cell, which was once so mind-bogglingly-Nobel-Prize-winningly complex that it was the preserve of Professors and PhDs only, is gradually becoming facile, and may one-day be commonplace. It's a gradual distillation of the disciplines of recombinant DNA production and synthetic biology into workaday and useful "tools".
I use the word tool in an abstract sense, rather than a hammer and spanner sense, in the way that a PC is a tool - a facilitator for (hopefully) greater things. Extending the analogy of computer technology, people don't really need to understand the concepts of binary and logic circuits to print off an essay from Word '95 - and the same can be said for synthetic biology. If the world of molecular biology was standardised enough, and familiar enough, and most importantly cheap enough, people would be able to manipulate (and create) biology to their heart's content.
Facilitating this movement from lab space to loft space are two movements.
First, molecular biology is pretty open source - and the hope is that it will become increasingly open source. Extraordinary useful resources, little things like the Human Genome for example, are essentially open to Joe Public - even if it's difficult for Joe Public (or myself for that matter) to make head or tail of the information presented. Programs which present the information in a sensible and utilitarian manner however are all over the web - are are extremely helpful. Other databases, such as the PDB which focuses on different (but not necessarily separate) areas of molecular biology, are popping into existence all the time - with a basic prerequisite of open access for all. These databases are the raw material for the budding garage DNA enthusiast - it's a library is to an aspiring authors. Garage DNA meddlers - just like authors - can cut and paste DNA from databases to create new and inventive organisms.
An OpenPCR machine - for $512 you too can make copies of DNA!
Second, is that the price of manipulating DNA is gradually moving into the amenable zone. Reading or "sequencing" DNA is almost becoming redundantly inexpensive. The individual letters (known as bases) of a string of DNA can be read for far less that $0.01 each. Within the decade (if not sooner) it is believed that sequencing your 3 trillion bases long genome could be $1,000 (or less - if the dollar is still in existence by 2020 and not replaced with renminbi). In 2020 you'll probably be able to spit in a tube, or swab your cheek, send it off mail order, and hey presto in a few days you'll have your genome sent back to you on a futuristic floppy disk. To put this in perspective, the sequencing of the first Human Genome took the best part of 21 years and three billion dollars. Other applications are abound; hand-held DNA sequences may make it quicker to sample the DNA of an unknown species of plant rather than look it up in a field guide. Producing your own DNA (now as oligonucleotide synthesis) however is still more expensive and labour intensive, but there are plenty of plans to overcome this. And as my old pal Craig Venter showed, synthesising an entire genome is within the realms of possibility - although rather out of the price range of (presumably) you and (definitely) I.
When manipulating DNA, sequencing DNA, and synthesising DNA becomes easy and affordable interesting things are set to happen. When what was once a major limitation of "programming" biology is removed, the whole arena becomes infinitely more accessible. Imagine early computer programmers worrying about how much each line of code cost the programmer to write or even read - it would have been an omnipresent inhibitor to everyone's creativity. When the cost-constraints for tinkering with DNA are a relic of the past - that's when coming home from work and playing with the genome of the organism your designing in your garage becomes a possibility.
Telling people you're a biochemistry student is often greeted with a mixture of a non-committal head movement and a slow but determined escape. More foolhardy people then go on to ask what I actually do, given I’m working in industry at the moment, and I cough out an answer along the lines of ‘X-ray crystallography’, which is normally causes their face to have a flash of frustration - I think that people believe I am trying to be obstructive or aloof for the sake of appearing mysterious. The main problem when explaining X-Ray crystallography is that it’s hard to be succinct and in any way true to the subject at the same time. Also another problem is that, as a student, most of the time I have no idea what exactly I am doing.
Ribosome structure determined by X-ray crystallography from Venki Ramakrishnan
What I do have some idea about however is the end result of X-ray crystallography; a model of a protein molecule or a nucleic acid– a very, very detailed model.
In the very best models, consisting of hundreds or thousands of amino acids, single atoms of each amino acid can be discerned – giving an amazingly complex model of the protein totaling tens of thousands of atoms. Navigating and orientating yourself around such a vast and detailled model is difficult enough, but producing such a thing is far, far more difficult. So, why would you want to expend so such time and effort on producing such a model of a protein or a nucleic acid?
Producing these models - although innately intricate and beautiful - isn't an end in itself. These models are capable of telling us vast amounts of information on how the proteins work and interact with other proteins. A detailed-enough model can describe the mechanism of an enzyme's catalysis to the atom; something which reminds me of how astounding this field of science is. There is something both unsettling and supremely comforting when a process is understood in such detail that we understand it on the most fundamental level.
However, achieving these models is anything but easy. A whistle-stop tour towards a model would go something like this;
Firstly, there is the matter of producing enough protein in order to examine it via crystallography. This involves (normally) knowing the genetic sequence, and implanting the gene into our good old friend E.coli. There's all sorts of issues here, such as your protein falling apart (degradation), or clumping together (precipitation/aggregation) or the protein simply being insoluble.
Secondly, there's the small matter of purifying the protein. There's a few thousand proteins in E.coli which you won't want to look at, and so finding a method of extracting your protein of interest is an art in itself. Methods generally rely on the concept of chromatography, a method of separating a mixture of stuff based on its affinity to something else. Being able to produce a large amount of pure protein isn't simple nor easy, and this step alone can take an inordinately large amount of time.
Pretty...
Third, there's the crystallography. Taking your high-purity protein and producing crystals requires a skill somewhere between alchemy and magic. It involves screening your protein against hundreds (to thousands) of different solutions. There's all sorts of frustrations to be found here with two near identical solutions able to give widely different results. Protein crystals are typically less that a millimetre long, and some grow in horrible solutions at 4 degrees centigrade - which makes harvesting them not so much fun. They also like fragmenting apart when you touch them and they also like no being exposed to the air for any amount of time. They're fussy little twats. Very small nylon loops are typically used to manipulate and play with them. Before the last step, the crystals are frozen in liquid nitrogen to about -150 degrees centigrade to stop them being damaged prior to diffraction.
Finally, you diffract your crystal using X-Rays, to produce a diffraction pattern - something between a Rorschach test and a join-the-dots pattern. Using a reasonable amount of computer power and complex mathematics (including Fourier Transformation)these images can be analysed and used to produce one of those lovely models I was talking about. Between a gene and a protein model is (on average) a year of tears, blood, swearing and sweat.
It's not a particularly romantic process, and it's not really completely comprehensible. But somewhere between the complex maths, and the thousands of crystallisation trials, the end product seems justified.
It’s a sad aspect of modern science that if you want to puncture the public consciousness on a particular topic, you’ve got to either pretend you’ve recreated life, or throw a press conference to announce wobbly arsenic-based results. It tends to over-shadow scientific endeavor which is important and earnestly presented in a non-egotistical manner. One big leap to understanding the possible origins of life came from Philipp Holliger’s lab last month, and as I’m more of a Nature guy than a Science guy it passed me (and the wider public) by.
RNA is DNA’s feisty side-kick (if you’re really sad like me). RNA is often deemed DNA’s “simpler cousin” but in reality it’s anything but. RNA is truly the handy-man of life; it pops up anywhere and everywhere, in the ribosome, during translation, transcription, and splicing - all pretty important cellular processes which keep you, me and all the rest of known biological life alive. There's a whole host of types of RNA also - tRNAs, mRNAs, snRNAs... it's pretty diverse and a relatively new area of exploration in molecular biology.
What’s also mightily impressive about RNA is that it is capable of being two things at once. RNA can act as an enzyme (a biological catalyst), in which case it is called a “ribozyme”- and it can also act as a repository of information, like DNA in our genomes. It’s this ability for a single macromolecule to straddle dual roles, along with its (when compared to DNA) chemical simplicity which lead to RNA being proposed as a possible precursor to DNA-based life. This is known as the “RNA World” hypothesis – a time about 4-billion old years ago when life as we know it consisted of bits of RNA floating about and interacting with each other.
Underpinning the RNA World theory is the assumption that RNA would need to be self-replicating at one point – allowing natural selection to kick in and allowing the eventual boot up of Life 1.0. An important step in the RNA World theory would be the production of a polymerase. These are enzymes which act like replicators, accurately copying other RNAs, allowing them to multiply and flourish (and evolve).
What Phillip Holliger’s lab produced was a RNA molecule called tc19Z capable of reproducing other RNAs up to 95 bases long with an accuracy of about 1 in 250 bases copied being incorrect. What’s even more impressive is that tc19Z is capable of replicating functional ribozymes. One such ribozyme is the hammerhead ribozyme, a quasi-suicidal self-destroying RNA molecule.
At 198 bases, tc19Z isn't far from replicating itself. Here's hoping that discovery punctures the public's consciousness a little more deeply.