Chemical Aspects of Synthetic Biology Luisi 2007

Chemical Aspects of Synthetic Biology Luisi 2007

(Parte 1 de 4)

Chemical Aspects of Synthetic Biology by Pier Luigi Luisi

Synthetic biology as a broad and novel field has also a chemical branch: whereas synthetic biology generally has to do with bioengineering of new forms of life (generally bacteria) which do not exist in nature, 5chemical synthetic biology6 is concerned with the synthesis of chemical structures such as proteins, nucleic acids, vesicular forms, and other which do not exist in nature.

Three examples of this 5chemical synthetic biology6 approach are given in this article. The first example deals with the synthesis of proteins that do not exist in nature, and dubbed as 5the never born proteins6 (NBPs). This research is related to the question why and how the protein structures existing in our world have been selected out, with the underlying question whether they have something very particular from the structural or thermodynamic point of view (for example, the folding). The NBPs are produced in the laboratory by the modern molecular biology technique, the phage display, so as to produce a very large library of proteins having no homology with known proteins.

The second example of chemical synthetic biology has also to do with the laboratory synthesis of proteins, but, this time, adopting a prebiotic synthetic procedure, the fragment condensation of short peptides, where short means that they have a length that can be obtained by prebiotic methods; for example, from the condensation of N-carboxy anhydrides. The scheme is illustrated and discussed, being based on the fragment condensation catalyzed by peptides endowed with proteolitic activity. Selection during chain growth is determined by solubility under the contingent environmental conditions, i.e., the peptides which result insoluble are eliminated from further growth. The scheme is tested preliminarily with a synthetic chemical fragment-condensation method and brings to the synthesis of a 4-residueslong protein, which has no homology with known proteins, and which has a stable tertiary folding.

Finally, the third example, dubbed as 5the minimal cell project6. Here, the aim is to synthesize a cell model having the minimal and sufficient number of components to be defined as living. For this purpose, liposomes are used as shell membranes, and attempts are made to introduce in the interior a minimal genome. Several groups all around the world are active in this field, and significant results have been obtained, which are reviewed in this article. For example, protein expression has been obtained inside liposomes, generally with the green fluorescent protein, GFP. Our last attempts are with a minimal genome consisting of 37 enzymes, a set which is able to express proteins using the ribosomal machinery. These minimal cells are not yet capable of self-reproduction, and this and other shortcomings within the project are critically reviewed.

Introduction. – I am a chemist who left his original avenues of polymer chemistry to move towards biochemistry and biology. Reflecting on the research field I am now pursuing, I can, however, see that somehow the 5genes6 of chemistry have remained, and they actually have helped me to move at the interface between biology and chemistry. Thus, the projects that I will describe below can be categorized within the

C 2007 Verlag Helvetica Chimica Acta AG, ZFrich broad and novel field of synthetic biology, but they are all have a strong character of chemistry.

In fact, this novel and fashionable term 5synthetic biology6 is now used to indicate a field where generally existing life forms are modified and the genomic content redirected towards novel, non-existing life forms; for example, bacterial life that does not exist on earth. Examples of these techniques can be found in recent issues in Nature [1] and Science [2]. This is all based on the hard hand of the bio-engineering approach that thrives from classic DNA molecular biology. The chemical approach to synthetic biology I am talking about is one that, instead of hampering with living life forms and creating some more or less fortunate imitations, aims more simply at the synthesis of molecular structures and/or multi-molecular organized systems that do not exist in nature. These man-made, in nature non-existing biological molecular or supramolecular structures can be obtained either by chemical or biochemical synthesis, possibly with the help of mechanical manipulations thereof.

I would put in this category of chemical synthetic biology the well-known work of

Albert Eschenmoser on nucleic acids containing pyranose instead of ribose [3], structures that have been synthesized in the laboratory, and that do not exist in Nature. The question is possibly why Nature did not make them, and much can be learned from the very asking of this question. The chemical modifications of nucleic acid bases pursued by Steve Benner [4] belong also to this class of studies.

There are examples also in the field of proteins: the synthesis of proteins containing a reduced alphabet – only 3, 5, 7, or 9 amino acids – already described in the literature [5][6] belong, in my opinion, to this field of 5chemical synthetic biology6. Also the approach pioneered by Craig Venter and co-workers, aimed at synthesizing an entire genome by chemical methods [7], can be considered as one of the examples of chemical synthetic biology.

In the following, I would like to present three projects carried out in my laboratory that can be considered as also belonging to this chemical frontier of synthetic biology. One project is carried out under the name of 5the never born proteins6, meaning proteins that have not been produced and/or selected by nature in the course of biological evolution. This synthetic procedure also produces the corresponding 5never born m-RNA6.

The second project deals with the synthesis of specific macromolecular sequences by fragment condensation under simulated environmental pressure which corresponds to molecular evolution.

The third is the 5minimal cells6 project, meaning semi-synthetic cells that do not exist in nature, which may represent the simplest form of cellular life.

These three projects are at different degrees of progress in my laboratory, and I will describe their present stage and outlook.

1. The !Never Born Proteins$ (NBPs). – The starting point is the numerology of proteins, in particular the well-known consideration that the proteins existing in nature make only an infinitesimal fraction of the theoretically possible structures. This paradox has been emphasized by various authors, also by Christian de Duve in his latest book [8]. There are many ways to express this. For example, one can say that the ratio between the theoretically possible proteins having a chain with 100 residues, and the actual number of all existing proteins (probably something around 1014), comes close to the ratio between the space of the universe and the space of a single H-atom; or, using a more earthly example, close to the ratio between the all sand of Sahara and one single grain of sand [9].

These astronomic figures may appear deprived of practical physical meaning.

However, they convey a very simple, well graspable concept, i.e., that our life is based on a very limited number of structures; and this, in turn, elicits a very relevant question: how and why have these few structures been selected out? 1.1. The Different Viewpoints: Determinism vs. Contingency. There are different answers one can tentatively give to this last question.

One first possible answer is that 5our6 proteins have something very special that made the selection possible. For example, they might be the only ones to be stable; or water soluble; or those which have very particular viscosity and/or rheological properties. In all these cases, they would have been selected because of their particular physical properties.

A second point of view is that our proteins have no extraordinary physical properties at all; they have been selected by chance among an enormous number of possibilities of quite similar compounds. They came out by 5chance6, and it happened that they were capable of fostering cellular life. The term 5chance6 is nowadays substituted by the more elegant term 5contingency6. Cast the dice again, and the probability that exactly our 1014 or so proteins come out again is at all effects practically zero, so that life as it is now may not have started. One can conceive some different forms of life thriving on quite different proteins, but this remains to be established.

Of course, contingency never works alone, it is always accompanied by some deterministic laws – certainly by thermodynamics and energy minimization processes – but, according to the contingency view, basically 5our6 life would be a serendipitous property of these casually determined structures.

The deterministic view mentioned above may, instead, assume some alternative extreme positions, up to the point of saying that life is an inescapable outcome of the laws of nature, and that, therefore, all prerequisites for making life, including the basic macromolecular structures, are determined.

In this sense, an author who should be particularly kept in mind is Christian de

Duve, who in his book [8] stated:

$It is self-evident that the universe was pregnant with life and the biosphere with man.

Otherwise, we would not be here. Or else, our presence can be explained only by a miracle...)

This is basically the view that the origin of life was an obligatory, inevitable process, and if one literally takes this view, then one has to conclude that the proteins must have been chosen in the right way so as to make life possible. One cannot, in fact, assume the inevitability of life and then let contingency shape the structure of proteins as chance structures.

To me, as I expressed in my recent book [9], the view that life is inescapable corresponds to implying a form of intelligent designer, and this, as the Anglican priest Paley said hundreds of years ago [10], cannot be else than God. In fact, I dubbed the authors that adhere to this view, including those of the anthropic principle [1][12] 5crypto-creationist6 [9] (not to be confused, however, with the American creationism, which is simply a form of fundamentalism). The view of contingency in evolution and life is advocated, among many others, most notably by Stephen J. Gould [13] and J. Monod [14]. 1.2. The Experimental Project. The basic idea of the project is to test whether 5our6 proteins have really something particular with respect to the proteins that have never existed. How can one conduct this project? Simply by synthesizing proteins that do not exist in nature, and comparing them with 5our6 proteins. It is a project of chemical synthetic biology, as outlined in the Introduction, aimed at producing a quite different 5grain of sand6, which should be the product of random choice – and asking then the question, whether 5our6 proteins are really so different and peculiar with respect to those synthetic biology products – in terms of stability, solubility, or folding. Actually, folding is a particularly important and stringent criterion, as the prerequisite for the biological activity of proteins is their globular folding, which is a consequence of the primary structure.

Such a project has been initiated at the Federal Institute of Technology in ZFrich,

Switzerland, to be pursued by my group transferred to the University of Rome3, Italy, in particular, by Cristiano Chiarabelli and Davide de Lucrezia. The first set of papers describing these results about the 5never born proteins6 (NBPs) has been recently published [15a–d].

The principle to produce NBPs is simple: if one makes a long string (say 150 bases) of DNA purely randomly, the probability of hitting an existing sequence in our Earth is practically zero (it corresponds to a number equivalent to the ratio between one grain of sand and the entire Sahara). If you then let this DNA being processed by standard recombinant DNA and in vivo expression techniques, you will obtain a 50-residueslong polypeptide that does not exist on Earth, and when this polypeptide is globularly folded, you have already obtained a NBP.

In practice, what we do is to work with a large library of DNA by the so-called phage-display method. We obtain first from commercial sources a library of totally random DNA sequences with the desired length (150 base pairs in our case). The random DNA segment is then inserted within a phage genome so that the corresponding random protein is linked to a capside protein. The production of the phage library actually needs the infection of cells that provide the machinery for the synthesis of viral proteins. Those proteins will be displayed on the capside of the phage (one per phage), and they are (in the N-termini portion) totally random, de novo proteins. In our case, the sequence of the NBPs is not completely random, since a tripetide sequence has been inserted in the middle of the random sequence with the aim of selection (vide infra).

This is the basis of the work carried out by us [15]. In this way, by a first run, a library of ca.1 09 of 50-residues-long polypeptides was obtained. The first questions at this point were: i) are they really all 5never born proteins6,i.e., more specifically, are they really absent in the protein data bank collected until now? and: i) what will be the fraction of folded polypeptides, i.e., of globular proteins?

It is also clear that, for practical reasons, one cannot study all 109 clones; one can only refer to a selected sampling of it, chosen, however, without any preconceived bias so that the statistical relevance of their properties still hold.

Let us begin with the question about being 5never born6. The 79 sequences which were selected at random were compared with known protein sequences, and no similarity was found, although a permissive criterion was adopted for the comparative analysis [15].

In conclusion, then, the proteins so synthesized can indeed be considered as nonextant, which permits the terminology of 5never born proteins6 (NBPs). Of course, it is possible that some of these sequences may have been proposed in the course of molecular evolution, and then gone lost; or that some of them are present in some unexplored plants or micro-organisms of our Earth. But, in first and good approximation, they are not present in any living form we know.

The other question (about folding) has been tackled based on the well-accepted observation that folded proteins are not easily digestible by proteases. The strategy involved the insertion of the tripeptide PRG (proline-arginine-glycine), substrate for the proteolytic enzyme thrombin, in the otherwise totally random protein sequence (i.e., the DNA library was designed in order to have three non-random codons in the middle of the sequence). In this way, each of the new proteins had the potentiality of being digested by the enzyme, with the expectation, however, that globularly folded NBPs would be protected from digestion. With this idea in mind, the 79 randomly selected clones were incubated in a medium in the presence of thrombin. The larger part of the population was rapidly hydrolyzed, but ca. 20% of the population was highly resistant to the action of thrombin.

(Parte 1 de 4)