(Parte 1 de 3)

Genomics of cellulosic biofuels

Edward M. Rubin1,2

The development of alternatives to fossil fuels as an energy source is an urgent global priority. Cellulosic biomass has the potential tocontribute tomeetingthe demand for liquid fuel, but land-userequirements and process inefficiencies represent hurdles for large-scale deployment of biomass-to-biofuel technologies. Genomic information gathered from across the biosphere, including potential energy crops and microorganisms able to break down biomass, will be vital for improving the prospects of significant cellulosic biofuel production.

he capture of solar energy through photosynthesis is a process that enables the storage of energy in the form of cell wall polymers (that is, cellulose, hemicellulose and lignin). The energy stored in these polymers can be accessed in a variety of ways, ranging from simple burning to complex bioconversion processes. The high energy content and portability of biologically derived fuels, and their significant compatibility with existing petroleum-based transportation infrastructure, helps to explain their attractiveness as a fuel source. Despite the increasing use of biofuels such as biodiesel and sugar- or starch-based ethanol, evidence suggests that transportation fuels based on lignocellulosic biomass represent the most scalable alternative fuel source1. Lignocellulosic biomass in the form of plant materials (for example, grasses, wood and crop residues) offers the possibility of a renewable, geographically distributed and relatively greenhouse-gas-favourable source of sugars that can be converted to ethanol and other liquid fuels. Calculations of the productivity of lignocellulosic feedstocks, in part based on their ability to grow on marginal agricultural land, indicates that they can probably have a large impact on transportation needs without significantly compromising the land needed for food crop production2.

Lignocellulosic biofuel production involves collection of biomass, deconstruction of cell wall polymers into component sugars (pretreatment and saccharification), and conversion of the sugars to biofuels (fermentation) (Fig. 1). Partially because of the historically low demand for biologically based transportation fuels, each step in this process is in the early stages of optimization for efficiency and throughput. The crops from which biomass is currently derived have not been domesticated for this particular purpose and the present methods for saccharification and fermentation are inefficient and expensive. However, the recent and pressing desire to develop alternatives to fossil fuels has made the rapid improvement of biofuel production a high priority, in which biologically derived energy (‘bioenergy’)-relevant genomic insights and resources will have an important role (Table 1).


From the perspective of transportation fuels, plants can be viewed as solarenergycollectorsandthermochemical energystoragesystems.It is the storage of energy in a form that can later be accessed via thermochemical or enzymatic conversion that distinguishes biomass from other renewable energy sources. Cellulosic biomass, sometimes referred to as lignocellulosic biomass, is an abundant renewable resource that can be used for the production of alternative transportation fuels3. The three main components of lignocellulose are cellulose, hemicellulose and lignin (Fig. 2), with the relative proportions of the three dependent on the material source4. Cellulose, the main structural component of plant cell walls, is a long chain of glucose molecules, linked to one another primarily by glycosidic bonds5. Hemicellulose, the second most abundant constituent of lignocellulosic biomass, is not a chemically well defined compound but rather a family of polysaccharides, composed of different 5- and 6-carbon monosaccharide units, that links cellulose fibres into microfibrils and cross-links with lignin, creating a complex network of bonds that provide structural strength5. Finally lignin, a threedimensional polymer of phenylpropanoid units, can be considered as the cellular glue providing the plant tissue and the individual fibres with compressive strength and the cell wall with stiffness6, in addition to providing resistance to insects and pathogens.

1DOEJointGenomeInstitute,2800MitchellDrive,WalnutCreek, California 94598,USA.2LawrenceBerkeleyNational Laboratory,1Cyclotron Road,Berkeley, California 94720,USA.

Solar energy

Feedstock Sugars


Fuel-producing microorganisms

Physical pre-treatment,chemicals and enzymes Figure 1 Biology of bioconversion of solar energy into biofuels. Solar energy is collected by plants via photosynthesis and stored as lignocellulose. Decomposition of thecellulosicmaterialinto simple5- and 6-carbon sugars is achieved by physical and chemical pretreatment, followed by exposure to enzymes from biomass-degrading organisms. The simple sugars can be subsequently converted into fuels by microorganisms.

841 ©2008Macmillan Publishers Limited. All rights reserved

As we can retrospectively view the features that made certain wild plants desirable for domestication thousands of years ago to become today’s food crops, we are now prospectively defining criteria to choose plants with potential to serve as dedicated bioenergy crops in the future. These include cell wall composition, growth rate, suitability for growth in different geographical regions, and resource-use efficiencies. With these features in mind, a list of potential bioenergy crops is being developed and targeted for different growing condi- tions7. Most plants assimilate their CO2 first into a C3 compound, whereas a smaller subset use a C4 compound. Plants using C4 photosynthesis tend to be among the most productive, having higher maximum efficiencies of light, nitrogen and water use in assimilating carbon. The C4 group of potential energy crops includes various perennial grasses such as switchgrass and Miscanthus. These grasses have the advantages of not requiring replanting after a yearly harvest, rapid growth, high biomass density per unit area, and low nutrient and water needs, enabling growth on marginal agricultural land.

Disadvantages are that C4 plants are rare in cold climates and unable to grow at temperatures below 10uC. In these environments, trees, which exclusively depend on C3 photosynthesis, provide the only candidate species. The C3 group of potential energy crops includes trees, such as poplar and eucalyptus, which have relatively rapid growth potential in difficult-to-plough environments. It is highly likely that multiple different energy feedstocks will be deployed depending on latitude, geography, water availability and landowner acceptance.

Until recently, minimal effort has been directed towards optimizing potential energy crops for the generation of transportation fuels. This is in stark contrast to the agronomic development of food crops, which have been domesticated for thousands of years to maximize productivity. Teosinte, the wild precursor to modern maize, was first recognized by Native Americans more than 5,000years ago as a potential food crop. The domestication of teosinte resulted in its conversion from a wild plant, the characteristics of which had been orchestrated by natural selection maximizing survival and reproduction, into a plant whose morphology and physiology had been extensively altered by artificial selection to increase its nutritional yield and ease of harvest8. More recently, selective breeding as well as agronomic advances have resulted in improvement over several orders of magnitude in the nutritional value per acre of modern maize

Table 1 Bioenergy genomes

Organism Genome size(megabases) Status Reference

Feedstocks and feedstock models Populus trichocarpa (poplar) 480 Complete Ref. 9 Chlamydomonas reinhardtii 120 Complete Ref. 34 Glycine max (soya bean) 1,200 Draft – Manihot esculenta (cassava) 770 In progress – Sorghum bicolor 760 In progress – Eucalyptus globulus 600 In progress – Brachypodium distachyon 355 In progress – Zea mays (maize) 2,500 In progress – Elaeis guineensis (oil palm) ,3,400 In progress http://www.checkbiotech.org/green_News_Biofuels.aspx?infoId515100 Panicum virgatum (switchgrass) ,5,600 In progress – Setaria italica (foxtail millet) ,515 In progress – Biomass degraders Acidothermus cellulolyticus 11B 2.4 Complete – Bacillus pumilis SAFR-032 3.7 Complete Ref. 35 Caldicellulosiruptor saccharolyticus DSM 8903 3.0 Complete – Clostridium phytofermentans ISDg 4.8 Complete – Clostridium thermocellum ATCC 27405 3.8 Complete – Cytophaga hutchinsonii ATCC 33406 4.4 Complete – Flavobacterium johnsoniae UW101 6.1 Complete – Rubrobacter xylanophilus DSM9941 3.2 Complete – Saccharophagus degradans 5.1 Complete Ref. 36 Thermobifida fusca strain YX 3.6 Complete Ref. 37 Clostridium cellulolyticum H10 4.0 Draft – Elusimicrobium minutum Pei191 1.6 Draft – Nectria haematococca/Fusarium solani 51 Draft – Phanerochaete chrysosporium 35.1 Draft – Postia placenta 3 Draft – Sagittula stellata E-37 5.3 Draft – Trichoderma reesei/Hypocrea jecorina 3 Draft – Cellulomonas flavigena DSM 20109 ,4.0 In progress – Cellvibrio japonicus Ueda107 ,6.0 In progress – Fibrobacter succinogenes subsp. succinogenes S85 ,3.8 In progress – Ruminococcus albus 4.0 In progress – Teredinibacter turnerae T7902 ,2 In progress – Termite hindgut community NA Complete Ref. 23 Poplar biomass degrading community NA In progress http://www.jgi.doe.gov/sequencing/lspssseqplans2007.html Asian longhorned beetle (Anoplophora glabripennis) gut community NA In progress http://www.jgi.doe.gov/sequencing/DOEmicrobes2007.html

Bovine rumen community transcriptome NA In progress http://www.energybiosciencesinstitute.org/ index.php?option5com_content&task5view&id5159&Itemid520

Fuel producers Clostridium acetobutylicum ATCC 824 4.0 Complete Ref. 38 Clostridium beijerinckii NCIMB 8052 6.0 Complete – Pichia stipitis 15.4 Complete Ref. 27 Thermoanaerobacter tengcongensis MB42 .7 Complete Ref. 39 Zymomonas mobilis subsp. mobilis ZM42 .1 Complete Ref. 40 Bacillus coagulans 36D12 .9 Draft – Thermoanaerobacter pseudethanolicus 39E 2.4 Draft – Clostridium ljungdahlii ,4.0 In progress –

Bioenergy-relevant organisms for which large-scale genome projects have been completed or are under way are listed. Information on genome projects without references can be found at http:// w.ncbi.nlm.nih.gov/sites/entrez?db5genomeprj.

842 ©2008Macmillan Publishers Limited. All rights reserved

compared to that of teosinte. Some of the most rapid increases have occurred in the past 40years, both from advances in agronomic practices and, importantly, from the application of modern genetics. The optimization of bioenergy crops as feedstocks for transportation fuels is in its infancy, but already genomic information and resources are being developed that will be essential for accelerating their domestication. Many of the traits targeted for optimization in potential cellulosic energy crops are those that would improve growth on poor agricultural lands, to minimize competition with food crops over land use.

Populus trichocarpa (poplar), the first tree and potential bioenergy crop to have its genome sequenced (Table 1)9, illustrates some of the issues and potential of applying genomics to the challenge of optimizing energy crops. The traits for which the genetic underpinnings will be sought in the genomes of bioenergy-relevant plants, such as poplar, include those affecting growth rates, response to competition for light, branching habit, stem thickness and cell wall chemistry. Significant effort will go into maximizing biomass yield per unit land area, because this more than any other factor will minimize the impact on overall land use. One can imagine trees optimized to have short stature to increase light access and enable dense growth, large stem diameter, and reduced branch count to maximize energy density for transport and processing. Trees have evolved with highly rigid and stable cell walls due to heavy selective pressure for long life and an upright habit. Plants domesticated for energy production, with a crop cycle time of only a few years, would have less need for a rigid cell wall than wild plants with lifetimes of a hundred years or more. Alterations in the ratios and structures of the various macromolecules forming the cell wall are a major target in energy crop domestication to facilitate post-harvest deconstruction at the cost of a less rigid plant.

Already, by comparing several of the presently available plant genomes (poplar9, rice10,1, Arabidopsis12; see Table 1) coupled with largescale plant gene function and expression studies, a number of candidate genes for domestication traits have been identified13,14. These include many genes involved in cellulose and hemicellulose synthesis as well as those believed to influence various morphological growth characteristics such as height, branch number and stem thickness15. In addition to homology-based strategies, other genome-enabled strategies for identifying domestication candidate genes are being used. These include quantitative trait analysis of natural variation and genome-wide mutagenesis coupled with phenotypic screens for traits such as recalcitrance to sugar release, acid digestibility and general cell wall composition. The availability of high-throughput transgenesis in several plant systems16 will facilitate functional studies to determine the in vivo activities of the large number of domestication candidate genes. Using these strategies, genes affecting features such as plant height, stem elongation and trunk radial growth, drought tolerance, and cell wall stability are but a few of the features that are likely to be identified as targets for domestication


Plant cell Plant

Cell wall



Hemicellulose Pentose

Crystalline cellulose

Hydrogen bondCellodextrin



10–20 nm Macrofibril p-Coumaryl alcoholConiferyl alcoholSinapyl alcohol

Figure 2 Structure of lignocellulose. The main component of lignocellulose is cellulose, a b(1–4)-linked chain of glucose molecules. Hydrogen bonds between different layers of the polysaccharides contribute to the resistance of crystalline cellulose to degradation. Hemicellulose, the second most abundant component of lignocellulose, is composed of various 5- and 6-carbon sugars such as arabinose, galactose, glucose, mannose and xylose. Lignin is composed of three major phenolic components, namely p-coumarylalcohol(H),coniferylalcohol(G)andsinapylalcohol(S).Lignin is synthesized by polymerization of these components and their ratio within thepolymervariesbetweendifferentplants,woodtissuesandcellwall layers. Cellulose, hemicellulose and lignin form structures called microfibrils, which are organized into macrofibrils that mediate structural stability in the plant cell wall.

843 ©2008Macmillan Publishers Limited. All rights reserved in a fraction of the time required to carry out similar studies unaided by the plant genomes and genomic approaches17.

(Parte 1 de 3)