genoma xylela

genoma xylela

(Parte 1 de 2)


The genomesequenceof the plant pathogenXylella fastidiosa

The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis*,S ao Paulo, Brazil

* A full list of authors appears at the end of this paper

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis—a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium–bacterium and bacterium–host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer.

Citrus variegated chlorosis (CVC), whichwas first recordedin Brazil in 1987, affects all commercial sweet orange varieties1. Symptoms include conspicuous variegations on older leaves, with chlorotic areas on the upper side and corresponding light brown lesions, with gum-like material on the lower side. Affected fruits are small, hardened and of no commercial value. A strain of Xylella fastidiosa was first identified as the causal bacterium in 1993 (ref.2) and found to be transmitted by sharpshooter leafhoppers in 1996 (ref. 3). CVC control is at presentlimited to removing infected shoots by pruning, the application of insecticides and the use of healthy plants for new orchards. In addition to CVC, other strains of X. fastidiosa cause a range of economically important plant diseases including Pierce’s disease of grapevine, alfalfa dwarf, phony peach disease, periwinkle wilt and leaf scorch of plum, and are also associated with diseases in mulberry, pear, almond, elm, sycamore, oak, maple, pecan and coffee4. The triply cloned X. fastidiosa 9a5c, sequenced here, was derived from the pathogenic culture 8.1b obtained in 1992 in Bordeaux (France) from CVC-affected Valencia sweet orange twigs collected in Macaubal (Sao Paulo, Brazil) on May 21, 1992 (ref.2). Strain 9a5c produces typical CVC symptoms on inoculation into experimental citrus plants5, and into Nicotiana tabacum (S. A. Lopes, personal communication) and Catharantus roseus (P. Brant- Monteiro, personal communication)—two novel experimental hosts.

General features of the genome The basic features of the genome are listed in Table 1 and a detailed map is shown in Fig. 1. The conserved origin of replication of the large chromosome has been identified in a region between the putative 50S ribosomal protein L34 and gyrB genes containing dnaA, dnaN and recF6. The Escherichia coli DnaA box consensus sequence TTATCCACA is found on both DNA strands close to dnaA. In addition, there are typical 13-nucleotide (ACCACCACCACCA) and 9-nucleotide (two TTTCATTGG and two TTTTATATT) sequences in other intergenic sequences of this region. This region is coincident with the calculated GC-skew signal inversion7. We have designated base1 of the X. fastidiosa genomeas the first Tof the only TTTTAT sequence found between the ribosomal protein L34 gene and dnaA.

Theoverallpercentageofopenreadingframes(ORFs) for whicha putative biological function could be assigned (47%) was slightly below that for other sequenced genomes such as Thermotoga maritima8 (54%), Deinococcus radiodurans9 (52.5%) and Neisseria meningitidis10 (53.7%). This may reflect the lack of previous complete genome sequences from phytopathogenic bacteria. Plasmid pXF1.3 contains only two ORFs, one of which encodes a replication-associated protein. Plasmid pXF51 contains 64 ORFs, of which 5 encode proteins involved in replication or plasmid stability and 20 encode proteins potentially involved in conjugative transfer. One ORF encodes a protein similar to the virulenceassociated protein D (VapD), found in many other bacterial pathogens11. Four regions of pXF51 present significant DNA similarity to parts of transposons found in plasmids from other bacteria, suggesting interspecific horizontal exchange of genetic material.

The principal paralogous families are summarized in Table 2. The complete list of ORFs with assigned function is shown in Table 3. Seventy-five proteins present in the 21 completely sequenced genomes in the COG database12 (as of 15th March 2000) were also found in X. fastidiosa. Each of these sequences was used to

Table 1 General features of the Xylella fastidiosa 9a5c genome

Main chromosome Length (bp) 2,679,305 G+C ratio 52.7% Open reading frames (ORFs) 2,782 Coding region (% of chromosome size) 8.0% Average ORF length (bp) 799 ORFs with functional assignment 1,283 ORFs with matches to conserved hypothetical proteins 310

ORFswithoutsignificantdatabasematch 1,083 Ribosomal RNA operons 2 (16SrRNA-Ala-TGC-tRNA-Ile-GAT- tRNA-23SrRNA-5SrRNA) tRNAs 49 (46 different sequences corresponding to all 20 amino acids)

tmRNA 1

Plasmid pXF51 Length (bp) 51,158 G+C ratio 49.6% Open reading frames (ORFs) 64 Protein coding region (% of plasmid size) 86.9% ORFs with functional assignment 30 ORFs with matches to conserved hypothetical proteins 8

ORFswithoutsignificantdatabasematch 24 Plasmid pXF1.3 Length (bp) 1,285 G+C ratio 5.6% Open reading frames (ORFs) 2

ORFs with functional assignment 1 © 2000 Macmillan Magazines Ltd generate a phylogenetic tree of the 2 organisms. In 69% of such trees, X. fastidiosa was grouped with Haemophilus influenzae and E. coli, consistent with a phylogenetic analysis undertaken with the 16S rRNA gene13.

One ORF, a cytosine methyltransferase (XF1774), is interrupted by a Group I intron. The intron was identified on the basis of the presence of a reverse transcriptase-like gene (as in other Group I introns), conserved splice sites, conserved sequence in structure V and conserved elements of secondary structure14. Group I introns are rare in prokaryotes, but have been found in different evolutive lineages including E. coli, cyanobacteria and proteobacteria15.

Transcription, translation and repair The basic transcriptional and translational machinery of X. fastidiosa is similar to that of E. coli16. Recombinational repair, nucleotide and base-excision repair, and transcription-coupled repair are present with some noteworthy features. Forexample, no photolyase was found, indicating exclusively dark repair. Although the main genes of the SOS pathway, recA and lexA, are present, ORFs corresponding to the three DNA polymerases induced by SOS in E. coli (DNA polymerases I, IVand V)17 are missing, indicating that the mutational pathway itself may be distinct.

Energy metabolism Even though X. fastidiosa is, as its name suggests, a fastidious organism, energy production is apparently efficient. In addition to all the genes for the glycolytic pathway, all genes for the tricarboxylic acid cycle and oxidative and electron transport chains are present. ATP synthesis is driven by the resulting chemiosmotic proton gradient and occurs by an F-type ATP synthase. Fructose, mannose and glycerol can be utilized in addition to glucose in the glycolytic pathway. There is a complete pathway for hydrolysis of cellulose to glucose, consisting of 1,4-b-cellobiosidase, endo-1,4-b-glucanase and b-glucosidase, suggesting that cellulose breakdown maysupplement the often lowconcentrations of monosaccharides in the xylem18. Two lipases are encoded in the genome, but there is no b-oxidation pathway for the hydrolysis of fatty acids, presumably precluding their utilization as an alternative carbon and energy source. Likewise, although enzymes required for the breakdown of threonine, serine, glycine, alanine, aspartate and glutamate are present, pathways for the catabolism of the other naturally occurring amino acids are incomplete or absent.

The gluconeogenesis pathway appears to be incomplete. Phosphoenolpyruvate carboxykinase and the gluconeogenic enzyme fructose-1,6-bisphosphatase, which are required to bypass the irreversible step in glycolysis, are not present. The absence of the first is compensated by the presence of phosphoenolpyruvate synthase and malate oxidoreductase, which together can generate phosphoenolpyruvate from malate. There appears, however, to be no known compensating pathway for the absence of fructose-1,6- bisphosphatase. It is possible that among the large number of unidentified X. fastidiosa genes there are non-homologous genes that compensate for steps in such critical pathways. Barring this possibility, however, the absence of a functional gluconeogenesis pathway implies a strict dependence on carbohydrates both as a source of energy and anabolic precursors. The glyoxylate cycle is absent and the pentose phosphate pathway is incomplete. In the latter pathway, genes for neither 6-phosphogluconic dehydrogenase nor transaldolase were identified.

Small molecule metabolism X. fastidiosa exhibits extensive biosynthetic capabilities, presumably an absolute requirement for a xylem-dwelling bacterium. Most of the genes found in E. coli necessary for the synthesis of all amino acids from chorismate, pyruvate, 3-phosphoglycerate, glutamate and oxaloacetic acid16 were identified. However, some genes in X. fastidiosa are bi-functional, such as phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP pyrophosphatase (XF2213), aspartokinase/homoserine dehydrogenase I (XF2225), imidazoleglycerolphosphate dehydratase/histidinol-phosphate phosphatase (XF2217) and a new diaminopimelate decarboxylase/aspartate kinase (XF1116) that would catalyse the first and the last steps of lysine biosynthesis. In addition, the gene for acetylglutamate kinase (XF1001) has an acetyltransferase domain at its carboxy-terminal end that would compensate for the missing acetyltransferase in the arginine biosynthesis pathway. Other missing genes include phosphoserine phosphatase, cystathionine b-lyase, homoserine O-succinyltransferase and 2,4,5-methyltetrahydrofolate-homocysteine methyltransferase. The first two enzymes are also absent in the Bacillus subtilis genome, the third is absent in Haemophilus influenzae and the fourth is missing in both genomes12. We thus presume that alternative, unidentified enzymes complete the biosynthetic pathways in these organisms and in X. fastidiosa.

The pathways for the synthesis of purines, pyrimidines and nucleotides are all complete. X. fastidiosa is also apparently capable of both synthesizing and elongating fatty acids from acetate. Again, however, some E. coli enzymes were not found, such as holo acylcarrier-proteinsynthase(alsoabsentinSynechocystissp.,H.influenzae and Mycoplasma genitalium) and enoyl-ACP reductase (NADPH) (FabI) (also absent from M. genitalium, Borrelia burgdorferi and Treponema pallidum)12.

X. fastidiosa appears to be capable of synthesizing an extensive variety of enzyme cofactors and prosthetic groups, including biotin, folic acid, pantothenate and coenzyme A, ubiquinone, glutathione, thioredoxin, glutaredoxin, riboflavin, FMN, FAD, pyrimidine nucleotides, porphyrin, thiamin, pyridoxal 59-phosphate and lipoate. In a number of the synthetic pathways, one or more of the enzymes present in E. coli are absent, but this is also true for at least oneothersequencedGram-negativebacterialgenomeineachcase12. We therefore again infer that the missing enzymes are either not essential or replaced by unknown proteins with novel structures.

Transport-related proteins A total of 140 genes encoding transport-related proteins were identified, representing 4.8% of all ORFs. For comparison, E. coli, B. subtilis and M. genitalium have around 10% of genes encoding transport proteins, whereas Helicobacter pylori, Synechocystis sp. and Methanococcus jannaschii have 3.5–5.4% (ref. 19). Transport systems are central components of the host–pathogen relationship (Fig. 2). Thereare a numberof ion transporters and transporters for the uptake of carbohydrates, amino acids, peptides, nitrate/nitrite, sulphate, phosphate and vitamin B12. Many different transport articles

152 NATURE VOL 406 13 JULY 2000

Table 2 Largest families of paralogous genes

Family (total number of families = 312)

Number of genes

(total number of genes = 853)

ATP-binding subunits of ABC transporters 23 Reductases/dehydrogenases 12 Two-component system, regulatory proteins 12 Hypothetical proteins 10 Transcriptional regulators 9 Fimbrial proteins 9

Two-component system, sensor proteins 9

Figure 1 Linear representation of the main chromosome and plasmids pXF51 and pXF1.3 of the Xylella fastidiosa genome. Genes are coloured according to their biological role. Arrows indicate the direction of transcription. Genes with frameshift and point mutations are indicated with an X. Ribosomal RNA genes, the tmRNA, the principal repeats, prophages and the group I intron are indicated by coloured lines. Transfer RNAs are identified by a single letter identifying the amino acid. Pie chart represents the distribution of the number of genes according to biological role. The numbers below protein-producing genes correspond to gene IDs.

families are represented and include both small and large mechanosensitive conductance ion channels, a monovalent cation:proton antiporter (CAP-2) and a glycerol facilitator belonging to the major intrinsic protein (MIP) family. In addition, 23 ABC transport systems comprising 41 genes can be identified. X. fastidiosa appears to possess a phosphotransferase system (PTS) that typically mediates small carbohydrate uptake. There are both the enzyme I and HPr components of this system, as well as a gene supposedly involved in its regulation (pstK or hprK); however, there is no PTS permease—an essential component of the phosphotransferase complex. The functionality of the system therefore remains in question.

There are fiveouter membranereceptors, including siderophores, ferrichrome-iron and haemin receptors, which are all associated with iron transport. The energizing complexes, TonB–ExbB–ExbD and the paralogous TolA–TolR–TolQ, essential for the functioning of the outer membrane receptors, are also present. In all, 67 genes encode proteins involved in iron metabolism. We propose that in X. fastidiosa the uptake of iron and possibly of other transition metal ions such as manganese causes a reduction in essential micronutrients in the plant xylem, contributing to the typical symptoms of leaf variegation.

The X. fastidiosa genome encodes a battery of proteins that mediate drug inactivation and detoxification, alteration of potential drug targets, prevention of drug entry and active extrusion of drugs and toxins. These include ABC transporters and transport processes driven by a proton gradient. Of the latter, eight belong to the hydrophobe/amphiphile efflux-1 (HAE1) family, which act as multidrug resistance factors.

Adhesion X. fastidiosa is characteristically observed embedded in an extracellular translucent matrix in planta20. Clumps of bacteria form within the xylem vessels leading to their blockage and symptoms of the disease such as water-stress leaf curling. We deduce, from our analysis of the complete genome sequence, that the matrix is composed of extracellular polysaccharides (EPSs) synthesized by enzymes closely related to those of Xanthomonas campestris pv campestris (Xcc) that produce what is commercially known as xanthan gum. In comparison with Xcc, however, we did not find gumI (encoding glycosyltransferase V, which incorporates the terminal mannose), gumL (encoding ketalase which adds pyruvate to the polymer) or gumG (encoding acetyltransferase which adds acetate), suggesting that Xylella gum may be less viscous than its Xanthomonas counterpart.

Positive regulation of the synthesis of extracelullar enzymes and

EPS in Xanthomomas is effected by proteins coded by the rpf (regulation of pathogenicity factors) gene cluster21. Mutations in any of these genes in Xanthomomas results in failure to synthesize the EPS. In consequence, the strain becomes non-pathogenic21. X. fastidiosa contains genes that encodeRpfA, RpfB, RpfC and RpfF, suggesting that both bacteria may regulate the synthesis of pathogenic EPS factors through similar mechanisms.

Fimbria-likestructures are readily apparent upon electronmicroscopical observation of X. fastidiosa within both its plant and insect hosts22. Because of the high velocity of xylem sap passing through narrow portions of the insect foregut, fimbria-mediated attachment may be essential for insect colonization. Indeed, in the insect mouthparts the bacteria are attached in ordered arrays, indicating specific and polarized adhesion23. In addition, fimbriae are thought to be involved in both plant–bacterium and bacterium–bacterium interactions during colonization of the xylem itself. We identified 26 genes encoding proteins responsible for the biogenesis and function of Type 4 fimbria filaments. This type of fimbria is found at the poles of a wide range of bacterial pathogens where they act to mediate adhesion and translocation along epithelial surfaces24. The genes include pilS and pilR homologues, which encode a two-component system controlling transcription of fimbrial subunits, presumably in response to host cues, and pilG, H, I, J and chpA, which encode a chemotactic system transducing environmental signals to the pilus machinery.

In addition to the EPS and fimbriae, which are likely to have central roles in the clumping of bacteria and in adhesion to the xylem walls, we also identified outer membrane protein homologues for afimbrial adhesins. Although fimbrial adhesins are well characterized as crucial virulence factors in both plant and human pathogens25, afimbrial adhesins, which are directly associated with the bacterial cell surface, have been hitherto associated only with human and animal pathogens, where they promote adherence to epithelial tissue. Of the three putative adhesins of this kind identified in X. fastidiosa, two exhibit significant similarity to each other (XF1981, XF1529) and to the hsf and hia gene products of H. influenzae26. The third (XF1516) is similar to the uspA1 gene product of Moraxella catarrhalis27. All these proteins share the common C-terminal domain of the autotransporter family28. Direct experimentation will be required to establish whether these adhesins promote binding to plant cell structures or components of the insect vector foregut, or both. Nevertheless, their presence in the X. fastidiosa genome adds to the increasing evidence for the generality of mechanisms of bacterial pathogenicity, irrespective of the host organism29.

We also identified three different haemagglutinin-like genes.

Again, similar genes have not previously been identified in plant pathogens. Thesegenes(XF2775,XF2196,XF0889)arethelargestin the genome and exhibit highest similarity to a Neisseria meningitidis putative secreted protein10.

Intervessel migration Movement between individual xylem vessels is crucial for effective colonization by X. fastidiosa. For this to occur, degradation of the pit membrane of the xylem vessel is required. Of the known pectolytic enzymes capable of this function, a polygalacturonase precursor and a cellulase were identified, although the former contains an authentic frameshift. These genes exhibited highest similarity to orthologues in Ralstonia solanacearum—which causes wilt disease in tomatoes—where the polygalacturonase genes are required for wild-type virulence.

Toxicity We identified five haemolysin-like genes: haemolysin I (XF0175), which belongs to an uncharacterized protein family, and four others (XF0668, XF1011, XF2407, XF2759) which belong to the RTX toxin family that contains tandemly repeated glycine-rich nonapeptide motifs at the C-terminal domain. One of these ORFs is closely related to bacteriocin, an RTX toxin also found in the plant bacterium Rhizobium leguminosarum30. RTX or RTX-like proteins are important virulence factors widely distributed among Gramnegative pathogenic bacteria31.

There are two Colicin-V-like precursor proteins. Colicin V is an antibacterial polypeptide toxin produced by E. coli, which acts against closely related sensitive bacteria32. The precursors consist of 102-amino-acid peptides (XF0262, XF0263) that have the typical conserved leader 15-amino-acid motif, and have some similarity with Colicin V from E. coli at the remaining C-terminal portion. The necessary apparatus for Colicin biosynthesis and secretion is also present. Interestingly, in E. coli most of the genes necessary for biogenesis and export of Colicin Vare in a gene cluster present in a plasmid, whereas in X. fastidiosa these genes are dispersed in the chromosome.

We found four genes that may function in polyketide biogenesis: polyketide synthase (PKS), pteridine-dependent deoxygenase, daunorubicin C-13 ketoreductase and a NonF-related protein. These genes belong to the synthesis pathways of frenolicin, rapamycin, daunorubicin and nonactin, respectively. These pathways articles articles

154 NATURE VOL 406 13 JULY 2000

Figure 2 A comprehensive view of the biochemical processes involved in Xylella fastidiosa pathogenicity and survival in the host xylem. The principal functional categories are shown in bold, and the bacterial genes and gene products related to that function are arranged within the coloured section containing the bold heading. Transporters are indicated as follows: cylinders, channels; ovals, secondary carriers, including the MFS family; paired dumbbells, secondary carriers for drug extrusion; triple dumbbells, ABC transporters; bulb-like icon, F-type ATP synthase; squares, other transporters. Icons with two arrows represent symporters and antiporters (H+ or Na+ porters, unless noted otherwise). 2,5DDOL, 2,5-dichloro-2,5-cyclohexadiene-1,4-dol; EPS, exopolysaccharides; MATE, multi-antimicrobial extrusion family of transporters multidrug efflux gene (XF2686); MFS, major facilitator superfamily of transporters; Pbp, b-lactamase-like penicillin-binding protein (XF1621); RND, resistance-nodulation-cell division superfamily of transporters; ROS, reactive oxygen species.

include many more enzymes, whichwe did not find; however, some of the genes listed lie close to ORFs without significant database matches, suggesting that at least one (as yet undiscovered) polyketide pathway may be functional.

Prophages Bacteriophages can mediate the evolution and transfer of virulence factors and occasional acquisition of new traits by the bacterial host. Because as much as 7% of the X. fastidiosa genome sequenced corresponds to double-stranded (ds) DNA phage sequences,mostly from the Lambda group, we suspect that this route may have been of particular importance for this bacterium. It is noteworthy that a very high percentage of phage-related sequences has also been detected in a second vascular-restricted plant pathogen, Spiroplasma citri33. We identified four regions, with a high density of ORFs homologous to phage sequences, that we considered to be prophages, in addition to isolated phage sequences dispersed throughout the genome. Two of these prophages (each ,42 kbp, designated XfP1 and XfP2) are similar to each other, lie in opposite orientations in distinct regions and appear to belong to the dsDNA, tailed-phage group. Both appear to contain most of the genes responsible for particle assembly, although we know of no reports of phage particle release from X. fastidiosa cultures. In prophage XfP1, we found two ORFs between tail genes V and W that are similar to ORF118 andvapA fromthevirulence-associatedregionof the animal pathogen Dichelobacter nodosus, which by homology encode a killer and a suppressor protein34. Interestingly, in prophage XfP2, we found two other ORFs also between tail genes V and W that are similar to hypothetical ORFs of Ralstonia eutropha transposon Tn4371 (ref. 35). The other two identified prophages, XfP3 and XfP4, are also similar in sequence to each other and to the H. influenzae cryptic prophage fflu (ref. 36). They both contain a 14,317-bp exact repeat. Few particle-assembly genes were found in these regions,suggesting that these prophages are defective. An ORF similar to hicB from H. influenzae, a component of the major pilus gene cluster in some isolates, was found in XfP4 (ref. 37).

(Parte 1 de 2)