Methods in Molecular Biology TM VOLUME 236 Plant Functional Genomics Edited by Erich Grotewold Plant BAC Library Construction An Improved Method for Plant BAC Library Construction Meizhong Luo and Rod A Wing Summary Large genomic DNA insert-containing libraries are required as critical tools for physical mapping, positional cloning, and genome sequencing of complex genomes The bacterial artificial chromosome (BAC) cloning system has become a dominant system over others to clone large genomic DNA inserts As the costs of positional cloning, physical mapping, and genome sequencing continuously decrease, there is an increasing demand for high-quality deepcoverage large insert BAC libraries In our laboratory, we have constructed many high-quality deep-coverage large insert BAC libraries including arabidopsis, manocot and dicot crop plants, and plant pathogens Here, we present the protocol used in our laboratory to construct BAC libraries Key Words BAC, library, method, pCUGIBAC1, plant Introduction Large genomic DNA insert-containing libraries are essential for physical mapping, positional cloning, and genome sequencing of complex genomes There are two principal large insert cloning systems that are constructed as yeast or bacterial artificial chromosomes (YACs and BACs, respectively) The YAC cloning (1) was first developed in 1987 and uses Saccharomyces cerevisiae as the host and maintains large inserts (up to Mb) as linear molecules with a pair of yeast telomeres and a centromere Although used extensively in the late 1980s and early 1990s, this system has several disadvantages (2,3) The recombinant DNA in yeast can be unstable DNA manipulation is difficult and inefficient Most importantly, a high level of chimerism, the clonFrom: Methods in Molecular Biology, vol 236: Plant Functional Genomics: Methods and Protocols Edited by: E Grotewold © Humana Press, Inc., Totowa, NJ Luo and Wing ing of two or more unlinked DNA fragments in a single molecule, is inherent within the YAC cloning system These disadvantages impede the utility of YAC libraries, and subsequently, this system has been gradually replaced by the BAC cloning system introduced in 1992 (4) The BAC cloning uses a derivative of the Escherichia coli F-factor as vector and E coli as the host, making library construction and subsequent downstream procedures efficient and easy to perform Recombinant DNA inserts up to 200 kb can be efficiently cloned and stably maintained in E coli Although the insert size cloning capacity is much lower than that of the YAC system, it is this limited cloning capacity that helps to prevent chimerism, because the inserts with sizes between 130–200 kb can be selected, while larger inserts, composed of two or more DNA fragments, are beyond the cloning capacity of the BAC system or are much less efficiently cloned In 1994, our laboratory was the first to construct a BAC library for plants using Sorghum bicolor (5) Since then, we have constructed a substantial number of deep coverage BAC libraries, including Arabidopsis (6), rice (7), melon (8), tomato (9), soybean (10), and barley (11) and have provided them to the community for genomics research ([http://www.genome.arizona.edu] and [http://www.genome.clemson.edu]) The construction of a BAC library is quite different from that of a general plasmid or DNA library used to isolate genes or promoter sequences by positive screening Megabase high molecular weight DNA is required for BAC library construction Because individual clones of the BAC library will be picked, stored, arrayed on filters, and directly used for mapping and sequencing, a BAC library with a small average insert size and high empty clone (no inserts) rate will dramatically increase the cost and labor for subsequent work Usually, a BAC library with an average insert size smaller than 130 kb and empty clone rate higher than 5% is unacceptable These strict requirements make BAC library construction much more difficult than the construction of a general DNA library As the costs of positional cloning, physical mapping, and genome sequencing continuously decrease, so increases the demand for high-quality deepcoverage large insert BAC libraries (12) As a consequence, we describe in this chapter how our laboratory constructs BAC libraries Several protocols have been published for the construction of high quality plant and animal BAC libraries (13–18), including three from our laboratory (16–18) We improved on these methods in several ways (8) First, to easily isolate large quantities of single copy BAC vector, pIndigoBAC536 (see Note 1) was cloned into a high copy cloning vector, pGEM-4Z This new vector, designated pCUGIBAC1 (Fig 1), replicates as a high copy vector and can be isolated in large quantity using standard plasmid DNA isolation methods It Plant BAC Library Construction Fig pCUGIBAC1 Not drawn to scale retains all three unique cloning sites (HindIII, EcoRI, and BamHI), as well as the two NotI sites flanking the cloning sites, of the original pIndigoBAC536 Second, to improve the stability of megabase DNA and size-selected DNA fractions in agarose, as well as digested dephosphorylated BAC vectors, we determined that such material can be stored indefinitely in 70% ethanol at –20°C and in 40–50% glycerol at –80°C, respectively The vector has been distributed to many users worldwide, and the high molecular weight DNA preservation method, established by Luo et al (8), has been extensively used by colleagues and visitors and shown to be very efficient (18) These improvements and protocols described here save on resources, cost, and labor, and also release time constraints on BAC library construction Materials, Supplies, and Equipment 2.1 For pCUGIBAC1 Plasmid DNA Preparation pCUGIBAC1 (http://www.genome.clemson.edu) LB medium; 10 g/L bacto-tryptone, g/L bacto-yeast extract, 10 g/L NaCl Ampicillin and chloramphenicol (Fisher Scientific) Qiagen plasmid midi kit (Qiagen) Thermostat shaker (Barnstead/Thermolyne) 2.2 For BAC Vector pIndigoBAC536 Preparation 2.2.1 For Method One Restriction enzymes (New England Biolabs) HK phosphatase, Tris-acetate (TA) buffer, 100 mM CaCl2, ATP, T4 DNA ligase (Epicentre) Luo and Wing Agarose and glycerol (Fisher Scientific) 10× Tris-borate EDTA (TBE) and 50× Tris-acetate EDTA (TAE) buffer (Fisher Scientific) kb DNA ladder (New England Biolabs) Ethidium bromide (EtBr) (10 mg/mL) DNA (Promega) Water baths CHEF-DR III pulse field gel electrophoresis system (Bio-Rad) 10 Dialysis tubing (Spectra/Por2 tubing, 25 mm; Spectrum Laboratories) 11 Model 422 electro-eluter (Bio-Rad) 12 Minigel apparatus Horizon 58 (Whatman) 13 UV transilluminator 2.2.2 For Method Two Restriction enzymes and calf intestinal alkaline phosphatase (CIP) (New England Biolabs) 0.5 M EDTA, pH 8.0 Absolute ethanol, agarose, and glycerol (Fisher Scientific) T4 DNA ligase (Promega) 10× TBE and 50× TAE buffer (Fisher Scientific) kb DNA ladder EtBr (10 mg/mL) DNA Water baths 10 CHEF-DR III pulse field gel electrophoresis system 11 Dialysis tubing (Spectra/Por2 tubing, 25 mm) 12 Model 422 electro-eluter 13 Minigel apparatus Horizon 58 14 UV transilluminator 2.3 For Preparation of Megabase Genomic DNA Plugs from Plants Nuclei isolation buffer (NIB): 10 mM Tris-HCl, pH 8.0, 10 mM EDTA, pH 8.0, 100 mM KCl, 0.5 M sucrose, mM spermidine, mM spermine NIBT: NIB with 10% Triton® X-100 NIBM: NIB with 0.1% -mercaptoethanol (add just before use) Low melting temperature agarose (FMC) Proteinase K solution: 0.5 M EDTA, 1% N-lauroylsarcosine, adjust pH to 9.2 with NaOH; add proteinase K to mg/mL before use 50 mM phenylmethylsulfonyl fluoride (PMSF) (Sigma) stock solution (prepared in ethanol or isopropanol) T10E10 (10 mM Tris-HCl and 10 mM EDTA, pH 8.0) and TE (10 mM Tris-HCl and mM EDTA, pH 8.0) Mortars, pestles, liquid nitrogen, 1-L flasks, cheese cloth, small paintbrush, and Pasteur pipet bulbs Plant BAC Library Construction 50-mL Falcon ® tubes (Fisher Scientific) and miracloth (CalbiochemNovabiochem) 10 Plug molds (Bio-Rad) 11 GS-6R centrifuge (Beckman) 12 Model 230300 Bambino hybridization oven (Boekel Scientific) 2.4 For Preparation of High Molecular Weight Genomic DNA Fragments 2.4.1 For Pilot Partial Digestions Restriction enzymes and BSA (Promega) 40 mM Spermidine (Sigma) and 0.5 M EDTA, pH 8.0 Ladder pulsed field gel (PFG) marker (New England Biolabs) Agarose and 10× TBE EtBr (10 mg/mL) Razor blades, microscope slides, and water baths CHEF-DR III pulse field gel electrophoresis system UV transilluminator EDAS 290 image system (Eastman Kodak) 2.4.2 For DNA Fragment Size Selection 10 Restriction enzymes and BSA 40 mM spermidine and 0.5 M EDTA, pH 8.0 Ladder PFG marker Agarose and 10× TBE Low melting temperature agarose EtBr (10 mg/mL) and 70% ethanol Razor blades, microscope slides, water baths, and a ruler CHEF-DR III pulse field gel electrophoresis system UV transilluminator EDAS 290 image system 2.5 For BAC Library Construction 2.5.1 For DNA Ligation T4 DNA ligase and DNA Agarose and 1× TAE buffer EtBr (10 mg/mL) Dialysis tubing (Spectra/Por2 tubing, 25 mm) or Model 422 electro-eluter Minigel apparatus Horizon 58 UV transilluminator Water baths 0.1 M Glucose/1% agarose cones: melt 0.1 M glucose and 1% agarose in water, dispense mL to each 1.5-mL microcentrifuge, insert a 0.5-mL microcentrifuge Luo and Wing into each 1.5-mL microcentrifuge containing 0.1 M glucose and 1% agarose, after solidification, pull out the 0.5-mL microcentrifuges 2.5.2 For Test Transformation DH10B T1 phage-resistant cells (Invitrogen) SOC: 20 g/L bacto-tryptone, g/L bacto-yeast extract, 10 mM NaCl, 2.5 mM KCl, autoclave, and add filter-sterilized MgSO4 to 10 mM, MgCl2 to 10 mM, and glucose to 20 mM before use 100-mm diameter Petri dish agar plates containing LB with 12.5 µg/mL of chloramphenicol, 80 µg/mL of x-gal (5-bromo-4-chloro-3-indolyl- -Dgalactoside or 5-bromo-4-chloro-3-indolyl- -D-galactopyranoside [X-gal]) and 100 µg/mL of IPTG isopropyl- -D-thiogalactoside or isopropyl- -D thiogalactopyranoside 15-mL culture tubes Thermostat shaker Electroporator (cell porator; Life Technologies) Electroporation cuvettes (Whatman) 37°C incubator 2.5.3 For Insert Size Estimation 2.5.3.1 FOR BAC DNA ISOLATION LB with 12.5 µg/mL chloramphenicol Isopropanol and ethanol P1, P2, and P3 buffers from plasmid kits (Qiagen) 15-mL culture tubes Thermostat shaker Microcentrifuge 2.5.3.2 FOR BAC INSERT SIZE ANALYSIS NotI (New England Biolabs) DNA loading buffer: 0.25% (w/v) bromophenol blue and 40% (w/v) sucrose in TE, pH 8.0 MidRange I PFG molecular weight marker (New England Biolabs) Agarose, 0.5× TBE buffer, and EtBr (10 mg/mL) 37°C water bath or incubator CHEF-DR III pulse field gel electrophoresis system UV transilluminator EDAS 290 image system 2.5.4 For Bulk Transformation, Colony Array, and Library Characterization Freezing media: 10 g/L bacto-tryptone, g/L bacto-yeast extract, 10 g/L NaCl, 36 mM K2HPO4, 13.2 mM KH2PO4, 1.7 mM Na-citrate, 6.8 mM (NH4)2SO4, Plant BAC Library Construction 4.4% glycerol, autoclave, and add filter-sterilized MgSO4 stock solution to 0.4 mM 384-well plates and Q-trays (Genetix) Toothpicks (hand picking) or Q-Bot (Genetix) Methods 3.1 Preparing pCUGIBAC1 Plasmid DNA Inoculate a single well-isolated E coli clone harboring pCUGIBAC1 in LB containing 50 mg/L of ampicillin and 12.5 mg/L of chloramphenicol and grow at 37°C for about 20 h with continuous shaking Prepare pCUGIBAC1 plasmid DNA using the plasmid midi kit according to the manufacturer’s instruction, except that after adding solution P2, the sample was incubated at room temperature for not more than instead of (see acknowledgments) Each 100 mL of culture yields about 100 µg of plasmid DNA when using a midi column 3.2 Preparing BAC Vector, pIndigoBAC536 3.2.1 Method One Set up 4–6 restriction digestions, each digesting µg pCUGIBAC1 plasmid DNA (with HindIII, EcoRI, or BamHI depending on which enzyme is selected for BAC library construction) in 150 µL 1× TA buffer at 37°C for h Check µL on a 1% agarose minigel to determine if the plasmid is digested Heat the digestions at 75°C for 15 to inactivate the restriction enzyme Add àL of 100 mM CaCl2, 1.5 àL of 10ì TA buffer, and µL of HK phosphatase, and incubate the samples at 30°C for h Heat the samples at 75°C for 30 to inactivate the HK phosphatase Add 6.4 µL of 25 mM ATP, µL of U/µL T4 DNA ligase, and 1.3 µL of 10× TA buffer, incubate at 16°C overnight for self-ligation Heat the self-ligations at 75°C for 15 Combine the samples and run the combined sample in a single well, made by taping together several teeth of the comb according to the sample vol, on a 1% CHEF agarose gel at 1–40 s linear ramp, V/cm, 14°C in 0.5× TBE buffer along with the kb ladder loaded into the wells on the both sides of the gel as marker for 16–18 h Stain the two sides of the gel containing the marker and a small part of the sample with 0.5 µg/mL EtBr and recover the gel fraction containing the 7.5-kb pIndigoBAC536 DNA band from the unstained center part of the gel by aligning it with the two stained sides Undigested circular plasmid DNA and nondephosphorylated linear DNA that has recircularized or formed concatemers after self-ligation should be reduced to an acceptable level after this step Figure shows a gel restained with 0.5 µg/mL EtBr after having recovered the gel fraction containing the 7.5-kb pIndigoBAC536 vector The 2.8-kb band is the pGEM4Z vector 10 Luo and Wing Fig Recovering linearized dephophorylated 7.5-kb pIndigoBAC536 vector from a CHEF agarose gel See text for details Electroelute pIndigoBAC536 from the agarose gel slice in 1× TAE buffer at 4°C Either dialysis tubing (19) or the Model 422 electro-eluter can be used (18) 10 Estimate the DNA concentration by running µL of its dilution along with µL of each of serial dilutions of DNA standards (1, 2, 4, and ng/µL) on a 1% agarose minigel containing 0.5 µg/mL EtBr (for 10 min) and comparing the images under UV light, or simply by spotting a 1-µL dilution along with µL of each of serial dilutions of DNA standards (1, 2, 4, and ng/µL) on a 1% agarose plate containing 0.5 µg/mL EtBr and comparing the images under UV light after being incubated at room temperature for 10 11 Adjust DNA concentration to ng/µL with glycerol (final glycerol concentration 40–50%), aliquot into microcentrifuge tubes, and store the aliquots at –80°C Use each aliquot only once 12 Test the vector quality by cloning DNA fragments digested with the same restriction enzyme as used for vector preparation Prepare a sample without the DNA fragments as the self-ligation control For ligation, transformation, and insert check, follow the protocols in Subheading 3.5 for BAC library construction, except that inserts are checked on a standard agarose gel instead of a CHEF gel Colonies from the ligation with the DNA fragments should be at least 100 times more abundant than those from the self-ligation control More than 95% of the white colonies from the ligation with the DNA fragments should contain inserts Plant BAC Library Construction 11 3.2.2 Method Two Set up 4–6 digestions, each digesting µg pCUGIBAC1 plasmid DNA (with HindIII, EcoRI, or BamHI depending on which enzyme is selected for BAC library construction) in 150 àL 1ì restriction buffer at 37C for h Check µL on a 1% agarose minigel to see if the plasmid is digested Add U of CIP and incubate the samples at 37°C for an additional h (see Note 2) Add EDTA to mM and heat the samples at 75°C for 15 Precipitate DNA with ethanol, wash it with 70% ethanol, air-dry, and add: 88 µL of water, 10 àL of 10ì T4 DNA ligase buffer, and µL of U/µL T4 DNA ligase Incubate the samples at 16°C overnight for self-ligation Then follow steps 6–12 of Method One (Subheading 3.2.1.) 3.3 Preparing Megabase Genomic DNA Plugs from Plants (see [18] for alternatives) (see Note 3) Young seedlings of monocotyledon plants, such as rice and maize, and young leaves of dicotyledon plants, such as melon, are used fresh or collected and stored at –80°C Grind about 100 g of tissue in liquid N2 with a mortar and a pestle to a level that some small tissue chunks can be still seen (see Note 4) Divide and transfer the ground tissue into two 1-L flasks, each containing 500 mL of ice-cold NIBM (1 g tissue/10 mL) Keep the flasks on ice for 15 with frequent and gentle shaking Filter the homogenate through four layers of cheese cloth and one layer of miracloth Squeeze the pellet to allow maximum recovery of nuclei-containing solution Filter the nuclei-containing solution again through one layer of miracloth Add 1:20 (in vol) of NIBT to the nuclei-containing solution and keep the mixture on ice for 15 with frequent and gentle shaking Transfer the mixture into 50-mL Falcon tubes Centrifuge the tubes at 2400g at 4°C for 15 Gently resuspend the pellets in the residual buffer by tapping the tubes or with a small paintbrush 10 Dilute the nucleus suspension with NIBM and combine it into two 50-mL Falcon tubes Adjust the vol to 50 mL with NIBM in each tube and centrifuge the tubes at 2400g at 4°C for 15 11 Resuspend the pellets as in step Dilute the nucleus suspension with NIBM and combine it into one 50-mL Falcon tube Adjust the vol to 50 mL with NIBM and centrifuge it at 2400g at 4°C for 15 12 Remove the supernatant and gently resuspend the pellet in approx 1.5 mL of NIB 13 Incubate the nucleus suspension at 45°C for Gently mix it with an equal vol of 1% low melting temperature agarose, prepared in NIB and pre-incubated 428 Kjemtrup et al ent conditions can be problematic This is especially true if the data are collected solely with reference to chronological age In contrast, phenotypic data collected with reference to a commonly defined series of growth stages would provide a coherency not otherwise achieved by collection procedures based on chronological age alone Common growth stage definitions have been developed for a number of experimental organisms, including Caenorhabditis elegans and Drosophila (2,3) Similar scales have also been developed for many agronomically important plant species (e.g., a decimal code for the growth stages of cereals [4]) One of these is the BBCH growth stage scale, named for the consortium of agricultural companies that developed it (Basf, Bayer, Ciba-Geigy, and Hoechst) The BBCH scale provides a comprehensive growth stage description and can be adapted for most crop and weed species (5) Originally developed as a means of communication among agriculturists, adaptation of such a universal scale at the laboratory level would allow for easier data comparisons between and within species The use of growth stage definitions will greatly facilitate information sharing and increase the value of individual research projects The BBCH scale assigns a numerical value (0–9) to 10 principal developmental stages that occur throughout plant development: 0, germination, sprouting; 1, leaf development; 2, formation of side shoots; 3, stem elongation–rosette growth; 4, vegetative plant parts; 5, inflorescence emergence; 6, flowering; 7, fruit development; 8, ripening; 9, senescence Each principal growth stage is subdivided into 10 more detailed morphological events germane to the principal stage The resulting code provides a digital naming convention for nearly any developmental stage of a plant at any given time (see Table 1) We have adapted a modified version of the BBCH scale for high-throughput phenotyping of Arabidopsis (6) This chapter describes a two-phase method for the collection of data for both quantitative and qualitative traits spread over the developmental timeline of the plant In the first phase of the method, data is collected, enabling a series of landmark growth stages to be defined The second phase involves the collection of detailed data for additional traits that are of particular interest at any one of these given stages Figure illustrates the growth stages and phases we use for data collection While we focus on the application of this method to Arabidopsis, a similar strategy can be applied to the collection of similar data from other plant species as well Growth Stage-Based Phenotypic Profiling of Plants 429 Table BBCH Growth Scale (5) Numeric code Growth stage description 00 01 03 05 06 07 Germination; sprouting Dry seed Seed imbibition begins Seed imbibition complete Radicle emerged from seed Elongation of radicle, formation of root hairs or lateral roots Coleoptile emerged; hypocotyls with cotyledons broken through seed coat Hyocotyl with cotyledons grow toward soil surface Cotyledons or coleoptile breaks through soil surface Leaf development (main shoot) First true leaf emerged from coleoptile; or cotyledons completely unfolded First true leaf Two true leaves Three true leaves, etc Nine or more true leaves (if tillering or shoot and stem elongation occur at an earlier stage or not at all, continue with either stage 21 or 31) Formation of side shoots or tillering First side shoot or tiller visible Two or more side shoots or tillers visible Three or more side shoots or tillers visible, etc., to 28 Nine or more side shoots or tillers visible Stem elongation or rosette growth (main shoot; shoot develop ment) Stem (rosette) 10% of final length (diameter) or nodes detectable Stem (rosette) 20% of final length (diameter) or nodes detectable Stem (rosette) 30% of final length (diameter) or nodes detectable, etc., to 38 Maximum stem length or rosette diameter reached; or more nodes visible Development of harvestable vegetative plant parts Harvestable vegetative plant parts begin to develop or flag leaf sheath extending Harvestable vegetative plant parts have reached 30% of final size; or flag leaf sheath just visibly swollen Harvestable vegetative plant parts have reached 50% of final size; or flag leaf sheath swollen 08 09 10 11 12 13 19 21 22 23 29 31 32 33 39 41 43 45 430 Kjemtrup et al Table Continued Numeric code Growth stage description 47 Harvestable vegetative plant parts reach 70% of final size; or flag leaf sheath opening Harvestable vegetative plant parts reach final size; or first awns visible Inflorescence emergence (main shoot); ear or panicle emergence Inflorescence or flower buds visible First individual (closed) flowers visible First flower petals visible; or inflorescence fully emerged Flowering on main shoot Beginning of flowering: 10% flowers open 30% Flowers open 50% Flowers open; first petals fallen or dry Flowering finishing; majority of petals fallen or dry End of flowering; fruit set visible Development of fruit Small fruits visible or fruit has reached 10% of final size First fruits have reached final size or fruit has reached 30% of final size 50% Fruits have reached final size or fruit has reached 50% of normal size 70% Fruits have reached final size or fruit has reached 70% of normal size Nearly all fruits have reached final size Ripening or maturity of fruit and seed Beginning of ripening or fruit coloration Advanced ripening or fruit coloration Fruit begins to soften Fully ripe; beginning of fruit abcission Senescence: beginning of dormancy Shoot development completed; foliage still green Leaves begin to change color or fall 50% Leaves discolored or fallen Plant material dead or dormant Harvested seed 49 51 55 59 61 63 65 67 69 71 73 75 77 79 81 85 87 89 91 93 95 97 99 Growth Stage-Based Phenotypic Profiling of Plants Fig (A) Growth stage-based early analysis of Arabidopsis: growth stages and descriptions are indicated in the black box Additional data collection steps are outlined in the boxes below each stage 431 432 Kjemtrup et al Fig (B) Growth stage-based analysis of soil-grown Arabidopsis: growth stages and descriptions are indicated in the black box Additional data collection steps are outlined in the boxes below each stage Growth Stage-Based Phenotypic Profiling of Plants 433 Materials 2.1 Arabidopsis Growth and Maintenance Growth chambers: Model TCR 480 (Conviron) Each of these chambers is outfitted with 20 growth racks Each rack has four shelves of growth space, each of which is approx × ft in size and accommodates four standard greenhouse flats Thus, the maximum capacity of an entire chamber is 320 flats Potting medium: e.g., Metro Mix 360 (Scotts) Granular time-release fertilizer: e.g., Osmocote 18–6–12 (Scotts) Mini-Mayer 2100 potting machine (Gro-May) modified to fill in square pots (see Note 1) Pipets or liquid handling robot (e.g., Genesis RSP-200; Tecan US) to sow seed (see Note 2) 0.1% (w/v) Agarose solution in water to suspend seeds for sowing Cold room or refrigerator to stratify seeds at 4°C 2.2 Data Collection Instruments 300-mm Ruler (VWR Scientific) Electronic or manual counters (VWR Scientific) Electronic calipers (Mitutoyo America) connected to the computer via an RS232 port Alternatively, manual calipers (VWR Scientific) will also suffice Digital camera with USB or firewire connection to computer We use Nikon® D1 series cameras (Nikon) Balance or automated weighing station 2.3 Data Collection Software Options for data collection software include: Electronic spreadsheets (e.g., Microsoft® Excel®) Customized relational databases (e.g., Microsoft Access, ORACLE) Commercial electronic laboratory notebooks (ELN) Some popular ELNs include LabTrack (http://www.labtrack.com/), which is by far the most popular ELN available It acts as a laboratory information management system (LIMS) and a notebook This ELN acts not only as a word processor, but also provides the reporting and searching capabilities of a relational database An ELN can also become a legally acceptable document with the addition of service subscriptions like First Use (http://www.firstuse.com/) or Surety (http://www.surety.com/) 2.4 Data Analysis Software 2.4.1 Data Analysis: Commercial Commercial options for data analysis software include: Statistical analysis software by SAS® Microsoft Excel 434 Kjemtrup et al 2.4.2 Image Analysis: Commercial Commercial options for image analysis software include: Image Pro Plus (http://www.mediacy.com/ippage.htm) and (http://www.optimas com/optimas.htm) offered by Media Cybernetics IP Lab (http://www.scanalytics.com/product.html) for Macintosh® and Windows™ operating systems offered by Scanalytics 2.4.3 Image Analysis: Public Domain There are also a number of public domain software packages available for image analysis, including: National Institutes of Health (NIH) Image for Macintosh® (http://rsb.info.nih gov/nih-image/) and ImageJ for any computer with Java 1.1 (http://rsb.info nih.gov/ij/) Scion Image (http://www.scioncorp.com/frames/fr_scion_products.htm), which is a windows equivalent to NIH Image Image Tool (http://ddsdx.uthscsa.edu/dig/itdesc.html) Methods 3.1 Arabidopsis Growth and Maintenance 3.1.1 Growth Conditions Large-scale phenotyping efforts require consistent long-term reproducibility of environmental conditions for plant growth Unlike greenhouses, growth chambers are subject to fewer outside environmental influences and are easier to control Growth chambers have the added benefit of allowing plants to be grown at a higher density than is possible in most greenhouses We maintain the following conditions in our chambers: Lighting: the light intensity over each shelf is maintained at 175–200 µE by a fixture containing nine T8 cool white fluorescent tubes The distance between the fluorescent tubes and the shelf below is approx 55 cm To ensure consistent illumination over time, one-third of the fluorescent tubes are replaced every mo The day length is 16 h Daytime temperature is 22°C Nighttime temperature is 20°C Relative humidity is held constant at 65% 3.1.2 Potting Medium Preparation One bag (3 cu ft) of commercial potting medium is supplemented with 90 g of granular fertilizer and gallons water Growth Stage-Based Phenotypic Profiling of Plants 435 Components are mixed until evenly dispersed using a cement mixer or commercial soil mixer Pots are filled manually, or for high-throughput operation, filled with a potting machine (see Note 1) A standard 10 in × 20 in greenhouse flat holds 32 2-in pots configured in a × grid 3.1.3 Seed Sowing Prior to sowing, seeds are suspended in a solution of 0.1% (w/v) agarose and placed at 4°C for d to synchronize germination Depending on throughput and application, sowing can be performed either manually using a pipet, or in a more automated fashion, using a liquid handling robot (see Note 2) After sowing, the flats are watered, covered with a humidity dome, and placed in the growth room Following germination, the humidity dome is removed, and the flats are irrigated every d until mid-flowering, at which point watering is increased to every day, until seed set is complete Flats are irrigated from below using an ebb and flood method All irrigation water is purified by reverse osmosis prior to use 3.2 Collection of Data to Define Growth Stages Growth-stage scale development or determination: the BBCH scale is a readymade template to aid in the definition of landmark growth stages However, its generic nature requires that it be more inclusive than exclusive, making the challenge of adapting it to a particular plant species one of detail reduction and focus The scale we use for Arabidopsis is based on a version of the BBCH scale already developed for the related plant, Brassica (5) Principal growth stages and were removed from this version of the scale, as they reference tiller formation and harvestable seed production in monocots The remaining principal growth stages of relevance are 0, 1, 3, 5, 6, 7, 8, and Scale refinement (if necessary): further refinement of growth stage definitions may be required for operational use, especially for those cases where growth stages are defined relative to the percentage of completion of that stage An example of this is principal growth stage (inflorescence development) The beginning (stage 6.0) and end of inflorescence development (stage 6.9) are easily identified in real time, as the time to first flower opening and flowering completion, respectively In contrast, the point at which 50% of the inflorescence has been produced (stage 6.5) can only be defined in retrospect, after flowering is complete Thus, the implementation of growth stages that are defined relatively as real-time data collection triggers is a practical impossibility A useful strategy in these cases is to identify a trait that can be followed as a surrogate to determine when the growth stage of interest has been reached For example, in a pilot experiment we counted flowers every other day between stages 6.0 and 6.9, and simultaneously measured stem height We found that the rate of stem elongation 436 Kjemtrup et al Table Arabidopsis Growth Stages and Measurements Stage Description Growth stage 0.10 0.50 0.7 R6 Growth stage 3.20 Seed germination Seed imbibition Radicle emergence Hypocotyl and cotyledon emergence More than 50% of the seedlings have primary roots cm in length Leaf development Cotyledons fully opened rosette leaves >1 mm in length rosette leaves >1 mm in length, etc., to stage 1.14 Rosette growth Rosette is 20% of final size 3.50 3.70 3.90 Growth stage 5.10 Growth stage 6.00 Rosette is 50% of final size Rosette is 70% of final size Rosette growth complete Inflorescence emergence First flower buds visible Flower production First flower open 6.10 10% Flowers to be produced have opened 6.30 30% Flowers to be produced have opened 50% Flowers to be produced have opened Flowering complete Growth stage 1.0 1.02 1.03 6.50 6.90 Growth stage Growth stage 8.00 Growth stage 9.70 Silique filling Silique ripening First silique shattered Senescence Senescence complete; ready for seed harvest Measurement or action Visual inspection Visual inspection Visual inspection Caliper measurement of root Visual inspection Leaf count Leaf count Caliper measurement of longest leaf Tissue harvest Visual inspection Visual inspection; tissue image Ruler measurement of stem height for correlative determination Image Tissue dissection and dry weight Visual inspection Seed harvest Growth Stage-Based Phenotypic Profiling of Plants 437 plateaued concomitantly with stage 6.5 as defined by the flower count and could, therefore, serve as a surrogate trait From this result, we developed a working definition of stage 6.5, as the day on which the rate of stem elongation decreased by more than 20% for two consecutive measurement cycles (6) The growth stages we use routinely in the analysis of Arabidopsis phenotypes are listed in Table Note that each growth stage is defined by a simple observation or robust measurement that can be determined rapidly (see Note 3) It is important that measurement of growth stage-determining traits be rapid and simple, because additional data specific to a particular growth stage can thus be collected concurrent with the attainment of that stage (see Note 4) Characterization of traits to measure at specific growth stages: besides growth stage measurements, additional traits can be assembled into modules to collect more extensive data for a particular stage of interest Modules could include characterization of floral morphology at mid-flowering (stage 6.5), yield and seedrelated traits at the conclusion of seed maturation (stage 9.7), or disease characterization during vegetative development (stage 1.10) The scope of the data collection in these modules can range from a broad survey of traits, in an attempt to uncover as many phenotypes as possible, to the analysis of a specific trait at a single stage of growth (see Note 5) Figure 1A,B illustrate a variety of Arabidopsis traits that can be collected for early analysis on plates and whole plant analysis on soil Evaluation of possible quantitative traits: traits can be quantitative or qualitative and can include processes such as harvesting tissue samples for subsequent extraction and analysis by methods including gene expression profiling and biochemical profiling Examples of robust quantitative traits for the analysis of Arabidopsis include biomass of leaves, stems, siliques, and seeds, as well as the length of siliques, pedicels, etc These traits can be assessed easily through the use of standard equipment including a balance, caliper, and ruler (see Note 6) With a greater investment in technology, a large number of metrics can also be extracted from digital images Traits such as area, perimeter, major and minor axis, and shape (e.g., eccentricity, standard deviation of the radius) can be quantitated readily from an image of seeds, siliques, pollen grains, or an intact rosette Image analysis can also be used to more precisely assess traits, such as flower size, that are challenging to measure by hand (see Note 7) Technology can also be applied to the analysis of other traits For instance, abnormal leaf color can be indicative of any number of underlying metabolic or developmental defects While color can be assessed qualitatively (see step 5), one can also use a color spectrophotometer to quantify the wavelengths of light reflected from the subject Evaluation of possible qualitative traits: clearly, all visual phenotypes will not be represented equally through an assessment of the quantitative traits such as those described in step Therefore, qualitative descriptors and images should also be included as part of the phenotyping process While free text can be used as a means to capture descriptive data, we have developed an Arabidopsis Pheno- 438 Kjemtrup et al type Taxonomy (APT) specifically for this purpose (see Note 8) The APT consists of a structural hierarchy (e.g., inflorescence::stem::flower::petal) that has modifying terms for every structure (e.g., inflorescence::height, stem::width, flower::male sterile, petal::color) The APT contains descriptions of shape, size, color, dimension, etc., for each major feature of Arabidopsis, as well as descriptions for altered developmental timing and stress tolerance Pleiotropic phenotypes can be described through the assignment of multiple APT entries For maximum utility, the APT entries can be associated with a corresponding image of the plant Efficient measurement design for a population of plants: phenotypic profiling invariably involves the analysis of populations of plants In some cases, the population size can be very large, and even with a computerized data collection system, it quickly becomes a logistical difficulty to track the development of each plant individually and to collect growth stage-specific data for each at the appropriate time To address this problem, we exploited the fact that the time required for individual plants to reach a growth stage is distributed normally within the population Data to determine growth stage is collected at the level of individual plants, and the population is considered to have reached a growth stage when 50% or more of the surveyed individuals have reached the growth stage of interest This event triggers the collection of additional data specific for that growth stage from all of the individuals within the population This method reduces the complexity of the data collection process by providing a mechanism to schedule growth stage-specific data only once during the development of each population Some of the data collection processes result in destruction of the specimen This should be taken into account when designing the phenotyping process When a population reaches a growth stage of interest, a subset of plants can be harvested for analysis, while the remaining plants are allowed to continue to grow for later analysis This strategy permits a complete set of developmental data to be collected from plants grown at the same time under the same conditions 3.3 Sample Tracking and Data Entry Various types of software can be used to track samples and record data For small-scale experiments, a spreadsheet may suffice However, for larger studies, a relational database is preferable Relational databases allow information to be stored and retrieved more efficiently than spreadsheets Relational databases support the development of graphical user interfaces that enable highly efficient entry of data as well as more sophisticated queries of the data (see Note 9) 3.4 Quality Control Variability is inherent in the assessment of biological phenomena While phenotypic variation resulting from a genetic difference is typically a desired outcome, experiments can be compromised as result of uncontrolled variabil- Growth Stage-Based Phenotypic Profiling of Plants 439 ity from undesirable sources Some of these sources include variation in environmental conditions, data collection technique, and data entry errors Safeguards to minimize these and other undesirable sources of variability are essential One or all of the following steps can be implemented to help control the quality of the data collection process: Develop and adhere to standard operating procedures pertaining to those components described above to minimize phenotypic variation resulting from inconsistency in environmental conditions For example, we routinely monitor the electroconductivity and pH of the irrigation water and utilize a plant growth bioassay to screen each lot of soil prior to use The intensive nature of high-throughput phenotyping often requires the efforts of many people in the data collection process A thorough training program can help to reduce variation resulting from differences in the way different people collect the same types of data Training program effectiveness can be monitored through the comparison of duplicate data sets collected by different individuals The variation within the difference of the duplicate data sets can be taken as a representation of the variability resulting from the measurement system High-throughput environments are prone to data entry errors These can be minimized through the incorporation of high and low data limits at the level of the data collection interface Additional measures can be added to ensure that values entered for a trait are consistent For example, in recording the number of rosette leaves over time, an entry with a value of after a previous entry of would trigger an error message 3.5 Data Analysis Phenotypic data resulting from growth stage-based data collection is a mixture of both quantitative and qualitative measurements The following generic data analysis method encompasses both types of data Be aware that, when collecting and analyzing data from a phenotypic platform, the sample size for a particular trait will vary depending on the subpopulation of plants that are sampled There is a necessary balance between the analysis of plants at a highthroughput and collecting data with sufficient replication to detect subtle phenotypic differences with high confidence The requirements on both sides of this equation will vary with the application and must be evaluated prior to initiating a phenotyping platform A subpopulation of control plants is included within each flat of plants grown for phenotypic analysis (see Note 10) Quantitative data are averaged within the control and mutant plant populations The mutant and control means are compared to each other using a t-test (see Note 11) 440 Kjemtrup et al Qualitative data are represented as a frequency of the noncontrol responses For example, a mutant mean of 0.75 for the “seedling color” component would mean that 75% of the seedlings exhibited a color that differed from the control The frequencies of the control and mutant populations are compared using a t-test to determine whether they differ significantly (see Note 11) Notes While pots can be filled by hand, we have found that the use of a potting machine improves process efficiency and delivers more consistent soil compaction A liquid handling robot has the advantage of increasing sowing efficiency and also enables the location of controls and/or seed lines to be easily randomized within the flat During the collection of developmental data over time, it is most efficient to schedule observations with reference to data collected during the previous cycle of analysis If the data collection is being driven by computerized system, this can be accomplished through the definition of a series of rules that trigger data collection events For example, the transition to flowering is first observed as the production of floral buds This event usually occurs several days prior to the opening of the first flower If one is interested in capturing the timing of both events, the occurrence of the floral buds can serve as the trigger to begin assessing flower opening To continue the example, opening of the first flower is correlated with the completion of vegetative development Therefore, opening of the first flower makes an ideal trigger to cease all observations related to vegetative development (e.g., number of leaves, rosette size, etc.) Planning a growth stage-based experiment is not as straightforward as planning for a calendar-based experiment; achieving a specific growth stage happens within a time frame, not necessarily on a specific day While some of those time frames can be fairly tight (in the Col-0 ecotype of Arabidopsis, days to stage 1.02 has a standard deviation of 1.3 d [6]), others can be fairly broad (days to stage 6.50 in Col-0 has a standard deviation of 4.9 d [6]) Certain measurements also require additional considerations For example, the distance across an open flower can only be measured in the morning, because under our conditions, the flowers close in the afternoon Data collection can be adapted to a high-throughput environment through the use of balances and calipers that are connected directly to a computer for automated data entry Even higher throughput weighing of seeds or other samples can be achieved using an automated balance workstation (e.g., Mettler-Toledo Bohdan Automation; http://www.bohdan.com/index.htm) The success of digital image analysis relies on the ability to take high quality images with consistent magnification and lighting Therefore, it is most efficient to develop one or more workstations that are dedicated to image capture We use Nikon D-series cameras They are available with a choice of resolution and are built on a standard 35-mm camera body, enabling an extensive array of macro and micro lens configurations Growth Stage-Based Phenotypic Profiling of Plants 441 The complete APT can be found on the Paradigm Genetics Web site (www.paradgimgenetics.com) Relational databases can range in complexity from simple systems built in Microsoft Access or the open source database MySQL, to advanced systems built on a platform such as ORACLE Additionally, there is a selection of ELNs available that would also suffice We use an ORACLE-based electronic LIMS that tracks the location and status of all samples through unique 10-digit identifiers This system also provides an advanced data collection interface, which incorporates a rule set to evaluate data immediately upon entry, to automatically determine when a growth stage has been achieved This event then triggers the collection of additional data that are specific to that growth stage 10 Data obtained from the control plants are critical not only as a reference for assessing phenotypic variation, but also as a means to control and understand the consistency of growth conditions within and between locations in the growth rooms A control data set can be developed from a large group of control plants that represent a random sample of sow dates and/or growth locations This reference control can then be used to perform t-tests with the control data from any individual flat in a manner analogous to that used to compare mutants to the control The reference control may be used to help refine the analysis of mutant data as well For example, if the control data within an experimental flat differ significantly from the reference control population, then the reliability of the data from the mutant plants in that flat should also be called into question 11 The t-test value may be interpreted as the number of standard errors between the mutant mean and the control mean The value of the t-test statistic may be positive or negative, representing mutant variation that is greater or less than the control mean, respectively For sample sizes greater than three, one can be at least 95% confident that t-test values greater than standard errors from the control are due to biological variation and not simply the result of chance References Arabidopsis 2010 Project (http://www.nsf.gov/pubs/2001/nsf01162/nsf01162.html) Hartenstein, V (1993) Atlas of Drosophila Morphology and Development CSH Laboratory Press, Cold Spring Harbor, NY Wilkins, A (1993) Genetic Analysis of Animal Development Wiley-Liss, New York Zadoks, J., Chang, T., and Konzak, C (1974) A decimal code for the growth stages of cereals Weed Res 14, 415–421 Lancashire, P., Bleiholder, H., vd Boom, T P L., Stauss, R., Weber, E and Witzenberger, A (1991) A uniform decimal code for growth stages of crops and weeds Ann Appl Biol 119, 561–601 Boyes, D C., Zayed, A M., Ascenzi, R., et al (2001) Growth stage-based phenotypic analysis of Arabidopsis: a model for high throughput functional genomics in plants Plant Cell 13, 1499–1510 442 Kjemtrup et al ... chimerism, the clonFrom: Methods in Molecular Biology, vol 236: Plant Functional Genomics: Methods and Protocols Edited by: E Grotewold © Humana Press, Inc., Totowa, NJ Luo and Wing ing of two or... extensively used by colleagues and visitors and shown to be very efficient (18) These improvements and protocols described here save on resources, cost, and labor, and also release time constraints... plants, such as rice and maize, and young leaves of dicotyledon plants, such as melon, are used fresh or collected and stored at –80°C Grind about 100 g of tissue in liquid N2 with a mortar and