Term 
         | 
        
        
        Definition 
        
        | Polyacrylamide, 6-20%. Bromphenol blue and xylene cyanol are used as loading dyes. Short fragments resolve in 1-2 hours, and long fragments (>150 bp) in 7-8 hours. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Chemical Sequencing (Maxam-Gilbert) |  
          | 
        
        
        Definition 
        
        | Template DNA is radioactively labeled at one end and aliquots into 4 separate tubes containing different chemicals with or without high salt. Piperdine, a strong reducing agent, is added to break the DNA at specific nucleotides. Piperdine is evaporated and fragments resuspended in formamide to be loaded onto a denaturing gel (to prevent ssDNA from binding/folding). Gels are removed to filter paper and dried before exposure to light-sensitive film. Hydrazine and piperdine are toxic and hard to prep. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Maxam-Gilbert Gel Interpretation |  
          | 
        
        
        Definition 
        
        | Single-base resolution is required in the gel, so only small fragments can be analyzed this way. The fastest migrating bands are the smallest fragments, truncated closest to the radioactive label. The lanes from each of the four reactions represent which nucleotide is present at the 3' end of that fragment. Bands in purine (A+G) lanes are called based on whether they appear in the G column, same with the pyrimidine (C+T) lane and the C lane. The sequence is read from the bottom of the gel (5') to the top (3'). |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | ddNTP. Don't have an OH on the 3 C of the deoxyribose sugar, preventing the formation of phosphodiester bonds with another nucleotide. This terminates DNA elongation, with the ddNTP at the 3' end. |  
          | 
        
        
         | 
        
        
        Term 
        
        | PCR Amplicon Requirements for Sequencing |  
          | 
        
        
        Definition 
        
        | PCR amplicons used for sequencing must have no carryover from the PCR reaction, specifically primers and dNTPs (which will interfere with sequencing). PCR products can be cleaned using phase matrices, alcohol precipitation, or AP digestion. Amplicons can also be run on a gel, excised, and eluted. This provides confirmation that the correct amplicon is being sequenced. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Dideoxy Chain Termination Sequencing (Sanger) |  
          | 
        
        
        Definition 
        
        | Uses short primers that end just 5' to the region of interest. These primers may attach to fluorescent or 32P-bound nucleotides or may have radioactive nucleotides present in the reaction mix. A 1:1 mix of primer:target is added to four tubes. Enzymes, buffer, dNTPs, and one ddNTP are added to each tube. DNA polymerase is allowed to act for 20 minutes, and is stopped by addition of 20 mM EDTA. Formamide is added to denature products and loading dye is added before products are loaded into 4 lanes of a denaturing polyacrylamide gel. Gel is dried and exposed to x-ray film to produce a sequencing ladder. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Reading Sanger Sequencing Gels |  
          | 
        
        
        Definition 
        
        The pattern formed in the 4 gel lanes (one for each ddNTP)  is a sequencing ladder. The shortest fragment at the bottom represents the nucleotide closest to the primer. Sequence is read 5'-3', bottom to top. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Optimizing Sanger Sequencing |  
          | 
        
        
        Definition 
        
        | Typically the fragment length limit is 300-400 bp. This can be extended to over 500 bp by loading the same ladder over and over again in 2-6 hour intervals so that long bands have 8 hours to migrate and short bands can migrate in 1-2 hours 6 hours after the first set is loaded. Recombinant enzymes that are more processive (stay on template longer) because they lack nuclease activity can be used. Altered nucleotides can be used the denature secondary structure. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        The ratio of the two must be perfect for Sanger sequencing to work.
  If ddNTP concentration too high there will be too many products with early termination. If ddNTP concentration too low, little or no termination occurs. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Automated Sequencing Fluorescent Dyes |  
          | 
        
        
        Definition 
        
        -Fluorescein -Rhodamine -Bodipy (4,4-difluor-4-boro-3a,4a-diaza-s-indole) Derivatives of these are used. The reader uses a laser to excite the dyes and detect the fluorescence at predetermined wavelengths. The distinct colors of the 4 labeled nucleotides have distinct peaks, so all reactions can be done in the same tube, instead of four different tubes. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Dye Primer Automated Sequencing |  
          | 
        
        
        Definition 
        
        | Four different preparations of he same primer are made, each with a different fluorescent dye attached to the 5' end. These are put in four different tubes, one for each ddTP. Buffer, template DNA, and heat-stable polymerase are added to each tube and the reactions are cycled. The products of all these reactions are combined together to be resolved in one gel lane or capillary. Each truncated strand is labeled at the 5' end with a dye unique to that ddNTP. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Dye Terminator Automated Sequencing |  
          | 
        
        
        Definition 
        
        | The ddNTPS are fluorescently labeled, the primers are not. All 4 labeled ddNTPs are added to one reaction with template, buffer, primer, polymerase, and dNTPs. The cycling happens in that tube, producing fragments labeled at the 3' end. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Sequencing of GC-Rich Templates |  
          | 
        
        
        Definition 
        
        | Intrastrand hybridization is more common in sequences high in GC content, making sequencing difficult. If 7-denza-dGTP (2'-deoxy-7-deazaguanosine triphosphate) or deoxyinsosine triphosophate is used in the cycling reaction instead of dGTP, band resolution will improve. GC band compressions (bunches of bands close together followed by some more separated bands) are avoided. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Preparation of Sequencing Ladder |  
          | 
        
        
        Definition 
        
        | After cycling, PCR products must be cleansed of excess dye terminators via bead/column/ethanol precipitation purification. Alternately, magnetic beads that bind to dye terminators can be used and DNA can be taken from the supernatant. DNA is precipitated and resuspended in formamide. Secondary structure formation is avoided by using denaturing conditions (50-60 C, formamide urea denaturing gel). Ladders are heated to 95-98 C for 2-5 minutes and placed on ice before loading. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Automated Sequencing Gel Electrophoresis |  
          | 
        
        
        Definition 
        
        | The prepared sequencing ladder is loaded onto one gel lane or capillary. This eliminates lane-to-lane variation and increases throughput. When migrating fragments reach to detector, a laser excites the fluorophor and the detector records the emitted color, producing an electropherogram with peaks for each nucleotide in the sequence. Quality of sequence is better near the primer and iffier near the end. At least 400-500 b products can be read using this method. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Quality of Sequencing Results |  
          | 
        
        
        Definition 
        
        | The calling of bases in the final sequence depends on the electropherogram quality, which is determined by template quality, sequencing reaction efficiency and sequencing ladder cleanliness. Of the sequencing ladder isn't cleaned properly before loading, dye blobs (bright flashes of fluorescence) will obscure part of the sequence. If the DNA template is bad, the sequence will have bad quality and be poorly read. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Sequencing Interpretation |  
          | 
        
        
        Definition 
        
        | Electropherogram peaks generate a text sequence and software indicates the probability of each base. The sequence might be compared to a reference sequences to ID mutations. An experienced reader might be able to interpret a suboptimal sequence. Both complimentary strands must be sequenced for confirmation. This makes mutations and polymorphisms easier to catch, especially if they're only one b. A heterozygous mutation should show as two overlapping peaks of half-height and different colors. Deletions or insertions cause frameshifts and are easier to spot. Somatic mutations may be hard to detect amid normal sequences. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | A chain termination method, no sequencing ladder required. ssDNA template, sequencing primer, sulfyrase and luciferase, substrate adenosine 5' phosphosulfate, and luciferin. One of the 4 dNTPs is added in a set order. If the dNTP matches the next base 3' to the primer, it is added and pyrophosphate (PPi) is produced as a byproduct. Sulfyrase converts PPi to ATP and generates a luminescent signal via luciferase-catalyzed conversion of luciferin to oxyluciferin. This is repeated sequentially with the 4 dNTPs to produce a pyrogram with peaks associated with the sequence. Repeated nucleotides show up as peaks of double height. Good for short analysis, mutation detection and typing. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Nonsequencing DNA Methylation Detection |  
          | 
        
        
        Definition 
        
        Methylation-sensitive restriction enzymes can be used, or ones that recognize sites destroyed when unmethylated Cs are changed to Us. PCR primers that will only bind to changed or unchanged sequences can also be used, so presence or absence of a product indicates methylation.  These methods can't be used on unexplored sequences. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | Chain-terminated sequencing technique used to detect methylated cytosines. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Bisulfate DNA Sequencing Prep |  
          | 
        
        
        Definition 
        
        | 2-4 ug of DNA cut with restriction enzymes at sites outside the region of interest. Desired fragments are purified  from the gel (fixed DNA can skip denaturation) and denatured at 97 C for 5 minutes. Denatured strands are exposed to bisulfate solution (sodium bisulfate, NaOH, hydroquinone) for 16-20 hours. Buffer systems can be used to prevent DNA damage. Unmethylated cytosines are converted to uracils via deamination, but 5-methyl cytosines remain the same. DNA is cleaved and amplified. Primers may be changed to accommodate C to U changes. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Bisulfate DNA Sequencing Results |  
          | 
        
        
        Definition 
        
        | Treated sequencne results are comparted to normal sequence results run the same way. Unmethylated sequences will show up with no peaks for C and peaks of twice the height for T (U), whereas methylated Cs will show up as peaks that can be compared to T peaks to calculate % methylation. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        Post-transcriptional editing of RNA menas that RNA sequence doesn't always match DNA sequence. RNA can be used to synthesize cDNA for sequencing, but errors are likely.  RNA is directly sequenced by immobilizing mRNA on polyT strands (non-mRNA strands are treated with polyA polymerase). The 3' RNA ends are blocked to prevent extension. Virtual terminator nucleotides with reversible dye labels are added, a picture is taken. C, T, G, and A are added sequentially with imaging, cleaving, rinsing between each. After 120 cycles, images are aligned and used to build the sequence. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | Goal is to sequence the human genome at $1,000 per person, allowing genome sequencing to be incorporated into research and clinical analysis. Designed to sequence hundreds of templates simultaneously within hours. Massive computing power is needed to resolve all the images and data. Also used to investigate populations like microbes for diversity. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | Extent to which two sequences are the same. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        Lining up of sequences to search for maximal regions of identity to determine relatedness or homology.
  Local alignment - alignment of same portion in both. Optimal alignment - alignment of sequences with best identity. Multiple sequence alignment - three or more sequences aligned so that common residues are together (may require gaps) |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | A space put into alignment to adjust for insertions or deletions in one of the sequences being compared. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | Relatedness of sequences (% identity or composition) |  
          | 
        
        
         | 
        
        
        Term 
        
        | Homology, Orthology, Paralogy |  
          | 
        
        
        Definition 
        
        Homology- similarity attributed to a common ancestor.
  Orthology - homology in different species due to common ancestral gene.
  Paralogy - homology within species due to gene duplication. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        | Description of functional structures, e.g., exons/introns in DNA or secondary structure/functional region in proteins. |  
          | 
        
        
         | 
        
        
        Term 
        
        | Composition of the Genome |  
          | 
        
        
        Definition 
        
        54% AT 38% GC 8% TBD
  2.91 billion bp.
  30-40% is repeat sequences. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        Chromosome 19 is the most gene-rich, 23 gene per Mbp. Chromosomes 13 and Y are the least gene-rich, 5 genes per Mbp. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        20,000-30,000 genes in the human genome.  Average length of a gene is 27 kbp. Only 2$ of DNA is genes. |  
          | 
        
        
         | 
        
        
        Term 
         | 
        
        
        Definition 
        
        Chromosome 2 is the most GC rich, 66%. Chromosome X is the least, 25%. |  
          | 
        
        
         |