Differential sequencing by mass spectroscopy

Related Terms

Alleles, biomarker, chromosomes, differential sequencing, DNA, electrospray ionization, epidemiology, ESI, FAB, fast atom bombardment, G2D, genetic sequencing, genes to diseases, HGP, Human Genome Project, ionization source, mass analyzer, mass spectrometry, m/z, MALDI-TOF, mass spectroscopy, mass-to-charge ratio, matrix assisted laser desorption/ ionization time-of-flight, methylation detection, mutation, nucleotide, oligonucleotides, pathogen identification, PCR, phenotype, polymerase chain reaction, proteomics, reference sequence, resequencing, single nucleotide polymorphisms, SNPs,TSP50 gene.

Background

General: Genes, which are made of deoxyribonucleic acid (DNA), are considered the building blocks of life because they provide instructions for cells in the body. Genes are located inside cells and control an organism's development and functions by instructing cells to make new molecules, usually proteins. Proteins are organic compounds composed of amino acids; the sequences of the amino acids in proteins are defined by genes. Proteins are required for the growth and maintenance of the body.
DNA is a long, thread-like (double-helix) molecule made up of large numbers of nucleotides. The sequence of bases (nitrogenous) in DNA serves as the carrier of genetic (hereditary) information. Nucleotides are building blocks of DNA and are made of nitrogen bases, sugars, and phosphate. Nitrogen bases are of two types: purines, such as adenine (A) and guanine (G), and pyrimidines, such as cytosine (C) and thymine (T). Long strands of nucleotides form nucleic acids.
Alleles are two or more alternative forms of a gene that may occur alternatively at a given site on a chromosome. Chromosomes carry hereditary information in the form of genes. Humans normally have 22 pairs of chromosomes (autosomes) and a pair of sex chromosomes (X and Y chromosomes).
A permanent variation in a DNA sequence of a gene is called as mutation. Genetic changes or mutations that occur in more than 1% of the general population are called polymorphisms. Some mutations and polymorphisms may influence the risk of the development of certain disorders/diseases.
Differential sequencing: Differential or genetic sequencing is the method of investigating a DNA sample for gene variations with respect to a reference sequence. The Human Genome Project (HGP) provides a reference sequence/standard (~three billion bases of sequences) of human DNA for analysis/comparison with the sample DNA sequence, thereby facilitating the identification of any variations. HGP, an international research program, which began in 1990, was designed to map out all of the genes that make up human beings (called human genome). The genome project was completed in 2003. Researchers have identified the order and location of these genes in humans and other species. They have also developed linkage maps, which track the inheritance of genetic diseases over generations.
Uses: A sequence variation exists at defined positions within genomes and may be responsible for individual phenotypic characteristics (observable characteristics), including a person's tendency to develop complex disorders such as heart disease and cancer. The study of single-nucleotide polymorphisms (SNPs) is one of the approaches to use the genome-sequencing data to detect inter- and intra-species genetic variations. SNPs are DNA sequence variations that occur when a single nucleotide in the genome sequence is altered. For example, a SNP may change the DNA sequence 'AAGGCTAA' to 'ATGGCTAA.' Here, the adenine (A) base has been substituted with thymine (T).
The analysis of SNPs in the human genome may have an impact on the identification of disease-prone genes and drug targets and may thus facilitate the development of new drugs and patient care strategies. For humans, it is projected that there are about 10 million SNPs, which characterize most of the individually inherited differences between humans by controlling the individual phenotypes.
Differential sequencing of infectious agents for fast and reliable identification is an important aspect in the field of molecular diagnostics and epidemiology, including disease outbreak tracking and the classification of disease-causing organisms (pathogens). Molecular diagnostics may determine the relationship between genes and proteins in a cell thus may help to discover if there is any change in this relationship as in the case of disease. Epidemiology is the study of the causes, distribution, and control of disease within populations.
Mass spectroscopy/spectrometry: A mutation or polymorphism is commonly associated with a change in the mass of the analyte (DNA sequence) in comparison with the sequences that have no mutation. Molecular mass is the mass of a molecule relative to the mass of a standard atom, now 12C (the mass of one molecule of carbon taken as 12.000). Hence, precise mass determination of a DNA segment using mass spectroscopy/spectrometry helps to identify the mutation in the DNA sequence. This facilitates the determination of disease susceptibility genes and may lead to the development of new drugs and patient care strategies.
Mass spectroscopy/spectrometry (MS) is an analytical technique used for measuring the molecular mass of a sample; it helps to identify the chemical composition of a compound present in the sample. The sample used for MS may include body fluids such as blood, serum, saliva, urine, etc., or tissue samples such as blood cells. MS is based on the chemical breakdown of a sample into charged ions/particles, which are then separated based on the charge and mass of the particles. The separated particles are later identified using a detector.
Mass spectrometry has widespread applications and is used in the identification of proteins, peptides, and oligonucleotides (short segments of DNA/RNA). MS helps to discover previously unknown multifunctional proteins and to identify new functions of already known proteins. MS may help in understanding the drug metabolism, i.e., the modification or degradation of the drug, which in turn may help to discover new drugs. In clinical settings, MS may assist in testing the effectiveness of a drug in an individual.

Methods

General: Nucleotides are building blocks of DNA and are made of nitrogen bases, sugars, and phosphate. Nitrogen bases are of two types: purines, such as adenine (A) and guanine (G), and pyrimidines, such as cytosine (C) and thymine (T). Long strands of nucleotides form nucleic acids.
Differential or genetic sequencing is the method of investigating a DNA sample for gene variations with respect to a reference sequence that is provided by the Human Genome Project (HGP). The HGP is an international research program that was designed to map out all of the genes that make up human beings. This may be done by using mass spectrometry (MS), an analytical technique used for measuring the molecular mass of a sample using a mass spectrometer. Molecular mass is the mass of a molecule relative to the mass of a standard atom, now 12C (the mass of one molecule of carbon taken as 12.000). A mass spectrometer creates charged particles (ions) from molecules and analyzes those ions to provide information about the molecular weight/mass of the DNA sequence and its chemical structure. The sample used for MS from where the DNA is extracted may include body fluids such as blood, serum, saliva, etc. or tissue samples like the blood cells.
Sample preparation: Sample preparation is performed to isolate the DNA sequences by removing potential contaminants such as proteins in the sample. The solution containing DNA sequences and contaminants is separated using certain chemicals and filtered, leaving the DNA completely free of other biological molecules such as proteins. Next, several copies of the extracted DNA are produced (amplification) using polymerase chain reaction (PCR) to help in the easy detection of any mutation in the DNA sequence. A permanent variation in a DNA sequence of a gene is called a mutation.
Polymerase chain reaction (PCR): PCR is an efficient and sensitive enzymatic laboratory technique to amplify (by replication/multiplication) a specific sequence of DNA into billions of its copies in the presence of sequence specific oligonucleotide primers and DNA polymerase. Oligonucleotide primer is a sequence of nucleotides, usually of 20-50 bases, which is complementary to a specific DNA sequence and serves as a starting point for DNA replication. DNA polymerase is an enzyme that synthesizes new DNA strands using preexisting DNA strands as templates, thereby assisting in DNA replication. A radionucleotide is also incorporated into the PCR product during the amplification process (radiolabeling of PCR). This assists in visualizing the PCR products (DNA) at a later stage.
The mass spectrometer is divided into three distinct regions or steps, namely, the ionization source, the analyzer, and the detector.
Ionization source: The first step involves the introduction of the sample (DNA) into the ionization source of the mass spectrometer. The method of sample introduction depends on the ionization method employed and the type and complexity of the sample used. The ionization method in turn depends on the type of sample and the mass spectrometer used. The ionization source ionizes the sample molecules because ions are easier to manipulate than neutral molecules, thereby facilitating the identification of the target altered DNA sequence (analyte) in the sample. Some of the common ionization methods include matrix assisted laser desorption ionization (MALDI), electrospray ionization (ESI), and fast atom bombardment (FAB). The process of ionization differs in each method. In MALDI, a laser is used to ionize the sample. In FAB, a high-energy beam of neutral atoms is used. In ESI, the sample solution containing the analyte that is dissolved in a large amount of solvent (methanol and acetonitrile) flows through a source chamber to form droplets. The droplets carry charge and when the solvent in the solution vaporizes, highly charged analyte molecules are formed.
Mass analyzer: The ions then move into the analyzer region of the mass spectrometer wherein the ions are separated according to their mass-to-charge ratios. Mass-to-charge ratio (m/z) is the physical quantity that may be quantified or calculated by dividing the charge of the substance/analyte by the mass of the same substance. There are a number of mass analyzers available and some of them include time-of-flight (TOF) analyzers, quadrupoles, etc. TOF analyzers use an electric field to filter ions based on their mass-to-charge ratio. Quadrupole mass analyzer consists of four circular rods that are parallel to each other and facilitate the filtering of sample ions based on their mass-to-charge ratio. The compatibility of different analyzers varies with different ionization methods. For example, MALDI is generally used with a TOF analyzer because of its huge mass range.
Detector: The detector in the mass spectrometer monitors the ion current and amplifies or increases it. There are many types of detectors and most of them produce an electronic signal when struck by an ion. The signal is transmitted to the data system and is stored in the form of mass spectra. A mass spectrum is commonly presented as a vertical bar graph, wherein each bar represents an ion having a specific mass-to-charge ratio (m/z) and the abundance of the ion is indicated by the length of the bar. The m/z value of an ion is equivalent to mass itself because most of the ions formed in the mass spectrometer have a single charge. Hence, this method facilitates the accurate identification of the target molecule (mutation in DNA sequence). Taking advantage of the accurate mass information provided by mass spectrometry, the sequence is compared to the reference standard provided by HGP and analyzed. This facilitates the determination of the sequence helps to identify the altered genes.

Research

General: Differential or genetic sequencing is the method of investigating a DNA sample for gene variations with respect to a reference sequence that is provided by the Human Genome Project (HGP). HGP is an international research program that was designed to map out all of the genes that makes up human beings. This may be done by using mass spectrometry (MS), an analytical technique used for measuring the molecular mass of a sample using a mass spectrometer. Several studies are being conducted using differential sequencing to identify any alteration in the genes (mutation) leading to the development of various diseases.
Asthma: Studies have been conducted to identify the environmental and genetic predisposing factors for the development of asthma. Asthma is a chronic (long-term) inflammatory lung disease in which the air passages within the lungs are constantly swollen, restricting the amount of air allowed to pass through the trachea (windpipe). Scientists have used a computational tool called 'genes to diseases' (G2D, used for categorizing genesthat are susceptible to the development of inherited diseases), along with differential sequencing to identify the regions of genetic alteration that lead to the development of asthma. Some of the mutations in the genes LPA, NOX3, SNX9, VIL2, VIP, ADAM8, DOCK1, FANK1, GPR123, and PTPRE increase the risk of a person for developing asthma.
Mycotic keratitis: Mycotic keratitis is an infection of the transparent covering of the front of the eye, called the cornea, caused by a fungus, which may lead to corneal blindness. Scientists examined the proteins, present in corneal tears, taken from persons affected with mycotic keratitis using differential sequencing with mass spectrometry. The initial results showed that tears in the cornea may be used to analyze the immune responses in the infected individuals, which may help in refining or improving the existing treatment.
Cystic fibrosis: Cystic fibrosis (CF), also called mucoviscidosis, is an inherited life-threatening disorder that causes severe lung damage and nutritional deficiencies. CF causes the body to produce abnormally thick and sticky mucus, saliva, sweat, and digestive enzymes. In healthy individuals, these secretions serve as lubricants in the body. In CF patients, the secretions are so thick that they plug up tubes and passageways in the body. The lungs and pancreas are the most commonly affected organs in CF patients. Scientists are conducting studies using differential sequencing with mass spectrometry to identify the changes in genes (mutations) that may lead to the development of CF.

Implications

General: Differential or genetic sequencing is the method of investigating a DNA sample for variations with respect to a reference sequence that is provided by the Human Genome Project (HGP). HGP is an international research program that was designed to map out all of the genes that make up human beings. Mapping was carried out with the help of mass spectrometry (MS), an analytical technique used for measuring the molecular mass of a sample that helps to identify the chemical composition of a compound present in the sample.
Disease susceptibility: A sequence variation exists at defined positions within genomes and is responsible for individual phenotypic characteristics (observable characteristics), including a person's tendency towards the development of complex disorders such as heart disease and cancer. Differential sequencing using mass spectrometry tracks the inheritance of disease genes within families and assists in the evaluation of an individual's risk of developing a disease.
Drug development: The analysis of mutation in the human genome helps in the identification of genes that are at an increased risk (susceptibility) of developing diseases as well as drug targets. This in turn may facilitate the development of new drugs and patient care strategies to prevent or treat the diseases.
Epidemiology: Epidemiology is the study of the causes, distribution, and control of disease within populations. Differential sequencing of infectious agents helps in the fast and reliable identification of the disease-causing organisms (pathogens) for epidemiological purposes. For example, differential sequencing helps in mass disease outbreaks that involve the occurrence of disease in excess of what would normally be expected in a defined community, geographical area, or season. This may help the early initiation of treatment and may facilitate better treatment outcomes as well as prevention of new cases from occurring.
Proteomics: Proteomics is the study of the functions and structure of proteins. The proteome is the entire collection of proteins, including changes made to a particular set of proteins produced by an organism. This may vary with time and distinct requirements that a cell or organism needs, or stresses which they undergo. Differential sequencing with mass spectrometry has had good results in the determination of the function/structure of proteins and has assisted in the discovery of previously unknown multifunctional proteins and the identification of new functions of already known proteins. All of this may also help in understanding the different protein expressions related to diseases aiding in the development of drugs that target the protein pathway.

Limitations

One major drawback of mass spectrometry (MS) is that it may not distinguish molecules that have identical composition but differ in their DNA sequences. Since mass spectrometry analysis involves the determination of DNA sequences based on differences in mass, MS cannot differentiate molecules that have the same mass.
One other major limitation associated with differential sequencing using MS is the difficulty in mass determination of polydisperse polymers, i.e., large molecules that are greater than 25kDa (Da is a Dalton, the weight of one proton or one neutron). This has been overcome by breaking the long strands of DNA into small segments during the polymerase chain reaction (PCR), thereby helping identification. A polymer is referred to as polydisperse if the length of the chain varies over a wide range of molecular masses.
Moreover, in differential sequencing, protein identity may not be determined for those proteins that are not in the database because the genome from which the protein is derived may not have been identified or sequenced.

Future research

Biomarkers: Genetic biomarkers provide an estimate of how genetic variations, also called mutations or polymorphisms, make individuals susceptible to environmental agents. They may help predict disease susceptibility, outcome, treatment response, and toxicity. Researchers have found that methylation may regulate the TSP50 gene expressions in different types of tissues. Methylation is the addition of a methyl group (a combination of one carbon atom and three hydrogen atoms) at a particular spot on a DNA strand during the development of an organism, which may result in the loss of gene function. Hence, detection and quantization (counting the number) of methylation genes is used as a diagnostic as well as a predictive biomarker for different types of cancer.
The combined knowledge of the past and present research on the characterization of genes using differential sequencing with mass spectrometry may help in the development of new diagnostic methods, therapeutics (treatment options), prognostics (predicting the outcome), and prevention methods for diseases caused by mutations in genes. The knowledge might also assist in the development or adoption of personalized medicine, suited for affected individuals or members within the same family.

Author information

This information has been edited and peer-reviewed by contributors to the Natural Standard Research Collaboration (www.naturalstandard.com).

Bibliography

Ananthi S, Chitra T, Bini R, et al. Comparative analysis of the tear protein profile in mycotic keratitis patients. Mol Vis. 2008 Mar 12;14:500-7.
Bantscheff M, Schirle M, Sweetman G, et al. Quantitative mass spectrometry in proteomics: a critical review. Anal Bioanal Chem. 2007 Oct;389(4):1017-31.
Graber JH, Smith CL, Cantor CR. Differential sequencing with mass spectrometry. Genet Anal. 1999 Feb;14(5-6):215-9.
Huang Y, Wang Y, Wang M, et al. Differential methylation of TSP50 and mTSP50 genes in different types of human tissues and mouse spermatic cells. Biochem Biophys Res Commun. 2008 Oct 3;374(4):658-61.
Kirpekar F, Nordhoff E, Larsen LK, et al. DNA sequence analysis by MALDI mass spectrometry. Nucleic Acids Res. 1998 Jun 1;26(11):2554-9.
National Cancer Institute. .
Natural Standard: The Authority on Integrative Medicine. .
Perez-Iratxeta C, Bork P, Andrade-Navarro MA. Update of the G2D tool for prioritization of gene candidates to inherited diseases. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W212-6.
Tremblay K, Lemire M, Potvin C, et al. Genes to diseases (G2D) computational method to identify asthma candidate genes. PLoS ONE. 2008 Aug 6;3(8):e2907.
World Health Organization. .