With the human genome project now almost complete, the question that begs to be asked is, how do we make sense of the information. Does the genome mean that we can discover new drugs and cure all the diseases using genomic information? The answer obviously is, not really. With the genome, we have the stepping-stone. The genome only provides a snapshot of all the possible ways a cell might use its genes (1). The real answer lies in understanding the way the cell is reacting to its environment. The dynamics of the cell during its exposure to disease causing element, which genes are turned on, along with the extent and degree of post translational modifications and their interaction with other genes, is what determines the susceptibility or resistance of the cell. A whole new field of study, proteomics has emerged which promises to address these issues.
Proteomics is the study of total protein complements, proteomes, e.g. from a given tissue or cell type. It is being undertaken using powerful analytical tools, like two-dimensional gel electrophoresis (2-DE) and ultrasensitive mass spectrometry (MS), coupled with high-throughput functional screening assays.
Nowadays proteomics can be divided into classical and functional proteomics. Classical proteomics is focused on studying complete proteomes, e.g. from two differentially treated cell lines, whereas functional proteomics studies more limited protein sets. Classical proteome analyses are usually carried out by 2-DE for protein separation followed by protein identification by MS and database searches. The functional proteomics approach uses a subset of proteins isolated from the starting material, e.g. with an affinity-based method. This protein subset can then be separated by using normal SDS-PAGE or by 2-DE. Proteome analysis is complementary to DNA-microarray technology: with the proteomics approach it is possible to study changes in protein expression levels and also protein-protein interactions and post-translational modifications. (2)
Two-dimensional gel electrophoresis
Proteins are separated in 2-DE according to their pI and molecular weight. In 2-DE analysis the first step is sample preparation; proteins in cells or tissues to be studied have to be solubilized and DNA and other contaminants must be removed. The proteins are then separated by their charge using isoelectric focusing. This step is usually carried out by using immobilized pH-gradient (IPG) strips, which are commercially available. The second dimension is a normal SDS-PAGE, where the focused IPG strip is used as the sample, thus separating the samples bu their molecular weight. After 2-DE separation, proteins can be visualized with normal dyes, like Coomassie or silver staining.
Protein identification by mass spectrometry
Mass spectrometers consist of the ion source, mass analyzer, ion detector, and data acquisition unit. First, molecules are ionized in the ion source. Then they are separated according to their mass-to-charge ratio in the mass analyzer and the separate ions are detected. Mass spectrometry has become a widely used method in protein analysis since the invention of matrix-assisted laser-desorption ionization/time-of-flight (MALDI-TOF) and electrospray ionization (ESI) methods. There are several options for the mass analyzer, the most common combinations being time-of-flight (TOF) connected to MALDI and triple quadrupole, quadrupole-TOF, or ion trap mass analyzer coupled to ESI.
In proteome analysis electrophoretically separated proteins can be identified by mass spectrometry with two different approaches. The simplest way is a technique called peptide mass fingerprinting (PMF). In this approach the protein spot of interest is in-gel digested with a specific enzyme, the resulting peptides are extracted from the gel and the molecular weights of these peptides are measured. Database search programs can create theoretical PMFs for all the proteins in the database, and compare them to the obtained one. In the second approach peptides after in-gel digestion are fragmented in the mass spectrometer, yielding partial amino acid sequences from the peptides (sequence tags). Database searches are then performed using both molecular weight and sequence information. PMF is usually carried out with MALDI-TOF, and sequence tags by nano-ESI tandem mass spectrometry (MS/MS). The sensitivity of protein identification by MS is in the femtomole range. Edman N-terminal amino acid sequencing is usually done by Edman degradation. This means that the N-terminal amino group is coupled with phenyl isothiocyanate, enabling the selective cleavage of the first peptide bond, followed by chemical conversion of the first residue into a stable PTH amino acid residue and analysis by RP-HPLC. Each PTH amino acid has a characteristic retention time in the RP-chromatography, enabling the identification of the residue by comparing to a standard. This procedure is repeated for a desired number of cycles or until the signal disappears into the background. The maximum number of cycles (30-50 in optimum circumstances) is limited by the amount and nature of the protein as well as instrument performance.
Proteomic analysis relies heavily on informatics tools at every level. The critical resource needed is an online database of protein expression profiles which is integrated with sequence and other databases. For a listing of databases visit http://www.expasy.ch/. Furthermore, software packages which can facilitate the analysis of multiple expression profiles by scanning databases are needed (4). Also statistical tools to analyse the results are in great demand.
Proteomics companies are looking for professionals with backgrounds in bioinformatics, IT, molecular biology, and protein separation chemistry. There is also a demand for software engineers who can craft the instrument control and data acquisition instrumentation as well as data analysis tools, including database-searching tools. Finally to sort out all the data being generated, biostaticians are very much in need (5).
1. Nature Biotechnology, 2000, Vol 18, Supplement 2000, IT45
4. Current Opinion in Biotechnology, 2000, 11: 176-179