|
|
|
| [Back] |
| |
Microarray Image and Data Analysis |
 |
Microarray technology allows the comprehensive measurement of the expression level of many genes simultaneously on a common substrate. Typical applications of microarrays include the quantification of expression profiles of a system under different experimental conditions, or expression profile comparisons of two systems for one or more conditions. Several types of microarrays have been developed to address different biological processes. cDNA microarrays are used for the monitoring of the gene expression levels to study the effects of certain treatments, diseases, and developmental stages on gene expression. As a result, microarray gene expression profiling can be used to identify disease genes by comparing gene expression in diseased and normal cells. Due to the abundance of experimental data, techniques for automated processing and analysis of microarray images and microarray data are required.
The experiment of cDNA microarrays typically starts by taking two biological tissues and extracting their mRNA. The mRNA samples are reverse transcribed into complementary DNA (cDNA) and labelled with fluorescent dyes resulting in a fluorescence-tagged cDNA. The most common dyes for tagging cDNA are the red fluorescent dye Cy5 (emission from 630-660 nm) and the green-fluorescent dye Cy3 (emission from 510-550 nm). Next, the tagged cDNA copy, called the sample probe, is hybridized on a slide containing a grid or array of single-stranded cDNAs called probes. Probes are usually known genes of interest which were printed on a glass microscope slide by a robotic arrayer. According to the hybridization principles, a sample probe will only hybridize with its complementary probe. The probe-sample hybridization process on a microarray typically occurs after several hours. All unhybridized sample probes are then washed off and the microarray is scanned twice, at different wavelengths corresponding to the different dyes used in the assay. The digital image scanner records the intensity level at each grid location producing two greyscale images. The intensity level is correlated with the absolute amount of RNA in the original sample, and thus, the expression level of the gene associated with this RNA.
|
|
The processing of microarray images provides the input for further analysis of the extracted microarray data. It includes the following stages:
|
- Spot addressing and gridding, which are the processes of assigning the location of each spot and fit a grid on the image.
- Segmentation, which is the process of grouping the pixels with similar features (this step results in the separation of foreground and background pixels),
- Intensity extraction, which calculates red and green foreground fluorescence intensity pairs and background intensities.
|
| Statistical analysis |
The analysis of DNA microarrays poses a large number of statistical problems, including the normalization of the data. There are several normalization methods in the published literature some of which are platform specific as in many other cases where authorities disagree, a sound conservative approach is to directly compare different normalization methods to determine the effects of these different methods on the results obtained. This can be done, for example, by investigating the performance of various methods on data.
Also, experimenters must account for multiple comparisons: even if the statistical P-value assigned to a gene indicates that it is extremely unlikely that differential expression of this gene was due to random rather than treatment effects, the very high number of genes on an array makes it likely that differential expression of some genes represent false positives or false negatives. Statistical methods tailored to microarray analyses have recently become available that assess statistical power based on the variation present in the data and the number of experimental replicates, and can help minimize type I and type II errors in the analyses.
A basic difference between microarray data analysis and much traditional biomedical research is the dimensionality of the data. A large clinical study might collect 100 data items per patient for thousands of patients. A medium-size microarray study will obtain many thousands of numbers per sample for perhaps a hundred samples. Many analysis techniques treat each sample as a single point in a space with thousands of dimensions, then attempt by various techniques to reduce the dimensionality of the data to something humans can visualize.
|
|
| People: N. Giannakeas, L. Maglaras |
|
| Articles in journals: |
|
[1]. |
|
Articles in conferences: |
[1]. |
|
[2]. |
|
[3]. |
|
[4]. |
|
Book chapters |
|
| |
|
| |
| [Back] |
|
| |
|
|