Scaling a chip means
multiplying the signals (intensity measures) for all genes by a common scale
factor. The reason to do this is that the total brightness is significantly different
between the from the two channels. If the same total weight of RNA is
hybridized in both channels, the differences between channels must be
due to different uptake of label (dye bias) of RNA hybridized. In
fact microarray technology can only measure relative levels of
expression: per mg RNA. For a two-color chip, we have two measures for each
gene, one from each channel. For each chip we compute scale factors *C*_{red}
and *C*_{green} , by:

where *G _{i}* and

This is equivalent to subtracting their
average from the logarithms of all the expression ratios, which results in a
mean log_{2}(ratio) equal to zero, or the (geometric) mean ratio is
equal to 1.

In order to make individual
channels more comparable across chips, the same constant is used for all chips.
In practice there are often anomalies at the top end, for examples a number of
probes are saturated. One gets more consistent results by using a robust estimator,
such as median or 1/3 – trimmed mean: take mean of middle 2/3 of probes, and
scale all probes to make those equal. (John Quackenbush suggested this
originally, but TIGR now uses lowess – see below.)

Whereas normalization adjusts the
mean of the log_{2}(ratio) measurements, it is common to find also that
the variance of the measured log2(ratio) values to differ between arrays. One
approach to dealing with this problem is to adjust the log_{2}(ratio)
measures so that the variance is the same. This often works, in reducing
variance, but sometimes works too well, in that variance of individual measures
is actually increased. Probably a partial adjustment is optimal, but it seems
unprincipled.

Another two-parameter approach is a linear regression of one channel on the other. This doesn’t seem to do as well.

**Intensity
Dependent Normalization with Lowess**

With a little experience it becomes clear to a researcher that these approaches
do not compensate for all the systematic differences between chips that obscure
and bias analysis of real biological differences. Several statisticians have
tried to identify variables, which systematically bias expression ratios. For
example one commonly observes that the log_{2}(ratio) values have a
systematic dependence on intensity – most commonly a deviation from zero for
low-intensity spots. Under-expressed genes appear up-regulated in the red
channel. Moderately expressed genes appear up-regulated in the green channel.
No known biological process would regulate genes that way – this must be an
artefact. It appears that the explanation is chemical: dyes don’t fluoresce
equally at different levels, because of different levels of ‘quenching’ – a phenomenon
where dye molecules in close proximity, re-absorb light from each other, thus
diminishing the signal. Quenching acts at different levels for each dye.

The easiest way to visualize
intensity-dependent effects is to plot the measured log_{2}(R_{i}/G_{i})
for each element on the array as a function of the log_{2}(R_{i}*G_{i})
product intensities. This 'R-I' (for ratio-intensity) plot can reveal
intensity-specific artifacts in the log_{2}(ratio) measurements. Note
that Terry Speed’s group calls these variables ‘M’ and ‘A’, and the plot is an
‘MA plot’.

Figure 1. Ratio-Intensity plot showing characteristic ‘banana’
shape of cDNA ratios; log scale on both axes. (courtesy Terry Speed)

We would like a normalization method
that can remove such intensity-dependent effects in the log_{2}(ratio)
values. The functional form of this dependence is unknown, and must depend on
many variables we don’t measure. An ad-hoc statistical approach widely used in
such situations, is to fit some smooth curve through the points. One example of
such a smooth curve is a __l__ocally __w__eighted linear regression
(lowess) curve. Terry Speed’s group at Berkeley used this approach.

To calculate a lowess curve fit to a group of points (x1,y1),…(xN,yN), we calculate at each point xi, the locally weighted regression of y on x, using a weight function that down-weights data points that are more than 30% of the range away from xi. We can think of the calculated value as a kind of local mean. For each observation i on a two-color chip, set xi = log2(Ri*Gi) and yi = log2(Ri/Gi). The lowess approach first estimates y(xk), the mean value of the log2(ratio) as a function of the log2(intensity). Lowess normalization corrects systematic deviations in the R-I plot by carrying out a local weighted linear regression as a function of the log2(intensity) and subtracting the calculated best-fit average log2(ratio) from the experimentally observed ratio for each data point.

The normalized ratios r* are given
by

.

The result is that ratios at all
intensities have a mean of 0, as seen below.

Figure 2. As in Figure 1, but
corrected by lowess normalization.

** **

**Global
versus local normalization. **

Most normalization algorithms, including
lowess, can be applied either globally (to the entire data set) or locally (to
some physical subset of the data). For spotted arrays, local normalization is
often applied to each group of array elements deposited by a single spotting
pen (sometimes referred to as a 'pen group' or 'subgrid'). Local normalization
has the advantage that it can help correct for systematic spatial variation in
the array, including inconsistencies among the spotting pens used to make the
array, variability in the slide surface, and slight local differences in
hybridisation conditions across the array. There is some controversy among
biotechnologists about how likely it is that a single print tip will cause a
systematic variation.

Another approach is to look for a
smooth correction to uneven hybridisation. The thinking behind this approach is
that most spatial variation is caused by uneven fluid flow. Flow is continuous,
and hence the correction should be continuous as well.

When a particular normalization
algorithm is applied locally, all the conditions and assumptions that underlie
the validity of the approach must be satisfied. For example, the elements in
any pen group should not be preferentially selected to represent differentially
expressed genes, and a sufficiently large number of elements should be included
in each pen group or spatial area for the approach to be valid.

A good design will place all
contrasts of interest directly on chips, but sometimes that is impossible, or
just not done. In that case we may want to compare parallel measures: , ie. measures
that are not directly contrasted on an array. We observe that variance is very
high between parallel measures. We need a kind of normalisation that works
across arrays as well as within arrays. It turns out that quantile normalization works quite well
at reducing variance between arrays, while not losing any of the properties of
lowess normalization.