Investigations where we varied these parameters or methodologies did not change our main conclusions, though the results for individual probes are naturally affected by the exact criteria used. Affymetrix is … Please check for further notifications by email. In addition, the Affymetrix arrays are constructed in a specific layout, with each probe synthesized at a predefined location ( 2 ), while individual Illumina arrays must undergo a 'decoding' step in which the locations of each probe on the array are determined using a molecular address ( 1 ). We removed genes for which the 3′ ends of the probes were located further apart than 100 bases (similar in size to the average human exon) between the two platforms. This enrichment is highly significant ( P = 0.0039 and P < 10 −15 for Illumina and Affymetrix, respectively). Under the terms of the deal, Illumina, without admitting liability, will make … Other ties often involved closely related genes, probably reflecting duplications (e.g. As shown in Figure 6B , when Affymetrix and Illumina probes align to very close or overlapping locations in the genome, they have a tendency to agree more, whereas probes that hybridize to distinct locations, even along the same gene, tend to disagree more. While our annotation system tries quite hard to identify which transcript or set of transcripts a probe is likely to hybridize to, thereby identifying cases where we believe the platforms are measuring the same RNA, the resolution of this approach is limited. The authors concluded that the Agilent platform outperforms the Illumina and Affymetrix platforms due to its greater accuracy in fold change measurement and its accurate profiling of miRNAs that differ in GC content. Specifically, probes that are expected to hybridize to different parts of the same transcript might yield different signals. This could lead to different populations of transcripts being assayed in some cases. At an FDR of 0.05, 37% of the cross-platform comparisons result in rejection of the null hypothesis of no correlation (the threshold correlation to achieve significance is ∼0.56). Black bars at the side indicate large clusters of genes that appear to show clear dilution effects in both platforms. The Illumina SNP chips include LD-based tagSNPs derived from over 2 million common SNPs (minor allele frequency greater than 0.05) in the HapMap data. This leaves a set of 940 pairs of probes for further study, or ∼3% of all comparisons. To analyze the ability of each platform to yield reproducible and accurate results, we used a dilution design, outlined in Figure 1 and detailed in Materials and Methods. All rights reserved
 The online version of this article has been published under an open access model. Very similar results overall were obtained when using annotations provided by the manufacturers (Supplementary Data). This means that probes falling entirely within introns were given similarity scores of zero, and when there were two alternative 3′ ends for a gene, the one with the 3′ end nearest to the probe was selected as the targeted gene. These results shed light on the causes or failures of agreement across microarray platforms. Compared with the effect of expression level, the effect is small though still highly statistically significant, with a rank correlation of 0.18. The diversity of microarray platforms has made it challenging to compare data sets generated in different laboratories, hindering multi-institutional collaborations and reducing the usefulness of existing experimental data. One group reported high reproducibility of Affymetrix and long oligonucleotide arrays (which share similarities to the BeadArrays in the type of sequence used), but not of cDNA arrays ( 21 ), suggesting that there could be real differences in the reproducibility of different platforms, and that arrays based on long clones may have particular problems with specificity ( 16 , 17 ). Illumina OmniExpress. Finally, we removed pairs which showed good agreement across platforms (as these need no further explaining), setting a maximum correlation threshold of 0.5 (close to that which maintained an FDR of 0.05), and also required that at least one of the probes show a strong dilution effect (again using the threshold of 0.5, but as a lower limit). Most disagreements are more subtle. The Illumina data were extracted using software provided by the manufacturer. We also cannot eliminate the possibility that refinements of transcript assignment would resolve some cases of ‘disagreement’. The dilution step is shown as a graph at the top of the figure (Blood/Placenta). As mentioned, the level of expression would be an important factor in making a good comparison: if a gene is simply not present in the samples, the measurements will be just noise, and we do not expect noise to be similar across platforms (by definition). Illumina is represented in this matter by Morris James LLP. An official … We designate all other potential transcripts ‘unassigned probes’. Second, we computed the distance of the 3′ end of the BLAT hit from the 3′ end of the annotated transcript (using the center of the BLAT hit made no difference in the conclusions; see Supplementary Data). For Illumina the input sequences were the 50 bp oligonucleotide sequences provided by the manufacturer. The dilution effect is more pronounced for probes targeted at ‘known’ genes. Michael Barnes, Johannes Freudenberg, Susan Thompson, Bruce Aronow, Paul Pavlidis, Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms, Nucleic Acids Research, Volume 33, Issue 18, 1 October 2005, Pages 5914–5923, This is especially true once the factors of gene expression level and probe placement on the genome are considered. Hybridization to the Sentrix HumanRef-8 Expression BeadChip (Illumina, Inc., San Diego, CA), washing and scanning were performed according to the Illumina BeadStation 500× manual (revision C). Floris Brenk • 970 wrote: Hi all, Was wondering if there any noteworthy differences between Illumina genotyping array and Affymetrix … We used a dilution design, where two different RNA samples are mixed at known proportions, and the same RNA is analyzed in duplicate on each platform (see Figure 1 ). However, it is also possible to identify clusters of probes which seem to show dilution effects on one platform but not on the other ( Figure 2 , light bars). In extreme cases, the GenBank accession number referenced by the manufacturer includes multiple genes. Each sequence was compared with the genome sequence using BLAT ( 10 ) with minimum score set to 20 and an initial minimum identity set to 0.5 (all other parameters were left to the default setting). Arrays produced by Affymetrix are fabricated by in situ synthesis of 25mer oligonucleotides (2) while the Illumina process involves using standard oligonucleotide synthesis … The results were visualized with matrix2png ( 12 ). Lockhart, D.J., Dong, H., Byrne, M.C., Follettie, M.T., Gallo, M.V., Chee, M.S., Mittmann, M., Wang, C., Kobayashi, M., Horton, H., et al. Pilot studies indicated that background subtraction had a negative impact on the Illumina data quality, so we used data that had not been background subtracted. There are ∼250 probes on each platform that have very high potential for cross-hybridization based on our sequence analysis (see Materials and Methods). To compare profiles across platforms, the Pearson correlation was used on non-log transformed data (the RMA data were transformed back from log 2 ), though the Spearman rank correlation yielded very similar results (Supplementary Data). Illumina also provided a table of annotations. Contrary to our general findings, a number of groups have found that concordance of results across expression analysis platforms is low ( 4 , 5 , 15 – 18 ). The Affymetrix and Illumina Microarray Analysis laboratory provides state of the art Affymetrix GeneChip® and Illumina BeadArray technology for analysis of gene expression, gene regulation, and … Comparisons between long and short oligonucleotide arrays have been carried out in the past for other array types ( 3 – 7 ). However, in contrast to spike-in studies, the identities of the genes expected to show differential expression are not definitely known ahead of time. Biotinylated cRNA was synthesized from total RNA (Enzo, Farmingdale, NY). Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., et al. At a second level of analysis, one can consider a finer level of stratification of probes based on their relative locations. Affymetrix expression level log 2 > 7), the effect of location on agreement is enhanced slightly (rank correlation 0.25), indicating that the effect cannot be completely attributed to associated differences in expression level. Affymetrix sells GeneChip® brand … Call: 1-888-226-4343 Fax: 901-595-4011 Email: Online: Referral Form Physician / Patient Referral Office. Concordance was also improved when probes on the two platforms could be identified as being likely to target the same set of transcripts of a given gene. These probes appear wholly unremarkable based on the parameters we have focused on (expression level and location), compared with the 899 other probe pairs (the complete list of the 940 probe pairs are provided as Supplementary Data). Some questions still remain. In this study we assessed the respective ability of Affymetrix and Illumina microarray methodologies to answer a relevant biological question, namely the change in gene expression … Beyond this conclusion, two more specific findings we wish to highlight in the discussion are that expression level plays a major role in determining reproducibility across platforms, and that the precise location of the probe on the genome affects the measurements to a substantial degree, such that two probes which do not map to the same location cannot be assumed to be measuring the same thing. Note that in this figure, if a gene occurs multiple times on one platform, it is shown in all possible valid comparisons with matching probes on the other platform. This paper details results from an experiment comparing Affymetrix HG-U133 Plus 2.0 microarrays with the Illumina HumanRef-8 BeadArrays. Two BeadChips were used, each one containing eight arrays, so that each dilution series of six samples was run on an individual BeadChip. This work was supported in part by the Children's Hospital Research Foundation of Cincinnati, the Schmidlapp Foundation, and National Institutes of Health Grants GM076880, AR47363, AR47784 and AR50688. As shown in Figure 5C , these genes show excellent agreement across the platforms, with many fewer disagreements than the data considered at large ( Figure 5A ). alcar • 0 wrote: I have data from a 50k Illumina bovine chip and data from a 600k Affymetrix bovine … Resolving this will likely require additional data. We then filtered out probes which were expressed at low levels on both platforms (medians below the 25th percentile). At Illumina, our goal is to apply innovative technologies to the analysis of genetic variation and function, making studies possible that were not even imaginable just a few years ago. These often represented alignments to sequences duplicated in the assembly (e.g. Affymetrix is a brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name. We hypothesized that analyses focused on well-characterized genes would tend to yield better results. Indeed, if we first filter the data to remove comparisons between probes, of which at least one do not show a significant dilution effect (FDR 0.05, removing 48% of the comparisons), rejection of 88% null hypotheses yields an FDR of <0.05 (i.e. None declared. Affymetrix 6.0. Your comment will be reviewed and published at the journal's discretion. Unlike the Affymetrix platform, each Illumina BeadArray slide contains multiple arrays, allowing us to analyze a complete dilution series on one slide. Agilent and Affymetrix arrays. A potential remaining source of ‘disagreement’ could be differential cross-hybridization. As for the effect of expression level, the effect remains after removing probes which failed the dilution effect filter. The OmniExpress chip however provides better coverage of Asian HapMap SNPs, although its coverage of HapMap SNPs is moderate. two probes targeting the same gene within a platform were more likely to yield concordant results if they exhibited stronger expression and were targeting nearby sites in the genome (see Supplementary Data for details). RNA-Seq technology produces discrete, digital sequencing read counts, and can quantify expression across a larger dynamic range (>10 5 for RNA-Seq vs. 10 3 for arrays). We believe this approach may be unsuitable for high-sensitivity comparisons across platforms, because of the coarseness of resolution of UniGene or GenBank IDs compared with the actual probes used on the arrays. This means that if a gene appeared twice on each platform, a total of four new expression vectors were constructed. parts of chromosome 1 and chromosome 1_random; ∼10% of cases). RNA mixtures (100:0, 95:5, 75:25, 50:50, 25:75, 0:100; PBMC: placenta) were prepared in single aliquots. This enrichment shows that when the dilution effect is considered, the agreement between the platforms rises substantially. The Affymetrix 6.0 and Illumina OmniExpress chip have similar genotyping accuracy and provide similar accuracy of imputed SNPs. CGB and CGB5). The dilution profile was described as a simple factor in a linear model used to fit each gene. Gunderson, K.L., Kruglyak, S., Graige, M.S., Garcia, F., Kermani, B.G., Zhao, C., Che, D., Dickinson, T., Wickham, E., Bierle, J., et al. For complete data see the Supplementary Data. St. Jude Graduate School of Biomedical Sciences, Volunteer at the Hospital Become a Monthly Donor. Next, we counted the number of different transcripts predicted to be hybridized by each probe (assuming for the moment that all RNAs are equally likely to be detected, regardless of 3′ location of the probe). As shown in Figure 5A , there is a remarkable level of agreement for many probes by this measure (Pearson correlation was used for this analysis; similar results are obtained with the rank correlation, see Supplementary Data). We thank Kiran Keshav for assistance in preparing the manuscript. Our interpretation of this finding is that these probes are somehow inherently ‘poorly behaved’ and we predict that they will not yield biologically useful results. I will be grateful for any suggestions. Therefore the failure of one platform to confirm a result on a rare transcript should be interpreted cautiously. Kuo, W.P., Jenssen, T.K., Butte, A.J., Ohno-Machado, L., Kohane, I.S. Saving children. Question: Comparing Illumina with Affymetrix data. The Affymetrix GeneChip Exon Array system provides, for the first time, exon-level expression profiling of … Illumina microarray technology (also known as BeadArray technology) uses silica microbeads. Design content for Illumina genotyping microarray experiments. Following informed consent (approved by Cincinnati Children's Hospital Medical Center Internal Review Board), ∼50 ml whole blood was collected from 30 adult, apparently healthy, volunteers using Acid Citrate Dextrose as an anti-coagulant. Some additional insight into the reproducibility problem comes from looking at reproducibility within each platform. both platforms indicate a strong dilution effect for the gene, but in the opposite direction. Question: BAF and L2R Illumina vs Affymetrix. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide, This PDF is available to Subscribers Only. For clustering only, the data matrices for each platform were adjusted so the probe expression vectors had a mean of zero and variance one. 0. The study design is based on a dilution series of two human tissues (blood and placenta), tested in duplicate on each platform. Using a more stringent Bonferroni correction on this filtered data, 23.7% of the comparisons are considered significant at an alpha of 0.05, compared with 8.4% for the unfiltered data. Kothapalli, R., Yoder, S.J., Mane, S., Loughran, T.P., Jr. Jurata, L.W., Bukhman, Y.V., Charles, V., Capriglione, F., Bullard, J., Lemire, A.L., Mohammed, A., Pham, Q., Laeng, P., Brockman, J.A., et al. However, this conclusion is complicated by the fact that expression level is also affected by distance from the 3′ end (rank correlation −0.15), so the measure of probe location difference is not independent of the level of expression. a Mfg, Manufacturer's annotations; BLAT, our own annotations computed using BLAT alignments to the genomic sequence. The growth in popularity of RNA expression microarrays has been accompanied by concerns about the reliability of the data especially when comparing between different platforms. The ATGC performs SNP genotyping microarray services using Affymetrix and Illumina platforms.. Pricing for SNP profiling microarray services can be found on the service pricing schedule.. Affymetrix… A likely explanation for some of the effects we see have to do with differences in the technologies, such as differences in RNA labeling protocols, or the linker and ‘bar code’ sequences on the Illumina arrays compared with the direct attachment of the Affymetrix sequences to the substrate. This score differs slightly from the default in the GoldenPath genome browser in that the gap penalty is lower, but we found it gave us higher sensitivity when aligning shorter sequences and permits good gapped alignments of the collapsed Affymetrix sequences. We hypothesized that reproducibility within each platform (for those genes with multiple probes predicted to target them) would show the same trends as reproducibility across platforms. There are a still fairly numerous probes which, based on dilution effect, location and expression level criteria, would be predicted to yield reproducible results, but do not. However, there was no overall difference between Affymetrix and Illumina in the number of transcripts assayed among the set of 940 ( P ∼ 0.3, paired t -test). 88% of the remaining comparisons are significant). More generally, we expect higher expression levels to be associated with less noisy measurements, and therefore would yield better agreement across platforms. If we analyze only probes that have higher expression levels (e.g. Affymetrix filed suit in July in U.S. district court in Wilmington, alleging that Illumina infringed six patents, issued between 1996 and 2003, that relate to various aspects of microarray technology… The thresholds for stratification were determined by inspection or from the statistical testing, and alternative reasonable thresholds do not change our findings. b ‘Known genes’ are genes identified in the GoldenPath ‘refGene’ or ‘knownGene’ tables, including transcript information from the ‘all_mrna’ table to determine exon overlaps. For example, we note that Tan et al . The final ‘best’ match for a probe was the transcript closest to the probe's 3′ end and with the largest non-intronic overlap. We considered this plausible because Affymetrix probe sets assay more sequence and often include probes spread fairly widely (a median of 481 genomic bases from the 5′ end of the 5′-most probe to the 3′ end of the 3′-most probe) compared with the Illumina platform, which use a single 50 bp probe that almost always maps to 50 contiguous bases in our analysis. For Affymetrix, we merged and joined the individual probe sequences to form a ‘pseudo-target’ sequence; we found that aligning these to the genome was much more effective and efficient than attempting to align individual probes or using the Affymetrix ‘target’ sequences (the merging procedure is depicted in a Supplementary Figure). For commercial re-use, please contact. Search for other works by this author on: © The Author 2005. St. Jude is leading the way the world understands, treats and defeats childhood cancer and other life-threatening diseases. See Materials and Methods for details. The complete list of probes on both platforms, with their agreement statistics across platforms, is included as Supplementary Data. At least one group ( 24 ) has reported higher reproducibility than in a previous analysis of the same data ( 15 ), suggesting that data treatment and choice of comparison metric plays a role. Genes that are expressed at low levels are not as likely to be reproducible across platforms. A difficulty with the analysis shown in Figure 5B and described above is that it relies on the arrays themselves to identify genes that might show a differential expression effect: an independent ‘gold standard’ would be desirable. That this is indeed the case is shown in Figure 6A ; the rank correlation of expression level to measure cross-platform agreement is 0.37–0.43 (depending on whether the Illumina or Affymetrix expression levels, or their means, are used for evaluation). The key features of the design are the use of a single pair of RNA samples for all analyses, mixed together in varying proportions and analyzed in technical replicates on each platform. (who compared three platforms) ( 4 ) relied on GenBank or UniGene identifiers to match genes across platforms. Whole placenta was collected and immediately frozen in liquid nitrogen. Arrays produced by Affymetrix are fabricated by in situ synthesis of 25mer oligonucleotides ( 2 ) while the Illumina process involves using standard oligonucleotide synthesis methods as is used for spotted long-oligonucleotides arrays. alcar • 0. Our main conclusion from this study is that the Affymetrix and Illumina platforms yield highly comparable data, especially for genes predicted to be differentially expressed. a tie), one was arbitrarily chosen (247 cases for Affymetrix, 231 cases for Illumina). When these two factors are taken into account, the agreement of the results across platforms is very high, though still not perfect. Our results show that these two completely different microarray technologies yield, on the whole, very comparable results. To examine this in more detail, we sought to identify provisionally ‘unexplained’ cases of disagreement by filtering the full set of results, using partly arbitrary criteria. This is because noise will have a stronger influence on their measurement, making detecting a dilution effect difficult.
