投影片 1 - Asia University

Download Report

Transcript 投影片 1 - Asia University

Using ArrayExpress
ArrayExpress http://www.ebi.ac.uk/microarray-as/aer/index.html#ae-main[0]
ArrayExpress is an international public repository
for well-annotated microarray data, including gene
expression, comparative genomic hybridization
(CGH) and chromatin-immunoprecipitation (ChIP)
experiments.
ArrayExpress has three major goals:
1.Serve the scientific community as a repository for data
supporting publications
2.Provide easy access to high-quality data in a standard format.
3.Facilitate the sharing of microarray designs and experimental
protocols.
ArrayExpress has two major components:
1. ArrayExpress experiment repository –
the main database containing complete data supporting
publications.
2. ArrayExpress gene expression profile data warehouse –
contains gene-indexed expression profiles from a curated
subset of experiments from the repository.
Options for sorting and filtering your results.
Search for experiments by entering ArrayExpress experiment accession
numbers or keywords (e.g. RNAi, breast cancer) in the query box on the
left-hand panel.
ID - the unique ArrayExpress accession number of the experiment.
Experiment accession numbers are in the format of E-XXXX-n, where XXXX is a code
for the source of the data.
Experiments and array designs in ArrayExpress are given unique accession numbers
in the format of
E-XXXX-n for experiments
A-XXXX-n for array designs
XXXX represents a four letter code and n is a number e.g. E-MEXP-568, A-UHNC-18.
Title - the curated title for the experiment
Hybs - the total number of hybridizations in the experiment
Species - the species of the samples used (can be multiple)
Date - the date that the data were loaded into ArrayExpress
Processed –
direct link to the processed data as a zip file (brown icon indicates that this exists)
Raw –
a direct link to the raw data (brown/grey icon indicates that this
exists/not exists).
A wedge shaped icon indicates Affymetrix .CEL files
More –
a link to the ArrayExpress advanced interface where you can get subsets of each data
file by gene, hybridization and QuantitationTypes (columns in the data file).
Click anywhere on an experiment row and it will expand to allow you see
more details about this experiment and see where the term you searched
for appears.
Title - curated title of the experiment
MIAME score this is a score to indicate how close to full MIAME-compliance an experiment is, with
a score of 5 being the highest. One point each is given for
•sufficient annotation of the associated array design
•essential sample annotation including at least one experimental factor and the
species of all samples
•raw data files for each hybridization
•final processed (normalized) data for the hybridizations in the experiment
•essential laboratory and data processing protocols
Sample annotation –
a link to .2columns.xls which is a file containing a list of the samples, the experimental
factor values associated with these samples and the corresponding data files
Array –
the ArrayExpress accession number(s) for the array design(s) used in the experiment.
Clicking on the accession number opens a new browser window showing more
information about the array design in the advanced query interface.
Downloads –
links to the FTP server directory containing data files and sample and hybridization
information for the experiment, and to the data retrieval page for the experiment in the
advanced user interface
Experiment design –
links to a diagram of the sample relationships in .png and .svg format.
Protocols –
there is a link taking you to a page listing all the protocols used in the experiment.
Citation - details about any publications that relate to the data, including links to the
online article and to the PubMed entry where available
Detailed sample annotation - a link to .sdrf.xls which contains information about the
samples, the relationships between the samples, extracts, labeled extracts,
hybridizations and data files.
Contact - the name of the experiment submitter
Design types - terms describing design types of the experiment. These can include
biological, methodological and technology types e.g. disease state, strain or line,
compound treatment, in-vivo, dye swap, co-expression, binding site identification.
Description - the description of the experiment as supplied by the submitter
Factor values - a list of the experimental factor values in the experiment
The four letter code in the accession number generally indicates the
source of the MAGE-ML file that was used to load the data into the
ArrayExpress database. Sources include our own submission tools
(MEXP for MIAMExpress and TABM for Tab2MAGE) as well as
MAGE-ML submitted from other organizations or microarray data
management tools. The 4 letter code does not necessarily tell you
which organization performed the experiment or manufactured the
array design.
Some experiments have also been extracted from the Gene
Expression Omnibus (GEO) at the NCBI.
MIAME describes the Minimum Information About a
Microarray Experiment that is needed to enable the interpretation
of the results of the experiment unambiguously and potentially to
reproduce the experiment.