Transcript PPT - NIH LINCS Program
GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG ATCGACTGGCGATCAAACAGTCACGCGCATCGATCGACTGA GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC GATCACTGGCATGCATCGATCGACTGACTGCGGCATGCGCG
LINCS Fall Consortia Meeting
GATCGCGGCATCGCGACGCGGATAAATACGAGCACTACAAA
Broad Institute U54 Team
TGACTACGGGATTTTACGCGCGATACGACTGACTGACTAGC 0111101101010111001010101000111010101001100101110101 0010101010100000011110101111101001010101000111011101 0111101101010111001010101000111010101001100101110101 0111010010001010100011110101000010101010100010100011 0010101000101011110101000100100100101010001000001011 0101001010000101111101001010010101011101010010101001
BASIC DISCOVERIES CONNECTIONS
PATHWAYS DISEASE STATES TOOL COMPOUNDS
THERAPEUTIC IMPACT DRUGS GENETIC GWAS TCGA RNAi CHEMICAL SCREENS NAT’L PRODUCTS
SLOW (SOME NEVER START) DOES NOT SCALE NO LEVERAGE
DIAG NOSTICS
LINCS as a Solution • perturbations scalable to genome • high information content read-outs (e.g. gene expression) • inexpensive • mechanism to query database
Toward a reduced representation of the transcriptome
gene expression is correlated
samples
Reduced Representation of Transcriptome
reduced representation transcriptome ‘landmarks’ computational inference model genome-wide expression profile ~ 100,000 profiles
100 60 40 20 0
A. Subramanian, R. Narayan
number of landmarks measured
5' 5' 5' 3' 1000-plex Luminex bead profiling AAAA 3' RT 5'-PO 4 | ligation 3' TTTT 5' PCR hybridization 001 Luminex Beads (500 colors, 2 genes/color) Reagent cost:
$3/sample
Validation of L1000 approach 12
Gene-level validation
11 10 9 8 7 6 5 4
Affymetrix ($500) C-Map Connections
Published (32) Internal (152) 6 8 10
Affymetrix
12
Affymetrix simulation
26 (80%) 121 (80%) 92% R 2 > 0.6
Similar to AFFX vs ILMN 14
Luminex ($5) 1,000-plex Connections
28 (86%) 142 (94%)
Putting it all together
Illustration: Bang Wong
Cell Types
GTEx
Primary hTERT-immortalized cells Patient-derived iPS cells* Banked primary cells* (T-cells, macrophages, hepatocytes, myocytes, adipocytes) Cancer cell lines
* in assay optimization
2-3 weeks
Cell Repository (e.g. Coriell) somatic cell isolation
fibroblasts 4-6 weeks
Reprogramming [Oct4, Sox2, Klf4, Myc] Neural progenitors
3-4 weeks
Neural Differentiation Astrocyte Oligo dendrocyte Neuron
Perturbagens
Small-molecules (n=4,000) Genes (n=3,000)
Automated Quality Control Measures
Overall failure rate ~ 8%
LINCS Proposal (~ 600,000 profiles)
4,000 compounds • 1,300 off-patent FDA-approved drugs • 700 bioactive tool compounds • 2,000 screening hits (MLPCN + others) 2,000 genes (shRNA + cDNA) • known targets of FDA-approved drugs (n=150) • drug-target pathway members (n=750) • candidate disease genes (n=600) • community nominations (n=500) 20 cell lines • emphasis on reproducibility and availability • cancer and primary, non-cancer • some ‘doubling down’ to assess intra-lineage diversity
Progress to date
http://www.broadinstitute.org/lincs_beta/ DATA RELEASE (BETA) proposed actual projected
Signature of p53 ORF
p53 vs. empty vector
• • p53 is NOT a Landmark Gene p53 pathway is #1 pathway of 512 in MSigDB
P < 0.001
Ramnik Xavier
Making connections in primary macrophages
NF-kB pathway genes (all INFERRED) pathway rank: 1/512 LPS pathways curated from literature (n=512)
Jens Lohr
Prioritizing human genetics candidates
Ramnik Xavier, MGH
Signatures of genetic variants connect to disease genesets
Ramnik Xavier, MGH
Disease variants connect to pathways
e.g. CD40 to ATG16L1 (both regulators of autophagy) Ramnik Xavier, MGH
ERG transcription factor
important in hematopoietic stem cells, prostate cancer
ERG-BINDING SMALL-MOLECULES
Defining a gene expression signature of ERG activity
integrating experimental and clinical data Gain of Function:
Primary prostate + hTERT +ST +AR +/-ERG
Loss of Function:
VCaP cells +/- ERG shRNA 120
Patient Samples:
Physician’s Health Study
3/69 ERG-binders inhibit ERG gene expression program
L1000 as primary small-molecule screen read-out
12,985 compounds screened for ERG signature
Name BRD-K42581894-001-01-1 BRD-K42581894-001-01-1 BRD-K14408783-001-01-5 Wortmannin BRD K78122587 BRD-K91899208-001-01-8 BRD-K24750847-001-01-2 BRD-K18273607-001-02-1 BRD-K76892938-001-01-9 AZD2281 (Olaparib) BRD-K86715531-001-01-1 BRD-K95688283-001-01-9 BRD-K99179945-001-01-5 Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 Library DOS DOS DOS Bioactives ChemDiv DOS DOS DOS DOS Bioactives DOS DOS DOS
Analytical and software challenges
1. Infrastructure: data and compute server 2. Optimization of connectivity metrics and statistics 3. Optimization of inference models (context-aware) 4. UI: query tools and results visualization 5. Addressing off-target effects of perturbagens
Aravind Subramanian Wendy Winckler Justin Lamb
Computational Rajiv Narayan Josh Gould RNAi Platform Chemical Biology Platform Genetic Analysis Platform Broad Program Scientists Laboratory Dave Peck Willis Reed-Button Xiaodong Lu