Bioinformatics and Comparative Genome Analysis course August 16 - August 29, 2009 HKU-Pasteur Research Centre - Hong Kong http://www.pasteur.fr/~tekaia/BCGA2009.html Participants self introduction August 16, 2009

Download Report

Transcript Bioinformatics and Comparative Genome Analysis course August 16 - August 29, 2009 HKU-Pasteur Research Centre - Hong Kong http://www.pasteur.fr/~tekaia/BCGA2009.html Participants self introduction August 16, 2009

Bioinformatics and Comparative Genome Analysis course

August 16 - August 29, 2009 HKU-Pasteur Research Centre - Hong Kong http://www.pasteur.fr/~tekaia/BCGA2009.html

Participants self introduction

August 16, 2009

Bioinformatics and Comparative Genome Analysis 2009

AU Chun Hang (Tommy)

[email protected]

Prof. Hoi-Shan KWAN (PI and supervisor)

Food Science Research Laboratory (Microbiology & Genetics)

The Chinese University of Hong Kong

Main interest of my lab

Food-related bacterial and fungal genomics •

Projects

Lentinula edodes (Shiitake mushroom) • • genome project Salmonella local isolates resequencing Vibrio local isolates resequencing • • •

Attend BCGA 2009 to

Learn and practice comparative genome analysis Exchange bioinformatics experience Seek potential collaboration (Courtesy Science Photo Library)

Bioinformatics and Comparative Genome Analysis course August 16 - August 29, 2009 HKU-Pasteur Research Centre - Hong Kong

Jessie, Bao

Post-doc fellow Genome Research Centre, University of Hong Kong

QuickTime™ et un décompresseur sont requis pour visionner cette image.

QuickTime™ et un décompresseur sont requis pour visionner cette image.

About our centre and my work

• Next generation sequencing (Solexa platform) • De novo DNA sequencing (bacterial,influenza) • De novo transcriptom analysis for non-model organism • Methylation study in epigenetics

My expectation

• Know more about the bioinformatic analysis • Have a global view of the bioinformatic and biology • Exchange ideas with others

Luisa Berná

Stazione Zoologica A.Dohrn

INTERNATIONAL Ph.D. PROGRAMME OPEN UNIVERSITY

Laboratory of Animal Physiology and Evolution - Evolutionary Genomics “Comparative analysis of deuterostome genomes”

Supervision Team Director of Studies:

Prof.Giuseppe D ´Onofrio

External Supervisor:

Dr. Pietro Liò (University of Cambridge, U.K. ) Fernando Alvarez-Valin República, Uruguay) (Universidad de la

Ph.D Student

Luisa Berná

Dehal et all (2003) Delsuc et all (2006)

Ciona

effective experimental model organism • unique position in evolution • advantages for the investigation of developmental mechanisms

Expectative from the course

   study in-depth different and helpful algorithms in bioinformatics sequence-alignments and phylogenetic reconstructions ontology predictions   interchange of ideas / discussions discussing our own results with other scientists and students

Chen, Chien-Ming

• •

Ph.D student.

Department of Computer Science and Engineering, National Taiwan Ocean University, Taiwan.

Current project: Detecting Functional Simple Sequence Repeat through Comparative Genomics.

Homologous genes

Gene ontology

Genetic diseases

Genome Bioinformatics & Computer Vision Laboratory

Advisor: Prof. Pai, Tun-Wen

Main interests:

Protein Structure Alignment

Epitope prediction

– –

DNA sequence in-silico analysis System biology

expect from the course

Advance knowledge about evolution, gene homologous, genetic disease, and system biology.

Gang Chen

Computer Science Ph. D. Student

Central South University , ChangSha, China

Our Lab: netlab.csu.edu.cn

[email protected]

Blog: www.gossipcoder.com

MSN: [email protected]

My Interests: Dynamic biological networks, especially protein interaction network; Sequence analysis for the study of human diseases; Algorithm design and analysis; Perl, R, C, Ocaml programming languages; Drupal framework; TeX typesetting technology; Digital devices; Soccer; Popular music and movies; Science fiction;

Current Projects:

Combine gene expression data with protein interaction network, and Identify functional modules from the dynamic protein networks Haplotype assembly for next-generation sequencing technologies Linkage analysis and copy number analysis for depress and drug abuse patients. (Cooperate with XiangYa Medical School)

What I Want to Get from BCGA 2009:

More knowledge about comparative genome analysis, and try to apply it to my projects; Improve my skills on Perl, UNIX and related packages; The opportunity to communicate with famous people in this field;

Be friends with everyone!

(Image courtesy

Cai et al. 2008

) (Image courtesy

www.foodsafety.gov

)

Photo courtesy: vollan.ca

(Image courtesy www.vet.upenn.edu)

General Features of BALOs

Animation 16

Dr. Morena d’Avenia

PhD Student in Applied Biology

University Federico II of Naples - Italy University of Salerno - Salerno – Italy

Department of Pharmaceutical Sciences – division of biomedicine

PhD Project: Characterization of an unmapped nucleotide sequence that modulates apoptosis (air: apoptosis-induced regulator)

August 16th 2009

Background: looking for new genes involved in NF κB-mediated regulation of cell apoptosis,

by differential analysis of gene expression a cDNA (3.4 kb long, 79%AT rich) was found in Jurkat cells transfected with IκBα and stimulated with TNFα.

EST of 106 bases in neuroblastoma cells (U74659). Genotrace software, (Berezikov E et al 2002), by using an " in silico genome walking " strategy, identified a closely related 4.4 kb sequence in the raw sequencing reads of the chimpanzee genome.

detection on human genomic DNA

X screening of a custom 10X BAC genomic library These results and its lack from the assemblies indicate that Cloning problems could be due to specific sequence characteristics .

(Razin SV et al 2001)

In vivo experiments air expression induce apoptosis in vitro and in human melanoma in vivo

AIR sequence was patented by the University of Catanzaro jointly with the University of Naples “Federico II” (patent WO02055553; acc. no. AX538681; AX538682; AX538684)

what I expect from this EMBO course:

Learn how to use new tools useful to analyze high throughput sequencing data and learn how to design helpful algorithms in bioinformatics ….

Improve my knowledge in consulting data banks … interchange of ideas / discussions discussing our own results with other scientists and students

Ding Keyue

, Ph.D

Soochow University, China

• • • Ph.D., 2004, Peking Union Medical College Postdoc, 2004~2007, Mayo Clinic Instructor of Medicine, Mayo Clinic

• • Research interests – Patterns of variation in the Human Genome – Identification of genetic determinants for coronary heart disease • Evolutionary genetics of coronary heart disease • Mutation spectrum of coronary heart disease Expectation: How to detect signatures of natural selection in the regulatory regions using a comparative genomic method?

Yanhui Fan (Nolan)

Department of Biochemistry, Faculty of Medicine,

The University of Hong Kong

21 Sassoon Road, Hong Kong, China.

Researches at our laboratory are focused on understanding the molecular basis of human complex diseases.

My Project:

Whole Genome-wide Association Study of Adolescent Idiopathic Scoliosis in Southern Chinese • •

Expects:

Advanced methods used in genome analyses.

Perl Software

QuickTime™ et un décompresseur sont requis pour visionner cette image.

Marbella da Fon êca

• I am from Brazil QuickTime™ et un décompresseur sont requis pour visionner cette image.

• Biologist – worked with phylogeny • Graduated in Information Systems – programming • Masters in Genetics and Molecular Biology – Molecular Modeling • Second year of my PhD on Genetics at the Federal University of Rio de Janeiro • Develop my work at National Laboratory for Scientific Computing (LNCC, in portuguese) – http://www.lncc.br/ • Labinfo – responsible for many genome projects, developed the software SABIA, that assemblies, annotates and compares the genome of prokaryotes, ESTs and eukaryotes, developed several databases, 454 Genome Sequencer http://www.labinfo.lncc.br/

My Work

• Identification and characterization of

Mycoplasma hyopneumoniae

7448 proteins by threading, Molecular Modeling and Gene Expression • Bacteria with complete genome – 716 ORFs • 298 are hypothetical and conserved hypothetical proteins • Bioinformatic function prediction by threading and Molecular Modeling • Confirmation or refutation by gene expression • Results until now we could predict the function for 32 proteins by threading.

We also predicted the 3D structure of eight proteins by Molecular Modeling and purified two proteins at the laboratory to start experiments • Expectations – learn new tools to improve and boost my research.

QuickTime™ et un décompresseur sont requis pour visionner cette image.

Amel Ghouila

PhD Student in Computer science and Bioinformatics Pasteur Institute of Tunis

 Molecular Biology for genetic orphan diseases  wide association studies  Bioinformatics and statistics tools for linkage and genome Development and application of data mining algorithms for transcriptomic data analysis

MAB team, LIRMM Labortaory Montpellier, France

 Development and application of data mining algorithms for post genomic data with application to infectious pathogens Amel Ghhouila, PhD Student

Current research work

    Development of clustering and classification tools for genomic, transcriptomic and proteomic data analysis Leishmania genome annotation based on transcriptomic data analysis Development of algorithms for protein domain detection in order to improve functional annotation Comparative genomics for different Leishmania species in order to enhance gene function determination as these species show high conservation in gene content and their genomes organization

What do I expect from this course?

 Strengthen and update my knowledge in Bioinformatics  Learn more about sequence analysis and comparative genomic tools  Become familiar with these tools thanks to the practical sessions  Meet and discuss with scientists from the field Amel Ghhouila, PhD Student

Yu-Nong Gong

Ph. D. student,

Infectious disease informatics laboratory

Chang Gung University, Tao-Yuan, Taiwan

EDUCATION

Ph.D. student in Electrical Engineering 2007 ~ present, Chang Gung University, Tao-Yuan, Taiwan.

M.S. in Computer Science and Information

Engineering 2005 ~ 2007, Chang Gung University, Tao-Yuan, Taiwan.

B.S. in Applied Mathematics 2001 ~ 2005, Chinese Culture University, Taipei, Taiwan.

• • •

Research

Computations in molecular virology and epidemiology Bioinformatics core in emerging infectious virus research Biological sequence database and analysis • • •

Experiences

National Science Council Research Project, Taiwan August 2008 ~ July 2011 –

Mapping Viral Genotypes vs Phenotypes through Sequence Analysis of Emerging Viruses

August, 2006 ~ July, 2007 –

Identifying Molecular Targets for Influenza Virus Diagnosis and Classification

Host-pathogen conflict

:

The mutually disturbed

human macrophages transcriptomes

and of both

leishmania parasites

upon infection.

Fatma Guerfali

PhD

Lab. of Immunology, Vaccinology and Molecular Genetics, WHO Collaborating Center.Pasteur

Institute of Tunis, Tunisia.

Parasite Vector

IPT foundation in 1893

Leishmaniasis

Typhus Toxoplasmosis Tuberculosis rabies Genetic diseases Diagnosis Hosts Host cells B C G A , AUGUST 2 0 0 9 FATMA GUERFALI

MODEL

HUMAN

CELL

MACROPHAGE

INFECTION

Leishmania

ANALYSIS

TRANSCRIPTOME

B C G A , AUGUST 2 0 0 9 ● Use of a large-scale transcriptome analysis method to study macrophage infection with

Leishmania

● What impact the infection of Human Macrophages by Leishmania parasites has on the Transcriptomes of both actors ???

AAAA AAAA AAAA

AAAA AAAA AAAA

Expectations From The Course:

● Use of Public Databases to identify at a large-scale level the molecular targets of Leishmania infection ● various types of Interrogations / Queries to be done at a large-scale level, because hundreds of gene / protein targets have been identified

FATMA GUERFALI

Cayetano Heredia University Lima - Peru

Frank Guzman

Research Assistant

[email protected]

[email protected]

Bachelor's Degree in Biology focused on Cellular Biology and Genetics San Marcos University (Lima, Peru) Master's Degree in Molecular Biology San Marcos University (Lima, Peru)

Genomics Research Unit

Plant Molecular Genetics - Plant Genomics Current projects Barley Potato This workshop Sequence analysis and comparative genomics

Chia-Lang Hsu

PhD student

• Institute of Biomedical Informatics,

National Yang-Ming University, Taipei, Taiwan

• My advisor: Ueng-Cheng Yang (http://binfo.ym.edu.tw/labpage)

My research

 Focus on Functional Bioinformatics • Research interests in my lab: – Analysis of Pathway – Bioinformatics analysis on disease candidate genes and mechanisms – Clinical system – Translation research • My projects: – Hunting candidate genes of complex disease via phenotypic and interactome information.

– Presenting pathway relation based on GO information.

• • • • •

Khader Shameer

National Centre for Biological Sciences (TIFR), Bangalore - India

NCBS – TIFR Main interest of our lab

• The mandate of NCBS is basic research in the frontier areas of biology The research interests of the faculty are in 6 broad areas ranging from the study of single molecules to systems biology Premier research institute in India with all the modern research and technology facilities Offers graduate programs leading to PhD degrees and MS by research http://www.ncbs.res.in

• • Prof. R. Sowdhamini’s Computational Biology lab at NCBS - TIFR is focussed on different aspects of computational approaches to protein science Lab interests are mainly in designing algorithms, protocols, tools, servers and database for the better understanding of important protein families and their relationship in terms of structural and sequence homologies Lab hosts various widely used publicly-accessible web resources (8 database and 8 web servers)

• •

My Projects

Worked on 5 different projects – PURE : An approach for the prediction of domains in unassigned region – HARMONY : A tool for the validation of protein structures based on propensity and substitution scores – STIF / STIFDB : Algorithm / Database for prediction of Stress responsive transcription factors – IWS : Integrated Web Server for protein sequence and structure analysis Currently working on : – Analysis of therapeutically important protein families – Protein structure analysis using machine learning approaches • • •

What I am expecting from the BCGA2009 ?

To interact with an excellent team of international speakers I hope the course will help me to learn and discuss several aspects in advanced bioinformatics and comparative genome analysis. I envisage that the course will be very useful for my current and future research work

Kuan-Ting Lin (Woody)

• • Ph.D student – Institute of Biomedical Informatics –

National Yang Ming University in Taiwan

Interests – Alternative Spliced proteins/Isoforms – Old Drug for New Usage

• • Project – To construct the alternative spliced (AS) PPI network in HCC (liver cancer) What am I expecting from the course?

– To deal with alignment problems in AS – AS protein functions – AS PPI in human phylome

Dr. Stephen B. Pointing

Maggie C.Y. Lau

Extremophiles Research Group School of Biological Sciences, The University of Hong Kong

Extremophiles are organisms able to live and proliferate in unordinary environmental conditions. There are obligatory and facultative extremophiles. • •

We study them for the reasons of the following:

• As an analogue to early life on Earth when conditions are not favourable for life existence • As an analogue to possible extraterrestrial life forms As a resources of biomolecules for medical, industrial and research development As a model system for ecological studies Taklimakan Desert, Xinjiang, China McKelvey Valley, Antarctica Shigatse, Tibet, China 

Biodiversity & Ecology – what are they? how do they distribute? what are the driving forces behind?

Aerosphere, Hong Kong, China © www.panoramio.com

Phylogeny & Evolution – how are they phylogenetically related to their mesophilic counterparts? how is their evolutionary history?

Metabolism & Adaptation – how do they metabolically interact with each other? what are their adaptive features and capability?

40 35 30 25 60 55 50 45 20 15 10 5 0 0 500

Enzyme digestion on 129f-end

1000 1500 2000 2500 3000

Alcaligenes eutrophus

(U20584)

Hydrogenophilus thermoluteolus

(D30764)

Hydrogenovibrio marinus

cbbL2 (D43622)

Synechococcus

sp. JA-3-3Ab (CP000239)

cbbL 3

Synechococcus

sp. OH20 (AY221518) 98/100 95/100 53/ 99/ 85/ 86/100 98/

Synechococcus trididemni

(AB011629)

Fishcherella thermalis

PCC 7521 (AB075913)

Synechococcus elongatus

PCC 6301 (X03220) Size of tRF (bp) My work includes …  0.1

63/95

Synechococcus

sp. WH7803 (U46156)

Thialkalivibrio denitrificans

(AY914807)

Thiobacillus

sp. (M34536) 69/98

Chromatium vinosum

(M26396) /93 81/100

cbbL 7

Nitrobacter winogradskyi

(CP000115) Nb-255 Uncultured bacterium clone ng3L492 (AY773062)

Acidithiobacillus ferrooxidans

cbbL2 (X70355) mainly the analysis of the above said projects

Hydrogenovibrio marinus

cbbL1 (D43621)

Chlamydomonas reinhardtii

(J01399)  evaluation and recommendation of experimental approaches  miscellaneous I expect …  to enrich my knowledge in informatics and know more about its development to meet biologists’ need  to learn different analysis tools (principles and utilities)  to know about the limitations or caveats  to make friends and become collaboration partners

Chieh Hua Lin

Ph.D student

National Tsing Hua University, Taiwan

• • I work at division of biostatistics and bioinformatics and Center for Nanomedicine Research at National Health Research Institutes, Taiwan.

I have been involved in the analysis of microarray data, of protein-protein interaction and phylogenetics, in which we have combined theory with programming.

• • The projects I involve including toxicogenomics of nanoparticles as biomaterials by microarray analysis and comparative genomics and interactomes.

My research involving the use of heterogeneity data sources need to be integrated systematically. The programs scheduled in this course are exactly what I need. Through skilled in the programs of different fields, they can give me an insight into the systematical data integration for systems biology.

Mahfoudh Wijden

, PhD student.

Department of Molecular Immuno- Oncology,

Faculty of Medicine, Monastir, Tunisia.

Genetic suceptibility to hereditary and sporadic breast cancer

*

Breast cancer is a common malignancy affecting women around the world. * It occurs in sporadic and hereditary forms.

* About 5-10% of all breast cancer is inherited as the result of highly penetrant germline mutation in cancer predisposing genes which leads to an autosomal dominant predisposition to the disease. *At present, two major breast cancer susceptibility genes have been identified (BRCA 1 and BRCA 2).

Objective:

Determine the prevalence and the spectrum of BRCA1 mutations amongTunisians.

Methods:

Mutation screening of high –risk Tunisian breast cancer families for germline mutations in the entire BRCA1 coding region and exon-intron boundaries using direct DNA sequencing.

What I expect from the course:

Acquire the necessary skills for genome data analysis using public databases and web based sequence analysis tools: - multiple sequence alignement - prediction the functional effect of Single Nucleotide Polymorphisms and Splice site mutations.

Michael Torres

RESEARCH ASSISTANT Genomic Research Unit

Universidad Peruana Cayetano Heredia, Lima, Peru

POTATO GENOME SEQUENCING PROJECT – CHROMOSOME 3

Background

• • Biologist with mention in genetics.

Technician degree in “computer engineering” (hardware and software) • Trained in bioinformatics: DRUG DESIGN. (GROMOCS & GROMACS) at Bioinformatic Unit . Universidad Peruana Cayetano Heredia COMPARATIVE GENOMICS. (fingerprint program, perl programming, consed/phred/phrap & bambus) at Wageningen University and Michigan State University.

• Bioinformatician in Potato Genome Sequencing Project in charge to assembly 25% of Chromosome 3 (estimated size 30000000 nucleotides)

• •

Fu-Jin Wei (Grass)

– Where I work Inst. Plant & Microbial Biology,

Academia Sinica, Taipei, Taiwan

– Main interests of my lab Studies on rice functional genomics – Biological functions of soybean seed maturation proteins

Fu-Jin Wei (Grass)

• • My Research – – Analysis of genome datasets Maintain our websites • ASPGC, TRIM, LEA genes, Rice Anther – Maintain informatics system in my lab Expect from the course – “Search for motifs”, “Phylogenomics” and “Large scale genome comparisons” – Making friends

Emily Wong

PhD student Kathy Belov’s lab, Faculty of Veterinary Science, University of

Sydney, Australia

My projects

Genomic identification of divergent immune gene/gene families in marsupial and monotremes

(Wong et al 2006 Immunome Research; Wong et al 2009 Immunogenetics; Wong et al. Accepted Aust.J.Zool )

• Developed a lab sequence database application based on concept of ‘tags’

GutenTag - http://tagbase.angis.org.au/tagbase/gutentag/

• • Ancestral antimicrobial prediction and testing Examining expression using RNAseq of the two thymuses in a tammar wallaby

Thomas Wong Hong Kong

• • • • A PhD student in Computer Science department in the University of Hong Kong Area of interest in research: non-coding RNAs In our lab, there are quite many different projects going on, like sequencing, secondary structure prediction, motif finding, ncRNA, etc.

My expectation from the course: • Learn more useful and practical skill • Explore myself in different areas in Bioinformatics • Know more people, more communication on research, and hopefully may have some cooperation in future.

Current Research Project

• Aim: to locate non-coding RNAs (ncRNAs) along the genome • Background: • Infernal is the most successful prediction tool and exhibits high sensitivity.

• However, when we tried to use Infernal to search ncRNAs along human genome, we found there are quite many false positives.

• Our approach: • Infernal does not consider adjacent dependence in their model. Thus we incorporate the adjacent dependence into the model and build a new model to identify ncRNAs.

• Result: • We found that by using this new model, many of the false positives resulted by Infernal can be filtered out.

Nazar Zaki (PhD)

Associate Professor Coordinator, Bioinformatics Laboratory

Laboratory Email: [email protected] URL: http://faculty.uaeu.ac.ae/nzaki

College of information Technolog United Arab Emirates University, UAE http://cit.uaeu.ac.ae/ United Arab Emirates University

 Serve the comp. biol. community in the UAE.

 Provides students and researchers with access to software, and technical support related to computational biology.

Research Focus

 Develop machine learning algorithms to solve specific Bioinformatics problems, such as protein homology detection, identification of protein functional sites, protein function/structure prediction, protein interaction network, and sequence classification.

Expectations

Learn more and collaborate.