CANB7640 COURSE WEBSITE

Course Director: Aik Choon Tan, Ph.D.

Course Instructors/Tutors:
Jihye Kim, Ph.D.

Class: Tuesday 1pm - 5pm
Venue: RC1N 1309 (P18 CTL-1309)


Course Syllabus

  1. INTRODUCTION
  2. DATA MINING CONCEPTS
  3. GENE EXPRESSION ANALYSIS I - CANDIDATE GENE APPROACH

    • [SLIDES]

    • [WORKSHOP SLIDES]

    • Link to CLASS03 Workshop materials and Assignment 3 [LINK]

      Reading materials:

    • SAM paper: Tusher, Tibshirani, Chu. (2001). Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98(9):5116-5121. [PDF]
    • FDR calculation in high-throughput genomics data: Xie, Whitehurst, White (2007). A practical efficient approach in high throughput screening: using FDR and fold change. Nature Protocol Exchange. [Link]
    • Comparisons of various methods: Jeanmougin, de Reynies, Marisa, Paccard, Nuel, Guedj. (2010). Should we abandon the t-test in the analysis of gene expression microarray data: a comparison of variance modeling strategies. PLoS ONE. 5(9): e12336. [PDF]

      Related data:

    • ALL vs AML example in Excel [Download]

  4. GENE EXPRESSION ANALYSIS II - GENE SET ANALYSIS

    • [SLIDES]

    • [WORKSHOP SLIDES]

    • Link to CLASS04 Workshop materials and Assignment 4 [LINK]

      Reading materials:

    • Mootha et al. (2003). PGC-1-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genetics. [PDF]
    • GSEA Paper: Subramanian et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS. [PDF].
    • GSEA User Guide [Link]
    • Emmert-Streib, Glazko (2011). Pathway Analysis of Expression Data: Deciphering Functional Building Blocks of Complex Diseases. PLoS Comp. Biol.[PDF]
    • Khatri, Sirota, Butte (2012). Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges. PLoS Comp. Biol. [PDF]

  5. GENE LIST ENRICHMENT ANALYSIS III - TOOLS, DATA INTEGRATION AND VISUALIZATION

  6. GENE EXPRESSION ANALYSIS IV - PROCESSING, QUERYING AND VISUALIZING GENE EXPRESSION DATA

    • [SLIDES]

    • [WORKSHOP SLIDES]

    • Link to CLASS06 Workshop materials and Assignment 6 [LINK]

      Reading materials:

    • Irizarry et al. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostat. [PDF]
    • Rung & Brazma. (2013). Reuse of public genome-wide gene expression data. Nat. Rev. Genetics. [PDF]
    • Brazma et al. (2001). Minimum information about a microarray experiment (MIAME) - toward standards for microarray data. Nat. Biotech. [PDF]
    • Pavlidis & Noble. (2003). Matrix2png: a utility for visualizing matrix data. Bioinformatics. [PDF]

  7. NEXT GENERATION SEQUENCING - INTRODUCTION, ALGORITHMS AND TOOLS

  8. MINING CANCER CELL LINES DATABASES

  9. MINING CANCER GENOMICS DATA

  10. CONNECTIVITY MAP

  11. FINAL PROJECT AND PRESENTATION

    Format: 10 mins presentation.

    • Introduction (motivation, what is the question? what are you trying to solve/find from the data?)
    • Features of your Data sets (either in house and/or public data sets, what kind of data? how do you process the data?)
    • Analysis Plan (what is your plan to tackle the problem bioinformatically? workflow?)
    • Tools that you used in the analysis/mining of your data (list out the tools, some background about the tools etc).
    • Results & Interpretation (what are the results? how do you present your results? how do you interpret the data?)
    • Future Plan & Discussions (what is next? validation in the lab?)
    • Conclusions