 |
 |
|
 |
Introduction to Biostatistics and Bioinformatics Fall 2014 (BMSC-GA 4451)
Lecturers:
Yindalon Aphinyanaphongs (yin.a@nyumc.org)
Stuart Brown (Stuart.Brown@nyumc.org)
David Fenyö (David@FenyoLab.org)
Judy Zhong (Judy.Zhong@nyumc.org)
Tutorial Instructors:
Pamela Wu (Pamela.Wu@nyumc.org)
Amanda Ernlund (Amanda.Ernlund@nyumc.org)
Course Overview
The goal for the Introduction to Biostatistics and Bioinformatics course is to provide an introduction to statistics and informatics methods for the analysis of data generated in biomedical research. Practical examples covering both small-scale lab experiments and high-throughput assays will be explored. The course covers a wide range of topics in a short time so the focus will be on the basic concepts, and in the practical programming exercises the students explore these basic concept and common pitfalls. An introduction of basic Python and R programming will be given throughout the course and many exercises will involve programming.
Learning objectives
The student will be introduced to entry-level methods in the biostatistics and bioinformatics.
Course Assessment
- Readings and participation (10%): Students are required to attend class, to complete reading assignments and to participate in discussions and engage in healthy exchange of ideas. Each student is required to lead at least one reading from the assigned weekly readings. This discussion lead will be graded.
- Assignments (40%): Programming assignment will be given at the end of each class, and the solutions to these assignments should be e-mailed to Assignments@FenyoLab.org within a week.
- Exam (40%): There will be one exam in this class and it will cover the entire course material.
Missed Exams and Grade Appeals
Make-up examinations (for final only) will be given under special circumstances. Documentation will be required to verify a student’s claim. If a make-up exam is permitted, a different exam will be written for that student and may have a different format than the regular examination.
The assignments must be turned in on time and no late assignments will be accepted.
If there is a time that you believe that there is a mistake in grading of an assignment/exam, you will have a chance to appeal your exam grade within a week after you receive your grade. If you think this is the case, you must write a note describing the error, attach it to the original exam, and give it to me within a week of the return of your exam. I will review your argument and my initial grading, and then return your exam with a decision to you in a timely manner.
General Policies
- Late/missed work: You must adhere to the due dates for all required submissions. If you miss a deadline, then you will not get credit for that assignment/post. Try to avoid last minute submissions.
- Incompletes: No “Incompletes” will be assigned for this course unless we are at the very end of the course and you have an emergency.
- Responding to Messages: I will check e-mails daily during the week, and I will respond to course related questions within 48 hours.
- Announcements: I will make announcements throughout the semester by e-mail. Make sure that your email address is updated; otherwise you may miss important emails from me.
- Safeguards: Always back up your work on a safe place (electronic file with a backup is recommended) and make a hard copy. Do not wait for the last minute to do your work. Allow time for deadlines.
- Plagiarism: Plagiarism, the presentation of someone else's words or ideas as your own, is a serious offense and will not be tolerated in this class. The first time you plagiarize someone else's work, you will receive a zero for that assignment. The second time you plagiarize, you will fail the course with a notation of academic dishonesty on your official record.
Recommended Readings
Fundamentals of Biostatistics by Bernard Rosner
Understanding Bioinformatics by Marketa Zvelebil and Jeremy O. Baum
Think Python by Allen B. Downey
Lectures
Lecture 1 Exploring Data (September 9, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
Video
,
Document
,
first.py
,
open.py
,
open.txt
)
Homework (due date: September 18)
Reading List
Data visualization: A view of every Points of View column
Python Basics for Bioinformatics by Stuart Brown
Additional Reading
The Visual Display of Quantitative Information by Edward R. Tufte
Visualize This: The FlowingData Guide to Design, Visualization, and Statistics by Nathan Yau
Data Analysis with Open Source Tools by Philipp K. Janert
The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures
Lecture 2 Descriptive Statistics (September 11, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Reading List
Think Stats by Allen B. Downey Chapter 2
Importance of being
uncertain
Visualizing samples with box plots
Error bars
Tutorial Python Programming (September 13, 2014 TRB 120 3pm) Tutorial Instructor: Wu
Lecture 3 Data types and Representations in Molecular Biology (September 16, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
ecogene.fasta
,
seq_id.list
)
Homework (due date: September 22)
Reading List
Understanding Bioinformatics Chapter 3
What is Bioinformatics
Entrez Help
Lecture 4 Probability (September 18, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Reading List
Fundamentals of Biostatistics Chapter 3
Think Stats by Allen B. Downey Chapter 5
Lecture 5 Sequence Alignment Concepts (September 23, 2014 TRB 120 2pm)
Lecturer: Brown
(
Slides
)
Tutorial Instructor: Wu
(
Document
,
Dmel-UniP.fasta
,
fly_test.fasta
)
Homework (due date: September 29)
Reading List
Understanding Bioinformatics Chapters 4.1-4.5 and 5.1-5.4
Smith Waterman
FASTA
Emboss dotmatcher
Lecture 6 Sequence Database Searching (September 25, 2014 TRB 120 2pm)
Lecturer: Brown
(
Slides
)
Reading List
BLAST Chapter 4
Altshul-BLAST
The BLAST Sequence Analysis Tool
Lecture 7 Distributions (September 30, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Slides
)
Tutorial Instructor: Wu
(
Document
,
central_limit.py
,
functions.py
)
Homework (due date: October 6)
Reading List
Fundamentals of Biostatistics Chapters 4 & 5
Think Stats by Allen B. Downey Chapters 4 and 6
Lecture 8 Estimation I (October 2, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Reading List
Fundamentals of Biostatistics Chapter 6
Think Stats by Allen B. Downey Chapter 8
Lecture 9 Estimation II (October 7, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
confInt.py
)
Homework (due date: October 13)
Reading List
Fundamentals of Biostatistics Chapter 6
Additional Reading
Think Bayes by Allen B. Downey
Lecture 10 Hypothesis Testing I (October 9, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Reading List
Fundamentals of Biostatistics Chapter 7 & 8
Significance, P values and t-tests
Comparing samples - part I
Think Stats by Allen B. Downey Chapter 7
Lecture 11 Hypothesis Testing II (October 14, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
hypTesting.py
,
hyptesting1.png
,
hyptesting2.png
)
Homework (due date: October 27)
Reading List
Fundamentals of Biostatistics Chater 7 & 8
Comparing samples - part II
Power and sample size
Lecture 12 Multiple Sequence Alignment (October 16, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Reading List
Understanding Bioinformatics Chapter 6.2-6.4
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Using ClustalX for multiple sequence alignment
Hidden Markov Models
Lecture 13 Sequence Motifs (October 21, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
Slides
,
script.py
,
MA0083.1.sites
,
random_fragments.fasta
,
srf_chip.fasta
)
Homework (due date: November 3)
Reading List
Understanding Bioinformatics Chapters 4.8-4.10, 6.1, 6.6
Sequence Logos
Finding Candidate Binding Sites for Known Transcription Factors via Sequence Matching
Lecture 14 Phylogenetics (October 23, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Reading List
Understanding Bioinformatics Chapter 7 & 8
Building Phylogenetic Trees from Molecular Data with MEGA
MEGA - Molecular Evolutionary Genetics Analysis
Lecture 15 Analysis of Variance (October 28, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Tutorial Instructor: Wu
Homework (due date: November 10)
Reading List
Fundamentals of Biostatistics Chaper 12
Lecture 16 Categorical Data Methods (October 30, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Reading List
Fundamentals of Biostatistics Chapter 10
Lecture 17 Non-Parametric Methods (November 4, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
Slides
)
Homework (due date: November 17)
Reading List
Fundamentals of Biostatistics Chapter 9
Nonparametric tests
Lecture 18 Regression and Correlation (November 6, 2014 TRB 120 2pm)
Lecturer: Zhong
(
Video
,
Slides
)
Reading List
Fundamentals of Biostatistics Chapter 11
Lecture 19 Proteomics Informatics (November 11, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
proteomics_no_replicate.py
,
proteomics_one_replicate.py
,
proteomics_three_replicates.py
,
NUP1-more-stringent-wash.mgf
,
NUP1-less-stringent-wash.mgf
,
two-sample-three-replicate-comparison.txt
)
Homework (due date: November 24)
Reading List
Mass spectrometric protein identification using the global proteome machine
Protein quantitation using mass spectrometry
Lecture 20 Gene Expression (November 13, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Reading List
Understanding Bioinformatics Chapters 15.1,16.1-16.5
Microarray data analysis: from disarray to consolidation and consensus
Using Bioconductor with Microarray Analysis
Lecture 21 Next Generation Sequencing Informatics I (November 18, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
Document
,
R-Intro.doc
,
ArrayData.zip
)
Homework (due date: December 1)
Reading List
Next Generation Sequencing-ChIPseq
Fast and accurate long-read alignment with Burrows–Wheeler transform
Lecture 22 Next Generation Sequencing Informatics II (November 20, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Lecture 23 Sequence Variation (November 25, 2014 TRB 120 2pm)
Lecturer: Brown
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
Chipseq.zip
)
Homework (due date: December 8)
Lecture 24 Signal Processing (December 2, 2014 Skirball 3rd Floor Seminar Room 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
peak-noise-smooth.py
,
peak-noise-smooth2.py
)
Homework (due date: December 9)
Lecture 25 Bioimage Informatics (December 4, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Reading List
Introduction to the Quantitative Analysis of Two-Dimensional Fluorescence Microscopy Images for Cell-Based Screening
Lecture 26 Experimental Design (December 9, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Tutorial Instructor: Wu
(
basic_image_analysis.py
,
image_for_basic_image_analysis.tif
)
Reading List
Designing comparative experiments
Analysis of variance and blocking
Replication
Bias as a threat to the validity of cancer molecular-marker research by David F. Ransohoff, Nat Rev Cancer 5 (2005) 142-149
Additional Reading
Design and Analysis of Experiments by Douglas C. Montgomery
Fundamentals of Biostatistics Chaper 13
Lecture 27 Machine Learning (December 11, 2014 TRB 120 2pm)
Lecturer: Aphinyanaphongs
(
Video
,
Slides
)
Reading List
ROC Graphs: Notes and Practical Considerations for Researchers by Tom Fawcett
Additional Reading
An Introduction to Statistical Learning by Gareth James et el.
A Gentle Introduction to Support Vector Machines in Biomedicine: Theory and Methods (Volume 1) by Alexander Statnikov et al.
A Gentle Introduction to Support Vector Machines in Biomedicine: Case Studies and Benchmarks (Volume 2) by Alexander Statnikov et al.
Lecture 28 Modeling and Simulation (December 16, 2014 TRB 120 2pm)
Lecturer: Fenyo
(
Video
,
Slides
)
Tutorial Instructor: Wu
Additional Reading
Modeling Complex Systems by Nino Boccara
Evolutionary Dynamics: Exploring the Equations of Life by Martin A. Nowak
Exam (December 18, 2014 TRB 120 2pm)
|
 |
 |