Computing
for Research I
Spring
2012
Description: Students learn to use the primary
statistical software packages for data manipulation and analysis, including
(but not limited to): R, R Bioconductor, SAS,
SAS macro, and Stata. Additionally, students will
learn: how to use the division's high speed cluster-computing
environment, how to practice the principles of reproducible research using Sweave in R, and how to use LaTeX
and BibTeX for manuscript and presentation
development. This is a three credit course.
Course Organization: This
course is given by the faculty members in the division. Instructors will take turns giving lectures
in their areas of expertise.
Textbooks: No
textbook. Reading material (primarily
found on the web) will be provided as necessary.
Prerequisites: Biometry 700
Grading:
Instructors will give short exercises to be completed and turned into
the primary instructor by the Thursday of the week following when it was
assigned (e.g., assignments given on Tuesday Feb 14 and Thursday Feb 16 are
both due on Thursday Feb 23). Each
assignment will count equally towards 75% of the course grade. There will be a final project which will
account for the remaining 20% of the course grade. The remaining 5% of the course grade will
reflect class participation.
Homeworks Policy: Homeworks are
due by 5pm on the due date. All homeworks should be emailed to the primary instructor (garrettm@musc.edu) or turned in at lecture
time. Asking for extensions on homeworks is
strongly discouraged. However, it is expected that, on occasion,
extenuating circumstances may arise. Therefore, the policy is that each student may request an extension on
homework twice and the extension is to be no more than 2 days.
You must notify the primary instructor that you are requesting an extension
before the time the assignment is due.
After using two extensions, no more extensions will be granted except
with a medical note.
Office Hours: The
primary instructor will have office hours by appointment. However, given the nature of the course, the
primary instructor may not be knowledgeable regarding all of the topics
covered. As a result, additional help
may be needed to complete assignments from the lecturers. Be considerate and responsible in scheduling
time with course instructors and recognize that they all have busy schedules.
Course
Objectives: Upon successful completion of
the course, the student will be able to
1.
Import data and perform simple
analyses and produce graphical displays in Stata, SAS
and R
2.
Create new functions or commands in
each of R, Stata and SAS
3.
Generate professional quality
scientific manuscripts and presentations using Latex along with statistical
software
4.
Perform standard power and sample size
calculations using available software and simulations.
5.
Operate the division’s cluster
computer with batch computing
Primary Instructor: |
|
Elizabeth
Garrett-Mayer |
Website: |
|
|
Contact Info: |
|
Hollings
Cancer Center, Rm 118G garrettm@musc.edu (preferred mode of contact is email) 792-7764 |
Time: |
|
Tuesdays and Thursdays, 2:00-3:30 |
Location: |
|
Cannon 301, Room 305V |
Office
Hours: |
|
By appointment. Contact via email. |
Lectures:
Date |
Lecturer |
Topic |
Lecture notes, links, etc. |
Homework assignment |
||
Th Jan 5 |
E.
Garrett-Mayer |
Introduction;
Overview and Principles |
|
|||
Tu Jan 10 |
E.
Garrett-Mayer |
R:
introduction to object-oriented programming |
|
|||
Th Jan 12 |
Caitlyn
Ellerbe |
R:
downloading packages/libraries; data input & output |
||||
Tu Jan 17 |
Cody
Chiuzan |
R:
graphics |
||||
Th Jan 19 |
Georgiana
Onicescu |
R: basic
language structure (ifelse, where, looping) |
||||
Tu Jan 24 |
Andrew
Lawson |
R:
exploratory data analysis; writing commands |
|
|||
Th Jan 26 |
Bethany Wolf |
R: bioconductor |
|
|||
Tu Jan 31 |
Yanqui Weng |
R: simulations;
random number generation; sampling from distributions |
||||
Th Feb 2 |
Stacia DeStantis |
R:
regression commands |
|
|||
Tu Feb 7 |
Amy Wahlquist |
Data
management: RedCap
|
|
|||
Th Feb 9 |
Annie
Simpson |
Data
management principles & Excel |
|
|||
Tu Feb 14 |
E.
Garrett-Mayer |
STATA:
introduction, “immediate” commands |
http://www.ats.ucla.edu/stat/stata/sk/default.htm http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/index.html http://data.princeton.edu/stata/ |
|
||
Th Feb 16 |
E.
Garrett-Mayer |
STATA: graphical displays |
|
|||
Tu Feb 21 |
E.
Garrett-Mayer |
STATA:
exploratory data analysis; |
|
|||
Th Feb 23 |
E.
Garrett-Mayer |
STATA
regression commands |
||||
Tu Feb 28 |
E.
Garrett-Mayer |
STATA:
programming and do files |
|
|||
Th Mar 1 |
Kyra
Robinson |
SAS:
introduction |
||||
Tu Mar 6 |
Renee
Martin |
SAS:
macros |
|
|||
Th Mar 8 |
Ramesh |
SAS: IML |
|
|||
Tu Mar 20 |
Valerie
Durkalski |
SAS: proc tabulate and proc report |
||||
Th Mar 22 |
Nate Baker |
SAS: Gplot |
||||
Tu Mar 27 |
Katherine
Nicholas |
SAS: ODS |
||||
Th Mar 29 |
Jordan Elm |
SAS: array
processing |
||||
Tu Apr 3 |
Adrian
Nida |
Batch
processing (using R) and cluster computing |
Code Examples: |
|||
Th Apr 5 |
Mulugeta
Gebregziabher |
Latex and Bibtex: manuscript
production |
ex1.tex |
|||
Tu Apr 10 |
Betsy Hill
|
Reproducible
Research: Sweave
|
Sweave.intro.student.notes.2012.pdf Sweave.sty (style file needed to run Sweave. Save this in the folder that contains the .Rnw file
you will be running) carter.cls (the class file used in the presentation Sweave_intro.tex) PPRCarter.sty (the style file used in the presentation Sweave_intro.tex. Links: · Sweave homepage – Leisch is
the originator of the package http://www.stat.uni-muenchen.de/~leisch/Sweave/ · The Cancer Letter link http://www.bcm.edu/cancercenter/index.cfm?pmid=12886 · Annals of Applied Statistics paper – scroll
all the way to the bottom of the page to see link to the Baggerly paper http://www.imstat.org/aoas/supplements/issue_3_4.html · SASweave paper in Journal of Statistical Software http://www.jstatsoft.org/v19/i08/ · STATweave user’s manual by Russ Lenth http://www.stat.uiowa.edu/~rlenth/StatWeave/StatWeave-manual.pdf |
|
||
Th Apr 12 |
Emily
Kistner-Griffin |
Latex and Bibtex:
presentations |
||||
Tu Apr 17 |
Sybil
Prince-Nelson |
Designing
your own website |
|
|||
Th Apr 19 |
Paul
Nietert |
Sample size
calculation software packages |
||||
Tu Apr 24 |
|
|
|
|
||
FINAL
PROJECT |
|
DUE APRIL
30, 5PM |
|
Computing:
Downloads
and Websites:
WinEdt: http://www.winedt.com/index.html
Stata
website: http://www.stata.com/
Tutorials :
R tutorial: R-intro.pdf