Computing for Research I

Spring 2012

 

Description:    Students learn to use the primary statistical software packages for data manipulation and analysis, including (but not limited to):  R, R Bioconductor, SAS, SAS macro, and Stata. Additionally, students will learn:  how to use the division's high speed cluster-computing environment, how to practice the principles of reproducible research using Sweave in R, and how to use LaTeX and BibTeX for manuscript and presentation development.  This is a three credit course.

 

Course Organization:  This course is given by the faculty members in the division.  Instructors will take turns giving lectures in their areas of expertise. 

 

Textbooks:  No textbook.  Reading material (primarily found on the web) will be provided as necessary.

 

Prerequisites:  Biometry 700

 

Grading:  Instructors will give short exercises to be completed and turned into the primary instructor by the Thursday of the week following when it was assigned (e.g., assignments given on Tuesday Feb 14 and Thursday Feb 16 are both due on Thursday Feb 23).  Each assignment will count equally towards 75% of the course grade.  There will be a final project which will account for the remaining 20% of the course grade.  The remaining 5% of the course grade will reflect class participation.

 

Homeworks Policy:   Homeworks are due by 5pm on the due date.  All homeworks should be emailed to the primary instructor (garrettm@musc.edu) or turned in at lecture time.  Asking for extensions on homeworks is strongly discouraged.  However, it is expected that, on occasion, extenuating circumstances may arise.  Therefore, the policy is that each student may request an extension on homework twice and the extension is to be no more than 2 days.   You must notify the primary instructor that you are requesting an extension before the time the assignment is due.  After using two extensions, no more extensions will be granted except with a medical note.  

 

Office Hours:  The primary instructor will have office hours by appointment.  However, given the nature of the course, the primary instructor may not be knowledgeable regarding all of the topics covered.  As a result, additional help may be needed to complete assignments from the lecturers.  Be considerate and responsible in scheduling time with course instructors and recognize that they all have busy schedules.

 

Course Objectives:  Upon successful completion of the course, the student will be able to

1.     Import data and perform simple analyses and produce graphical displays in Stata, SAS and R

2.     Create new functions or commands in each of R, Stata and SAS

3.     Generate professional quality scientific manuscripts and presentations using Latex along with statistical software

4.     Perform standard power and sample size calculations using available software and simulations.

5.     Operate the division’s cluster computer with batch computing

 

Primary Instructor:

 

Elizabeth Garrett-Mayer

Website:

 

http://people.musc.edu/~elg26/teaching/statcomputing.2010/statcomputingI.2012_files/statcomputingI.2012.htm

Contact Info:

 

Hollings Cancer Center, Rm 118G

garrettm@musc.edu (preferred mode of contact is email)

792-7764

Time:

 

Tuesdays and Thursdays, 2:00-3:30

Location:

 

Cannon 301, Room 305V

Office Hours:

 

By appointment. Contact via email.

 

Lectures:

 

Date

Lecturer

Topic

Lecture notes, links, etc.

Homework assignment

Th Jan 5

E. Garrett-Mayer

Introduction; Overview and Principles

Lecture1.intro.pptx

 

Tu Jan 10

E. Garrett-Mayer

R: introduction to object-oriented programming

Rlecture1-2.pptx

Rlect1.Jan10.R

SCBC2004.small.csv

Homework1.R.docx

IschemicHeartDisease.csv

IschemicHeartDisease.pdf

 

Th Jan 12

Caitlyn Ellerbe

R: downloading packages/libraries; data input & output

Jan12_PackagesData.ppt

LectureCode_12JAN2012.R

test.RData

Homework.doc

LabRecord.csv

PatientData.csv

Tu Jan 17

Cody Chiuzan

R: graphics

R graphics.ppt

Graphics Code.txt

Prostate.csv

3D Code.txt

framestnc.csv

esoph.csv

ComputingResearchHW.pdf

Th Jan 19

Georgiana Onicescu

R: basic language structure (ifelse, where, looping)

Rpresentation.pdf

Rcode_presentation.txt

hwk.pdf

Tu Jan 24

Andrew Lawson

R: exploratory data analysis; writing commands

R5.lecture.pptx

Rcommands.R

Final-3-3-2011.csv

R5homework.docx

MethylationData.csv

Th Jan 26

Bethany Wolf

R: bioconductor

Bioconductor_lec.pdf

Bioconductor_HW.pdf

 

Tu Jan 31

Yanqui Weng

R: simulations; random number generation; sampling from distributions

Simulation.presentation.ppt

In.class.code.R

HW.FOR.SIMULATION.doc

Th Feb 2

Stacia DeStantis

 

R: regression commands

Rlecture.pptx

HASSLES.txt

CrossValidatingLogisticRegression.txt

LVH_4markers.csv

 

Tu Feb 7

Amy Wahlquist

Data management:  RedCap

REDCap2012.pptx

 

REDCapHomework2012.docx

QBdata.xlsx

Th Feb 9

Annie Simpson

Data management principles &  Excel

DataCollectionPresentationANS 2_9_2012.ppt

 

Tu Feb 14

E. Garrett-Mayer

STATA: introduction, “immediate” commands

Stata1.pptx

SCBC2004.dta

SCBC2004.v9.dta

Statalecture1.do

http://www.ats.ucla.edu/stat/stata/sk/default.htm

http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/index.html

http://data.princeton.edu/stata/

 

Stata1.homework.docx

Ohiosmall.dta

Ohiosmall9.dta

 

Th Feb 16

E. Garrett-Mayer

STATA:  graphical displays

Stata.graphics.pptx

Statalecture2.do

Ceramide.csv

Ptdata.GemDox.csv

IschemicHeartDisease.csv

IschemicHeartDisease.pdf

Stata2.homework2.docx

 

Tu Feb 21

E. Garrett-Mayer

STATA: exploratory data analysis;

StataEDAandHT.pptx

lecture14.do

Ceramide.csv

Ptdata.GemDox.dta

ceramide.alldata.dta

SCBC2004.v9.dta

Ddata.csv

Homework.StataEDA.docx

 

Th Feb 23

E. Garrett-Mayer

STATA regression commands

stata4.regression.pptx

stata4.regression.do

sleep.csv

stata4.homework.docx

Tu Feb 28

E. Garrett-Mayer

STATA: programming and do files

stata.programming.pptx

stata.programming.do

 

Th Mar 1

Kyra Robinson

SAS: introduction

Introduction to SAS.pptx

SAS_intro.sas

Transpose vs Arrays.pdf

SAS_intro_hw.pdf

Tu Mar 6

Renee Martin

SAS: macros

SAS.macro.presentation.pptx

SAShandout.pdf

 

Th Mar 8

Ramesh

SAS: IML

SAScode.txt

 

Tu Mar 20

Valerie Durkalski

SAS: proc tabulate and proc report

Proc_Tabulate_howto.pdf

SASPres_20MAR12.pdf

macros.tabulate.HW.docx

vitals.xls

Th Mar 22

Nate Baker

SAS: Gplot

SAS.GPLOT.slides.3.22.2012.ppt

SAS.Gplot.HW.Description.doc

hw_gplot_3_22_12.sas7bdat

Tu Mar 27

Katherine Nicholas

SAS: ODS

ODS.Demo.sas

ODS.Lecture.pptx

ODS.HW.docx

Th Mar 29

Jordan Elm

SAS: array processing

SASArrayProcessing.ppt

HANDOUT242-30.pdf

HR ARRAY statements1.doc

Tu Apr 3

Adrian Nida

Batch processing (using R) and cluster computing

Handout.pdf

Presentation.pdf

Code Examples:

   create.batchfile.R

   mineAminos.batch.R

   mineAminos.R

   mpi.R

   pi.mpi.R

   pi.R

Assignment.pdf

mineAminos.batch.R

mineAminos.R

create.batchfile.R

genome.txt

Th Apr 5

Mulugeta Gebregziabher

Latex and Bibtex:  manuscript production

IntrotoLateX.ppt

ex1.tex

ex2.tex

ex3.tex

ex4.tex

ex5.tex

latexshortintro.pdf

latexshorterintro.pdf

Latex Homework.docx

Tu Apr 10

Betsy Hill

Reproducible Research:  Sweave

Sweave.intro.student.notes.2012.pdf

Sweave.example.Rnw

Sweave.example.pdf

Sweave.sty (style file needed to run Sweave.  Save this in the folder that contains the .Rnw file you will be running)

carter.cls (the class file used in the presentation Sweave_intro.tex)

PPRCarter.sty (the style file used in the presentation Sweave_intro.tex.

 

Links:

·        Sweave homepage – Leisch is the originator of the package

http://www.stat.uni-muenchen.de/~leisch/Sweave/

 

·        The Cancer Letter link

http://www.bcm.edu/cancercenter/index.cfm?pmid=12886

 

·        Annals of Applied Statistics paper – scroll all the way to the bottom of the page to see link to the Baggerly paper

http://www.imstat.org/aoas/supplements/issue_3_4.html

 

·        SASweave paper in Journal of Statistical Software

http://www.jstatsoft.org/v19/i08/

 

·        STATweave user’s manual by Russ Lenth

http://www.stat.uiowa.edu/~rlenth/StatWeave/StatWeave-manual.pdf

 

 

Th Apr 12

Emily Kistner-Griffin

Latex and Bibtex:  presentations

statcomputing2012.pdf

statcomputing2012.tex

beameruserguide.pdf

conference-ornate-20min.en.tex

DNA.png

teatime2010.pics.tex

HWbeamer.pdf

Tu Apr 17

Sybil Prince-Nelson

Designing your own website

Build.Website.Presentation.ppt

 

Th Apr 19

Paul Nietert

Sample size calculation software packages

Simulate2.sas

Simulation.Power.2GroupComparison.sas

SampleSizeProblem.docx

Tu Apr 24

 

 

 

 

FINAL PROJECT

 

DUE APRIL 30, 5PM

finalproject.docx

icu.codebook.docx

icu.csv

 

 

 

 

Computing: 

            Downloads and Websites:

R:  http://cran.r-project.org/

                        WinEdt:  http://www.winedt.com/index.html

                        Stata website:  http://www.stata.com/

 

            Tutorials :

                        R tutorial:  R-intro.pdf