Computing for Research I

Spring 2013

 

Description:    Students learn to use the primary statistical software packages for data manipulation and analysis, including (but not limited to):  R, R Bioconductor, SAS, SAS macro, and Stata. Additionally, students will learn:  how to use the division's high speed cluster-computing environment, how to practice the principles of reproducible research using Sweave in R, and how to use LaTeX and BibTeX for manuscript and presentation development.  This is a three credit course.

 

Course Organization:  This course is given by the faculty members in the division.  Instructors will take turns giving lectures in their areas of expertise. 

 

Textbooks:  No textbook.  Reading material (primarily found on the web) will be provided as necessary.

 

Prerequisites:  Biometry 700

 

Grading:  Instructors will give short exercises to be completed and turned into the primary instructor by the Thursday of the week following when it was assigned (e.g., assignments given on Tuesday Feb 5 and Thursday Feb 7 are both due on Thursday Feb 14).  Each assignment will count equally towards 75% of the course grade.  There will be a final project which will account for the remaining 20% of the course grade.  The remaining 5% of the course grade will reflect class participation.

 

Homeworks Policy:   Homeworks are due by 5pm on the due date.  All homeworks should be emailed to the primary instructor (garrettm@musc.edu) or turned in at lecture time.  Asking for extensions on homeworks is strongly discouraged.  However, it is expected that, on occasion, extenuating circumstances may arise.  Therefore, the policy is that each student may request an extension on homework twice and the extension is to be no more than 2 days.   You must notify the primary instructor that you are requesting an extension before the time the assignment is due.  After using two extensions, no more extensions will be granted except with a medical note.  

 

Office Hours:  The primary instructor will have office hours by appointment.  However, given the nature of the course, the primary instructor may not be knowledgeable regarding all of the topics covered.  As a result, additional help may be needed to complete assignments from the lecturers.  Be considerate and responsible in scheduling time with course instructors and recognize that they all have busy schedules.

 

Course Objectives:  Upon successful completion of the course, the student will be able to

1.     Import data and perform simple analyses and produce graphical displays in Stata, SAS and R

2.     Create new functions or commands in each of R, Stata and SAS

3.     Generate professional quality scientific manuscripts and presentations using Latex along with statistical software

4.     Perform standard power and sample size calculations using available software and simulations.

5.     Operate the division’s cluster computer with batch computing

 

Primary Instructor:

 

Elizabeth Garrett-Mayer

Website:

 

http://people.musc.edu/~elg26/teaching/statcomputing.2013/statcomputingI.2013.htm

Contact Info:

 

Hollings Cancer Center, Rm 118G

garrettm@musc.edu (preferred mode of contact is email)

792-7764

Time:

 

Tuesdays and Thursdays, 2:00-3:30

Location:

 

Cannon 301

Office Hours:

 

By appointment. Contact via email.

TA Office Hours:

 

TBA

 

Lectures:

 

Date

Lecturer

Topic

Lecture notes, links, etc.

Homework assignment

Tu Jan 8

EGM

Introduction; Overview and Principles

Lecture1.Intro.pptx

 

Th Jan 10

Katherine Nicholas

SAS: introduction

Intro Lecture.pptx

intro_demo.sas

intro_data1.csv

intro_data2.csv

HWdat1.csv

HWdat2.xls

Tu Jan 15

Katherine Nicholas

SAS: ODS

ODS Demo.sas

ODS Lecture.pptx

ODS HW.docx

Th Jan 17

Ramesh

SAS: IML

SAS IMLa.pptx

Homework.docx

Tu Jan 22

Valerie Durkalski

SAS: proc tabulate and proc report

SASPres_22JAN13.pdf

Proc_Tabulate_how_to_-_version_2.0.pdf

Th Jan 24

Nate Baker

SAS: Gplot

SAS GPLOT slides 1 24 2013.pdf

Data for Examples.zip

SAS Gplot HW Description.doc

Hw_gplot_1_24_13.sas7bdat

Tu Jan 29

Renee Martin

SAS: macros

SAS macro presentation2013.pptx

Macros.tabulate.HW.docx

Vitals.xls

Th Jan 31

Jordan Elm

SAS: array processing

SASArrayProcessing.ppt

HANDOUT242-30.pdf

HARRAYstatements1.doc

Tu Feb 5

Sybil Prince-Nelson

Designing your own website

Spn.buildwebsite.ppt

https://www.musc.edu/infoservices/web/html1/publish.html

Website.homework.docx

Th Feb 7

EGM

STATA: introduction, “immediate” commands

Stata1.pptx

SCBC2004.dta

Statalecture1.do

http://www.ats.ucla.edu/stat/stata/sk/default.htm

http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/index.html

http://data.princeton.edu/stata/ 

Stata1.homework.docx

Ohiosmall.dta

 

Tu Feb 12

EGM

STATA:  graphical displays

Stata.graphics.pptx

Statalecture2.do

Ceramide.csv

Ptdata.GemDox.csv

IschemicHeartDisease.csv

IschemicHeartDisease.pdf

Stata2.homework2.docx

 

Th Feb 14

EGM

STATA: exploratory data analysis;

Lecture12.do

StataEDAandHT.pptx

Ceramide.csv

Ptdata.GemDox.csv

SCBC2004.dta

Ceramide.alldata.dta

Ddata.csv

Homework.StataEDA.docx

Tu Feb 19

EGM

STATA regression commands

Stata.regression.pptx

Sleep.csv

Stata.regression.do

Stata.regression.homework.docx

Th Feb 21

EGM

STATA: programming and do files

Stata.programming.do

Stata.programming.pptx

Stata.programming.hw.docx

Tu Feb 26

EGM

Data management: principles &  Excel

Data.management.pptx

Data.ClinicalTools.pdf

 

Th Feb 28

Amy Wahlquist

Data management: RedCap

REDCap2013.pptx

REDCap Homework 2013.docx

QBData.xlsx

Tu Mar 5

EGM

R: introduction to object-oriented programming

Rlecture1-2.pptx

Rintro.R

SCBC2004.small.csv

Homework.R1.docx

IschemicHeartDisease.csv

IschemicHeartDisease.pdf

Th Mar 7

Chiuzan, Cody

R: downloading packages/libraries; data input & output

R lib. Data Input&Output.ppt

R pack.data in_out.R

 

Tu Mar 19

Delia Voronca

R: graphics

R Graphics.pptx

RGraphics.code.r

HW – Graphics.docx

Pleural.xls

Th Mar 21

Georgiana Onicescu

R: basic language structure (ifelse, where, looping)

Rpresentation.pdf

Rcode_presentation.txt

Hwk.pdf

Tu Mar 26

EGM

R: exploratory data analysis; writing commands

Rcommands.pptx

Rcommands.R

Final-3-3-2011.csv

Rhomework.docx

MethylationData.csv

 

Th Mar 28

EGM

R:  : regression commands

Rlecture.pptx

 

Tu Apr 2

Yanqui Weng

R: simulations; random number generation; sampling from distributions

Simulations.presentation.ppt

InClassCode.R

HW.FOR.SIMULATION.doc

Th Apr 4

Beth Wolf

R: bioconductor

An Introduction to Bioconductor_2013.pptx

Bioconductor_R_Code_4_4_13.R

Bioconductor_HW.pdf

 

Tu Apr 9

EGM

Sample size calculation software packages

SampleSize&Power.pptx

Interaction.sample.size.R

SampleSizeProblem.docx

Th Apr 11

Adrian Nida

Cluster computing, etc.

Presentation.pdf

Article.pdf

Code Examples:

   create.batchfile.R

   mineAminos.batch.R

   mineAminos.R

   mpi.R

   pi.mpi.R

   pi.R

Assignment.pdf

mineAminos.batch.R

mineAminos.R

create.batchfile.R

genome.txt

Tu Apr 16

Cody Chiuzan

Latex and Bibtex:  manuscript production

Tex.instructions.docx

Greenberg Intro

Latex Presentation

Practice1

Practice2

HW

Senn 7Myths

Th Apr 18

Emily Kistner-Griffin

Latex and Bibtex:  presentations

Statcomputing2013.pdf

Statcomputing2013.tex

Beameruserguide.pdf

Conference-ornate-20min.en.tex

DNA.png

Teatime.2010.pics.tex

HWbeamer.pdf

Tu Apr 23

Betsy Hill

Reproducible Research:  Sweave

Sweave.intro.student.notes.2012.pdf

Sweave.example.Rnw

Sweave.example.pdf

Sweave.sty (style file needed to run Sweave.  Save this in the folder that contains the .Rnw file you will be running)

carter.cls (the class file used in the presentation Sweave_intro.tex)

PPRCarter.sty (the style file used in the presentation Sweave_intro.tex.

 

Links:

·        Sweave homepage – Leisch is the originator of the package

http://www.stat.uni-muenchen.de/~leisch/Sweave/

 

·        The Cancer Letter link

http://www.bcm.edu/cancercenter/index.cfm?pmid=12886

 

·        Annals of Applied Statistics paper – scroll all the way to the bottom of the page to see link to the Baggerly paper

http://www.imstat.org/aoas/supplements/issue_3_4.html

 

·        SASweave paper in Journal of Statistical Software

http://www.jstatsoft.org/v19/i08/

 

·        STATweave user’s manual by Russ Lenth

http://www.stat.uiowa.edu/~rlenth/StatWeave/StatWeave-manual.pdf

 

Th Apr 25

Caitlyn Ellerbe

Mendeley

Mendeley_Spring2013.pptx

 

 

 

 

 

 

 

FINAL PROJECT

DUE MAY 3

Finalproject.docx

Finalprojectdata.csv

 

 

 

 

 

Computing: 

            Downloads and Websites:

R:  http://cran.r-project.org/

                        Stata website:  http://www.stata.com/

 

            Tutorials :

                        R tutorial:  R-intro.pdf