Columbia College | Columbia Engineering

Student Advising

Undergraduate Summer Internship Program in Applied Statistics - App Deadline: (3/15)

The Department of Statistics seeks applications to its undergraduate summer internship program in applied statistics from Columbia College, Engineering School, General Studies, and Barnard College students who will be enrolled as undergraduates in the Fall semester of 2014.  Interns undertake mentored research and data analysis projects under the direction of Statistics Department faculty. The project areas for the summer of 2014 are outlined below.  Interns receive housing for the six week period of the internship, and a stipend. The internship starts May 27th and ends July 3rd.

Applicants should deliver to Ms. Dood Kalicharan in the Statistics Department

Ms. Dood Kalicharan
Department of Statistics
1255 Amsterdam Avenue,
Room 1005, School of Social Work Building
New York, NY 10027

Please submit a transcript and a statement of interest. The statement should indicate which project area would be of primary interest, and the applicants experience and coursework in probability and statistics and the applicants programming skills.  Review of applications begins March 15th.

Project 1: Case Studies in Statistical Reproducibility.  There is significant interest within the computational sciences to understand why reproducibility fails. In this project we will examine the statistical reasons an empirical finding may not replicate in an independent study (insufficient power, misuse of p-values, poor reporting practices, for example), and build a library of case studies as exemplars. See for example

Project 2: Understanding Model and Data Stability.  The phrase ‘big data’ is often heard in the context of transformative data-driven research. In this project we investigate the reliability of findings drawn from the big data context, including examining the sensitivity of findings to model stability as well as perturbations in the underlying data.

Project 3: : Linking Code and Data to Findings.  As scientists increasingly rely on computation as a key part of their research toolbox, they are using and generating more code than ever before. In combination with ‘big data’ and the increased amounts of data sharing in the sciences, linking the data and code to published findings seems a natural step in ensuring reproducibility of computational results. This project will accelerate an open-source collaborative research pilot called that seeks to solve this problem.

Project 4: Clinical Epidemiology.  Data from large-scale clinical epidemiological studies of cardiovascular and pulmonary disease are available for analysis.  The data include genetic, demographic, socio-economic, physiologic, and anatomic variables.  A student undertaking this project would gain exposure to data management, study design, and data analysis in bio-medical research.


Starting with:

# (2)
2 (6)
3 (1)
A (92)
B (43)
C (384)
D (61)
E (91)
F (66)
G (33)
H (43)
I (101)
J (33)
K (4)
L (34)
M (67)
N (30)
O (61)
P (174)
Q (1)
R (29)
S (215)
T (57)
U (30)
V (9)
W (41)
Z (1)