REU Research Intern Positions Available Fall 2013 and Spring 2014

Undergraduate Research Intern positions available for the academic year on-campus (fall 2013 and spring 2014). These positions are funded under the Research Experiences for Undergraduates (REU) program from the NSF and provide an $8,000 stipend paid over the academic year. Undergraduate students from information science, the social sciences and computer science who are interested in participating in an interdisciplinary research team are encouraged to apply by September 9, 2013.

Research Project Description

The goal of the SoCQA research project, Social Computing for Qualitative Analysis, is to develop and test an innovative Natural Language Processing (NLP) and Machine Learning (ML) based research tool that supports a computer-human partnership for qualitative social science. The innovation of the project is the integration of human processing with computational information extraction and active learning in a tool to support a commonly applied data analysis approach in the social sciences: content analysis, the extraction of structured research data from unstructured sources. The project is funded by the NSF under the SOCS program for three years and features interdisciplinary research between the social sciences and information/computer science. The project is led by Nancy McCracken and Kevin Crowston in the School of Information Studies.

A working prototype research support system has been developed during the first two years of this research project. During the third year, starting Fall 2013, the system will be tested on one or more pilot projects for content analysis. The first pilot project, already underway, conducts research on the dynamics of distributed groups. During the third year, the project will fund two undergraduate research interns to assist with testing and development of the system.

Two Student Research Intern Positions:

Student 1: Content Analysis Intern tasks and responsibilities:

One undergraduate intern will be funded for the content analysis research with the pilot project(s). This student will work a postdoctoral researcher and graduate student in the content analysis of emails relating to the analysis of leadership in open source software groups. The student will be trained in the methodology of content analysis, which is applicable to other information and social science research. The intern will
• annotate email text with the leadership codes; this involves reading the email in an annotation tool, choosing phrases which show aspects of leadership according to a codebook developed for this task and assigning the correct code,
• use the prototype software tool to correct and to analyze the correctness of the automatic coding,
• and contribute to the analysis and reports of leadership in the software teams.

The outcome of the content analysis of the pilot project will be a contribution to research in distributed groups. The student will learn about content analysis, a research methodology in information science and social sciences.

Student 2 Machine Learning Intern tasks and responsibilities

The second student will be engaged in testing from the Natural Language Processing (NLP) and Machine Learning (ML) perspective, carrying out quantitative experiments that vary ML features and parameters to see which types of codes are learned most effectively. The student will work with a doctoral student and the tasks may include
• conducting experiments with the machine learning part of the system, written as a Java application, in order to set optimal features and parameters for different types of data, and keeping records of results from the machine learning software output,
• assisting in the analysis of the email data that has been assigned leadership codes in the pilot project and experimenting with the textual features that can improve the performance of Machine Learning algorithms, and
• helping analyze and report on the results of the machine learning.

The outcome of the machine learning part of the project will be a contribution to research in machine learning applications in computational linguistics. The student will gain experience in an application of machine learning, a technique widely used in data mining and other computer science applications.

Expectations of students:
Each REU intern for the project will be expected to work approximately 10-15 hours per week. This will include a weekly project meeting, scheduled for the fall semester to be every Monday from 2:15 – 3:15. At this meeting, the entire research team will discuss progress, and we expect that attendance at these meetings will make a strong contribution to the research experience. We also share milestone progress reports on different aspects of the project with the entire group, and we will expect the undergraduates to contribute to that as well. The overall project supervision is provided by Dr. Nancy McCracken who will lead in setting requirements and performing evaluation. Training in the background needed to understand the research and in using the software tools of the project will be provided.

Student stipend:
Each intern will receive an $8,000 stipend to be paid in installments over the course of the academic year.

Skills required:

Students should have excellent academic credentials and references for good work experience, including reliability and dedication. A general comfort level with using software and analyzing data is required of both students. Some programming skill would be a plus for the machine learning intern.

Application Process

To apply for either of these positions, please send a cover letter describing your background and interest in one of the intern positions together with a resume’, including references and (unofficial) transcript, to Prof. Nancy McCracken, , by September 9, 2013, for starting in the fall semester.