Goals

Course projects are intended to give you the opportunity to engage in a larger project on a topic of your choosing. If you are a graduate student or an undergraduate researcher, we encourage you to choose a project that is directly related to your research. The goal of projects is to both learn how to work on bigger data related computing projects beyond the confines of an exercise, and to give you the opportunity to learn the specific computing tools you will need for your research while you have access to experienced teachers to help you figure out how to make them work.

What does a project involve

Projects can involve programming, databases, or both. They should be on something you are excited about.

As a rough guideline projects should represent ~30-40 hours of work. Some class time will be provided for working on projects.

Project Proposal

The course schedule should include a deadline for submitting project proposals (Version Control Projects week).

These proposals should be no more than 1 page long and describe:

  • what you are planning on working on,
  • what tools you are thinking about using,
  • and any questions you have about how to best proceed.

Submit the project proposal as directed by the Project Proposal exercise.

Your instructor should provide feedback on these proposals about whether the work is sufficient for the project, whether it is reasonable to accomplish it in ~30-40 hours, and what tools or approaches you should look into for the project.

Final Project Submission

  1. The code and/or data involved in the project in your class project Github repo.
  2. A 1-2 page description of what you did and how the resulting project works as a .txt file in the your class project Github repo. For example if the project is primarily based on code, how should the instructor run the code and what is the expected output. If the project was taking a bunch of poorly structured data, tidying it, and putting it in to a database: describe what the original state of the data was; described the tables in the new database and how the relate to one other; and provide an example of a query that extracts useful information from the database.

Examples of previous projects

  • Cleaned four datasets on Burmese python diets, which had varying lengths, structures, and creators, so that they could be used together for future analyses
  • Using spatial occurrences of species from two openly available sources, created spatial maps of occurrences and extracted corresponding climate variables
  • Used Bioconductor package to compare transcriptomes of a bird species that underwent two experimental toxin treatments
  • Created web application for users to explore data associated with tagged fish using Shiny package