Final Report

Due: December 18, 2025

Final Report Instructions

As a group, you will write a report expanding on what you showed in the poster. Because space on a poster is limited, the report is where you can go into detail about your analysis, results, methodology, and decisions as a group.

You do not need to explicitly describe your individual contributions in the report. Your individual contribution will be assessed separately through your:

These two documents are where you should speak to your own role, work, challenges, and contributions to the project. The main report should remain focused on the group’s work and final analysis.

Purpose of the Report

Your final report should serve as a professional, clear, and well-organized record of your data science project. It should demonstrate your ability to apply the tools and concepts from the course such as programming with R, data wrangling, visualization, analysis, simulating, modeling, and statistics in order to investigate a meaningful question using real data.

A reader should be able to:

  • Understand the research question(s), why it matters, and what you found.
  • Follow your reasoning, design choices, and the structure of your analysis.
  • Interpret the figures and results without needing to see your code.
  • See evidence that the group can conduct and communicate data science effectively.

You should feel comfortable sharing this report with a family member, friend, faculty member, or future employer to demonstrate your ability to analyze and communicate with data.

This report represents the group’s work. Your individual contributions will be evaluated separately through the Teamwork Survey and Individual Reflection.

Required Structure and Content

Your report should be written professionally and clearly. It must include the following sections:

1. Project Overview (1–2 paragraphs + 1 figure)

Briefly summarize:

  • The question(s) your group studied
  • Why the question is interesting or important
  • What dataset(s) you used
  • The general approach you took

Include one figure that helps situate the reader. This could be:

  • A simple plot showing the structure of the data
  • A conceptual diagram describing your analysis workflow
  • A schema showing key variables and their relationships

Choose a visual that makes your project easier to understand from the start.

2. Data and Methods

Explain:

  • How your data was obtained and cleaned
  • What variables you focused on and why
  • The analytical methods you used (visualizations, modeling, bootstraping, hypothesis testing etc.)
  • Any important decisions, assumptions, or transformations you made

Focus on the reasoning behind your choices, not the code itself.

3. Results

Present the most important results from your analysis. For each major figure or table:

  • Explain what the figure shows
  • Interpret the findings clearly
  • Connect back to your research question

Use high-quality plots with thoughtful design and clear labels.

4. Discussion & Conclusion

Summarize:

  • The key takeaways of your project
  • What your results suggest, or what story the data tells
  • Any reasonable limitations of your analysis
  • Ethical considerations (privacy, bias, fairness, representation, etc.) when relevant

This section should give the reader a clear sense of what your group learned.

What Not to Include in the Report

To keep the main report concise and focused, do not include:

  • Long code blocks (move code into the code/ folder)
  • Exploratory plots or drafts that are not essential to your argument
  • Large tables showing the entire dataset
  • A detailed list of who worked on which parts
  • Unfinished ideas, partial methods, or features you considered but didn’t use
  • Long discussions of future work (a brief remark in the conclusion is fine)

Your goal is to present a polished, coherent narrative—not a full lab notebook.

File Name and Length

  • Your main report must be named report.pdf.
  • Keep the report to 5–6 double‑spaced pages.
  • Brevity is intentional. This should be something you might want to share, and most readers will not read past 2–3 pages.
  • Include a short reproducibility note (1–3 sentences) describing how your code is organized and how someone could rerun your analysis.

Figures and Code

  • Use space wisely. Consider facets or the patchwork package to combine figures.
  • Select figures carefully.
  • If a figure cannot be included, summarize what it would show rather than pasting it in.
  • Do not include large chunks of code or long data‑overview sections in the report.
  • Small tables are acceptable.

Extra Materials

  • Additional figures may be placed in an appendix document (for example, appendix.pdf) and referenced in the main report.
  • Extra code used for analysis should go in a directory named code/.
  • Processed data (<100MB) may be included in data/.
  • For large datasets, include:
    • A link to the dataset online, or
    • Documentation for the API used.
      If the URL is unavailable or breaks in the Gradescope submission, include enough information in the text so the dataset can still be located (for example, the platform name, dataset title, author/organization, or API endpoint).

Acknowledgements (Separate Document)

Write acknowledgements in a separate document: acknowledgements.pdf. No formal style is required. Simply acknowledge the resources and tools that supported your project.

Examples include:

  • My lecture notes
  • Any of the datasets in the course’s data sources
  • YouTube videos or tutorials
  • StackExchange or Wikipedia
  • R or tidyverse documentation
  • AI tools such as ChatGPT

This section is not about where you “sourced” the project—it is about what tools, references, and materials helped you build it. Your project should reflect your own analysis and thinking. Resources may support your work, but the project should not replicate someone else’s report or workflow.

Submission Structure for Gradescope

When you submit your final project to Gradescope, you will upload a single .zip file containing your report, acknowledgements, and supporting materials. A recommended structure is:

final_project_submission/
├── report.pdf                 # Main 5–6 page double-spaced report
├── appendix.pdf               # Optional: extra figures (if used)
├── acknowledgements.pdf       # Required separate document
├── code/                      # Folder with all analysis code
│   ├── cleaning.R             
│   ├── analysis.R             # They do not have to have these exact names
│   ├── report.qmd             # These are some examples
│   └── visualization.R
├── data/                      # Optional: processed data (<100MB)
│   └── cleaned_data.csv
└── README.txt                 # Optional: brief description of structure

Notes for submission:

  • report.pdf must exist and be named exactly that.
  • appendix.pdf is optional, and only needed if you use an appendix for extra figures.
  • acknowledgements.pdf is required and should not be merged into the main report.
  • The code/ folder should contain the code (.qmd or .R) used for your analysis.
  • The data/ folder is only for processed data under 100MB; for larger datasets, include links or API documentation in the report or acknowledgements. If the URL is unavailable or breaks in the Gradescope submission, include enough information in the text so the dataset can still be located (for example, the platform name, dataset title, author/organization, or API endpoint).
  • Compress the contents of the final_project_submission/ folder into a single .zip file before uploading to Gradescope.