Final Project#
The goal of the Final Project is to prove or disprove a hypothesis using skills learned in this class, and demonstrate understanding of those techniques through explaining them to others. It’s open-ended — you decide what you’re investigating. We’re looking for you to be creative, and just the right amount of ambitious.
Proposal#
Once you start#
Create a new notebook to do the actual analysis; that is what you’ll turn in.
Go back and find any information that’s available around the data, to get a better understanding of what it contains and means.
Might include a data dictionary
Might involve poking around a government agency’s web site to understand their processes
Understand what all the different columns and values represent
If you end up answering your initial research question easily (haven’t met the requirements below), ask and answer follow-up question(s).
Analysis requirements#
In addition to the applicable general assignment requirements, your submission should:
Read like a blog post - 35 points
Pretend you’re explaining to a peer who hasn’t taken this class. You don’t need to teach them to code, but they should be able to follow what’s going on.
Re-state the question, hypothesis, and data source(s) with link(s)
Walk the reader through what you’re doing in every step and what they should be taking away from it.
You are more than welcome to inject personality in there; doesn’t need to be dry.
Use text cells with Markdown for formatting.
You’ll need to change the cell type to Markdown.
If you hit any dead ends in your analysis, leave them in.
For example, include charts that you generate that may not show anything interesting and explain what you are choosing to look at instead.
You should still be cleaning up unused/broken code to make your notebook readable.
You may need to tweak your research question as you go. Show and explain why.
Have a conclusion that speaks to your question and hypothesis.
Use pandas - 15 points
Not be trivial - 35 points - requiring:
At least 40 lines of code to come to a conclusion
That code should be relevant to answering your question. In other words, having 40 lines of
print("hello world")
wouldn’t count.If you meet all the other requirements, you will likely be well over this number.
Transforming data through grouping, merging, and/or reshaping of DataFrames
Operations that aren’t easily done in a spreadsheet.
Have a visualization (chart or map) of some kind - 15 points
Follow best practices
If you answer the first question easily, that’s fine; dig into / build off of it. Go deep, not broad.
Examples#
Submission#
DO NOT WAIT UNTIL THE LAST MINUTE TO SUBMIT. Leave yourself time to fix any issues that come up in doing so, computer crashing, etc.
Please try to preserve anonymity.
Keep your name/username out of the notebook title, text cells, file paths, etc.
Hold off on responding to comments on your notebook before you get your Project grade.
Don’t leave any sensitive information in the notebook, such as:
API keys
Personally-identifiable information (PII)
Because it’s the end of the course and your peers are doing the reviews, there will be no extensions.
Confirming you meet the requirements#
The instructor and Reader don’t have bandwidth to review everyone’s full notebooks. Therefore, to be fair to everyone, we will deny any requests to have notebooks reviewed end to end, aside from appeals to the peer grade. In other words, please don’t ask us “I think I’m done — can you make sure my Final Project is ok?” That said, we are more than happy to answer specific questions and help troubleshoot specific sections.
To confirm you meet the requirements prior to submitting, you can:
Take a pass through your own notebook, pretending you are grading someone else
Ask someone else in the class to do so