Homework 1#
General assignment information. Note that this isn’t a template notebook, hence there’s no 🚀 above. You will create a blank notebook for this one.
Tutorials#
-
Beginning up to “GroupBy object attributes”
“Aggregation” up to “The
aggregate()method”
Coding#
You’ll do the following in a notebook. Make it read like a blog post. Pretend you’re explaining to a peer who hasn’t taken this class. You don’t need to teach them to code, but they should be able to follow what’s going on.
Steps#
-
It must have at least one numeric column.
Don’t spend too long on this step.
If there’s more than one numeric column, pick one.
Create a new notebook.
Using pandas:
Read in the data.
Compute:
The mean
The median
The mode
Do a
groupby()with an aggregation.
Do the same thing, but with pure Python (without pandas).
Write a conclusion, covering both:
The takeaways of the analysis
Reflecting on the process
Did you use an external source, including generative AI? Please explain, or say that you didn’t.
Tutorials, continued#
Read The Joys (and Woes) of the Craft of Software Engineering
Note not everything in there is applicable to data analysis
Filtering/indexing
DataFramesLearn about functions
Coding Style Guides - Please skim these; I don’t expect you to understand and follow everything in them. The most important guidelines to pay attention to are indentation and keeping each statement on its own line.
Optional#
Glance through pandas’ comparison with other tools for any you are familiar with
Participation#
Reminder about the between-class participation requirement.