Homework 2#
General assignment information
Coding#
The goal here is to practice joining datasets through pandas. Hint: The instructions here are intentionally incomplete.
Step 1#
Find an NYC dataset with a borough column.
Use Scout to filter by column name.
Don’t spend too long on this step.
Keep the dataset small (under 500,000-ish rows) to make it easier to work with.
What’s the URL of your dataset?
YOUR RESPONSE HERE
Step 2#
# your code here
Step 3#
Open the Population by Borough dataset and load it into Jupyter.
# your code here
Step 4#
Use merge()
to combine the two, and output the resulting table.
# your code here
Bonus#
5 points
Using the two datasets above, use pandas to produce an aggregate per-capita statistic by borough.
The dataset you chose before may not work for this. That’s fine, pick another.
Hint#
You’re creating a “number of [thing] per capita by borough” table.
Do a
groupby()
on the original dataset.Join with the populations by borough.
Compute the per-capita values as a new column.
# your code here
Tutorials#
“You’re Not Mapping Rats, You’re Mapping Gentrification”—article about bias in 311 data
Intro to Plotly Express. You don’t have to work through every one of these examples; just review to get familiar with what types of charts are possible.
Optional#
Participation#
Reminder about the between-class participation requirement.