Homework 3#
General assignment information
Coding#
We are going to look at the population count of different community districts over time.
Step 0#
Read the data from the New York City Population By Community Districts data set into a DataFrame called pop_by_cd
. To get the URL:
Visit the page linked above.
Click
Export
.Right-click
CSV
.Click
Copy Link Address
(orLocation
, depending on your browser).
# your code here
Step 1#
Prepare the data. Reshape the DataFrame to have one row per community district per Census year.
Hints
You’ll want the
id_vars
to beBorough
,CD Number
, andCD Name
.You’ll want to use
.str.replace()
to get rid of thePopulation
suffixes.
# your code here
Step 2#
Create a line chart of the population over time for each community district in Manhattan. There should be one line for each.
# your code here
Tutorials#
Go through the first third of Time Series Analysis with Pandas, up until the “Visualizing time series data” section.
Read the Data Design Standards.
Optional#
Watch this talk on audification/sonification. We won’t be doing so in this class, but hopefully will provide some inspiration about different ways that data can be represented.
Read about other tools and techniques for visualization in Python.
Final Project Proposal#
Please submit your proposal.
Participation#
Reminder about the between-class participation requirement.