Lecture 1 in-class exercise#
We’ll be doing calculations on a subset of 2021 Yellow Taxi Trips data.
Step 0#
The data needs to available on the machine where Python is running in order to process it, so let’s download from the NYC Open Data site directly:
!wget -O 2021_yellow_taxi_trips.csv --no-verbose https://data.cityofnewyork.us/resource/m6nq-qud6.csv
2025-06-29 23:01:23 URL:https://data.cityofnewyork.us/resource/m6nq-qud6.csv [136065] -> "2021_yellow_taxi_trips.csv" [1]
This is using a command-line program to do the downloading. We’ll download data ourselves manually later.
Look at the files in the sidebar. You should see 2021_yellow_taxi_trips.csv there. (You may need to refresh that list.)
Your turn!#
Do the following steps:
Using pure Python, not pandas or other packages
In other words,
import csvshould probably be the onlyimport.
Step 1#
Print out the trip distances.
# your code here
Step 2#
Calculate the average ride distance.
# your code here
Step 3#
Calculate the percent of trips that were paid for by credit card.
The data dictionary will be helpful - see the Attachment on the dataset page.
Pair with a neighbor.
# your code here