Lecture 1 in-class exercise#
We’ll be doing calculations on 2021 Yellow Taxi Trips data. We’ll live-code this together.
Step 0#
The data needs to available on the machine where Python is running in order to process it, so let’s download from the NYC Open Data site directly:
!wget -O 2021_yellow_taxi_trips.csv --no-verbose https://data.cityofnewyork.us/resource/m6nq-qud6.csv
2024-11-07 00:35:32 URL:https://data.cityofnewyork.us/resource/m6nq-qud6.csv [136065] -> "2021_yellow_taxi_trips.csv" [1]
This is using a command-line program to do the downloading. We’ll download data ourselves manually later. Look at the files in the current directory:
!ls
2021_yellow_taxi_trips.csv lecture_1.ipynb
LICENSE.md lecture_1.slides.html
_build lecture_1_exercise.ipynb
_config.yml lecture_1_exercise_solution.ipynb
_static lecture_2.ipynb
_toc.yml lecture_2.slides.html
assignments.md lecture_2_exercise.ipynb
conf.py lecture_3.ipynb
curve.ipynb lecture_3.slides.html
data lecture_3_exercise.ipynb
extras lecture_3_exercise_solution.ipynb
final_project lecture_4.ipynb
final_project.md lecture_5.ipynb
home.md lecture_5_exercise_solution.ipynb
hw_0.ipynb lecture_6.ipynb
hw_1.ipynb meta
hw_2.ipynb meta.md
hw_3.ipynb nbdime_config.json
hw_4.ipynb pyproject.toml
index.md resources.md
joining_late.md solutions
lecture_0.ipynb syllabus.md
lecture_0.slides.html tmp
You should see 2021_yellow_taxi_trips.csv
there.
Step 1#
Print out the trip distances.
# your code here
Step 2#
Calculate the average ride distance.
# your code here
Step 3#
Your turn! Calculate the percent of trips that were paid for by credit card. The data dictionary will be helpful - see the Attachment on the dataset page.
# your code here