Lecture 6: The Bigger Picture

Lecture 6: The Bigger Picture#

Please:

sign attendance sheet
put away devices

Guest speaker#

Michael Freedman is a Brooklyn-based visual artist and creative technologist whose work moves between observation and systemic inquiry. For over 30 years, he’s painted and drawn the world around him—portraits, landscapes, gestural studio paintings, and, more recently, quick, meditative ink drawings made while walking through the city. He’s drawn to stillness and structure, but also to the chaotic energy of people and streets. More recently, he created CrashCountNYC, a data-driven project documenting traffic violence in New York City.

Questions?#

Final Project #

How did it go?

Peer grading #

Version control#

Open the repository.
Make your own Markdown file under people/, based on Aidan’s.
- You’ll need to create / sign into your GitHub account.
Commit and send a pull request.
- You’ll be asked if you want to Fork.
  - Why?
  - Click yes.
Review pull request of your neighbor.

Ask Me Anything (AMA)#

Have slides on “Python beyond data analysis” as backup, but would rather talk about what you want to hear about.

Data warehousing #

Python beyond data analysis#

We’ve been focusing on using Python and pandas for data analysis. What else is Python used for?

Data engineering#

Automation / recurring processes
Copying/moving/processing/publishing data, especially Big Data
Monitoring/alerting

Web development#

Building web sites that are interactive (more than just content)
Forms
Presenting data
Workflows, such as:
- Signing up for things
- Paying for things

Machine learning#

Statistics, but fancy
Building models
- Examples
Finding patterns
Recommendations
Detection

When people say “artificial intelligence,” they usually mean “machine learning.”

Diagram showing what type of machine learning may be useful, if at all

Source, with more thorough explanation

The process#

High-level

Create a model
1. Gather a bunch of data for training
2. If supervised machine learning, label it (give it the right answers)
3. Segment into training and test data
4. Train the model against the training dataset (have it identify patterns)
5. Test the model against the test dataset
Run against new data
If reinforcement learning, model refines itself

You have a head start: The fundamentals are applicable anywhere you’re using code.

Resources #

Post-Google Colab#

Google Colab instance (and your copy of files) will be deleted
- Download things you want to keep, particularly edited notebooks
Class materials will remain available on the site and through GitHub
Running notebooks elsewhere

Course evaluations #

They are:

Totally anonymous
Not visible to me until grades are released
A big help. Some things I took from the past:
- Making assignments more rigorous
- Students are hungry for more
- People like the in-class exercises

More info. Please complete now, if you haven’t already.

Thank you!#

Keep in touch.