NYC Data Science Corporate blog

What I Learned From 100,000 Open Data Across 100 Open Data Portals

Written by Jun Zhao | Dec 3, 2013 12:43:33 PM

 

Many thanks go to Thomas Levine for giving such a great workshop!

 

Slides:

http://thomaslevine.com/!/data-about-open-data-talk-december-2-2013/

----------------------------------

Meetup Announcement:

Thomas Levine has downloaded 100,000 datasets from 100 open data portals, and this is what he learned.http://thomaslevine.com/open-data

He talked about all aspects of how he did this, and downloading was, of course, a big part of that. Here were two repositories that you could link to if you like. They lacked comprehensible documentation, though.
https://github.com/tlevine/socrata-download
https://github.com/tlevine/socrata-analysis

Speaker:

Playing with computers since he was young, Thomas Levine eventually developed back and wrist pain, so he started studying ergonomics and conducting quantitative ergonomics research. Then he realized that he’d accidentally become a data scientist. And his back and wrists now hurt less. He also has a band called CSV Soundsystem that makes music from spreadsheets.

Outline:

For the first half of the session, he would talk about what he did and what he learned.

After that, he talked in more detail about how to conduct an analysis like this. The specifics depended on what interested participants,but topics could include

- Planning complicated data workflows/pipelines
- Storing data
- Tricks for making things run faster

In addition, He also talked a bit about brainstorming and six thinking hats. Then people did a couple of exercises.

- Choose an open data catalog. Diagram how a person could manually download all of the datasets. Then change the labels in the diagram so that it describes a computer program that downloads the datasets.
- Select a guideline from one of these lists, and brainstorm ways of testing it. 

----------------------------------

Other Useful Info Link:

You could try one exercise before you begin to see more details about this workshop.

http://thomaslevine.com/%21/data-about-open-data-talk-december-2-2013/#exercises

Apply for the Upcoming NYC Data Science Bootcamp

The first step in becoming a data scientist is to complete your Data Science Bootcamp Application.  Just click the button to apply.  It's free and will only take you about 5 minutes.