The Examples

This site is not intended as a reference for best coding practices or universally useful scripts, but rather, a collection of examples of the kinds of small, mundane tasks that we can try to program for, and the (often sloppy) design decisions and compromises that go into such one-off work.

The source code of the scripts is included – because, why not? – but they aren’t guaranteed to work as-is, so you copy-paste-execute at your own risk.

Examples

Convert Excel to CSV

Working with data that comes from an Excel spreadsheet.

Combining text files

In many real world situations, data is not only dirty, it's not even put together in a single file.

Get coordinate data from Google Geocoding

Get latitude and longitude, and other Google geodata, for any given location.

Get HTTP Status code

Get HTTP status code for any given link

Extract Twitter accounts from text

Find the text pattern that indicates a Twitter handle and feed it to the API.

Timestamp and organize downloaded files

Not quite as good as git, but better than hand-naming files yourself.

Convert Word doc generated HTML to better HTML

A workflow for writers who use Microsoft Word but want to post to the Web.

Archive a single webpage and its visual elements

Take a basic snapshot of a webpage and the stylesheets and images used in its visual presentation.

Name and optimize a screengrab file

Control what happens to an image file after a screenshot is taken

Automated webpage screenshots with PhantomJS

The problem with non-browser tools is that, well, they don't act like browsers. "Headless" programs provide some of the functionality of a full-fledged web browser for automated systems (such as testing, or mass screenshot grabbing)

Get metadata about a Socrata dataset

Get the most important information about a Socrata dataset.

Get data about an Instagram photo

Another example of doing something you've already done -- finding the information about a photo -- but doing it via API

Finding text within a group of images

You remember documenting a website by taking screenshots. But how do you search by text?

Dummy text detector

Do a last-second sweep of embarrassing words and phrases

Fetch latest updates on Texas death row

An exercise in scraping and parsing HTML about criminal justice events.

Follow all Twitter accounts listed on a page

Find the text pattern that indicates a Twitter handle and feed it to the API.

Extract data about a Flickr photo

A command-line tool for getting the URL and metadata given a Flickr URL.

Create a Google Static Map with Markers

Put some markers on a map and customize it

Extract useful metadata from a web article

Just how do they make those Flipboard/Twitter/Facebook previews anyway

Quickly crop and create image previews and thumbnails.

Use Python's Pillow library to calculate the proper dimensions for a resized image.

Count up your GMails with Python

Gather up and filter your emails programmatically.

Chicago metadata as a package

Make it easy to use information about Chicago and its geographies. Because sometimes you just need to loop through a list of Chicago neighborhoods.

Archive tweets

Archive tweets with a certain hashtag to a database

Projects

Counting SSA Baby Names

Mostly a lot of combining text files together

Gather and visualize U.S. minimum wage data

A canonical single-page web-scraping example, focused on wrangling not-quite-ideal HTML into tidy data.

California test scores

A mini-project to collect California school data, using a variety of mundane programming scripts and libraries.

California Salaries Tracker

A demonstration of working with multiple CSV files programatically

White House Salaries Tracker

A demonstration of working with multiple CSV files programatically