workshop contents

requests and responses (or, how computers talk to each other)

people create web APIs in order to make some or all of their data accessible to others. in order to talk about web APIs, we need to make sure we all have an understanding of how information travels on the web. here's a very simplified version of how a website gets to your phone or computer:

to summarize, clients make requests; servers serve responses.

responses and requests both include addresses so they know where to go. when you type an address into a browser search bar, you’re typing the address of the file you want on the server where it's stored.

the response the server sends back can be a website, an image, a video. in this workshop we’ll be talking about responses that are JSON data.

APIs use different addresses to serve different parts of a data set.

a close reading: the propublica congress API

let's take a close look at an API. the propublica congress API is well-documented and not evil, so we’ll use it to build as much knowledge as we can, from scratch, about APIs.

what does the documentation say?

information about what an API does is referred to as its documentation. it’s the best place to get your bearings when you’re exploring a new API. this one says:

"Using the Congress API, you can retrieve legislative data from the House of Representatives, the Senate and the Library of Congress. The API, which originated at The New York Times in 2009, includes details about members, votes, bills, nominations and other aspects of congressional activity. This document describes the requests that users can make of the API and the responses that it returns."

from the description, we learn some things about APIs in general:

with this API in particular, we'd make requests to get information about:

building a query, making a request

the propublica congress API provides access to many data sets. we'll focus on the members API to keep things simple. the members API has information about how to build the address to access the data we need.

URI stands for "universal resource identifier" and it's another name for the data set's address on propublica's server.

looks like a regular web address, right? the only differences are:

  1. the propublica API requires a key, so you won't be able to access the data by just typing the address above into a regular browser; you'll get a message saying "forbidden". we'll talk more about keys later, but for now just think of it as a password.
  2. those curly brackets, which tell you that there can be multiple possible values there. you have to decide which values you want to put in; putting in different values will return different data. this is what i meant when i said above, “APIs use different addresses to serve different parts of a data set.”

for example, these three links will return different data sets because they’re (slightly) different addresses:

see the differences? the places where you have to decide the values are called parameters. programmers plug the query parameters into the address to get the response (or, data set) they need.

once you build your request address and enter it in the command line,

the response looks like this:

a close reading: the dronestream API

let's look at another API: a covert drone strike API created by data artist josh begley.

what does the documentation say?

there is no documentation! sometimes, this happens with very simple APIs.

making a request

unlike the propublica congress API which has lots of different endpoints (addresses), the dronestream API just has one endpoint. this means you don’t have to deal with any parameters to build the address to get the data set you want; there's only one data set. the dronestream API also doesn’t require a key, so anyone can access it straight from the browser. give it a try:

this endpoint returns raw JSON data, just like the propublica congress API. begley created this data set by going through articles from the bureau of investigative journalism and making a JSON object for each covert drone strike launched by the u.s. before begley made the dronestream API, this data set existed in articles, but not in a format that could be accessed and used in an application or data visualization.

corporate APIs

we've looked at two examples of APIs that journalists and activists might make and use. but many corporations also release APIs. google, facebook, amazon, uber, twitter—they may not all be profitable, but they all have APIs. why?

to encourage outside software developers to create new products and services from data these companies collect. each company decides what to make accessible to developers via its APIs, and companies can remove features or revoke access at any time, ending the viability of products and services built on those deprecated or inaccessible APIs.

discretionary API access

companies have complete control over the data they collect and release in the form of an API. this power asymmetry can present a problem for developers.

as a case study, let's look at the uber API. here's how the company talks about its API in the API mission:

Source: Uber Developers, "Terms of Use"

examples of prohibited uses of the uber API include aggregating uber with competitors and storing uber's data, except as expressly permitted by uber:

Source: Uber Developers, "Terms of Use"

the urbanhail story

urbanhail was an app that aggregated the prices of rideshare options so users could choose the cheapest one. to do this, it relied on APIs, including uber’s.

uber revoked urbanhail’s API access. urbanhail’s now-defunct website informed visitors that

"Uber terminated urbanhail's API access of May 31st [2016]. We had previously been using this API access to display Uber ride options on our app's results page."

a few months after urbanhail folded, uber pricing was integrated into google maps, along with lyft and other ride-hailing platforms. uber permits price comparisons within google maps, but not for other companies.

discussion questions💡

  1. if you're uber, can you imagine why you would have these kinds of restrictions?
  2. why would a company prohibit storing or aggregating its data?
  3. can you imagine a situation when an API's terms of use would get in the way of certain kinds of applications getting built?

speculate 💖

  1. are wage theft apps or strike apps possible with discretionary API access? why or why not?
  2. what would it take to connect isolated gig economy workers with each other without API access?

next steps: technical tutorials

now that we've read some examples of API documentation, have a sense of APIs do, who makes them, why we should care about who has access to them, and what data they're made from, it's only appropriate to share the official and mostly useless definition of what an API is:

the u.s. government uses this metaphor, among others, to describe them:

"APIs are like the engine of a car. You don’t have to know how it works but rather just turn the key in the ignition and it handles everything underneath."API Resources for Federal Agencies

web APIs are difficult to talk about, but important to understand since they're such a fundamental part of the web applications we use every day. you can (and should!) also make your own, with data you think could be used in applications or data visualizations. these are two of the best technical API tutorials i know of—one in python, one in javascript:

many platforms that offer APIs also have useful tutorials and documentation: