This post was originally published at ODSC. Check out their curated blog posts!
An API is a way for one piece of software to interact with another application. API stands for “application program interface” allowing your application or R script to interact with an outside service. The cool thing about APIs is that it lets you grab functionality or data without having to create it yourself. For example when you use Google Maps to navigate, your phone sends the request to the navigation service and gets back the needed information. Your phone doesn’t have to locally store all maps of the world or calculate an efficient route. It just retrieves the information for the particular request through an API then displays it to you.
Many APIs are “REST” meaning the services interact with data abstractions through a URL with parameters. In database operations there is a similar “CRUD” which stands for Create, Read, Update and Delate. A REST API service can do these operations sometimes with additional functionality. A short explanation between CRUD and REST is here.
In R there are various methods to interact with APIs. In some cases dedicated packages exist like ‘twitteR
’ or ‘ggmap’ for interacting with particular services. In other cases more generalized packages like ‘jsonlite
’, ‘xml2
’ or ‘httr
’ can be used to access APIs. In this post I will show you how to interact with services in a couple of ways performing GET and POST requests. GET requests obtain information, like a row of a customer database using parameter(s) you specify such as customers in Ohio. Post requests allow you to post a file to a service either for storage, or so that the application can enact on it and return some information. An example post request in this article is posting a photo to a Microsoft cognitive service so the service can analyze the photo’s emotion. That way I can check for my Valentine’s reaction to my gift!
APIs with Packages
One of the easiest ways to start interacting with APIs in R is with the ggmap package. It is a wrapper for interacting with google maps in the familiar ggplot2 functionality. If you have spatial information I suggest working with ggmap before trying the more dynamic leaflet package.
In this code you load the library and create an object bos which is a string representing a location.
library(ggmap)
bos<-'Boston, MA'
Next, pass the string location to the get_map function. This function is a wrapper for getting map tiles that can be used as the base layer in a ggplot. The second line calls ggmap on the map tile that was retrieved so it can be plotted in your R session.
bos.map <- get_map(bos)
ggmap(bos.map)
The base map tile from the google map API.
A nice output of the get_map function is that it will print the URLs where the information was retrieved. In this example the URLs include:
In the first link address you should identify “Boston,+MA” which is the center of this map. If you change this to another city such as “Cleveland,+OH” the map center will correctly change. The second link also has the URL encoded Boston MA as “Boston,%20MA”. Again you can change this part of the URL to retrieve different information from the maps service.
If you click on the second link you will navigate to a small text file with structured information from the service. In this case the page is JSON formatted. JSON is a lightweight “JavaScript Object Notation” file that is used to pass information to a browser. The screen shot below shows the JSON results of the API query which includes “Boston”, “United States” and starts to include latitude and longitude information.
A portion of the JSON response for Boston Ma after using get_map.
True to ggplot’s nature, you can call another API service for information and add a layer to the base plot. To start, let’s define the starting point as “Westford, MA” and drive to the “Boston Common.”
west<-'Westford MA'
common<-'Boston Common'
After we have our starting and ending points, call the route function from ggmap and pass in the string objects. This performs an API request to google.
route.df<-route(west, common)
The API will return a data frame of trip segments. You can examine the information by calling route.df in your console as is shown below. The trip is divided into segments with lat, lon, time to travel and distance information.
A portion of the route from Westford to the Common return by Google Maps Routing API.
Now that we have our route data frame we can add that information to our original ggplot map of Boston. Using the “+” add a layer of geom_leg segments. Within that layer specify the starting longitude and latitudes as the x, y. Then specify endLon and endLat for the end of each segments. Lastly pass in the route.df as the data source. I add another layer to remove the legend but this is optional. The resulting map that has called a few API services is below!
ggmap(bos.map)+
geom_leg(aes(x = startLon, y = startLat, xend = endLon, yend = endLat, color='red'), data =route.df) + theme(legend.position="none")
The Boston map that has routing information as an additional layer.
APIs without a Package
A simple way to start learning about APIs without packages is to look at http://newsapi.org. This is a simple service where the GET requests are straightforward. If you look at the webpage you will see a live example with a URL “ https://newsapi.org/v1/articles?source=techcrunch&apiKey={API_KEY}” and its JSON formatted response.
Deconstructing this URL you will notice the “source=” followed by “techcrunch” and then an API key. Following the source parameter is the string that represents a new service like “cnn.” This tells the service which of the many services to return. The only other parameter is the API key which is a unique identifier letting the administrator know who is accessing the service. Previously, the google maps service did not ask for a key however most API services do. I suggest signing up at www.newsapi.org for a key and then adding it into the code below.
Since there is no R package for the News API service you will need to parse the return. If you look at the documentation you will see the response is given in JSON. A great package for JSON responses is jsonlite. This will let you access and organize data from a JSON webpage. Further after looking at any of the documentation you should see an example URL which I added as an object called news.url. In this example the source is CNN but that can be changed to another. Remember to change the Xs to your key after signing up!
library(jsonlite)
news.url<-'http://newsapi.org/v1/articles?source=cnn&apiKey=XXXXXXXXXX'
Using the fromJSON function along with the web address will access the webpage and return a list containing the API’s response.
cnn<-fromJSON(news.url)
I like to examine the response because no two services are exactly alike. Usually I use str to understand the structure of the response. The screen shot below shows what was returned. First was an “ok” or status 200 from the service. Next, a header parameter letting you know that the news source was CNN followed by how the articles were sorted. The most interesting part of the API response is the $articles element. This represents a data frame with $author, $title, $description and $url columns.
The NewsAPI.org response using fromJSON.
Thus to extract the data frame from within the list use the code below. This will let you save the output of start to do some text analysis.
cnn.df<-cnn$articles
She loves, she loves me not…using Microsoft’s Emotion API.
Microsoft’s cognitive services offer many interesting APIs. From optical character recognition to text analysis and machine vision there is a lot that can be explored. In this example I want to check my Valentine’s emotional reaction to my gift by passing their picture to the API. We will need to POST our pictures to the service, let the service analyze it and then respond with the results. If you are coding along, sign up at https://www.microsoft.com/cognitive-services/en-us/. The UX is horrible the service doesn’t automatically send a verification email…there is a link at the top to send it next to your email address before you can get keys. Without verification the service just returns a 401 (unauthorized).
There is an R package called “Roxford” which is used for accessing the Microsoft services. However to show you what’s going on under the hood we will code a POST response without it first.
The libraries I will use are httr and rvest. The first provides tools for working with URLs and HTTP. The second has a function I like called pluck for working with lists.
library(httr)
library(rvest)
Lets define where the photo will be “POST”ed. Check the cognitive services documentation for each different service because each has a unique URL. This link is for the emotion recognition service.
msft.url <-‘https://westus.api.cognitive.microsoft.com/emotion/v1.0/recognize’
Next, create a string for the image location. You could use a local path with some minor code changes but for now just find a photo online. This photo is of Ice Cube who is usually known for being angry!
Ice Cube does not look happy!
The last input is to define your API key. Microsoft Cognitive Services requires keys for each type of service. For example to use the Computer Vision key you have one string and for this emotion detection you have another.
emo.key<-'REPLACE_with_your_key'
Within the POST function you need to specify the image location as a URL. It’s easier to do this as an object shown below as img.loc referring to the img.url string. The img.loc is then used in the body= parameter of the POST code.
img.loc <- list(url = img.url)
Now it’s time to POST the photo to the service. Using POST, specify where the file should do. The next parameters add information to the photo in JSON format. This includes the API key authorizing you to access the service. The resulting object msft.response is a list returned from the Emotion service.
msft.response <- POST(
url = msft.url,
content_type('application/json'),
add_headers(.headers = c('Ocp-Apim-Subscription-Key' = emo.key)),
body = img.loc,
encode = 'json')
This will direct your R session to pass the photo to the emotion API and collect the response. Using the contentfunction on the response, msft.response, will extract the API content for analysis. The resulting list has a bounding box for each face (in case there is more than one) and numeric scores for 8 emotions. The service scores on each among “anger,” “contempt,” “disgust,” “fear,” “happiness,” “neutral,” “sadness,” and “surprise.”
emo.tags <- content(msft.response)
Since I care about the emotions I next use pluck to extract the parts of the list that contain emotional scores. The API response is passed in followed by the name of the list element “scores.”
emo.tags<-pluck(emo.tags,'scores')
Each of the 8 emotions for each face are still a list. Thus I organize them into a data frame. To do so, do.call is passed rbind (row bind) then the list. The end result is a data frame with 8 columns shown below the code. Each row represents a face in the photo.
emo.df<-do.call(rbind,emo.tags)
To quickly find the highest value I like to use max.col. This function will return the column number with the highest value. This is used to index the colnames of the data frame. In this example, the first column contains the highest value. So colnames(emo.df)[1] would return the name “anger.” Multiple emotions will be returned if there is more than 1 face in the photo.
colnames(emo.df)[max.col(emo.df)]
Emotion as a Web App
Now that you have seen some example APIs and specifically how a POST occurs let’s build a simple Shiny app that calls on these APIs for analysis. This will make it easier for me to see if my Valentine is happy with my gift because I can quickly snap a pic, host it online and post to multiple services. For education’s sake I added the cognitive face and vision APIs to the emotion API. While I could write another post specifically for Shiny apps I will just do a simple walk through of this code while using the library(Roxford) functions which are convenient wrappers for the POST code above.
Shiny apps need a UI for defining the user interface and a back end server for creating the objects and information to be rendered. The code below represents the ui.R file that creates the basic user interface. It begins, as a fluidPage. A fluid page layout consists of rows which in turn include columns. In each of these cells the different objects are added. After adding a page title I declare a fluidRow to be inside the page. The width is 3 and then contains a text box input for an image URL and a simple submit button. The next column in the row has a width of 8. The total for any row must be less than 12. Nested within this column is another row of the same width. The tablePanel will show the image that was sent to the API service. Below that are some text outputs representing the API responses.
The key takeaway is the naming of objects such as user.gen, user.age, user.caption, and user.emotion. Additionally the user will input a URL picture defined as user.url.pic. These are objects that will be defined in the server code and rendered back to the front end of the application.
Now that you have the data set, check out the rest of this post at ODSC!