Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
RStudio for R Statistical Computing Cookbook

You're reading from   RStudio for R Statistical Computing Cookbook Over 50 practical and useful recipes to help you perform data analysis with R by unleashing every native RStudio feature

Arrow left icon
Product type Paperback
Published in Apr 2016
Publisher
ISBN-13 9781784391034
Length 246 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Andrea Cirillo Andrea Cirillo
Author Profile Icon Andrea Cirillo
Andrea Cirillo
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. Acquiring Data for Your Project 2. Preparing for Analysis – Data Cleansing and Manipulation FREE CHAPTER 3. Basic Visualization Techniques 4. Advanced and Interactive Visualization 5. Power Programming with R 6. Domain-specific Applications 7. Developing Static Reports 8. Dynamic Reporting and Web Application Development Index

Accessing an API with R

As we mentioned before, an always increasing proportion of our data resides on the Web and is made available through web APIs.

Note

APIs in computer programming are intended to be APIs, groups of procedures, protocols, and software used for software application building. APIs expose software in terms of input, output, and processes.

Web APIs are developed as an interface between web applications and third parties.

The typical structure of a web API is composed of a set of HTTP request messages that have answers with a predefined structure, usually in the XML or JSON format.

A typical use case for API data contains data regarding web and mobile applications, for instance, Google Analytics data or data regarding social networking activities.

The successful web application If This ThenThat (IFTTT), for instance, lets you link together different applications, making them share data with each other and building powerful and customizable workflows:

Accessing an API with R

This useful job is done by leveraging the application's API (if you don't know IFTTT, just navigate to https://ifttt.com, and I will see you there).

Using R, it is possible to authenticate and get data from every API that adheres to the OAuth 1 and OAuth 2 standards, which are nowadays the most popular standards (even though opinions about these protocols are changing; refer to this popular post by the OAuth creator Blain Cook at http://hueniverse.com/2012/07/26/oauth-2-0-and-the-road-to-hell/). Moreover, specific packages have been developed for a lot of APIs.

This recipe shows how to access custom APIs and leverage packages developed for specific APIs.

In the There's more... section, suggestions are given on how to develop custom functions for frequently used APIs.

Getting ready

The rvest package, once again a product of our benefactor Hadley Whickham, provides a complete set of functionalities for sending and receiving data through the HTTP protocol on the Web. Take a look at the quick-start guide hosted on GitHub to get a feeling of rvest functionalities (https://github.com/hadley/rvest).

Among those functionalities, functions for dealing with APIs are provided as well.

Both OAuth 1.0 and OAuth 2.0 interfaces are implemented, making this package really useful when working with APIs.

Let's look at how to get data from the GitHub API. By changing small sections, I will point out how you can apply it to whatever API you are interested in.

Let's now actually install the rvest package:

install.packages("rvest")
library(rvest)

How to do it…

  1. The first step to connect with the API is to define the API endpoint. Specifications for the endpoint are usually given within the API documentation. For instance, GitHub gives this kind of information at http://developer.github.com/v3/oauth/.

    In order to set the endpoint information, we are going to use the oauth_endpoint() function, which requires us to set the following arguments:

    • request: This is the URL that is required for the initial unauthenticated token. This is deprecated for OAuth 2.0, so you can leave it NULL in this case, since the GitHub API is based on this protocol.
    • authorize: This is the URL where it is possible to gain authorization for the given client.
    • access: This is the URL where the exchange for an authenticated token is made.
    • base_url: This is the API URL on which other URLs (that is, the URLs containing requests for data) will be built upon.

      In the GitHub example, this will translate to the following line of code:

      github_api <- oauth_endpoint(request   = NULL, 
                                   authorize =          "https://github.com/login/oauth/authorize",                     access    = "https://github.com/login/oauth/access_token",
                                   base_url  =  "https://github.com/login/oauth")
  2. Create an application to get a key and secret token. Moving on with our GitHub example, in order to create an application, you will have to navigate to https://github.com/settings/applications/new (assuming that you are already authenticated on GitHub).

    Be aware that no particular URL is needed as the homepage URL, but a specific URL is required as the authorization callback URL.

    This is the URL that the API will redirect to after the method invocation is done.

    As you would expect, since we want to establish a connection from GitHub to our local PC, you will have to redirect the API to your machine, setting the Authorization callback URL to http://localhost:1410.

    After creating your application, you can get back to your R session to establish a connection with it and get your data.

  3. After getting back to your R session, you now have to set your OAuth credentials through the oaut_app() and oauth2.0_token() functions and establish a connection with the API, as shown in the following code snippet:
    app <- oauth_app("your_app_name",
      key = "your_app_key",
      secret = "your_app_secret")
      API_token <- oauth2.0_token(github_api,app)
  4. This is where you actually use the API to get data from your web-based software. Continuing on with our GitHub-based example, let's request some information about API rate limits:
    request <- GET("https://api.github.com/rate_limit", config(token = API_token))

How it works...

Be aware that this step will be required both for OAuth 1.0 and OAuth 2.0 APIs, as the difference between them is only the absence of a request URL, as we noted earlier.

Note

Endpoints for popular APIs

The httr package comes with a set of endpoints that are already implemented for popular APIs, and specifically for the following websites:

  • LinkedIn
  • Twitter
  • Vimeo
  • Google
  • Facebook
  • GitHub

For these APIs, you can substitute the call to oauth_endpoint() with a call to the oauth_endpoints() function, for instance:

oauth_endpoints("github")

The core feature of the OAuth protocol is to secure authentication. This is then provided on the client side through a key and secret token, which are to be kept private.

The typical way to get a key and a secret token to access an API involves creating an app within the service providing the API.

The callback URL

Within the web API domain, a callback URL is the URL that is called by the API after the answer is given to the request. A typical example of a callback URL is the URL of the page navigated to after completing an online purchase.

In this example, when we finish at the checkout on the online store, an API call is made to the payment circuit provider.

After completing the payment operation, the API will navigate again to the online store at the callback URL, usually to a thank you page.

There's more...

You can also write custom functions to handle APIs. When frequently dealing with a particular API, it can be useful to define a set of custom functions in order to make it easier to interact with.

Basically, the interaction with an API can be summarized with the following three categories:

  • Authentication
  • Getting content from the API
  • Posting content to the API

Authentication can be handled by leveraging the HTTR package's authenticate() function and writing a function as follows:

api_auth    function (path = "api_path", password){
authenticate(user = path, password)
}

You can get the content from the API through the get function of the httr package:

api_get <- function(path = "api_path",password){
auth <- api_auth(path, password )
request <- GET("https://api.com", path = path, auth)

}

Posting content will be done in a similar way through the POST function:

api_post <- function(Path, post_body, path = "api_path",password){
auth <- api_auth(pat) stopifnot(is.list(body)) 
body_json <- jsonlite::toJSON(body) 
request <- POST("https://api.application.com", path = path, body = body_json, auth, post, ...) 
}
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image