You're reading from Pentesting APIs A practical guide to discovering, fingerprinting, and exploiting APIs

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781837633166

Length 290 pages

Edition 1st Edition

Concepts

Cybersecurity

Author (1):

Maurício Harley

View More author details

Table of Contents (18) Chapters

Preface

1. Part 1: Introduction to API Security

2. Chapter 1: Understanding APIs and their Security Landscape FREE CHAPTER

3. Chapter 2: Setting Up the Penetration Testing Environment

4. Part 2: API Information Gathering and AuthN/AuthZ Testing

5. Chapter 3: API Reconnaissance and Information Gathering

6. Chapter 4: Authentication and Authorization Testing

7. Part 3: API Basic Attacks

8. Chapter 5: Injection Attacks and Validation Testing

9. Chapter 6: Error Handling and Exception Testing

10. Chapter 7: Denial of Service and Rate-Limiting Testing

11. Part 4: API Advanced Topics

12. Chapter 8: Data Exposure and Sensitive Information Leakage

13. Chapter 9: API Abuse and Business Logic Testing

14. Part 5: API Security Best Practices

15. Chapter 10: Secure Coding Practices for APIs

16. Index

Why subscribe?

17. Other Books You May Enjoy

What is an API?

There are a few definitions. For example, Red Hat says that APIs are “a set of definitions and protocols for building and integrating application software.” whereas Amazon Web Services (AWS) states that “APIs are mechanisms that enable two software components to communicate with each other using a set of definitions and protocols.”. Well, APIs are not limited to two software components only, for sure, but both definitions share this part: “definitions and protocols”. Let’s craft our own definition by making a comparison with the analog world.

An API is a bridge (communication path) between two distinct parts (codes), belonging to the same city or not (the same program). By following a set of pre-established traffic rules (protocols) and conventions (definitions), vehicles (requests and responses) can freely flow between both sides. Sometimes, APIs may have speed controls (throttling gears) that are enforced as needed.

As it happens with all kinds of communication, definitions need to be established first. This rule is not limited to the digital world. I can’t ask you to sell me a car if you have no idea what selling is or if a car is a type of vehicle. Protocols also are paramount. Unless you are donating a product, a sale starts with me paying you for the product I want and you handing it over to me. It includes giving me change if necessary.

In terms of APIs, definitions are related to which types and lengths of data are acceptable and allowed between the communicating partners. A requester cannot send certain data as a string when the receiver is expecting to receive a number, for example. Negative numbers may also pose an additional challenge to badly written APIs. When dealing with data lengths, minimum and especially maximum sizes are applicable. You will learn later how important it is to block data chunks that are bigger than what your API is able to handle.

Protocols are the second component of an API. As their counterparts in the networking arena, they are responsible for guaranteeing that independently written software will be able to communicate in an effective manner. Even though you might be reading this book primarily because of web-bound APIs and ways to explore their security flaws, I need to tell you that even inside your computer, there are APIs working between your Operating System (OS) and your Wi-Fi card, with definitions and protocols like their more famous web cousins. If you are familiar with the Transmission Control Protocol/Internet Protocol (TCP/IP) stack, the following figure is not strange. The communication on TCP/IP can only happen because each small rectangle has their own lower-level protocols implemented in a way that allows the same Network Interface Card (NIC) to be used in different OSs and those different OSs can communicate with each other:

Figure 1.1 – Communication with TCP/IP

Every API should be well documented so anyone who wants to use it does not have to request information from its creators or maintainers. Can you imagine the avalanche of NIC manufacturers sending enquiries to Defense Advanced Research Project Agency (DARPA) scientists to understand how the data link layer should be developed, which data structures should be in place, and which sizes and types of data should be considered every time a new product was going to be released?

When documenting an API, at least the definitions of data types and protocol(s) adopted need to be made explicit. Well-documented APIs also usually have examples of their usage, along with exceptions that may be generated when something goes wrong, such as bad data manipulation or unexpected behavior.

A brief history of APIs

You will read a lot about web APIs in this book. However, as you saw in the TCP/IP example, APIs were not created along with the web. The idea was born many decades ago, in 1951, when Maurice Wilkes, David Wheeler and Stanley Gill, three British computer scientists, proposed this concept while they were building the Electronic Delay Storage Automatic Calculator (EDSAC), one of the very first computers ever. Their book, The Preparation of Programs for an Electronic Digital Computer, focused primarily on explaining the library they built, as well as its subroutines (should you need to develop a program to run on the EDSAC). Observe the concern in explaining how the computer could be used beginning with the book’s title. This book became the first API documentation we have records of.

Moving on to the 1960s and 70s, the usage of computers grew, leveraging the improvements in electric and electronic circuitry. Their sizes also started to reduce. Nonetheless, they were still the size of some rooms. The use of APIs was now attached to the need for developers to not have to worry about the details of how displays or other peripherals worked. We were in the era of mainframes, and the advent of new ways to interact with a computer, such as terminals and printers, was posing additional challenges to program developers. In 1975, Cristopher Date and Edgar Codd, a British mathematician and computer scientist respectively, released a paper titled The relational and network approaches: Comparison of the application programming interfaces. In this work, APIs were proposed to databases, something that is still in use today.

In the 1980s, we started seeing commercial explorations of consumer networks. In 1984, The Electronic Mall, an online shopping service sold by CompuServe, was offered to the company’s subscribers. They could buy products from other merchants through their Consumer Information Service network. You may ask yourself where there is an API in all of this. With the incremental usage of computer networks, developers needed to sophisticate their code, and requirements to access code and libraries located in remote computers began to show up. It was in 1981 that the term Remote Procedure Calls (RPCs) was coined by the American computer scientist Bruce Nelson. The concept is as simple as a client sending a request to a network server that then processes the request (executes some computation) and returns a result to the client. RPC is therefore what we know as a message passing mechanism, in which some channel (usually a computer network) is applied to allow communication between different elements through message exchanges.

In the 1990s, that is, more than 40 years after the idea of APIs was first used, the internet was generally used around the world (in the USA, this happened nearly one decade before). Previously restricted to research institutions and government agencies only, the commercial use of the network was then completely possible. This increased the adoption of APIs even more and they became the de facto way of exchanging information between programs. New websites came up, new consumer products and services became commercially accessible through the internet, and it was clear that software needed standards to communicate with each other. Java, a programming language created by Sun Microsystems (now part of Oracle Inc.), played a vital role. In 1984, John Gage, the #21 employee of Sun Microsystems, coined the phrase “The network is the computer”. In his own words, “We based our vision of an interconnected world on open and shared standards.” Eleven years after, James Gosling, another Sun Microsystems employee, created the Java programming language, which would evolve to Java 2 afterward and became the seed of notable APIs, released as part of its Java 2 Enterprise Edition (J2EE, now Jakarta EE) and Java 2 Micro Edition (J2ME).

In the 2000s, the internet had pretty much been consolidated. The always-growing number of companies joining the network among massive amounts of developers creating new web solutions demanded a quick and effective way to establish a communication path between clients (at this time, those were mostly browsers) and web servers. In 2000, a PhD thesis entitled Architectural Styles and the Design of Network-based Software Architectures by Roy Fielding proposed a structured way to allow clients and servers to exchange messages on the internet. Roy proposed Representation State Transfer (REST), which became one of the most popular API protocols in the world. This decade also saw the explosion of cloud computing offerings, both private and public, which mostly implemented REST. It also saw the creation of Web 2.0 in 2004, which states the new way that the internet should be used (with a greater focus on centering on the user), as well as the birth of applications such as Facebook, X (previously Twitter), Reddit, and many more.

Ten years later, in the 2010s, web protocols were even more evolved. We were in the decade of social media and apps, with millions of requests per minute. To give you an idea, in 2013, each minute on the Internet was occupied, among other traffic, with 461,805 Facebook logins, 38,000 photos uploaded to Instagram, and 347,000 tweets sent. This was also the decade when containers and microservice-based applications faced their most expressive adoption. The release of Kubernetes, an open source container orchestrator, augmented the possibilities for dynamic applications on the internet. It was in the 2010s that the term Web 3.0 was coined for the first time, with its focus primarily based on blockchain. APIs became fundamental for companies creating and delivering their products to the public.

As the Tears for Fears’ 1985 hit song Head Over Heels states, it’s “funny how time flies”. Time really flew and we arrived in the 2020s. Nowadays, applications keep modernizing themselves, but now we have the presence of systems running even more spread. The advent of concepts such as edge computing and the Internet of Things (IoT) increased the complexity of the whole scenario and demanded the evolution of APIs to encompass such changes. Web 3.0 was, in fact, only incorporated in 2021. We currently have applications being designed and developed around an API, and not the opposite, as it happened in the early stages of the technology.

You're reading from Pentesting APIs A practical guide to discovering, fingerprinting, and exploiting APIs

Table of Contents (18) Chapters

What is an API?

A brief history of APIs

Authors (1)

Personalised recommendations for you