Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
LLVM Cookbook
LLVM Cookbook

LLVM Cookbook: Over 80 engaging recipes that will help you build a compiler frontend, optimizer, and code generator using LLVM

Arrow left icon
Profile Icon Mayur Pandey Profile Icon Suyog Sarda
Arrow right icon
$19.99 per month
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2 (9 Ratings)
Paperback May 2015 296 pages 1st Edition
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Mayur Pandey Profile Icon Suyog Sarda
Arrow right icon
$19.99 per month
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2 (9 Ratings)
Paperback May 2015 296 pages 1st Edition
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
NZ$39.99 NZ$57.99
Paperback
NZ$71.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

LLVM Cookbook

Chapter 2. Steps in Writing a Frontend

In this chapter, we will cover the following recipes:

  • Defining a TOY language
  • Implementing a lexer
  • Defining Abstract Syntax Tree
  • Implementing a parser
  • Parsing simple expressions
  • Parsing binary expressions
  • Invoking a driver for parsing
  • Running lexer and parser on our TOY language
  • Defining IR code generation methods for each AST class
  • Generating IR code for expressions
  • Generating IR code for functions
  • Adding IR optimization support

Introduction

In this chapter, you will get to know about how to write a frontend for a language. By making use of a custom-defined TOY language, you will have recipes on how to write a lexer and a parser, and how to generate IR code from the Abstract Syntax Tree (AST) generated by the frontend.

Defining a TOY language

Before implementing a lexer and parser, the syntax and grammar of the language need to be determined first. In this chapter, a TOY language is used to demonstrate how a lexer and a parser can be implemented. The purpose of this recipe is to show how a language is skimmed through. For this purpose, the TOY language to be used is simple but meaningful.

A language typically has some variables, some function calls, some constants, and so on. To keep things simple, our TOY language in consideration has only numeric constants of 32-bit Integer type A, a variable that need not declare its type (like Python, in contrast to C/C++/Java, which require a type declaration) in the TOY language.

How to do it…

The grammar can be defined as follows (the production rules are defined below, with non-terminals on Left Hand Side (LHS) and a combination of terminals and non-terminals on Right Hand Side (RHS); when LHS is encountered, it yields appropriate RHS defined in the production...

Implementing a lexer

Lexer is a part of the first phase in compiling a program. Lexer tokenizes a stream of input in a program. Then parser consumes these tokens to construct an AST. The language to tokenize is generally a context-free language. A token is a string of one or more characters that are significant as a group. The process of forming tokens from an input stream of characters is called tokenization. Certain delimiters are used to identify groups of words as tokens. There are lexer tools to automate lexical analysis, such as LEX. In the TOY lexer demonstrated in the following procedure is a handwritten lexer using C++.

Getting ready

We must have a basic understanding of the TOY language defined in the recipe. Create a file named toy.cpp as follows:

$ vim toy.cpp

All the code that follows will contain all the lexer, parser, and code generation logic.

How to do it…

While implementing a lexer, types of tokens are defined to categorize streams of input strings (similar to states...

Defining Abstract Syntax Tree

AST is a tree representation of the abstract syntactic structure of the source code of a programming language. The ASTs of programming constructs, such as expressions, flow control statements, and so on, are grouped into operators and operands. ASTs represent relationships between programming constructs, and not the ways they are generated by grammar. ASTs ignore unimportant programming elements such as punctuations and delimiters. ASTs generally contain additional properties of every element in it, which are useful in further compilation phases. Location of source code is one such property, which can be used to throw an error line number if an error is encountered in determining the correctness of the source code in accordance with the grammar (location, line number, column number, and so on, and other related properties are stored in an object of the SourceManager class in Clang frontend for C++).

The AST is used intensively during semantic analysis, where...

Implementing a parser

Parser analyzes a code syntactically according to the rules of the language's grammar. The parsing phase determines if the input code can be used to form a string of tokens according to the defined grammar. A parse tree is constructed in this phase. Parser defines functions to organize language into a data structure called AST. The parser defined in this recipe uses a recursive decent parser technique which is a top-down parser, and uses mutually recursive functions to build the AST.

Getting ready

We must have the custom-defined language, that is the TOY language in this case, and also a stream of tokens generated by the lexer.

How to do it…

Define some basic value holders in our TOY parser as shown in the following:

  1. Open the toy.cpp file as follows:
    $ vi toy.cpp
  2. Define a global static variable to hold the current token from the lexer as follows:
    static int Current_token;
  3. Define a function to get the next token from the input stream from the lexer as follows:
    static...

Parsing simple expressions

In this recipe, you will learn how to parse a simple expression. A simple expression may consist of numeric values, identifiers, function calls, a function declaration, and function definitions. For each type of expression, individual parser logic needs to be defined.

Getting ready

We must have the custom-defined language—that is, the TOY language in this case—and also stream of tokens generated by lexer. We already defined ASTs above. Further, we are going to parse the expression and invoke AST constructors for every type of expression.

How to do it…

To parse simple expressions, proceed with the following code flow:

  1. Open the toy.cpp file as follows:
    $ vi toy.cpp

    We already have lexer logic present in the toy.cpp file. Whatever code follows needs to be appended after the lexer code in the toy.cpp file.

  2. Define the parser function for numeric expression as follows:
    static BaseAST *numeric_parser() {
      BaseAST *Result = new NumericAST(Numeric_Val);
     ...

Introduction


In this chapter, you will get to know about how to write a frontend for a language. By making use of a custom-defined TOY language, you will have recipes on how to write a lexer and a parser, and how to generate IR code from the Abstract Syntax Tree (AST) generated by the frontend.

Defining a TOY language


Before implementing a lexer and parser, the syntax and grammar of the language need to be determined first. In this chapter, a TOY language is used to demonstrate how a lexer and a parser can be implemented. The purpose of this recipe is to show how a language is skimmed through. For this purpose, the TOY language to be used is simple but meaningful.

A language typically has some variables, some function calls, some constants, and so on. To keep things simple, our TOY language in consideration has only numeric constants of 32-bit Integer type A, a variable that need not declare its type (like Python, in contrast to C/C++/Java, which require a type declaration) in the TOY language.

How to do it…

The grammar can be defined as follows (the production rules are defined below, with non-terminals on Left Hand Side (LHS) and a combination of terminals and non-terminals on Right Hand Side (RHS); when LHS is encountered, it yields appropriate RHS defined in the production rule...

Implementing a lexer


Lexer is a part of the first phase in compiling a program. Lexer tokenizes a stream of input in a program. Then parser consumes these tokens to construct an AST. The language to tokenize is generally a context-free language. A token is a string of one or more characters that are significant as a group. The process of forming tokens from an input stream of characters is called tokenization. Certain delimiters are used to identify groups of words as tokens. There are lexer tools to automate lexical analysis, such as LEX. In the TOY lexer demonstrated in the following procedure is a handwritten lexer using C++.

Getting ready

We must have a basic understanding of the TOY language defined in the recipe. Create a file named toy.cpp as follows:

$ vim toy.cpp

All the code that follows will contain all the lexer, parser, and code generation logic.

How to do it…

While implementing a lexer, types of tokens are defined to categorize streams of input strings (similar to states of an automata...

Defining Abstract Syntax Tree


AST is a tree representation of the abstract syntactic structure of the source code of a programming language. The ASTs of programming constructs, such as expressions, flow control statements, and so on, are grouped into operators and operands. ASTs represent relationships between programming constructs, and not the ways they are generated by grammar. ASTs ignore unimportant programming elements such as punctuations and delimiters. ASTs generally contain additional properties of every element in it, which are useful in further compilation phases. Location of source code is one such property, which can be used to throw an error line number if an error is encountered in determining the correctness of the source code in accordance with the grammar (location, line number, column number, and so on, and other related properties are stored in an object of the SourceManager class in Clang frontend for C++).

The AST is used intensively during semantic analysis, where...

Implementing a parser


Parser analyzes a code syntactically according to the rules of the language's grammar. The parsing phase determines if the input code can be used to form a string of tokens according to the defined grammar. A parse tree is constructed in this phase. Parser defines functions to organize language into a data structure called AST. The parser defined in this recipe uses a recursive decent parser technique which is a top-down parser, and uses mutually recursive functions to build the AST.

Getting ready

We must have the custom-defined language, that is the TOY language in this case, and also a stream of tokens generated by the lexer.

How to do it…

Define some basic value holders in our TOY parser as shown in the following:

  1. Open the toy.cpp file as follows:

    $ vi toy.cpp
  2. Define a global static variable to hold the current token from the lexer as follows:

    static int Current_token;
  3. Define a function to get the next token from the input stream from the lexer as follows:

    static void next_token...

Parsing simple expressions


In this recipe, you will learn how to parse a simple expression. A simple expression may consist of numeric values, identifiers, function calls, a function declaration, and function definitions. For each type of expression, individual parser logic needs to be defined.

Getting ready

We must have the custom-defined language—that is, the TOY language in this case—and also stream of tokens generated by lexer. We already defined ASTs above. Further, we are going to parse the expression and invoke AST constructors for every type of expression.

How to do it…

To parse simple expressions, proceed with the following code flow:

  1. Open the toy.cpp file as follows:

    $ vi toy.cpp

    We already have lexer logic present in the toy.cpp file. Whatever code follows needs to be appended after the lexer code in the toy.cpp file.

  2. Define the parser function for numeric expression as follows:

    static BaseAST *numeric_parser() {
      BaseAST *Result = new NumericAST(Numeric_Val);
      next_token();
      return...

Parsing binary expressions


In this recipe, you will learn how to parse a binary expression.

Getting ready

We must have the custom-defined language—that is, the toy language in this case—and also stream of tokens generated by lexer. The binary expression parser requires precedence of binary operators for determining LHS and RHS in order. An STL map can be used to define precedence of binary operators.

How to do it…

To parse a binary expression, proceed with the following code flow:

  1. Open the toy.cpp file as follows:

    $ vi toy.cpp
  2. Declare a map for operator precedence to store the precedence at global scope in the toy.cpp file as follows:

    static std::map<char, int>Operator_Precedence;

    The TOY language for demonstration has 4 operators where precedence of operators is defined as -< + < / < *.

  3. A function to initialize precedence—that is, to store precedence value in map—can be defined in global scope in the toy.cpp file as follows:

    static void init_precedence() {
      Operator_Precedence['-'...
Left arrow icon Right arrow icon

Description

The book is for compiler programmers who are familiar with concepts of compilers and want to indulge in understanding, exploring, and using LLVM infrastructure in a meaningful way in their work. This book is also for programmers who are not directly involved in compiler projects but are often involved in development phases where they write thousands of lines of code. With knowledge of how compilers work, they will be able to code in an optimal way and improve performance with clean code.

What you will learn

  • Introduction to LLVM modular design and LLVM tools Write a frontend for a language Add JIT support and use frontends for different languages Learn about the LLVM Pass infrastructure and the LLVM Pass Manager Create analyses and transform optimization passes Build a LLVM TOY backend from scratch Optimize the code at SelectionDAG level and allocate registers to variables

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 30, 2015
Length: 296 pages
Edition : 1st
Language : English
ISBN-13 : 9781785285981
Vendor :
LLVM
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : May 30, 2015
Length: 296 pages
Edition : 1st
Language : English
ISBN-13 : 9781785285981
Vendor :
LLVM
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just NZ$7 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total NZ$ 183.97
LLVM Essentials
NZ$39.99
Getting started with LLVM core libraries
NZ$71.99
LLVM Cookbook
NZ$71.99
Total NZ$ 183.97 Stars icon

Table of Contents

10 Chapters
1. LLVM Design and Use Chevron down icon Chevron up icon
2. Steps in Writing a Frontend Chevron down icon Chevron up icon
3. Extending the Frontend and Adding JIT Support Chevron down icon Chevron up icon
4. Preparing Optimizations Chevron down icon Chevron up icon
5. Implementing Optimizations Chevron down icon Chevron up icon
6. Target-independent Code Generator Chevron down icon Chevron up icon
7. Optimizing the Machine Code Chevron down icon Chevron up icon
8. Writing an LLVM Backend Chevron down icon Chevron up icon
9. Using LLVM for Various Useful Projects Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
(9 Ratings)
5 star 22.2%
4 star 0%
3 star 0%
2 star 11.1%
1 star 66.7%
Filter icon Filter
Top Reviews

Filter reviews by




A. Mcb Hill Sep 17, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Thorough and about an excellent topic. Unfortunately, I have not finished it yet.
Amazon Verified review Amazon
David Baker Oct 14, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Help me get a practical handle on the LLVM internal representation from the first chapter. Showed how each pass is transformational on the IR. Made the concepts easy to grasp with actual hands on executions.
Amazon Verified review Amazon
f1sdz Sep 03, 2016
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
Ce livre est un simple copier-coller d'exemples publiquement disponibles sur le site de LLVM. Je ne conseille pas du tout l'achat de ce livre.
Amazon Verified review Amazon
Ryan Patrick Nicholl Dec 08, 2015
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
Basically a repeat of information you can easily find on the online LLVM docs for free.
Amazon Verified review Amazon
je_2014 May 12, 2016
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
The worst tech/programming book I've ever read. It is a disorganized attempt to copy and paste text from llvm.org. I've also bought "LLVM Essentials" by the same two authors, I've not had a chance to look at this book thoroughly but it looks like another similar compilation of llvm.org text snippets. I've focused mainly on the backend chapters and tried to create the example "toy" backend. It is thoroughly incomplete and all of the "recipes" that I've tried do not even compile.Additionally, I've downloaded the example code from the publishers website and nothing compiled using LLVM 3.8. They don't even mention how to integrate/register the target with Clang or do a complete job with LLVM. I contacted the publisher about this and their response was that the code was developed for an older version of LLVM. Fair enough, but I've found several syntactical errors in the downloaded code recipes that even show in the published book.This means that the authors didn't even bother to build/compile their own examples! Absolutely ridiculous!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.