Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On SAS for Data Analysis A practical guide to performing effective queries, data visualization, and reporting techniques

Product type Paperback

Published in Sep 2019

Publisher Packt

ISBN-13 9781788839822

Length 346 pages

Edition 1st Edition

Languages

SQL

Tools

SAS

Concepts

Data Analysis

Author (1):

Harish Gulati

View More author details

Table of Contents (17) Chapters

Preface

1. Section 1: SAS Basics FREE CHAPTER

2. Introduction to SAS Programming

3. Data Manipulation and Transformation

4. Section 2: Merging, Optimizing, and Descriptive Statistics

5. Combining, Indexing, Encryption, and Compression Techniques Simplified

6. Power of Statistics, Reporting, Transforming Procedures, and Functions

7. Section 3: Advanced Programming

8. Advanced Programming Techniques - SAS Macros

9. Powerful Functions, Options, and Automatic Variables Simplified

10. Section 4: SQL in SAS

11. Advanced Programming Techniques Using PROC SQL

12. Deep Dive into PROC SQL

13. Section 5: Data Visualization and Reporting

14. Data Visualization

15. Reporting and Output Delivery System

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;
  Create Table Distinct_Dealership_Looped As
      Select Distinct *
    From Dealership_Looped
  ;
Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns.
 
NOTE: PROCEDURE SQL used (Total process time):
       real time 1:56.01
       cpu time 1:05.78

Let's find out how would we have fared in terms of runtime if we had used PROC SORT. After all, PROC SORT is the most popular...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Gulati

Harish Gulati is a consultant, analyst, modeler, and trainer based in London. He has 16 years of financial, consulting, and project management experience across leading banks, management consultancies, and media hubs. He enjoys demystifying his complex line of work in his spare time. This has led him to be an author and orator at analytical forums. His published books include SAS for Finance by Packt and Role of a Data Analyst, published by the British Chartered Institute of IT (BCS). He has an MBA in brand communications and a degree in psychology.

See other products by Gulati