Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On SAS for Data Analysis

You're reading from   Hands-On SAS for Data Analysis A practical guide to performing effective queries, data visualization, and reporting techniques

Arrow left icon
Product type Paperback
Published in Sep 2019
Publisher Packt
ISBN-13 9781788839822
Length 346 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Harish Gulati Harish Gulati
Author Profile Icon Harish Gulati
Harish Gulati
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Section 1: SAS Basics FREE CHAPTER
2. Introduction to SAS Programming 3. Data Manipulation and Transformation 4. Section 2: Merging, Optimizing, and Descriptive Statistics
5. Combining, Indexing, Encryption, and Compression Techniques Simplified 6. Power of Statistics, Reporting, Transforming Procedures, and Functions 7. Section 3: Advanced Programming
8. Advanced Programming Techniques - SAS Macros 9. Powerful Functions, Options, and Automatic Variables Simplified 10. Section 4: SQL in SAS
11. Advanced Programming Techniques Using PROC SQL 12. Deep Dive into PROC SQL 13. Section 5: Data Visualization and Reporting
14. Data Visualization 15. Reporting and Output Delivery System 16. Other Books You May Enjoy

Identifying duplicates using Proc SQL

The simplest way to remove duplicates in Proc SQL is by using the Distinct statement. We will use it on the Dealership_Looped dataset, where the i column, which is used as a looping counter, has been dropped:

Proc Sql;
Create Table Distinct_Dealership_Looped As
Select Distinct *
From Dealership_Looped
;
Quit;

Using the Distinct statement, we have correctly identified the duplicates we created as part of the DO LOOPS. We are now left with the original number of 36 records we had. This can be confirmed by looking at the following LOG:

NOTE: Table WORK.DISTINCT_DEALERSHIP_LOOPED created, with 36 rows and 6 columns.

NOTE: PROCEDURE SQL used (Total process time):
real time 1:56.01
cpu time 1:05.78

Let's find out how would we have fared in terms of runtime if we had used PROC SORT. After all, PROC SORT is the most popular...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime