Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Analysis with STATA

You're reading from   Data Analysis with STATA Explore the big data field and learn how to perform data analytics and predictive modelling in STATA

Arrow left icon
Product type Paperback
Published in Oct 2015
Publisher Packt
ISBN-13 9781782173175
Length 176 pages
Edition 1st Edition
Arrow right icon
Toc

Table of Contents (11) Chapters Close

Preface 1. Introduction to Stata and Data Analytics FREE CHAPTER 2. Stata Programming and Data Management 3. Data Visualization 4. Important Statistical Tests in Stata 5. Linear Regression in Stata 6. Logistic Regression in Stata 7. Survey Analysis in Stata 8. Time Series Analysis in Stata 9. Survival Analysis in Stata Index

Introducing data analytics

We analyze data everyday for various reasons. To predict an event or forecast the key indicators, such as the revenue for a given organization, is fast becoming a major requirement in the industry. There are various types of techniques and tools that can be leveraged to analyze the data. Here are the techniques that will be covered in this book using Stata as a tool:

  • Stata programming and data management: Before predicting anything, we need to manage and massage the data in order to make it good enough to be something through which insights can be derived. The programming aspect helps in creating new variables to treat data in such a way that finding patterns in historical data or predicting the outcome of given event becomes much easier.
  • Data visualization: After the data preparation, we need to visualize the data for the the following:
    • To view what patterns in the data look like
    • To check whether there are any outliers in the data
    • To understand the data better
    • To draw preliminary insights from the data
  • Important statistical tests in Stata: After data visualization, based on observations, you can try to come up with various hypotheses about the data. We need to test these hypotheses on the datasets to check whether they are statistically significant and whether we can depend on and apply these hypotheses in future situations as well.
  • Linear regression in Stata: Once done with the hypothesis testing, there is always a business need to predict one of the variables, such as what the revenue of the financial organization will be in specific conditions, and so on. These predictions about continuous variables, such as revenue, the default amount on a credit card, and the number of items sold in a given store, come through linear regression. Linear regression is the most basic and widely used prediction methodology. We will go into details of linear regression in a later chapter.
  • Logistic regression in Stata: When you need to predict the outcome of a particular event along with the probability, logistic regression is the best and most acknowledged method by far. Predicting which team will win the match in football or cricket or predicting whether a customer will default on a loan payment can be decided through the probabilities given by logistic regression.
  • Survey analysis in Stata: Understanding the customer sentiment and consumer experience is one of the biggest requirements of the retail industry. The research industry also needs data about people's opinions in order to derive the effect of a certain event or the sentiments of the affected people. All of these can be achieved by conducting and analyzing survey datasets. Survey analysis can have various subtechniques, such as factor analysis, principle component analysis, panel data analysis, and so on.
  • Time series analysis in Stata: When you try to forecast a time-dependent variable with reasonable cyclic behavior of seasonality, time series analysis comes handy. There are many techniques of time series analysis, but we will talk about a couple of them: Autoregressive Integrated Moving Average (ARIMA) and Box Jenkins. Forecasting the amount of rainfall depending on the amount of rainfall in the past 5 years is a classic time series analysis problem.
  • Survival analysis in Stata: These days, lots of customers attrite from telecom plans, healthcare plans, and so on, and join the competitors. When you need to develop a churn model or attrition model to check who will attrite, survival analysis is the best model.
You have been reading a chapter from
Data Analysis with STATA
Published in: Oct 2015
Publisher: Packt
ISBN-13: 9781782173175
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime