Augmented Analytics Algorithms and Techniques: Learning for Citizen Data Scientists

This article summarizes our recent article series on the definition, meaning and use of the various algorithms and analytical methods and techniques used in predictive analytics for business users, and in augmented data preparation and augmented data discovery tools.


The article series is designed to help business users better understand the analytical techniques so that the average user can feel more confident in adopting, embracing and sharing these tools.

This twenty-four (24) article series includes:

Naïve Bayes Classification: What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?

Use Case(s): Weather Forecasting, Fraud Analysis and more.

Frequent Pattern Mining (Association): What is Frequent Pattern Mining (Association) and How Does it Support Business Analysis?

Use Case(s): Market Basket Analysis, Frequently Bundled Products and more.

KNN Classification: What is KNN Classification and How Can This Analysis Help an Enterprise?

Use Case(s): Predicting Loan Default, Predicting Success of Medical Treatment and more.

Multiple Linear Regression: What is Multiple Linear Regression and How Can it be Helpful for Business Analysis?

Use Case(s): Impact of Product Pricing, Promotion on Sales, Impact of rainfall, humidity on crop yield an more.

Independent Samples T Test: What is the Independent Samples T Test Method of Analysis and How Can it Benefit an Organization?

Use Case(s): Are men more satisfied with their jobs than women? Does customer group A spend more on products than customer group B, and more.

Simple Random Sampling and Stratified Random Sampling: What Are Simple Random Sampling and Stratified Random Sampling Analytical Techniques?

Use Case(s): Average value of all cars in U.S. based on sample, sampling by age, gender, religion, race, educational attainment, socioeconomic status, and nationality and more.

Spearman’s Rank Correlation: What is Spearman’s Rank Correlation and How is it Useful for Business Analysis?

Use Case(s): Cluster various survey responders into groups, based on rank correlation, assess student rating by department chairs and by the faculty members and more.

Binary Logistic Regression Classification: What is Binary Logistic Regression Classification and How is it Used in Analysis?

Use Case(s): Predict if loan default based on attributes of applicant; predict likelihood of successful treatment of new patient based on patient attributes and more.

Paired Sample T Test: What is the Paired Sample T Test and How is it Beneficial to Business Analysis?

Use Case(s): Manufacturing unit manager analyzes statistical significance of cycle time difference, pre and post process change, determine whether sales increased following a particular campaign and more.

Simple Linear Regression: What is Simple Linear Regression and How Can an Enterprise Use this Technique to Analyze Data?

Use Case(s): Measure the impact of product price on product sales, measure the impact of temperature on crop yield an more.

ARIMAX Forecasting: What is ARIMAX Forecasting and How is it Used for Enterprise Analysis?

Use Case(s): Forecast product line growth based on data from the past 30 years based on yearly consumer inflation rate, yearly GDP data, target variables for user-specified time periods to clearly illustrate results for planning, production, sales and other factors and more.

Karl Pearson Correlation Analysis: What is Karl Pearson Correlation Analysis and How Can it be Used for Enterprise Analysis Needs?

Use Case(s): Correlation between income and credit card delinquency rate, identify negative, positive and neutral correlations between the age of a consumer and the color of shirt they might purchase and more.

Hierarchical Clustering: What is Hierarchical Clustering and How Can an Organization Use it to Analyze Data?

Use Case(s): Group loan applicants into high/medium/low risk based on attributes such as loan amount, installments, or employment tenure, organize customers into groups/segments based on similar traits, product preferences and expectations and more.

SVM Classification Analysis: What is SVM Classification Analysis and How Can It Benefit Business Analytics?

Use Case(s): Predict success of treatment success based on attributes of a patient, improve weather forecasting results and more.

Outlier Analysis: What is Outlier Analysis and How Can It Improve Analysis?

Use Case(s): Outliers are sometimes discounted, or in other cases, they will indicate that the organization should focus solely on those outliers; identify when a person recovered from a particular disease in spite of the fact that most other patients did not survive, and more.

Decision Tree Analysis: What is the Decision Tree Analysis and How Does it Help a Business to Analyze Data?

Use Case(s): Classify customers into those that will default and those that will not default. And assess the characteristics of customers that are likely to default, based on customer attributes and past online shopping behavioral data, one can predict the future purchases of customers and more.

Chi Square Test of Association: What is the Chi Square Test of Association and How Can it be Used for Analysis?

Use Case(s): Determine if a product sells better in certain locations, verify if gender has an influence on purchasing decisions, Identify if demographic factors influence banking channel/product/service preference or selection of a type of term insurance plan and more.

FP Growth Analysis: What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining to Analyze Data?

Use Case(s): Select items in a business catalog to complement each other so that buying one item will lead to buying another, analyze the association of purchased items in a single basket or single purchase and more.

ARIMA Forecasting: What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?

Use Case(s): Predict sales of a drug for the next 2 months, based on drug sales from the past 12 months, suitable for forecasting when data is stationary or non-stationary, will produce accurate, dependable forecasts, when planning for short-term business results and more.

Multinomial-Logistic Regression Classification: What is the Multinomial-Logistic Regression Classification Algorithm and How Does One Use it for Analysis?

Use Case(s): Based on the attributes of a respondent e.g., demographics, marital status, gender, income, age, qualification etc., analysis can check the level of likely satisfaction with life/job/product/services, given a list of symptoms, one can predict if a patient is likely to be diagnosed with initial/intermediate/serious stages of a particular disease and more.

KMeans Clustering Algorithm: What is the KMeans Clustering Algorithm and How Does an Enterprise Use it to Analyze Data?

Use Case(s): Loan applicants grouped as low, medium, and high risk based on applicant age, annual income, employment tenure, a movie ticket booking website can group users into frequent ticket buyers, moderate ticket buyers and occasional ticket buyers, based on past movie ticket purchases, and more.

Descriptive Statistics: What is Descriptive Statistics and How Do You Choose the Right One for Enterprise Analysis?

Use Case(s): Average age and income for a particular type of product category purchased, Identify the most popular dish served in the restaurant or find out the most frequent rating given by customers for a given movie/restaurant or most frequent size or category of a sold product and more.

Holt-Winters Forecasting: What is the Holt-Winters Forecasting Algorithm and How Can it be Used for Enterprise Analysis?

Use Case(s): Forecasting number of viewers by day for a particular game show for next two months.

Input data: Last six months daily viewer count data, insurance claim manager can forecast policy sales for next month based on past 12 months data and more.

Trends and Patterns: What Are Data Trends and Patterns, and How Do They Impact Business Decisions?

Use Case(s): identify seasonality pattern when fluctuations repeat over fixed periods of time and where patterns do not extend beyond 1 year, analyze a stationary time series with statistical properties, where variances are all constant over time, or cyclical when fluctuations do not repeat over fixed periods of time, are unpredictable and extend beyond a year, and more.

Each of these techniques, methods and algorithms has a unique value in advanced analytics. Augmented Data Discovery tools allow business users to gather and analyze data using these techniques within a sophisticated, intuitive navigation that is designed to guide users through the processing of selecting the appropriate algorithm or analytical technique based on the type of data selected.

This article series will help business users understand the concepts and the benefits of each technique, as well as the logic behind the application of these techniques, and the value-added auto-recommendations and suggestions provided by comprehensive augmented analytics tools.

You can find more educational resources by browsing our Augmented Analytics Learning and Augmented Analytics Videos pages.

About Smarten

The Smarten approach to business intelligence and business analytics focuses on the business user and provides Advanced Data Discovery so users can perform early prototyping and test hypotheses without the skills of a data scientist. Smarten Augmented Analytics tools include plug n’ play predictive analyticsassisted predictive modelingsmart data visualizationself-serve data preparation and clickless analytics for search analytics with natural language processing (NLP). All of these tools are designed for business users with average skills and require no special skills or knowledge of statistical analysis or support from IT or data scientists.

The Smarten approach to data discovery is powered by ElegantJ BI Business Intelligence Solutions, a representative vendor in multiple Gartner reports including the Gartner Modern BI and Analytics Platform report and the Gartner Magic Quadrant for Business Intelligence and Analytics Platforms Report.

Original Post: Augmented Analytics Algorithms and Techniques: Learning for Citizen Data Scientists