This article presents a brief explanation of Outliers, and how this type of analysis is used.
What is Outlier Analysis?
An outlier is an element of a data set that distinctly stands out from the rest of the data. In other words, outliers are those data points that lie outside the overall pattern of distribution as shown in figure below.
The easiest way to detect outliers is to create a graph. Plots such as Box plots, Scatterplots and Histograms can help to detect outliers. Alternatively, we can use mean and standard deviation to list out the outliers. Interquartile Range and Quartiles can also be used to detect outliers.
Here is another illustration of an outlier. If you look at the Histogram below, you will see that one value lies far to the left of all other data. This data point is an outlier.
How Can Outlier Detection Improve Business Analysis?
Outlier data points can represent either a) items that are so far outside the norm that they need not be considered or b) the illustration of a very unique and singular category or variable that is worth exploring either to capitalize on a niche or find an area where an organization can offer a unique focus.
When considering the use of Outlier analysis, a business should first think about why they want to find the outliers and what they will do with that data. That focus will help the business to select the right method of analysis, graphing or plotting to reveal the results they need to see and understand.
When considering the use of Outlier analysis, it is important to recognize that, when the Outlier analysis is applied to certain datasets, the results will indicate that outliers should be discounted, while in other cases, the outlier results will indicate that the organization should focus solely on those outliers. For example, if an outlier indicates a risk or a mistake, that outlier should be identified and the risk or mistake should be addressed. If an outlier indicates an exceptional result, such as a person that recovered from a particular disease in spite of the fact that most other patients did not survive, the organization will want to perform further analysis on the outlier result to identify the unique aspects that may be responsible for the patient’s recovery.
When a business uses Outlier analysis, it is important to test the results and analyze the overall dataset and environment to be sure that the presence of outliers does not indicate that the dataset may be more complex than anticipated and may require a different form of analysis.
The Smarten approach to business intelligence and business analytics focuses on the business user and provides Advanced Data Discovery so users can perform early prototyping and test hypotheses without the skills of a data scientist. Smarten Augmented Analytics tools include assisted predictive modeling, smart data visualization, self-serve data preparation and clickless analytics with natural language processing (NLP) for search analytics. All of these tools are designed for business users with average skills and require no special skills or knowledge of statistical analysis or support from IT or data scientists.
The Smarten approach to data discovery is powered by ElegantJ BI Business Intelligence Solutions, a representative vendor in multiple Gartner reports including the Gartner Research Market Guide to Self-Service Data Preparation, as a Niche BI and Analytics Vendor in the Gartner Report, Competitive Landscape in the BI Platforms and Analytics Software, Asia/Pacific, as a Representative Vendor in the Gartner Market Guide for Enterprise-Reporting-Based Platforms, and a Listed Vendor in the Other Vendors to Consider for Modern BI and Analytics, Gartner Report.
Original Post: What is Outlier Analysis and How Can It Improve Analysis?