Maximizing the Return on OLAP and Data Mining Analysts
In this way you can explore a cube created using an OLAP source, having a dimension as well as a cube that are optimized for data mining relationships. Both OLAP and data mining are important analytical technologies in the business intelligence family. OLAP is good at aggregating of large amount of transaction. Oct 26, Online application processing (OLAP) and data mining are considered double entendres. Here we attempt at singling out what each term.
Data summarization and presentation to display trends among the data elements. Data analysis tasks and related outcomes The classification system indicates whether each task is more closely aligned with data mining or OLAP, and presents the outcome for each task, expressed as a ratio to indicate whether the outcome is primarily to provide insight about a business problem or opportunity, or to perform the analysis efficiently i.
The classification system aligns the use of the data mining and OLAP analysts with the needs of a manager making a request, and helps managers and analysts work together more effectively.
The outcome from a data analysis is defined as a trade-off between a new insight and analysis efficiency. Applying the classification system, the assignment of a data exploration task means that the manager and data mining analyst understand that the primary outcome should be insight.
However, if the manager assigns an in-depth explanation task, the analyst should be more sensitive to performing the task efficiently to meet a deadline.
If it is a basic explanation task, the manager should be able to specify the specific variables to analyze and suggest the potential relationships among them. The analyst is expected to carry out the analyses efficiently. With visualization tasks, the manager identifies specific variables for display.
For the analyst, this is a simple, straightforward task that can be completed quickly. Table 2 provides banking examples for the general tasks and expected outcomes.
OLAP and Data Mining
The manager gives the analyst sufficient time to explore data, as the goal is new insight. In-Depth Explanation Analysis of multiple variables not predefined by business logic that may have a significant statistical relationship with loan-default behavior. The decision makers should give the analyst ample time to complete the analysis; however, a deadline is likely as decision makers are hoping for new insight to support a strategic business decision. Basic Explanation Grounded in business logic that age and education influence loan-default behavior, this analysis involves a statistical test of significance between age, education, and loan default.
Efficiency is key here and decision makers only expect new insight to be generated by confirming, or invalidating, the logic they present. Visualization Determine the percentage of positive responses, negative responses, and non-responses to a particular campaign and display the results in a pie chart.
Fast turnaround is required as the decision makers are seeking ways to present what they know, not generate new insights. Such a plan helps with assigning tasks, developing analyst skills, and motivating and retaining analysts.
One way to develop a plan is to use a Data Mining Maturity Model, which describes job progression paths for both data mining and OLAP analysts based on their competence in three areas: The competency scale ranges from 1 limited experience to 4 very experienced. Depending on their technical and analytical skills, domain knowledge, interests, and experience, analysts may start by primarily performing visualization, basic explanation, in-depth explanation, or data exploration tasks.
Most typically, they start by working on basic data-analysis tasks such as data visualization. Over time, as their skills, knowledge, and experience grow, they move to more advanced tasks e. Some analysts may work on specific tasks on a regular or full-time basis, such as designing campaigns in marketing.
Understanding the Relationship Between OLAP and Data Mining Data Mining
They become highly proficient in performing specific analyses using data and software that are appropriate for the task. They may be placed in the organizational unit where the specialized work resides. Figure 1 shows the career progression plan a company developed using the data mining maturity model. Analysts can be initially placed anywhere in the model if they have the requisite technical and analytical skills and domain knowledge, but most analysts begin lower in the model and move up as they become more skilled and experienced.
Following are several specific career progression paths that were defined. A sample career progression plan for research analysts using the data mining maturity model Career Path 1: OLAP analysts examine hypotheses based on predefined business logic.
Over time, OLAP analysts are expected to improve their technical, analytical, and domain competence. How should I segment the customer base? While most of OLAP techniques come from the database family, data mining techniques come from three academic fields: One of the fundamental processes of data mining is to analyze correlations among attributes and their values.
Statisticians have been working on this issue for centuries. These are many profound statistical theories that we can still apply today. Machine learning has introduced many new concepts for information discovery. Some of these can be applied to data mining. The most common ones are decision trees and neural networks. Other algorithms such as genetic algorithms and fuzzy logic are also included in some data mining packages.
For example, many popular association algorithms for analyzing product associations of large transaction tables were proposed by database researchers. OLAP can help data mining tasks with the data transformation step thanks to its data aggregation engine.
In many cases, patterns can be found only in aggregated data. It is difficult to discover patterns directly from the fact table. For example, analyzing the sales of snow tires at the city level can be challenging for many data mining algorithms, because there are too many cities. There are often millions of members in a dimension and tens of millions of aggregated values in a cube.
Like any relational database, a cube contains hidden patterns such as sales trends, product associations, customer segments, and so on. An OLAP cube needs data mining techniques to discover the inside information. Market basket analysis about products: Market basket analysis of product associations is a frequent marketing problem.
Store managers want to know which products sell together in order to do promotional cross-selling. Store managers also like to group customers into segments using customer demographic information as well as aggregated measures, for example, monthly spending at the store. Segmentation can be done on dimensions other than the customer. For example, the marketing department of a retail chain may want to cluster its stores based on store attributes and sales.
Based on the customer attributes in the customer dimension and measures, it is possible to build a classification type mining model to analyze the customer information. For example, a store manager might want to know the profile of customers who are interested in applying for a golden membership card. Based on historical product sales, a store manager might like to know projected future sales amounts.
For example, what are the potential sales of all beverages in all stores in Washington state next month? Suppose that a store ordered a product — for example, a new kind of beer. The store manager wants to know which customers are most interested in buying this product. He can apply data mining techniques to discover the profile of customers who are interested in buying beer and send mailings to those people with similar profiles. The OLAP mining model and relational mining model use the same set of data mining algorithms.