![]() |
Home Marketing | A & A Readings Portal | ||
Consumer Behaviour | Marketing | Marketing Management | Marketing Research |
Home Information Technology |
![]() |
Article published in DM Direct Newsletter
March 31, 2006 Issue
In 1988, an obscure mathematician-philosopher named John Alan Paulos published Innumeracy, a book that described the mathematical illiteracy so many of us share. Dr. Paulos didn't remain obscure for long. Innumeracy soon showed up on the New York Times best-seller list, where it stayed for five months. He appeared on the David Letterman show, Larry King Live, The McNeill-Lehrer NewsHour and several other interview programs. His book has been translated into 13 languages including Finnish, Chinese and Hebrew. Clearly Dr. Paulos touched a nerve, not just here, but around the world.
Most people don't find pure numbers an eloquent medium of communication. Nevertheless, business decisions are, more than ever before, based on data analysis and interpretation. More and more marketers are mathematically sophisticated, but even numerate practitioners will struggle to find useful patterns in a large data file. The technical challenges facing marketers fall into several different categories: data collection problems, data interpretation problems and mathematics and science problems. The most pressing and most tractable of these problems is data interpretation, the focus of this article.
What's a numerically challenged marketer to do? Hire a math coach? Go back to school and revisit statistics? Or, better yet, find a way to translate marketing data into graphic language that's easier to read and understand?
Graphic representation of data is certainly nothing new. But creating an accurate picture of a group of 50,000 customers and predicting what they're likely to do in the future is a more complex problem than simply creating a chart that tracks the price of IBM stock over the past decade. However, if such a picture of customers can be built, then it immediately leads to a visual understanding of ways to approach customers of all types and lifecycle stages, without getting outside of your numeracy comfort zone. We're going to show you how this is possible, using a technique we call behavior mapping.
Even an innumerate marketer can understand a behavior map. Just like a geographical map that locates every city and town, a behavior map has a data point for every customer, and each customer is individually identifiable. Where a city appears on a geographical map depends on its latitude and longitude. Where a customer appears on a behavior map depends on the values for the metrics attached to that customer. And just like cities are often shown in different colors according to their size, so too are the data dots associated with customers shown in different colors according to the value of some analytic metric associated with that customer. What's different about behavior maps compared to geographic maps is that a behavior map locates a customer not in the physical space of the real world, but in some multidimensional marketing space that defines and explains the customers, analogous to how Boston is defined by being in Massachusetts and New England and the United States. Let's look at some insights a behavior map can offer your customers, and how you might put one to work in your business.
The first step is to ask the questions you want your data to answer. To make this approach concrete, we'll illustrate it with a question we commonly hear - how do I spot customers with changing behaviors? The second step is to identify some possible metrics that could bear on the problem. One obvious metric for this question is revenue - how much is the customer spending? Another useful metric for this problem is the change in loyalty score, where loyalty score measures the likelihood of a customer buying in the next period.
Figure 1 is a behavior map that shows changing behaviors of a group of real customers. Each dot is an identifiable customer, and each customer has three values attached to him or her. One value is measured along the horizontal axis: the change in their spending from the previous quarter to the current quarter. The vertical axis measures the change in a customer's loyalty score. The third value, a customer's loyalty group, corresponds to the color of that customer's dot. A loyalty group is a collection of customers with similar loyalty scores. The highest ranked customers are in loyalty group 6 (dark red) and the lowest are in loyalty group 1 (darkest blue).
Figure 1: Behavior Map of Customer Velocity
Step three is to interpret the rich structure in this map. The accelerating customers are in quadrant II and the laggards are in quadrant IV. Customers in quadrant II spent more money in the recent quarter than in the previous one, and their loyalty scores increased too. Dots near (0,0), the intersection of the two "no change" lines, are the "steady eddies," customers whose behavior is not changing greatly.
Since loyalty is influenced by longevity, many customers with low scores have great potential but simply haven't been around that long. Customers in lower loyalty groups who show big changes in their loyalty score are usually new customers who are showing rapid growth. There is a significant group of such customers toward the top of quadrant II, the dark blue dots pluming out to the right. These are the "up and comers" who should become strong customers in the future. Similarly, there are many high-value customers (the orange and dark red dots) in quadrant IV whose behavior is deteriorating and who should definitely be classified as "at risk."
Step four is not strictly data interpretation; it is implementation - translating what the chart is telling us into marketing actions. Customers whose behavior is changing, whether for the better or the worse, need immediate attention. Too often, companies are quicker to focus on potential attritors than they are to recognize increased purchase velocity. Behavior maps indicate both the size and membership of these populations. That the different populations appear in different regions of the map strongly imply different tactics for each group. It's then up to the marketing department to take the appropriate actions.
In our experience, some of the most useful behavior maps are those that give insight into a customer's future behavior, for example, finding up-sell or cross-sell candidates. The way to do this is to plot customer metrics that relate to the future (predictive analytics) rather than the past. For example, instead of plotting the revenue already received from a customer, plot the revenue expected in the next year or two (customer equity). Instead of plotting when a customer last purchased, use a metric that relates to their propensity to purchase again in the next period. Generally, a future-oriented behavior map is more easily tied to a marketing campaign. Visualizations like these communicate quickly and even the numerate MBA marketer who knows statistics and devours spreadsheets for breakfast will find they're a rich source of ideas.
A century ago, the science-fiction writer H. G. Wells predicted that in modern technological societies, statistical thinking would be as necessary as the ability to read and write. We would simply add that in the past 100 years, it's become clear that we humans are genetically programmed for pattern recognition - which means that the kind of mapping described here will continue to be the quickest way to understand the knowledge that lays hidden in the records of your databases.
Mark Klein is CEO of Loyalty Builders, where he analyzes customer behavior for major corporations. He has more than thirty years of software and management experience.
![]() |
|||||
![]() |
|||||
|
|||||
![]() |
|||||
![]() |
Claudia wishes to thank Frank Cullen from Blackstone and Cullen Consulting for his insightful input into this column.
Most farmers are very organized. They manage their environments with great care and exacting precision. A field of dirt; the crops are harvested and segregated so they aren't mixed up. This maximizes the level of quality and efficiency - both mandatory features for a farm to survive today.
A farmer would never even think of mixing his seeds together in a bag and simply winging them out into the freshly plowed field. Chaos would result; his very livelihood would be in jeopardy. He would end up with a stock of corn growing next to alfalfa, which would be growing next to wheat, etc. Harvesting would become a monumental task in which different plants would be harvested at different times by hand! I think there is a lesson here that we can learn from these well-organized and competent people.
Think of the Corporate Information Factory (CIF) as a well run farm. Those of you familiar with this architecture (Figure 1) are aware that it is easily split into two halves (or sets of fields, if you will) - each with its respective large process and data stores.
Figure 1: The Corporate Information Factory
One half deals with "getting data in" and consists of the operational systems, the data warehouse and/or operational data store, and the complex process of data acquisition. Much has been written about these components, especially the extract, transform and load (ETL) part of data acquisition. The ultimate deliverable for this part of the CIF is a repository of integrated, enterprise-wide data for either strategic (data warehouse) or tactical (operational data store) decision making.
The other half of the CIF deserves some more attention. It is summarized as "getting information out" and consists is tilled and earmarked for a specific crop. The seeds for that crop are planted at precise intervals along the rows of the data delivery process, the variety of marts (data and oper) available for the business community's usage and the decision support interface (DSI) or technologies that access the marts and perform the various analytics or reporting needed by the business community. The ultimate deliverable for this half of the CIF is an easily used and understood environment in which to perform analyses and make decisions. Most of the highly touted business intelligence (BI) benefits are derived from getting information out - data consistency, accessibility to critical data, improved decision making, etc.
However, is this really being achieved? I don't think so. Unfortunately, many corporations have not followed the architecture as closely as they should have. They have created an environment that is all too similar to a farmer putting all his seeds into a bag and blasting them out - willy-nilly - into his fields. Let's look at what we have created in more detail.
The construction of the data warehouse is now well documented and has eliminated much of the chaos in terms of getting consistent data from our operational systems. We are now able to clean up the data as well, improving its quality significantly. We place this data in easily accessed database technologies with the idea that data marts can be quickly built from this resource.
Unfortunately, we have not paid as much attention to the creation of marts as perhaps we should have. With the warehouse in place, it becomes very easy to create cube after cube, star schema after star schema, data set after data set, seed after seed, from this repository - but with minimal management or control over these constructs. Redundancy and inconsistency have crept into this half of the architecture, significantly threatening the promised benefits. Figure 2 shows what is happening here.
Figure 2: Chaos in Data Delivery
Many companies have more than one ETL tool used for the delivery of data into the marts. Some are used to create the data warehouse and marts; others come with the DSI tool of choice or even the data mart database. We have also used hand-coded data delivery programs - all now run rampant through the warehouse extracting data and winging it out to marts at will. This by itself would not be a problem if it were a managed process. The difficulty is that the discipline in many organizations is not in place to ensure that these processes are efficient and administered. What I see more often today are the following situations:
The resulting situation is not pretty: multiple tools means multiple skills required, reusability of data delivery code may be constrained or limited, meta data becomes encapsulated within various tools and is not sharable across tools, inconsistency of the data used is highly probable and the overall environment is more costly to maintain and sustain.
What is needed is a new paradigm, a return to the principles of the architecture, a shift in our thinking about data delivery. It cannot be an unmanaged, uncoordinated set of processes as represented in Figure 2. We must create a consistent, documented and managed environment that starts with a request coordinator process. Figure 3 demonstrates a managed function in the CIF.
Figure 3: Data Delivery with a Request Coordinator Function
The request coordinator is like the farmer who plans his next season carefully, determining what seeds will be planted, which fields will have what crops, where efficiency of scale, market value and time to market (harvest schedule) play a role, etc.
In the CIF, the request coordinator first captures the business user requests, prioritizes them and then profiles them to fully understand the request. Meta data plays an important part in this step - it is used to determine whether a new mart is warranted or an existing one can be enhanced to accommodate the request. If a data mart that can satisfy the request already exists, then the function simply gives the users access to the mart, perhaps adding a bit of new data, a new report or creating a view specifically for that set of users. If a mart does not exist, then the coordinator must begin the process of filtering the right data from the warehouse, formatting it to the correct technological format, and delivering that data to the new mart per the requested schedule.
In researching this column, I looked at a number of technologies that could help with this data delivery management problem. Certainly it is possible for you to use your existing ETL tools or even the many bulk data movement technologies (IBM, Microsoft, iWay and other EII capabilities). However, you still need to create the request coordinator function and manage the meta data associated with the data delivery process.
I also found a new technology offered by Certive that manages not only the creation of marts but also the meta data and business rules for each mart creation. This new technology is a bright spot in our industry and deserves consideration.
In any case, regaining control over the data delivery process requires a shift in your existing architecture; however, the benefits of this shift outweigh any disruption to "business as usual." These include reusable delivery code, managed meta data and business rules, potential usage of virtual marts and lowered overall data delivery costs.
Claudia Imhoff, Ph.D., is the president and founder of Intelligent Solutions (www.intelsols.com), a leading consultancy on CRM and business intelligence technologies and strategies. She is a popular speaker and internationally recognized expert and serves as an advisor to many corporations, universities and leading technology companies. She has coauthored five books and more than 50 articles on these topics. Imhoff may be reached at cimhoff@intelsols.com.
![]() |
|||||
![]() |
|||||
|
|||||
![]() |
|||||
![]() |
Increasingly, large enterprises are recognizing the value of an enterprise data warehouse (EDW) in their information and knowledge strategies. The potential benefits include cost-effective consolidation of data for a single view of the business and creation of a powerful platform for everything from predictive analysis to near real-time strategic and tactical decision support throughout the organization.
Too often, a company's struggles with designing and implementing a logical data model (LDM) are the source of the problem. The LDM is of vital importance, largely because other key components of enterprise data management rely on it. It is especially disturbing that the frustrations around LDM implementations are often preventable, having more to do with people, process and organizational concerns than with the technology itself. If companies can better understand what is tripping them up when they create and implement LDMs, they will realize the full potential of their data warehousing projects much more quickly.
An LDM is a representation of business concepts laid out in a visual format that clearly shows these concepts and their various relationships. It is independent of the underlying database implementation. Designing an LDM that fits the needs of the business is crucial, not just because it reflects the commitment to treat data as a true enterprise asset but also because it enables efficient and effective storage of data that businesses can readily access to create various information and knowledge products. Conversely, a poorly designed LDM negatively affects many EDW components, making rework quite expensive. The LDM affects:
LDMs also serve as a foundation for data quality. Models that don't follow first normal form or have the wrong relationships often store duplicate data, resulting in loss of data quality. Proper modeling of items such as domains and data types helps validate data quality checks. LDMs also must comply with data governance guidelines and any overall data standards in the enterprise. The amount of history stored in a warehouse depends on the design of LDMs and physical data models (PDMs), thus influencing the storage strategy.
Building an EDW environment is not unlike building a house, which often includes a three-step modeling process that begins with a sketch showing the various rooms. This initial plan drawing does not include detailed construction-specific information such as the type of materials needed, but does include details such as room dimensions, plumbing details and electrical details. Think of this first sketch as the LDM.
When building a house, the architect hands the sketch over to a construction engineer, who creates detailed plans to construct the house. The construction engineer must apply his knowledge of materials and other factors to narrow down the design choices. The end product is an engineer's blueprint that specifies the various materials. For example, if he is using copper tubing for hot water plumbing, this drawing would specify the gauge and diameter of the pipe, among other things.
Finally, there may be many people living in the house with their own expectations about features and functionalities. Also, the same underlying feature may have different applications across the house. For example, even though water is used throughout the house, in the kitchen you need a standard sink and faucet with hot and cold water, in the bathroom you need a shower and sink, and in the swimming pool you only need cold water with an additional filtration/chlorination system. Just as you don't build a plumbing system to satisfy one room or person, you don't build a data infrastructure to satisfy the needs of just one business user. The semantic layer acts as the interface to the specific uses and needs of a business and uses the language/semantics that are understood by specific business areas.
Logical data models: In an ideal scenario, the development of an LDM begins with a broad set of data requirements for the EDW. Using the requirements along with interviews with subject matter experts (SMEs) creates a conceptual model showing the relationships between key entities.
The next stage is detailed attribution, where you assign primary keys and verify the cardinality, optionality and identifiability of the various relationships. Detailed attribution is created based on the data available in the enterprise for specific applications. This involves mapping data elements in the source environment to the LDM. Further, with the help of SMEs, you also develop metadata (including domains, data types, definitions, comments and notes). Following this, peers and SMEs typically review the model. After review and, if necessary, revision, the model becomes available for the next step in the modeling continuum: the PDM.
Physical data models: For PDMs, data architects create index structures, perform selected denormalizations, create aggregations/summary tables and conduct performance tuning. They also handle staging area tables as part of physical data modeling. Because architects optimize PDMs for the underlying database architecture, PDMs are not easily interchangeable across database platforms.
Semantic data models (SDMs): Designed for specific end-user applications, SDMs give business users a view into the database that reflects specific business area semantics or terminology. SDMs are a layer on top of the PDM that uses database views as well as additional summary or other minor tables. If companies use reporting or OLAP tools, the semantic layer sometimes reflects those tools' characteristics.
While creating these models may seem straightforward in theory, in practice, a number of roadblocks can emerge that have complicated this critical stage of an EDW implementation. The five most significant roadblocks are:
Before proposing any specific solution for overcoming these challenges, organizations should conduct an assessment of existing tools, processes and organizational behaviors; the assessment can be formal if you have the necessary backing or informal if you don't. Either way, use the assessment to discover if any of the following solutions apply to your specific LDM challenges.
Develop a common vision for the enterprise around the purpose and usage of the data warehouse. Along with the vision, a roadmap that shows the implementation schedule for various subject areas will be useful for all stakeholders.
Ensure there is a business and IT champion who can support the LDM development process. This champion should believe in the value of an EDW and have enough authority to overcome roadblocks.
Identify key stakeholders from the business and technology sides who are directly affected by the LDM development and identify their positions (support, neutral, oppose). Use the business champion to defuse opposition, create buy-in and align all key stakeholders. Create a process for keeping everyone informed of all key developments.
Use the business champion to ensure that only the relevant business and technology people are involved in the modeling effort and that SMEs are available to support the modeling effort.
Train key stakeholders. Bring the modeling team up to speed with logical data modeling methodology, especially those who have a data mart or application-based modeling approach. Modelers with domain expertise are ideal, but depending on team skills, training should include sections on topics such as normalization, subtype/supertype, standard modeling patterns and model management. Ensure business users and IT managers understand the value of having an enterprise logical model. Use samples to demonstrate how the LDM can help answer cross-business and ad hoc questions.
Develop a standardized logical modeling methodology with reusable design patterns. Because logical modeling is a fairly mature process, there are a number of white papers, presentations and textbooks that present reusable design patterns.
Develop a model management process. When multiple modelers work on the same model, there should be a process for change control, revisions, authentication and commitment of work.
Develop accelerators to speed up the JAD (joint application design) sessions and reduce business user time. In addition to making sure SMEs are available for validating the model, accelerators such as a straw model with key concepts and definitions help engender discussion and focus the JAD session. You can construct the accelerators based on prior experience, industry standards or other reference models.
Create a process for addressing and validating the lifecycle of an LDM, including requirements, design, development and testing. Design metrics to evaluate the efficacy of the lifecycle and the overall process. These metrics can address budget, usability, completeness and efficiency.
Accept the fact that in many cases, requirements are not available. If domain expertise is available in the modeling team, use that to come up with a set of skeleton requirements and validate these with the business users. You can also use source system design as a starting point to create a straw-man model that can serve as an accelerator.
Develop periodic status reports for all stakeholders to inform them of the progress and list any concerns.
There are a select number of tools available to develop logical data models, and it is essential to translate some of the processes described here into tool-specific templates. This could include such things as a formalization of the modeling conventions, naming standards and model management procedures.
Use a data profiling tool to arrive at standard domains and valid values, among other things. Store metadata required for the enterprise (such as definitions, data steward information, domain values, valid range, allowable values, data type information and relationship phrases) in the tool; make sure that other data management components, such as data quality and data movement, can extract and use that metadata. Most tools will also allow you to separate logical and physical modeling information.
Data warehousing is, at least in part, recognition that data is an extremely valuable enterprise asset that requires rigorous management. Logical data modeling is a key data management discipline, the success of which is critical to ensuring efficient and effective EDW implementation. Understanding and effectively facing the challenges in creating LDMs by using a combination of people, process and technology solutions will help enterprises successfully develop large EDWs.
Sreedhar Srikant is a senior data warehouse consultant in the Teradata Financial Services Risk Center of Excellence. He has been part of several data warehouse implementations. His key interests include data management practices such as metadata, data quality, master data management, data modeling and analytic modeling. He may be reached at sreedhar.srikant@teradata-ncr.com.
![]() |
|||||
![]() |
|||||
|
|||||
![]() |
|||||
![]() |
This article is an excerpt from the book Performance Dashboards: Measuring, Monitoring, and Managing Your Business by Wayne W. Eckerson, director of research and services at The Data Warehousing Institute, a worldwide association of data warehousing and business intelligence professionals. The book was published in October 2005 and can be ordered at online book dealers.
This summer I found my 11-year-old son Harry and his best pal Jake kneeling side by side in our driveway, peering intensely at the pavement. As I walked over to inspect this curious sight, I saw little puffs of smoke rising from their huddle. Each had a magnifying glass and was using it to set fire to clumps of dry grass as well as a few unfortunate ants that had wandered into their makeshift science experiment.
In this boyhood rite of passage, Harry and Jake learned an important lesson that escapes the attention of many organizations today: the power of focus. Light rays normally radiate harmlessly in all directions, bouncing off objects in the atmosphere and the earth's surface. The boys had discovered, however, that if they focused light rays onto a single point using a magnifying glass, they could generate enough energy to burn just about anything, and keep themselves entertained for hours!
By the time Harry and Jake enter the business world (if they do), they will probably have forgotten this simple lesson. They will have become steeped in corporate cultures that excel at losing focus and dissipating energy far and wide. Most organizations have multiple business units, divisions and departments, each with their own products, strategies, initiatives, applications and systems to support them. Good portions of these activities are redundant at best and conflicting at worst. The organization as a whole spins off in multiple directions at once without a clear strategy. Changes in leadership, mergers, acquisitions and reorganizations amplify the chaos.
To rectify this problem, companies need an "organizational magnifying glass" - something that focuses the work of employees so everyone is going in the same direction. Strong leaders do this. However, even the voice of a charismatic executive is sometimes drowned out by organizational inertia.
Strong leaders need more than just the force of their personality and experience to focus an organization. They need an information system that helps them clearly and concisely communicate key strategies and goals to all employees on a personal basis every day. The system should focus workers on tasks and activities that best advance the organization's strategies and goals. It should measure performance, reward positive contributions and align efforts so that workers in every group and level of the organization are marching together toward the same destination.
In short, what organizations really need is a performance dashboard that translates the organization's strategy into objectives, metrics, initiatives and tasks customized to each group and individual in the organization. A performance dashboard is really a performance management system. It communicates strategic objectives and enables businesspeople to measure, monitor and manage the key activities and processes needed to achieve their goals.
To work this magic, a performance dashboard provides three main sets of functionality, which I will describe in more detail later. Briefly, a performance dashboard lets business people:
A performance dashboard is a powerful agent of organizational change. When deployed properly, it can transform an underperforming organization into a high flyer. Like a magnifying glass, a performance dashboard can focus organizations on the key things it needs to do to succeed. It provides executives, managers and workers with timely and relevant information so they can measure, monitor and manage their progress toward achieving key strategic objectives.
One of the more popular types of performance dashboards today is the balanced scorecard, which adheres to a specific methodology for aligning organizations with corporate strategy. A balanced scorecard is a strategic application, but as we shall soon see, there are other types of performance dashboards that optimize operational and tactical processes that drive organizations on a weekly, daily or even hourly basis.
Executive Dashboards and Cockpits
Although dashboards have long been a fixture in automobiles and other vehicles, business, government and nonprofit organizations have only recently adopted the concept. The trend started among executives who became enamored with the idea of having an "executive dashboard" or "executive cockpit" with which to drive their companies from their boardroom perches. These executive information systems (EISs) actually date back to the 1980s, but they never gained much traction because the systems were geared to so few people in each company and were built on mainframes or minicomputers that made them costly to customize and maintain.
In the past 20 years, information technology has advanced at a rapid clip. Mainframes and minicomputers largely gave way to client/server systems, which in turn were supplanted by the Web as the preeminent platform for running applications and delivering information. Along the way, the economy turned global, squeezing revenues and profits and increasing competition for ever more demanding customers. Executives responded by reengineering processes, improving quality and cutting costs, but these efforts have only provided short-term relief, not lasting value.
Convergence
During the 1990s, organizations began experimenting with ways to give business users direct and timely access to critical information, an emerging field known as business intelligence. At the same time, executives started turning to new performance management disciplines such as balanced scorecards, Six Sigma, economic value add and activity-based costing to harness the power of information to optimize performance and deliver greater value to the business.
These initiatives convinced many executives that they could gain lasting competitive advantage by empowering employees to work proactively and make better decisions by giving them relevant, actionable information. Essentially, executives recognized that the EIS of the 1980s was a good idea but too narrowly focused; everyone, not just executives, needed an EIS. Fortunately, executives did not have to wait long for a solution. At the dawn of the 21st century, business intelligence converged with performance management to create the performance dashboard.
Market Trends
This convergence has created a flood of interest in performance dashboards since the year 2000. A study by The Data Warehousing Institute (TDWI) in 2004 showed that most organizations (51%) already use a dashboard or scorecard and that another 17% are currently developing one. The same study showed that almost one-third of organizations that already have a dashboard or scorecard use it as their primary application for reporting and analysis of data.
Benefits
The reason so many organizations are implementing performance dashboards is a practical one: they offer a panoply of benefits to everyone in an organization, from executives to managers to staff. Here is a condensed list of benefits:
In short, performance dashboards deliver the right information to the right users at the right time to optimize decisions, enhance efficiency and accelerate bottom-line results.
Although many organizations have implemented dashboards and scorecards, not all have succeeded. In most cases, organizations have been tantalized by glitzy graphical interfaces and have failed to build a solid foundation by applying sound performance management principles and implementing appropriate business intelligence and data integration technologies and processes. Here are the common symptoms of less-than-successful solutions:
In the end, performance dashboards are only as effective as the organizations
they seek to measure. Organizations without central control or coordination will
deploy a haphazard jumble of nonintegrated performance dashboards. However,
organizations that have a clear strategy, a positive culture and a strong
information infrastructure can deliver performance management systems that make
a dramatic impact on performance.
This chapter "What are Performance Dashboards"
is excerpted from Performance Dashboards:
Measuring, Monitoring, and Managing Your Business (October 2005) with
permission from the publisher John Wiley & Sons. You may not make any other use
or authorize others to make any other use of this excerpt in any print or
nonprint format, including electronic or multimedia.
Wayne Eckerson is director of research at The Data Warehousing Institute, the industry's premier provider of in-depth, high-quality training and education in the data warehousing and business intelligence fields. He can be reached at weckerson@tdwi.org.
![]() |
|||||
![]() |
|||||
|
|||||
![]() |
|||||
![]() |
As the business dynamics of the competitive marketplace accelerate and the velocity of information increases, more and more companies are seeking opportunities to compress and streamline management decision making. Actionable information is now the required mantra for superior performance. This has elevated the importance of key performance indicators (KPIs) and their ability to measure, predict and manage the business health of a company in real (or near real) time. Since the mid-'90s, KPIs have morphed from static siloed measures to dynamic real-time enterprise metrics. Statistical modeling and data mining under the guise of predictive analytics have become critical building blocks in setting the new KPI standard for leading indicators. This journey to KPI maturity and its salient features are illustrated in Figure 1.
Figure 1: The Three Waves of KPI Development
During the first wave of KPI development - The BI Enterprise - KPIs focused on what had happened historically to individual product lines and strategic business units (SBUs). Some of the source data resided in separate data marts, while other data existed in legacy systems and Excel spreadmarts. If one was lucky, an OLAP data mart provided manual drill down, ad hoc query and navigation. Operational managers and business executives were data slaves rather than decision-makers. The ability to calculate, track and react to KPIs was severely limited by the data availability, infrastructure integration and software capabilities. Leading indicator KPIs existed in name only - they were retrospective, not perspective, and certainly not actionable.
The second wave of KPI development - The Business Performance Management (BPM) Enterprise - moved the focus from a siloed product/SBU mind-set to an enterprise perspective. Data from product lines and SBUs were integrated into enterprise data warehouses. Business performance management frameworks provided alignment between company strategies and business initiatives, while BPM software suites streamlined KPI tracking and management. The intensive initial focus on financial measures was expanded to incorporate the customer, process and learning perspectives. KPIs tend to be captured by basic metrics. More importantly, most of the KPIs still captured business metrics that measured either the past or current health of the company rather than predicting future directions. Collaborative decision making, employee empowerment and information democratization are key features of this wave.
At present, several leading-edge companies are now embarking on the third wave of KPI development - The Predictive Enterprise - and the bar has ratcheted up another notch. The focus has shifted to real-time KPI monitoring, causal analysis and predictive analytics. KPIs are now forecasted using mathematical models to predict future behavior based on current and historical data. The extensive portfolio of statistical techniques includes data mining, segmentation, clustering, regression modeling, market-basket analysis and decision trees. The real challenge in developing effective KPIs is deciding which statistical technique to mate with a specific business problem, as there are significant tradeoffs and assumptions associated with each.
Kent Bauer is the managing director, Performance Management Practice at GRT Corporation in Stamford, CT. He has more than 20 years of experience in managing and developing CRM, database marketing, data mining and data warehousing solutions for the financial, information services, healthcare and CPG industries. Bauer has an MBA in Statistics and an APC in Finance from the Stern Graduate School of Business, New York University. A published author and industry speaker, his recent articles and workshops have focused on KPI development, BI visioning and predictive analytics. Please contact Bauer at kent.bauer@grtcorp.com.
![]() |
|||||
![]() |
|||||
|
|||||
![]() |
|||||
![]() |
With the exponential explosion of business data and the accelerated market dynamics, more decisions must be made in compressed time frames. Data mining and its sibling - predictive analytics - now provide a potential avenue to meet this pressurized demand for real-time decision making. Although data mining has been a defined solution space since the 1990s, it is only during the last two years that the data mining process has been enhanced to create embedded predictive analytics. Predictive analytics builds on the data mining multistep process and statistical modeling techniques to add a layer of automation and self-directed built-in intelligence. Business users (and not just Ph.D. statisticians) can now analyze large amounts of customer, supplier, employee and product data for patterns and trends.
Depending on who you talk to, the time of day and the product space being promoted, the predictive analytics definition can be as narrow as "understanding why a variance occurs in real time and what to do about it moving forward" or as broad as "delivering the right insight to the right people in real time for making decisions." In our situation, we are focused on the relationship of the dependent key performance indicators (KPIs) and the associated independent causal variables. It is all about the ability to automatically discover KPI variances and determine the root causes so that improved predictors of a company's business performance can be developed, measured and managed. Predictive analytics is the automated process of sifting through large amounts of data using statistical algorithms and neural nets to identify data relationships between KPIs and critical measures that facilitate prediction of critical success factors.
From a simplistic perspective, predictive analytics automates the data mining process and adds enhanced capabilities such as real-time data capture on the front end and automatic alerting on the back end. Let us first discuss the two major data mining constructs - SEMMA (from SAS) and CRISP-DM (from other data mining industry leaders) - and show how they can be integrated into the predictive analytic framework.
The SEMMA construct starts out with sampling the population data to create a manageable set of data for analysis and then explores the data visually to determine what types of patterns and trends can be found. Next the data is manipulated to insure data completeness and quality. In some cases the data is bucketized into meaningful classes and enriched with demographic, attitudinal and behavioral data. Modeling uses appropriate statistical modeling techniques such as regression analysis, neural nets, tree-based reasoning and time-series methods to uncover the root causes of the patterns and trends. The final step is assessment where the models developed using the initial training data are compared to the holdout sample to determine the effectiveness of the forecasting models.
The CRISP-DM reference model is broader in context and starts with a business understanding that includes a focus on business goals/objectives and a project plan. Both data collection and exploration have been combined into a module called data understanding. The next step - data preparation - incorporates activities such as data formatting, data cleansing and data integration. The final two modules focus on evaluation of model results and deployment of the models into production.
The bottom line is that both the SEMMA construct and the CRISP-DM reference model converge on key activities such as data capture, modeling, analysis, evaluation and deployment. These provide the core elements for a predictive analytics framework. The trick with predictive analytics is to create a seamless environment so that the data collection through model development and deployment are self-directed and untouched by human hands. This includes the six steps illustrated in Figure 1. After the initial source data mapping, the data collection process can proceed with current data incorporated in the updated KPI forecasting models. The KPI business models can be developed with tuned coefficients, and the results automatically evaluated to quantify the "miss" versus "actual." Simulations and what-if analysis can gauge the impact of alternative business scenarios. The results are communicated in real time via e-mail alerts and scorecards/dashboards beacons. Success of the process lies in the ability of the models to forecast the future rather than just predict the past. Leading rather than lagging KPI forecasting models are the mantra for success.
Figure 1: The KPI Predictive Analytics Process
Although more of the software vendors are offering embedded predictive analytics with extensive "under the covers" modeling solutions for market basket analysis, fraud detection and affinity analysis, the first-cut models need to be reviewed by individuals steeped in the art and science of statistics. There are just too many assumptions and caveats associated with data mining techniques that could lead a business analyst down the slippery slope of misinterpretation. Once the initial models are validated, self-directed and automated updates using embedded predictive analytics can certainly enhance the agility and effectiveness of the decision-making process.
Kent Bauer is the managing director, Performance Management Practice at GRT Corporation in Stamford, CT. He has more than 20 years of experience in managing and developing CRM, database marketing, data mining and data warehousing solutions for the financial, information services, healthcare and CPG industries. Bauer has an MBA in Statistics and an APC in Finance from the Stern Graduate School of Business, New York University. A published author and industry speaker, his recent articles and workshops have focused on KPI development, BI visioning and predictive analytics. Please contact Bauer at kent.bauer@grtcorp.com.