Data Scientist Lesson: Micro-to-Macro Analytics Modeling

Bill Schmarzo By Bill Schmarzo September 9, 2013

Recently, I was working with one of our data scientist (Wei Lin, pictured below with his son Lucas on a trip to the Grand Canyon…yes, we do occasionally let our data scientists have a life) on a couple of Vision Workshops.  In the Vision Workshop process, we employ a data scientist to take a small sliver of client data (~5GB) and build some illustrative examples of what the client could do to leverage their detailed transactional data, coupled with new sources of unstructured data and advanced analytic techniques to uncover new insights about their customers, products and operations.  The purpose of this exercise is to help the client’s business stakeholders to start envisioning how they might leverage these new data sources and advanced analytics to power their key business initiatives.

Wei and his son small version

As part of that process, the data scientist will build some very detailed predictive models modeling the behaviors and tendencies of customers, merchants, wind turbines, students, etc.  For purpose of the Vision Workshop (and to keep the Vision Workshop process very nimble and agile), we typically work with the data from only a couple of customers, partners, products, turbines, students, etc.  The question naturally arises regarding how do we scale the detailed data prep and analysis work we did on that single customer/partner/turbine to the millions of customers/partners/turbines that the client might have.  To be honest, I had the same question.

At this point, Wei introduced the “micro-to-macro analytic modeling” process.  This process truly required me to think differently about the analytics process and to throw aside my traditional thinking about how to develop customer profile, segmentation and targeting models.  Let me walk through this “micro-to-macro analytics process” and show how this line of thinking can create a magnitude of change in the level of insights that are uncovered, and the resulting “actionability” of those insights.

Micro-to-Macro Analytics Modeling

Figure 1 provides a high-level overview of the micro-to-macro analytics modeling process.  I’ll explain each of the steps in more detail.

Micro-to-Macro Analytic Modeling Process

Figure 1: Micro-to-Macro Analytics Process

Step 1:  Data Prep and Model Development

In step 1, the data scientist takes the detailed transactional data for that single customer, partner or product, and builds a predictive model on that individual customer, partner or product.  For our example, we have access to multiple years of daily credit card transactions, and have coupled the daily credit card transaction data with customer-specific internal CRM data (consumer comments, email threads) as well as social media data.  We then used the association rule learning methodology to develop a predictive model about that particular customer and their merchant usage propensities (see Figure 2: Customer Predictive Model below).

Customer Predictive Model

Figure 2: Customer Predictive Model

This association rule model yields two types of insights about customer’s merchant engagement propensities that we can convert into rules or insights:

  1. Market basket or association analysis, which shows the customer’s propensity for which merchants they are likely to visit in combination on the same day.  For example, “Customer X visits CVS and Trader Joe’s together 83% of the time on the same day”.  Note:  if we had the exact time of the credit card transaction, when would be analyze and segment the data to create multiple “trips” that occur within the same day.  This could yield much more detailed, more actionable insights such as “Customer X visits CVS and Trader Joe’s 55% of the time on the same trip.”
  2. Serial or sequencing of transactions, which shows the customer’s propensity for which merchant they visit prior to the customer visiting merchant X.  For example, “Customer X visits Safeway 41% of times after visiting Chipotle and Starbucks on the same day.”   Again, having the exact time of the transactions would yield more granular, more actionable insights.

In another example, we were dealing with a wind turbine’s series of error codes.  Using the association rule model, we could develop rules such as “When error code 10123 appears, followed within 30 minutes by error code 9874 for Wind Turbine 41, then there is a 80% probability that there is a potentially fatal problem with the turbine gear box.”

The ability to leverage the association rule model on detailed customer transactions in order to quantify the sequencing of customer activities forms one of the foundations for enabling the delivery of real-time, location-based marketing offers.

Running the above association rule model against customer 200214 yields the following business insights or rules (see Figure 3:  Customer 200214 Purchase Behaviors Insights below).

WPC 2013_Messaging Image 1

Figure 3: Customer 200214 Purchase Behaviors Insights

The above chart shows us for Customer 200214 the merchants they tend to visit, in what combinations, their propensity as to what merchant they likely will visit next, and the associated lift and confidence in the rule.  The rule reads {A | B} –> {C}”, which means that if the events on the left side of the equation happen; then it predicts the probability that the event(s) on the right will happen next.  For example, the rule {onions, ketchup, mustard} –> {hamburger} found in the sales data of a supermarket would indicate that if a customer buys onions, ketchup and mustard together, he or she is likely to also buy hamburger.

Step 2 & 3: Create Millions of Business Rules

The next step is to run the customer predictive model against ALL your customers, all 30M+ of them (Step 2). This is where Wei challenged me to think differently – that you could run this model using today’s big data technologies against millions of individual customers to yield these detailed analytic rules or insights.  The output of this process will be a bevy of business rules (much like the business rules that we uncovered when we ran this predictive model against customer 200214, only multiplied 30+M times).  As a result, you will uncover potentially tens or hundreds of millions of insights or business rules (Step 3).

Step 4:  Aggregate Business Insights And Rules Into Common Segments

Once you have these tens or hundreds of millions of insights or rules, you are now in a position to (1) aggregate common or similar rules into the same customer or merchant usage segments, (2) ascertain the strength (lift plus confidence level) and actionability of the resulting segments, and (3) determine if enough customers fall into that segment to make it worthwhile of action.

For example, you could aggregate all of your customers who have the strong {Starbucks, Chipotle} propensity into the same (very cool) behavioral segment.  What you will find in this process, is that customers don’t just fall into one segment, but actually might fall into multiple segments – {Starbucks, Chipotle}, {Foot Locker, Best Buy}, etc.

Step 5:  Create More Personalized Customer Offers

Having these new, more tightly focused, more relevant customer segments enables clients to dramatically improve the profiling, segmentation and targeting of their customer base.  Instead of trying to force fit customers into a few (10 to 50), highly-generalized customer segments, we could instead create thousands of customer segments that allow marketing offers and campaigns to be more focused, more relevant and ultimately, more profitable.  As another example, it enables us to group the performance of products (e.g., wind turbines) into multiple maintenance categories based upon the detailed rules that we create about each individual product’s performance (e.g., error codes, vibrations, sensor data) to improve the maintenance scheduling and optimize product performance.


I love this “micro-to-macro analytics approach” because it fuels the creative juices of the organization – to look differently about (1) how we analyze each individual customer, partner and/or product in order to create, (2) detailed behavioral or performance insights or rules that we can, (3) group into new, more focused, more relevant and ultimately more profitable segments.

It’s this sort of data scientist thinking that holds the potential to transform how data powers big business.

Bill Schmarzo

About Bill Schmarzo

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *