Best Practices for Analytics Profiles

Bill Schmarzo By Bill Schmarzo July 8, 2014

In our Big Data engagements, we talk about the importance of building detailed “profiles” of our most important entities, such as customers, products, devices, machines, employees, partners, stores, wind turbines, cars, ATMs, etc. As part of our data science process, we build a profile on each individual entity that:

1)     Captures that entity’s tendencies, propensities, patterns, trends, behaviors, relationships, associations, affiliations (plus, in the case of humans, interests and passions)

2)     Compares that entity’s current state and recent transactions, activities, and interactions to their individual profile in order to flag “unusual” activities and behaviors that might be indicative of a problem or monetization opportunity

But what do we mean by the word “profile,” and what elements might comprise a profile?

Defining and Building a Profile

A profile is a combination of metrics, key performance indicators, scores, business rules, and analytic insights that combine to make up the tendencies, behaviors, and propensities of an individual entity (customer, device, partner, machine). The profile could include:

  • Key demographic data such as age, gender, education level, home location, marital status, income level, wealth level, make and model of car, age of car, age of children, gender of children, and other data. For a machine, it might include model type, physical location, manufacturer, manufacturer location, purchase date, last maintenance date, technician who performed the last maintenance, etc.
  • Key transactional metrics such as number of purchases, purchase amounts, returns, frequency of visits, recency of visits, payments, claims, calls, social posts, etc. For a machine, that might include miles and/or hours of usage, most recent usage time and date, type of usage, usage load, who operated the product, route of product usage (for something like a truck, car, airplane, or train)
  • Scores (combinations of multiple metrics) that measure customer satisfaction level, financial risk tolerance, retirement readiness, FICO, advocacy grade, likelihood to recommend (LTR), and other data. For a machine, that might include performance scores, reliability scores, availability scores, capacity utilization scores, and optimal performance ranges, among other things
  • Business rules inferred using association analysis; for example, if CUST_101 visits a certain Starbucks and a certain Walgreens, we can predict (with 90% confidence level) that there is an 85% likelihood that this customer will visit a certain Chipotle within 60 minutes
  • Group or network relationships (number, strength, direction, sequencing, and clustering of relationships) that capture interests, passions, associations and affiliations gained from using graphic analysis
  • Coefficients that predict certain outcomes or responses based upon certain independent variables found through regression analysis; for example, a machine’s likelihood to break down given a number of interrelated variables such as usage loads since last maintenance, the technician who performed the maintenance, the machine manufacturer, temperatures, humidity, elevation, traffic, idle time, etc.)
  • Behavioral groupings of like or similar machines or people based upon usage transactions (purchases, returns, payments, web clicks, call detail records, credit card payments, claims, etc.) using clustering, K-nearest neighbor (KNN), and segmentation analysis

Example Customer Profile

A profile could be made up of hundreds, if not thousands of different metrics and scores that—when used in combination against a specific business initiative like customer retention/up-sell/reference, predictive maintenance, supplier quality, or on-time shipments—can improve the predictive capabilities of the model.

Let’s review in the table below what a profile might look like for a particular customer. Note that I have grossly oversimplified the profile to facilitate the explanation and because I can’t process anything more complex myself. My data science team is probably rolling over laughing in their Python, R, Mahout and SAS toolsets as they read this.

Profile Variable

Historical Score Variance σ 4-week

Unusual Flag?

Demographics (Age, Gender, Income, Education)
Retirement Planning 90 1.25 92
Retirement Readiness 65 1.75 66
Disposable Income 95 1.50 94
Insurance Risk 45 1.10 45
Financial Risk Tolerance 50 1.25 52
Pregnancy Likelihood 0 1.00 0
Divorce Likelihood 2 1.00 2
Health Score 94 1.05 94
Exercise Frequency 81 1.45 78
Preferences (based upon Purchases, Web Browsing, Search, Mobile Apps, GPS)
Starbucks Score 95 1.25 92
Chipotle Score 88 1.60 85
Air Travel Score 82 1.90 80
United Airlines Score 70 2.25 50 X
SWA Airlines Score 45 3.10 45
Virgin Airlines Score 25 4.50 50 X
Automobiles 20 2.20 85 XX
Rules: A|B -> C (based upon Purchase transactions, GPS tracking, Mobile Apps)
Stanford Starbucks à Stanford Shell Station 55 2.50 50
Stanford Starbucks | Oregon Ave Chipotle à Middlefield Walgreens 60 3.25 55
United ORD à Chicago Uber + Schaumburg Renaissance 45 1.55 55
EPA Starbucks à EPA Sports Authority 45 2.55 15 XX
Relationships (Emails, Texts, Social Media, Phone Calls)
Carolyn Doe 98 1.05 98
Amelia Doe 98 1.01 99
Wei Lin 55 2.25 99 X
John Smith 85 1.56 25 XX
Associations (Social Media, Email, Web Browsing, Search)
Chicago Cubs 85 1.75 75
Baltimore Orioles 82 2.25 10 XX
Golden State Warriors 78 2.35 84
EMC 86 1.45 88
Kool Big Data Startup 35 3.75 80 XX

Some metrics and scores are more important than others, depending upon the business initiative being addressed. For a financial services firm focused on customer acquisition, certain data (disposable income, retirement readiness, life stage, age, education level, and number of family members) may be the most important predictive metrics. For customer retention, however, metrics such as advocacy, customer satisfaction, risk comfort score, social network associations, and select social media relationships may be the most important predictive metrics.

In my next blog, I’ll take a look at how to use these profiles in a customer retention example.

Bill Schmarzo

About Bill Schmarzo

Read More

Share this Story
Join the Conversation

Our Team becomes stronger with every person who adds to the conversation. So please join the conversation. Comment on our posts and share!

Leave a Reply

Your email address will not be published. Required fields are marked *