Graph Analytics 101
I’m intrigued by graph analytics. It’s a wonderful analytic tool to uncover insights about customer, product and device/node relationships buried inside social media, telecommunications, healthcare, and computer networks. I wanted to learn more about graph analytics and explore some specific use cases where the use of graph analytics can lead to new customer, product, campaign, and operational insights. This 2 part blog series on graph analytics pulls from a number of very useful sources, which I reference at the end. Hope you learn as much about graph analytics as I have! Click here for Part II.
Graph Analytics Definition
Let’s start with a definition of graph analytics.
Graph analytics leverage graph structures to understand, codify, and visualize relationships that exist between people or devices in a network. Graph analytics, built on the mathematics of graph theory, is used to model pairwise relationships between people, objects, or nodes in a network. It can uncover insights about the strength and direction of the relationship.
- Strength of relationship: how often do nodes or individuals communicate with each other? What other nodes or individuals tend to join that conversation? How much “weight” should be given a type of node, based on the analysis you’re conducting?
- Direction of relationship: who typically initiates the conversation? Is it a two-way conversation, or does one always lead? How often and in what situations does the conversation get forwarded to others?
Nodes represent entities such as people, businesses, accounts, devices, ATMs or any other item you might want to track as part of a network. Properties are pertinent information that relate to nodes. For instance, if “Wikipedia” were one of the nodes, one might have it tied to properties such as “website,” “reference material,” or “word that starts with the letter ‘w'”—depending on which aspects of “Wikipedia” are pertinent to the particular database.
Edges are the lines that connect nodes to nodes or nodes to properties and they represent the strength and “direction” of relationship between the two nodes. Most of the important information is really stored in the edges. Meaningful patterns emerge when one examines the connections and interconnections of nodes, properties, and edges.
Essentially, graphs provide a way of organizing data to specifically highlight relationships between people or devices on or across a network. On such a foundation, it is possible to apply a number of simple to complex analytical techniques to understand groups of similar, related entities, to identify the central influencer in a social network, or to identify complex patterns of behavior indicative of attrition, advocacy, and/or fraud.
As an example, the key to Google‘s search engine success is the use of a specific graph analytics technique called PageRank. Rather than focus on the prevalence of keywords in a web page, Google focused on the relationships between webpages and prioritizing results from highly authoritative sites—resulting in astonishing accuracy in determining relevant results for keyword search.
A discussion of graph analytics must include a discussion on graph databases, those technologies that are optimized for performing graph analytics.
A graph database uses graph structures with nodes, edges, and properties to represent and store data. Compared with relational databases, graph databases are often faster for associative data sets, and map more directly to the structure of object-oriented applications. They can scale more naturally to large data sets as they do not typically require expensive join operations. As they depend less on a rigid schema, they are more suitable to manage ad hoc and changing data with evolving schemas.
Power of Graph Analytics: Understanding Relationships
Social media networks such as Facebook and LinkedIn are driven by a fundamental focus on relationships and connections. For example, Facebook users can now use the service’s Graph Search to find friends of friends who live in the same city or like the same baseball team, and the site frequently suggests “people you may know” based on the mutual connections that two unconnected individuals have established. LinkedIn focuses on helping business professionals grow their social networks by helping them find key contacts or prospects that are connected to existing friends or colleagues, and allowing users to leverage those existing relationships to form new connections.
Likewise, the ability to comprehend and assess such relationships is a key component driving the world of business analytics. For example, business managers frequently want to know the answers to questions such as:
- Who are the social influencers who have the most social power to influence the perspectives of others?
- Who are the social drivers and who are their typical followers based upon the topic of discussion?
- What are all the ways in which a person of interest in a crime database may be related to another person of interest?
- Based on known patterns of suspicious behavior in a corporate network, how can we identify malicious hacking attacks before they have a financial impact on our company?
- Which of an organization’s partners have a financial exposure to the failure of another company?
Take the question of how two people might be connected on social media. This may seem simple, but as soon as you look closely, it’s not quite so clear. The simplest example of such a problem is in looking at how two people may be connected on Facebook. They can be friends—a direct connection that is hard to miss. Or they might be friends of friends, which starts getting a little murkier. The connections can be even more distant and difficult to immediately pinpoint. For instance, Person A may be married to someone whose brother is a friend of Person B. Or perhaps they have a shared affiliation, such as attending the same school, working at the same organization, or attending the same church.
In some cases, two individuals’ only connection may be sharing a few “Likes.” These shared “affinities” may be valuable information to a business if, for example, those “Likes” happen to be something your organization addresses. In that case, you may want to drill down to those specific people out of the entire billion users on Facebook, so that you can target your online advertising directly to them.
What do you do to solve problems that involve complex relationship patterns and require detailed link analysis? Enter graph analytics. Graph analytics can help to mine this wealth of relationship data to uncover consumers’ interests, passions, affiliations and associations.
I hope that this discussion on this powerful analytic capability has been useful. Part II of the series of graph analytics will dive into some specific graph analytics use cases.