Building a data-driven channel attribution model using Markov chains
By ARBEN KQIKU
Using Markov chains for optimal channel attribution
Among a number of modeling possibilities for multi-channel attribution, we have Markov chains. Markov chains are a powerful tool we can use to extract precise probabilities from a series of sequential events. The realm of applicability is enormous and has long fueled research in fields as far apart as predictive inventory management, innovation of airport queuing, and the design of search-engine page-ranking algorithms. Within the context of digital marketing, performing a Markovian analysis upon a dataset of customer journeys and their conversion paths allows us to put weighted values on the effectiveness of each channel, particularly as relates to the desired outcome, a conversion. We do this in two stages: first by calculating the transition probabilities for each channel and, second, by removing channels from the journeys in order to calculate the impact of their absence and, therefore, their relative importance to the conversion. Make sense? Let’s have a look at this in a little more detail.Transition probabilities
Each customer journey has a number of steps that can involve any number of channels. Imagine that within a campaign you monitor three of such channels: C1) Social Media, C2) Paid Search, and C3) Organic Search. Before converting, a customer will go through any number of these three channels, creating what is in effect a unique journey with a not-so-unique statistical footprint. By aggregating the data of all these journeys, we can calculate what is called the transition probabilities or, in other words, how likely the customer is to go from C1 to C2, C3 to C1, C2 to C1, and so on.
Removal effects
In the second stage of this process, we apply a method called removal effect which, unsurprisingly, shows us the overall effect of removing the various steps in the customer journey. In this way we can calculate the impact on conversions by removing each channel from the dataset. How many conversions do we lose when we remove, in turn, the Social Media, Paid Search, or Organic Search channels? The aggregation of these numbers gives us a comprehensive, accurate picture of the customer journeys, and a properly weighted value for each channel’s impact on conversions.
Our instruments and methodologies
We recently performed a Markovian analysis in this way on a dataset from our portfolio that included over 40,000 unique customer journeys. The desired conversion for this particular client was a reservation of a logistics service. We built a pipeline with the raw data from Google Analytics 360 to Google Big Query, where we created a table showing unique client id, date at which each client visited the website, and whether they converted. Using R, a programming language used for statistical computing, we calculated and piped in the transition probabilities and importance values. Finally, we plotted all this data together using R’s ggplot2 data visualization package.
What you can except as results
With the results, we were able to prove in detail exactly how effective each channel had been. In this case, 35.1%, 38.5%, and 23.6% of conversions could be attributed to Organic Search, Paid Search, and Direct ( which includes unclassifiable sources, and direct entry of the website’s URL) channels, respectively.
Within these numbers we can further illustrate, for example, that Paid Search reinforced other top-performing channels by contributing directly to 13.39% of Organic Search conversions, and 7.29% of Direct channel conversions.
Get a broader perspective
So, when you’re deciding on how to attribute channels and resources within your digital marketing strategy, be aware that there are hidden stories within the data that your current model may not be picking up on. Markov chains are just one of the methods you can use to get a broader perspective.
Would you too like to identify what journey your customers take before converting on your site? Contact us today for a free strategy session.