We have chosen to divide the solution into three different approaches, all related to creating a customer or marketing data warehouse and activation hub. We’ll go deeper into each of the areas, but to begin with here is a quick overview:
This is the second post on our series around “Why your main goal should not be to build a unified customer journey“.
The aim of this post is to give some answers to the question: If one should NOT strive to build a unified customer journey, what to strive for then?
3 approaches summarised
1. The cookie-less marketing data warehouse
What is possible with this approach? Get the optimal top level and per channel budget allocation. Understand what each channel or campaign is worth.
What is not possible with this approach? Evaluate marketing performance based on path based, data driven attribution methods or through incrementality (lift) studies. Audience segmentation/activation, CLV calculations, and improved bidding signals.
Data needed: Aggregated marketing performance and cost data (daily) and aggregated sales data (no cookie, user, or customer IDs)
Activation possibilities: Cross channel marketing performance reporting, cross channel budget allocation, intra channel (across campaigns) budget allocation.
2. The customised data driven attribution warehouse and activation hub
What is possible with this approach? Understand what each channel or campaign is worth based on data driven attribution methods, ability to set the correct target in the channels (and per campaign), apply data driven attribution bidding.
What is not possible with this approach? Audience segmentation/activation, CLV calculations, marketing mix modeling, cookieless marketing evaluation.
Data needed: Purchase channel-paths (cookie and/or user based).
Activation possibilities: Cross channel performance reporting, target setting, bidding.
3. The modern customer data warehouse
What is possible with this approach? Supply the bidding algorithms with better conversion signals (smarter bidding), apply smarter targeting and communications in digital marketing channels, CLV calculations, clustering, and more.
What is not possible with this approach? Evaluate marketing performance.
Data needed: Cross channel (web, CRM etc) cookie and customer data.
Activation possibilities: Better bidding signals, audiences, and segments adapted for each output channel.
It is of course possible to strive to achieve all of the above approaches, because each of them fills a unique purpose. However, our recommendation is to start with one of them. Otherwise, the risk of getting stuck in the data collection phase increases significantly. Remember that 80% fail in the activation phase.
How do I know what suits my company?
In our experience, to succeed with these types of projects it’s crucial to place as much focus on the organisational aspects as on the technical ones. This includes everything from getting internal buy-in, keeping momentum during the implementation phase, as well as clarifying ownership and responsibilities. One important aspect is to define which disciplines should be handled by an external party (consultants, SaaS platforms, etc) and what disciplines need to be handled by internal stakeholders. In relation to this, it is important to define the competencies that are required in-house. This is our recommended process to follow to define the optimal approach (1,2 or 3), taking into consideration the organisational aspects:
- Each of the three approaches could be divided into four disciplines (data collection, data stitching/cleaning, analysis/segmentation, and activation).
- Map each of the four disciplines (e.g. analysis/segmentation) and define the most suitable decision makers and owners for implementation, taking into account the level of sophistication required (for example rule based vs predictive modeling). In the example below, on the left we see that if the implementation owner were to be a digital marketer without deep technical knowledge, the most suitable method for segmentation is most likely to be rule-based audiences built through a user interface. In the example to the far right, we see that if the preferred implementation owner were to be a data engineer or scientist, the method can be far more advanced. In this scenario the digital marketeer could be the decision maker but not the implementation owner. We should start by deciding on the preferred method and let that decision guide the selection of implementation owners.
- Based on the above requirements, continue the mapping exercise by defining the most suitable approach for each discipline. In the example below, the desired level of sophistication for the analysis/segmentation phase is “advanced” (predictive modeling), hence a user interface is not needed (guiding the choice of technology). In contrast, for the activation phase, manual execution needs to be set-up (and frequently often modified), hence the requirement for a user interface.
To sum up, we recommend considering at what points on the automation scale (from 100% manual to 100% automated) and the sophistication scale (from 100% rule bases decided by a human to 100% algorithmically-based) you see as being the sweet spot for your business. This should guide competency planning and in-house vs. consultancy needs.
Our hope is that our blog post series will take you one step further in finding the most suitable approach as well as the ideal method for the chosen approach, also taking organisation considerations into account.
Apart from technical planning, there is another critical aspect to have in mind from the start. That is privacy. The third post in the series will cover this topic.
Cloud vs On-premise
Before looking into our recommendations on how to build each of the three marketing/customer data warehouses, we will briefly stop to consider whether our technology should be Cloud based or on-premise. Our opinion is that this is not a choice you should spend too much time on, since we believe a Cloud-based approach is the only way to go. We have listed the most important arguments (which you could bring to your CTO) here:
- Scalability – Easily scale infrastructure up and down (to zero) which can yield significant cost savings vs fixed cost architecture (e.g. Google Cloud Functions, Google Cloud Run, Cloud Dataflow).
- Reduce DevOps/Infrastructure Burden – The proliferation of Managed Services in Cloud Environments allows users to readily deploy business logic without worrying about provisioning and maintaining infrastructure (e.g. Google App Engine, Google Kubernetes Engine).
- Availability of Global Scale Products On Demand – Ready access to the “backbone” systems that some of the largest companies in the world use to manage their services, allowing for “unlimited” scalability (e.g. Pub/Sub, BigQuery, Cloud Storage, BigTable, Global Load Balancer).
- Secure Access To All Resources from the Browser – You can easily track all resource deployments from a single “pane” and manage IAM/RBAC policies as well as “hardware” from a single place (IAM, IAP, GCP Monitoring, Cloud Logging, gcloud CLI, GCP Web Application).
What Cloud provider to consider?
Our preference is Google Cloud Platform. Like other cloud providers, Google Cloud Platform offers a comprehensive portfolio of Compute services (e.g. VMs, Managed Kubernetes, Managed Database, etc), but it’s primary differentiator is in the focus around data at scale. Solutions like BigQuery, Pub/Sub, and DataFlow provide highly scalable serverless methods for the storage, transmission and management of large volumes of streaming/batch data . BigQuery ML, AI Platform, DataStudio and DataProc (managed Hadoop/Spark) provide seamless integrations between data and modelling resources to gain insights.
In addition, Google Cloud Platform has many integrations with the Google marketing products, making the ingestion and actionability of marketing data more efficient. Good examples include Google Marketing Platform connectors to GCP (e.g. Google Analytics 360, Data Transfer for Google Ads, etc) and the upcoming Ads Data Hub product.
To make this blog post as useful as possible we have listed our views on the most important considerations/choices for each approach.
1. The cookie less marketing data warehouse
The SaaS black box vs. a custom setup
The first decision is if a custom setup suits you better than a software-license, or vice versa. To start with, let’s recap the software available. There are few really strong options including OptiMine and Aiden. A major benefit of choosing existing SaaS platforms is the relatively short time needed to get a functioning tool up and running. A wide range of connectors are available and the models are quite advanced. However, it is important to keep in mind that many of these solutions are a bit of a black box. This goes not only for the models themselves, but also the fact that your data is collected in someone else’s Cloud. In most cases you only have access to the output itself (eg. in a user interface) and not the underlying data and models. This means it is much harder to re-use the raw data for other purposes. Also, the potential for customisation can be limited. Last but not least, don’t expect to only pay for the license itself. In many cases you’ll need consultants for implementation or ongoing services.
The benefits of a custom setup are, conversely, more control over the data andthe model and a far larger potential for customisations. Unless you have a really strong in-house team you will most likely pay more for consultants, but less for licenses.
How to fetch data?
If you go for a custom solution you need to make a few important decisions when it comes to data fetching. Generally speaking there are three different ways to go. You can either build your own connectors, buy a license for a third party tool (such as Supermetrics or Funnel.io), or rely on connectors built by someone else (for example your agency).
Let’s do a high level walk through one by one:
Develop own connectors – The most time consuming option, but also most flexible. This allows for maximal customisation. In addition, in the long run this could also be the cheapest alternative, especially if you have developers in-house. When considering this option, think about the overall number of connectors you will need to build and, importantly, how you will handle the ongoing maintenance as APIs get updated across all the platforms you integrate with.
Funnel.io – If your purpose for funnel.io is mostly the connectors, it is an expensive solution. Funnel’s strengths lie in its data blending capabilities. Though important to note that the blending happens behind the scenes. The output is accessible in many different formats (for example pushed to a BigQuery-project) however you lose a lot of insights intoto how the data is joined. In addition, the data cannot be re-used for other use purposes such as audience segmentation or activation since the data is not provided at a granular enough level.
Supermetrics – Similar to Funnel.io, but more cost-effective as it does not offer the blending possibilities. If the main purpose is to pull data and do the blending yourself it could definitely be a good start or a complement to connectors that you build yourself.
Rely on connectors built by someone else (for example you agency) – Important questions to consider here are where the data is stored, how much access you have to it, what SLAs are in place, and what the total long term cost is.
What model to choose?
Final, but perhaps most important decision is what model to use/build and how to activate the insights?
More about this in blog post 5 in this series.
2. The customised data driven attribution warehouse and activation hub
There are a few acceptable out-of-the box attribution software solutions (relying on paths) out there. However, given the significantly increased need for customisation (due to for example privacy regulations and trends) our recommendation is not to completely rely on an out-of-the box solution. The recommendation is instead to build your own solution (or have someone else help you build one) using a mixture of your own solutions and existing ones. Before digging into the biggest decisions, it is worth mentioning that path based attribution has its challenges. No matter how much effort you put in, your solution won’t perfectly reflect the truth – but with that being said, the foundation for all of the big performance advertising platforms is the conversion signals. Hence, the competitive advantage of providing the bidding algorithms with as smart signals as possible based on robust attribution will be even bigger going forward.
What data source/collection method to rely on?
The underlying data (paths) is absolutely crucial and you should spend as much time as possible getting it right. There are a few different possibilities available when fetching the data/paths. We have listed the most common ones.
Google Analytics 360 – Comes with a native integration to BigQuery which is a huge benefit. Important to bear in mind that the path data (which you can find in the Multi Channel funnel reports) are not available in the BQ export. This means you will need to create your own paths, based on the raw export. This opens up a lot of possibilities for customisation but also requires some work and automation.
Google Analytics (standard) – To access the raw data you would need to use the Reporting API. This has its drawbacks. Just to reach a point where you have all raw data available (for example in a BigQuery table) is a rather complex task. You might run into sampling limitations, for example, or other limitations with getting all of the dimensions and quantity of data that you need.
Google Analytics Web + App property – The standard version comes with a native BigQuery integration which is very promising for advertisers who didn’t have the budgets for Analytics 360. However the product is still in development and there are still a few important limitations in building robust datasets for data driven attribution. The roadmap is developing quickly so watch this space!
Google Analytics Multi Channel Funnel reports – One straightforward approach is to use the Multi Channel Funnel API. By doing this you don’t need to invest time building the paths yourself. However, opportunities for customisation are more limited. In addition, only converting paths (not non-converting paths) are available which is a major drawback.
Facebook attribution – Not currently a viable option given the fact it is not possible to automate the export of conversion paths. If Facebook offer that possibility in the future, these paths are extremely interesting given that they are cross-device out of the box. A deep dive on Facebook attribution can be found here.
Own tracking solution – Very time consuming and often not a viable option. However offers a lot of flexibility.
The other important area relating to the underlying data is what actions your company takes to improve the quality of the paths themselves. The actions with most impact are cross device measurement and ITP mitigation strategies.
Cross device – Implementing User ID tracking in Google Analytics can improve the quality of the paths drastically. However, there are two things that are often forgotten when speaking about User ID: 1) Privacy (read more in this blog post); 2) The tactics needed to get as much cross device data as possible. Often the technical implementation is owned by the Ecommerce or CRM departments. However, once the implementation is complete, no further actions are taken.
Once User ID collection is in place, your team should aim to make a compelling case for users to opt-in to sharing their data or logging in to the service. We are not talking about something simple like changing the color of the login-button, but giving the customer real incentives.
Another tactic is to use email and SMS as a tool to increase the cross device match rate. If you have a large customer base, you could send emails to your customers at different hours of the day to encourage the use of the website or service on different devices. Tactics like this can lead to a huge difference in cross device match rates for User ID.
ITP – Browser technology such as ITP has had a large impact on attribution path data. We recommend that you familiarise yourself and your team as much as possible on the effects that it is expected to have. In terms of actions, there are a couple of options to consider. Firstly, server cookies (HTTP only cookies) are generally considered to be more secure and are less affected by browser based privacy technology. Migrating your device identifiers to use server cookies can be a good start to avoid the deletion and erosion of cookies set by client-side scripts (e.g. Google Analytics). Another option, adopted already by large tech players such as Google, is to model data for browsers such as Safari on browsers where path data is more robust such as Chrome.
Consideration/Choice 3: Activation
The final considerations around activation are explored in more depth in this blog post.
3. The modern customer data warehouse
This is definitely the area where the most software exists and fights for attention. Personalisation tools, marketing automation software, customer data hubs. The names are many but the capabilities do not differ a lot.
SaaS and a user interface vs. custom solutions
This decision is related to the organisational considerations. Generally speaking, a custom solution offers more sophisticated machine learning possibilities (e.g. Google Cloud Platform offers BigQuery ML as well as AutoML, to mention a few). If the custom data platform has machine learning capabilities, they can be hard to customise, but it is often a must to make them suit your business. Custom solutions also often offer more possibilities when it comes to data stitching. Our strong recommendation is to strive for as much control and customisation potential as possible when it comes to data stitching and analysis (the “engine”).
For the activation piece, a user interface is often needed. In our opinion this is the biggest requirement in a marketing automation platform. In most cases, a human needs to be in control of the actual orchestration, especially when it comes to 1-1 communication (email, sms, push notifications, etc). However the logic/segments should be fed from the modern customer data warehouse. There are clearly some advantages to having everything in one out-of-the box platform (collection, stitching, logic, orchestration) but this comes with several limitations that we should be aware of.
Segments vs. Signals
When building a modern customer data warehouse, our recommendation is to make bidding signals equally important as audience segments when feeding data to the channels.. Often bidding signals are forgotten, and the focus for data sharing to the channels relies too heavily on audiences . Also, it is very common that all output channels are treated in the same way. An example is customer lifetime value models. Having some kind of CLV signals in place is important for most businesses, however the outcome doesn’t always have to be the sharing of customer value segments with the channels. In many cases, sending enhanced conversion signals to the platform based on the future expected value of a user can have a positive impact on bidding efficiency and the acquisition of high value customers. More about this in part 4.