Josh West is a Senior Partner at Analytics Demystified, the leading American digital analytics consultancy. In his work, he focuses primarily on Adobe SiteCatalyst, Test&Target, Google Analytics, and tag management implementation.
He also enjoys custom development, especially in using the APIs of CRM, web, and social analytics tools to solve business problems.
When you get right down to it, data is the basic building block of any company’s digital marketing efforts. Whether it’s processing a transaction online, deploying a tag on your website, or pulling data from your analytics tool into your data lake, your business can only be as good as the data you collect. That’s the reason I try to get the companies I work with to focus on their data - not the vendors they need to send it to or the format in which they want it.
I often work with companies whose data layers contain data that is way more complex than it needs to be. My favorite example is having your developers write your page name logic into your data layer, so that every page has a data layer element written by developers to identify the page. This is a common need for tools like Adobe Analytics - but the problem is, as new pages go live, or you decide you want to change how a particular page is named, you have to go back and have developers make a change. Instead, I encourage my clients to start with the building blocks - page name is usually derived from what type of page it is, what section it’s in, a product name, etc. So why not just ask developers to expose those building blocks, and then rely on your TMS to piece critical elements into a more meaningful whole? Then, if you do have to make a change, you do it in your TMS, outside of development release cycles. Breaking your data down into basic elements lets everyone involved focus on what they know best - your analytics developers can write logic within the TMS without requiring your back-end developers to do the same.
To summarize this point, the data contained in your data layer should represent the data in your internal systems supported by your developers. It does not need to be the data in the state it is when it is sent to your analytics tool. One of the most obvious benefits to a TMS is that it allows you to translate the data into the final state desired by each vendor - but too many companies fail to take advantage of this benefit.
Another thing that makes me cringe is when I find duplicate elements in a data layer, with the only difference being the name of the vendor receiving the data. For example, say that you send both Google Analytics and Facebook your product ID on each product page. Why would you create 2 data elements - “ga_product_id” and “fb_product_id” when a single element called “product_id” would be sufficient? If someone comes along and has to figure out what data is in your data layer, it’s going to be confusing to figure out why 2 data layer elements have the same value but different names. Your data layer should be absolutely clear about what kind of data each element contains - and no more complex than is necessary.
Your data layer taxonomy will make a lot more sense if you identify a set of governing principles around it and then make efforts to maintain those principles. I tell my clients I’m less interested in whether you use camel-case or underscore separators for multi-word elements (productId and product_id are the same to me) than I am in whether they use the same rules for all their data elements. I also don’t have a strong preference on whether you nest customer data under a “customer” object or prefix a set of data elements with “customer_” - but do it the same way for each different category of data. Consistently apply the rules you set for your data layer, and it will make it easier for anyone that needs to access that data.
Perhaps the most important thing you will do with your data layer is to document it. We create a spreadsheet for each of our clients that we refer to as their “Data Dictionary” to identify the following information about each data element and event that they plan to track:
This data dictionary should be accessible to both developers and analysts within your organization, and it should be easily understood by both groups as well. As your organization experiences turnover, your developers should be able to quickly familiarize themselves with this document and identify whether changes they are making to your website - that might first appear to have no impact on analytics or the data layer - will impact your tracking efforts in any way.
Regardless of how hard you work to build a good data layer and a process to keep it that way, things will go wrong. But some companies refuse to adapt - if a data element breaks, they’ll wait until it can be fixed and live with bad data until that can happen. Or they’ll refuse to make any implementation changes until the data they need is in the data layer. The opposite of a company refusing to adapt is the one that is always bending over backward - too many companies lean on their TMS to paper over their data layer issues until the TMS is so messy and fragile that your implementation collapses on itself.
I always advise my clients to lean on their TMS when they have to - if something breaks, it’s okay to add temporary fixes to keep your data quality high. Just make sure you go back and fix underlying problems and then revert your TMS hacks when they’re no longer needed. Over the long term, you should follow your processes religiously - but you should be flexible enough in the short term to keep getting things done.
I have told many of my clients that “The perfect is the enemy of the good,” because it relates perfectly to the work we do every day. Outside of your first release date (maybe), you’ll never have a perfect data layer, TMS, or analytics implementation. But you can make it a little better every day, and that should be your goal. Build a data layer based on consistent, logical building blocks - but know that sometimes you’ll have to make some short-term tradeoffs to keep moving in the direction of your long-term goals.