Print View

Your printed page will look something like this.

https://www.ucda.com/should-you-be-worried/

Should you be worried?

Universal Analytics Data to be Removed in 2024

By Stu Eddins
Images: iStockphoto

On July 1, 2023, Google’s Universal Analytics will stop collecting website data and Google Analytics 4 will take over. Universal Analytics accounts will remain open with access to all historic data through June 30, 2024.

That sounds straight forward, one of your tools being replaced with another newer version, and the older tool will still hang out in your toolbox for about a year.

Until you give it a beat and it sinks in, all the historic website performance data collected by Universal Analytics since 2005 gets taken away from you on July 1, 2024.

All. Data. Gone.

One Problem, Two Levels
We should start by acknowledging that losing access to historic site data is a double whammy. First, there is the obvious loss of access to information on historic site performance, a loss that is likely to affect forecasting in the near term. Then there is the less obvious blow, which is the emotional side of the change, we tend to think of that data as “mine” or “ours”. When someone threatens to take away what we hold to be a possession or a right we get defensive, and while this emotional undercurrent might be less obvious, it tends to live just below the surface and colors our thought processes and planning.

Loss of Historic Data
Anyone who has looked at a chart that plots website session counts by day, and from that chart deduced that their website is far less busy on weekends, has made an analysis.

Analysts use website data collection to learn the behaviors of site visitors, the impact of marketing, and to forecast a path toward achieving an organization’s online objectives. You don’t have to be a full-time Analyst to read and understand most of the charts and graphs about website performance. However, an Analyst goes one important step further by taking that same data and preparing forecasts. “Historically, we did X amount of marketing and got Y new students, so we forecast that if we do X+1 marketing we would earn much more than Y+1 new students”.

It would be fair to say that Analysts look backward so that they can see forward. All things being equal, the greater the amount of time in that lookback timeframe, the more likely the resulting forecast will be accurate. By removing access to a large block of historic data, Analysts will need to discover other signals in current day data to help them generate forecasts.

I just referenced “all things being equal”, hold that thought.

It’s My Data
Many of us have spent years collecting website data, working to ensure that the data is clean and error-free, and that it is accessible to whoever needs it. We have an ownership-like investment in the data, and that is an emotional tie.

“It’s my data, how can Google take it away from me?”

That’s right, it is your data, generated from your website by your site’s visitors. But that isn’t the whole story.

The data has been collected by a free software platform, using free digital storage, offering free access to anyone with the right set of free credentials. Yet it costs Google hard currency to own and operate the tens of thousands of physical servers used by Google Analytics, and in turn support the infrastructure required to maintain them.

It’s still your data, Google simply won’t continue to store the old Universal Analytics data, and they will discontinue the software required to access it. If an organization wants to download and preserve their historic website performance data they can, but that storage and access solution won’t be free. Google can move your data to their BigQuery cloud database for a fee. BigQuery requires the use of MySQL language to access the stored information, or it can be connected to Looker Studio.

Another option is to download most of the data to a platform like Domo which also offers storage and access for a monthly fee.

Yes, it is still your data, but after July 1, 2024, keeping it will incur cost.

Some Needed Context
Perhaps something has been overlooked, buried beneath the threat of data loss, questions about data ownership, and the future costs of maintaining storage and access to the data.

Why would you want to keep all that the data in the first place?

Fear of loss can lead to making quick decisions that in the long run don’t benefit the organization. So before committing to preserving historical website data here are a few items to consider:

The data collected in today’s Google Analytics 4 is fundamentally different than the historical data in Universal Analytics. A few of the differences: 

  • The way sessions are counted. The definition of a bounced session. Even time on a page or on the site. None are measured the same way between the two versions of Analytics.
  • The method used to combine site visits scattered across days and weeks and attributing that activity to the same user is very different between UA and GA4.

If these most basic data points are collected and counted in very different ways, how accurate will it be to compare historic site behaviors to today’ site behaviors? If we don’t have consistency in the data all we’re left with is trends, not scorecard metrics. Maybe we don’t need the details (data), maybe we just need to preserve the trends.

When was the last time your organization used website performance data from two years ago to create a strategic plan? 

  • The way people use the web is constantly changing, we shouldn’t assume that today’s users behave the same way as our users from two years ago.
  • One example is that search engines have been around for over 20 years, yet every day 1 in 5 of the searches performed have never been seen before. If today’s users search differently compared to two years ago, they access your site differently too.

Because the way people use the web changes almost daily, as performance data ages it risks becoming less relevant as a forecasting tool. While it might be nice to know that you have data on how Organic Traffic performed in 2017, it isn’t terribly useful in today’s decision-making process.

In both subtle and extreme ways, COVID changed everything. 

  • Earlier in this article I stated, “All things being equal, the greater the amount of time in the lookback timeframe, the more likely the resulting forecast will be accurate.” Since February 2020, few things online have remained equal.
  • During COVID people were forced to rely on the web more than at any time before. Casual users became frequent users, power users became even more enmeshed with the web. Increased and more intensive use of the web caused an increase in overall web-savvy behaviors.
  • Today, more than three years after the pandemic started, few higher education websites have achieved the same volume of traffic they had before COVID. Yet many are experiencing more goal completions per user than ever before. Less traffic, but more effective traffic.

Simply put, indications are that people changed how they use the web, becoming more adept at finding what they need, faster. If the pandemic acted as an incubator period by promoting more advanced usage of the web, then the data on performance during and before COVID becomes far less relevant in assessing and forecasting today’s behaviors.

Conclusion
Google will allow us continued access to our Universal Analytics data for one year after the platform stops collecting data. That provides some breathing room to make a considered decision on how–or if–the data should be preserved. We don’t need to rush toward a solution, but then we shouldn’t simply kick the can further down the road either.

Before committing to the effort and cost of preserving historic Universal Analytics data, an informed internal conversation is needed. Discussion points should include details of exactly what is to be archived, an estimate of the cost to warehouse the data, and any gaps in personnel that need to be filled (by training or new hire) to successfully access and use the stored data.

The second part of the conversation is the more important one to have, this is where the “Why?” question is asked and answered. How is older data used today, is it simply to create more impressive looking charts? Or is there a supportable business need that mandates access to what happened on the website back in 2018?

“This is how much it will cost each month/year to download and store Universal Analytics data. We will need to train two or more people on our team in how to use MySQL programming language to access and use the data. We need to do this because the historic Analytics data acts as a check and balance to our CRM solution.”

  • Most colleges probably won’t need to go with a full data storage and access solution. Downloading important reports as spreadsheets could be enough. Depending on what data is identified as important, this download process could take weeks or months, so making decisions and creating a download plan shouldn’t be delayed.
  • Others may elect to go with a short-term solution such as a platform like Domo that can both store the data and create charts and graphs from it. Cost may be slightly higher than Google’s Big Query option, but platforms like Domo have data visualization built in.
  • A very small segment might have compelling need for the warehousing and access provided by cloud storage paired with staff trained in using MySQL or R to extract information. Those in this final group are probably very aware of their situation, if for no other reason than they already have staff or vendors focused on data extraction, transformation, and visualization.

When it comes to determining the right path and method, there is no blanket recommendation we can make. That said, the important first step remains: “Start the internal discussion as soon as possible.”