What Do You Overlook if You Use Just Client-Side Web-Analytics?

In the previous article comparing the JAMstack services of two popular providers, I have mentioned that one of my incentives of moving to Cloudflare was its basic server-side analytics provided even for the free tier users. Extended analytics is available on both Cloudflare and Netlify as a paid option: on Cloudflare you have to subscribe to one of the paid accounts (the cheapest is “Pro” plan that costs 20 US Dollars per month); on Netlify you can either subscribe to “Business” plan for 99 US Dollars per member per month, or you can enable this feature for every your site for just 9 US Dollars a month. If you need an accurate web analytics data, I definitely recommend you choosing one of these options because, as my analysis in this article shows, the client-side analytics solutions (e.g., Google Analytics, Yandex Metrica or Microsoft Clarity) overlook a large portion of visitors' interactions due to different anti-tracking solutions (e.g., personally I use uBlock Origin plugin for my web-browser). In this article, I show how much data you may overlook.

Table of Contents

My Setup and Limitations

As a client-side tracking solution I use Google Analytics since the first version of my website (appeared somewhere in 2013). Even when I migrated to the current setup three years ago, I continued to use Google Analytics. After I moved my website to Cloudflare Pages, as the source of server-side analytics I started to use the basic analytics provided by this platform.

Unfortunately, in such setup there is one big issue: to the best of my knowledge, there are no equivalent metrics on two these platforms that you can directly compare (let me know if I am wrong). Therefore, I have chosen the closest ones to my point of view: “Users” on Google Analytics and “Unique Visitors” on Cloudflare.

According to the Google Analytics documentation, Users are those “who have initiated at least one session during the date range. In order for Google Analytics to determine which traffic belongs to which user, a unique identifier associated with each user is sent with each hit.” This means that each visitor during the first visit to a website is assigned with a unique identifier stored in a cookie. During the following visits, this identifier is presented, and Google Analytics knows that this is a returning visitor. The amount of unique identifiers during a time period is what Google Analytics considers as Users. According to the Cloudflare documentation, the Unique Visitors metric shows that amount of unique IP addresses querying a website during a time period.

These two metrics are not equivalent. For instance, a user may open a webpage at work and use the same device to read the same webpage at home. In this case, Google Analytics will report that only one User has visited the webpage (the same unique identifier will be presented in both cases), while Cloudflare will show two Unique Visitors (two IP addresses at home and at work). At the same time, when there are two users behind a NAT (thus, having the same IP address) checking the same website, Google Analytics will report two Users, while Cloudflare will show only one Unique Visitor. Thus although these metrics are not totally equivalent, to my point of view they still can be compared to assess the amount of the traffic missed due to anti-tracking solutions.

Note that my blog is for technically-savvy audience who usually make use of different anti-tracking solutions. Thus, I expect that the amount of missed traffic in my case could be higher than in general.

Analysis

I moved my website to the Cloudflare platform on the 7th of July, 2021. However, the data on this date is not complete, so my dataset contains the data from the 8th of July. Currently, the dataset is not that big (less than half a month of data), however so as I have all the code ready, probably in the future I will update this post from time to time to see the evolution of the data.

Below, you can see the interactive graph for my website with two lines: the lower shows the amount of Users from Google Analytics, while the upper represents the amount of Unique Visitors from Cloudflare.

As you can see, Cloudflare consistently shows higher numbers. The amount of Unique Visitors is about 3.7 times more than the amount of Google Analytics Users. This number allows me to estimate the amount of the traffic I miss. If you rely only on client-side webanalytics data, e.g., Google Analytics, to show your advertisers the traffic to your website, then it is possible that you undervalue your website (up to 3.7 times in my case).

The only exception to the consistent view is the data from 14th of July, 2021. During this day, Google Analytics reports 496 Users, while Cloudflare shows 455 Unique Visitors. The reason is that during this day bots (from bottraffic921.xyz) attacked my website and spoiled the statistics. Bot traffic is another problem for the client-side statistics. Indeed, it is quite easy to automate visiting of a particular website and boost the client-side visit statistics, however, it is quite difficult to find many different IP addresses to fake the amount of Unique Visitors. Note, in Google Analytics you can create filters so that this traffic would not be considered in the future. If you want to have a more clear statistics, I recommend you to spend some time on doing this.

As a proof that Users and Unique Visitors metrics are connected with each other, I calculate the Pearson correlation coefficient. Currently, it is about 0.33 that shows that these two metrics are correlated.

Conclusion

The conclusion is simple: if you earn money from your JAMstack website and want to see the accurate data, consider subscribing to server-side analytics services. This may rise the evaluation of your website significantly (up to 3.7 times in my case).

If you have a similar setup and want to repeat the analysis I have done in this article, you can find the Jupyter Notebook and the data in the accompanying repository.

Yury Zhauniarovich
Yury Zhauniarovich
R&D Engineer
Lead Data Scientist

Related