Current structure from Google analytics:
User> Visits> Pageviews> On-Page Events
The whole thing is starting to create numbers more and more distant from reality. And that’s why I did a thought exercise on a topic which of these metrics can be trusted.
Historically, it worked quite a bit before mobile applications, single page applications, PWA, services appeared in the world. Other cloves in the coffin are ITP / ETP and other versions of user privacy. Which curves the metrics on which a lot of people build analytics and despite the tricks of how to keep the GA cookie a little longer than normal.
What does the number of users in Google analytics (* not logged in) tell us ?
This is the number of browsers with a unique cookie… it is definitely no longer a real unique user. I tried to analyze how many GA cookies are per real user and it is about 2 cookies per month for a large service. Yes, of course I’m talking about a lot of people who have permanent long-term cookies, but there are a lot of people who are able to look on the web 10 times a day and always have a different cookie. There are even special users who simply delete a cookie after each creation. It is quite difficult to identify such a user on the web and work with him differently. The real number of users is many times smaller. The light is brought into these waters by logged-in users with enabled analytics, where I am not really interested in cookies, because I know their user ID. Then we can say, I have X logged in users and Y low quality users,what can I + – share to two and get closer to reality.
In reality, this ratio between reality and the number of cookies per user will increase more and more. Services without a permanent login will practically work blindly, and the user number metric will have less value from the business client’s point of view.
As an additional bonus, there are also web content downloaders that display the page as if it were a real user and thus artificially create super low-quality traffic / users.
Another impact is on the modern “data driven” approach with a lot of machine learning. ” Garbage in, garbage out ” when if I am not able to hold the identity of the user, then we enter a clutter on the input models and the result will also be an unusable mess. In general, almost the entire attribution will go to hell if users do not want to be from the first moment of measurement. Because the gold in attribution models is to have a complete user path with as few errors as possible.
Of course, I’m still exaggerating a bit, but for me it’s important to prepare for the worst and then only the analyst can be pleasantly surprised. GA will still be a pretty good indicator of trends 😉. The Privacy sandbox API will certainly be used . I do not believe that we will lose all the data, but we will have to learn to work with the data piece by piece, with those that are good and those that are chopped. It will be important to know which ones are which.
Google analytics visit
It’s not much better the classic Google analytics were good as long as the site worked on it, each time a new page was loaded and not much was measured. If we add SPA, PWA and mobile applications etc. to that paradigm. it starts creaking again. I have clients who have services. The result is that the user logs into the system in the morning and works in it with half-hour breaks from morning to evening. What does it look like from the point of view of the visitor? 13 visits per day? Each visit begins with a click from an existing page. Practically 80% of visits are direct visits, without sources and media, classically they do not even have a landing page, because it starts with a measured click. Practically behaving like when I open a mobile application on my mobile, I know that the application was opened, but with few exceptions I do not know the reason for opening it.We have already started to replace visits by the number of users per day and started to observe traffic in part, except for the acquisition of new clients, because the result is important metrics of the site / application itself and not as a user launched it, especially when working 5 days a week Till evening. What I’m actually interested in is whether the user is new, what his status is from a business perspective and where I want to get him. Then I may be interested in where he came from, but not an interrupted visit. All I need is an event with external sources, but I’m more interested in the user, but it’s not worth anything if he doesn’t log in. What’s funny is that even though single page applications cause various problems, by not actually reloading the page, they can also do type spells, mark interrupted visits to the web and they can then exclude or combine them when working with attribution models.
What about those famous pageviews?
Is he dead yet? Paradoxically, I think that the classic view of the site will survive the longest. Yes and no. By further increasing security and limitations on the part of the analyst. So it could turn out that we could consider “site usage analytics”. From user-centric analytics can go back to the beginning, practically to the analysis of server logs. We will not use any information store for that user. This will make both classic cookies / users and visits disappear. In principle, when visits did not work, the unique pageview metric no longer applies because it is simply linked to a visit.
This will get to the place where we will address the flow of users through the site, how the site visitor got to this page and where he went from this page. Such separate views of things that are not glued together using cookies. Paradoxically, this step should lead to an increase in measurements. At the moment when I can’t rely on a long visit, I will always want to know perfectly what the person who viewed the page saw perfectly and what is important what he did next. Did he go? He gave me an indication of what his next step would be?
The resulting report will then look something like this:
- Inputs, external / internal,
- What happened on the page? Were the site’s goals met?
- Output actions.
Measurement without cookies and other repositories is super harsh and probably isn’t waiting for us right away, but at the same time I believe that some companies and large publishers will turn to it to escape security problems. Paradoxically, neither Google nor Facebook will mind, advertising will still work, but it will be harder to measure. While worse metering will break the quality of ad targeting, it will also reduce data for advertisers who won’t be able to control much and will be even more dependent on the ad platform.