Skip to content

Google analytics spam

Over the past few months I have noticed an increased amount of Google Analytics spam traffic (particularly referrals) on the websites I manage.

Most recently, having launched a site and checking a week later to find it had already had 500 hits despite not promoting it yet, I decided to investigate further.

I discovered that almost all of this traffic was spam, rendering the real traffic data practically useless. I’ve done some research into how to clean it up to restore the accurate data and thought it would be useful to share my findings.

  • In this first blog I’ll explain how to identify the spam traffic.
  • In part 2 I’ll explain how to set up a segment to filter out the spam data from the traffic that’s already been collected.
  • Then finally in part 3 I’ll explain how to set up a new filter to block spam data from being logged in future tracked data.

Part 1: Spotting the Spam Traffic

First off I had to find out where all the traffic was coming from so I checked the Acquisition tab.

Google Analytics acquisition data

Google Analytics acquisition data. Note the large number of direct and referral traffic as well as their high bounce rates.

Clicking into the referral data reveals the source of the referrals; a whole host of spammy websites!

Google Analytics spam referrals

Google Analytics spam referrals (showing 10 of around 70 sources). Note the shady domain names, high bounce rate and minimal session durations!

So what are they and why are they linking to my website? Well, I discovered that most aren’t even visiting the site at all and the ones that do are robots.

Delving a little deeper by adding a secondary dimension of Hostname, the spam referrals tended to be split into two main categories.

Google Analytics spam referrals with hostname

Google Analytics spam referrals with the Hostname displayed. If the Hostname is anything other than my website or is (not set) it’s probably a spam hit.

Ghost Referrals

These ‘visitors’ haven’t actually visited the site at all. The Hostname column refers to where the visit took place. In my case my real traffic should have a hostname of PaulJardine.co.uk because it’s the only place I have my Analytics code (You could alternatively put it somewhere else like a Paypal checkout if you have a shop).

Looking at the Hostname data of where the Analytics code tracked these hits, it’s not set to my website address so these are spam hits because they even take place on my website! Seemingly, these hits show here because a website in a dark corner of the internet has a matching tracking code.

The vast majority of ghost referrals can easily be spotted as they have a hostname of ‘(not set)’. Others may be set to Apple or Amazon to try and disguise themselves. Either way, unless the hostname shows a valid source it’s false and needs removing.

Bot Crawlers

Bot crawlers such as Semalt scour the web for purposes unknown, visiting your website and instantly leaving again. These are easily identifiable as the results with a session duration of 0.0 seconds and a 100% bounce rate.

Since the bot crawlers do actually visit my site the Hostname does say my website. However, the name of the source should give away whether it’s shady or not.

IMPORTANT: If you want to investigate a referral address, DO NOT VISIT THE SITE as it could have a virus or something nasty waiting for you. Do a Google search instead as this will tell you all you need to know about the site without having to actually visit it.

When Life Sends You Spam, Make Spam Fritters

Looking at the direct and search traffic revealed similar results. A massive chunk of the recorded traffic appeared to be ghost hits, clearly identifiable by the dodgy hostnames.

OK, so we’ve identified that there’s clearly a problem here! But fear not as next I’ll explain how I’ve restored the stats to their former glory!

In part 2, I’ll explain how to clean up you Google Analytics data using a filtered segment and in part 3 I’ll show you how to create a new filter to block the spam data from your future statistics.

Thanks for reading! As ever you can get in touch with me on Facebook and Twitter.

Get helpful advice and articles into your inbox once per month with the PJWD newsletter.

Related Articles

Getting my kicks with a Wix SEO fix

15th May 2019

I did battle with Wix to help improve the search engine ranking for my friend Sarah Green of Another AI + Interiors in Stockport.

Read this blog

The Benefits of Having an SSL Certificate

5th September 2016

PaulJardine.co.uk is now rocking an SSL certificate which means more security, more credibility and a boost in Search Engine Optimisation!

Read this blog
View all articles