Skip to content
Google Analytics spam traffic vs real data

Boo-urns! I’m not as popular as I thought I was!

In my previous blog I covered how to how to identify fake referral traffic in Google Analytics data. Now it’s time to clean it up!

In order to clean up the data that had already been collected I created a segment view to see the data with the spam traffic filtered out. Setting up a filter will block spam data from affecting future data (which I will cover next time).

Creating a new Google Analytics Segment

To begin, click the + Add Segment towards the top of the Google Analytics screen. Give it a name and then open the Conditions tab in the left column. Two filters are required to remove both the Ghost Referrals and the Bot Crawler traffic.

Google Analytics new segment

The Google Analytics segment options. The circle on the right updates to show how much data your filters remove.

Remove Ghost Referrals

As I discovered earlier, ghost referrals (as well as fake direct and search traffic) all share a common element; they all have a Hostname value that is not that of my website. Therefore the most effective way to remove them is to only include sessions that have a Hostname value that contain PaulJardine.co.uk.

Google Analytics segment hostname filter

Only show hits that actually occurred on my website to filter out the ghost hits.

Note that if you use your tracking code on other sites (such as an external Paypal checkout if you have a shop) you will also need to list these sites as well (in regex format).

Applying this filter removed over 80% of the overall traffic, that’s just how much Ghost Referral traffic I was getting on this particular site!

Remove Bot Crawlers

Having applied the Hostname filter, next I wanted to remove the remaining bot crawlers. With the hostname filter applied to my new segment view, I returned back to the Referrals data in the Acquisition tab to review what was left.

Google Analytics bot crawler referrals

Google Analytics referral data with the Ghost Referrals removed but still showing the bot crawlers. Terms 6 & 7 are actual real websites which I’ve hidden.

I copy-and-pasted the spam results into a text document like so…

buttons-for-your-website.com
buttons-for-website.com
semalt.com

I then needed to convert this list into a regex format, placing a | (vertical line) symbol between each domain (there shouldn’t be a vertical line after the last one). The finished regex code looks like this…

buttons-for-your-website.com|buttons-for-website.com|semalt.com

I then created a second filter to exclude sessions where the source matches the regex of the list.

Google Analytics segment source filter

A second filter to remove any hits that come from the bot crawler sites

This filter may need to be updated from time to time if/when new bot crawlers start visiting the website.

Results

Applying the filtered segment view revealed that the spam referrals to this particular website accounted for approximately 70% of the overall data collected! Having cleaned it up and restored the accurate results, the Analytics stats became useful again and can once again be used to help inform design decisions on future updates to the website.

Google Analytics spam referral traffic vs real

The cleaned up data (blue) compared against the original (green) with a more accurate bounce rate and session duration

Next time, I’ll explain how to create a filter to stop future data from being collected in the first place!

I hope you found this helpful to clean up your Google Analytics stats! As ever you can get in touch with me on Facebook and Twitter.

Get helpful advice and articles into your inbox once per month with the PJWD newsletter.

Related Articles

Getting my kicks with a Wix SEO fix

15th May 2019

I did battle with Wix to help improve the search engine ranking for my friend Sarah Green of Another AI + Interiors in Stockport.

Read this blog

The Benefits of Having an SSL Certificate

5th September 2016

PaulJardine.co.uk is now rocking an SSL certificate which means more security, more credibility and a boost in Search Engine Optimisation!

Read this blog
View all articles