Author

Analytics & GTM Developer

Optimizer Troubleshooter

Follow me on
Category | Google Analytics
Difficulty |

The importance of the Google Analytics hostname report

The hostname report is one of those underused reports in Google Analytics that can massively help the accuracy and data quality of your reports.

Some of the things the hostname report can help you with are:

  • Identifying and preventing ghost spam (this is a big one),
  • Excluding development and staging sites from your main reports,
  • Identifying sites that are scrapping your content.

What is a hostname in Google Analytics?

A hostname is any domain, tool, or service where your GA tracking code is present. They might be controlled by you or by an external service. An example of a hostname controlled by you is your website where you inserted your code. An example of a hostname not controlled by you is Google translate.

Hostname vs a source expand

These 2 are often confused:

  • The Source is where your visit comes from and there can be any number of them, for example, Facebook, Google, Twitter, Youtube, links from other sites to your site, etc.
  • The hostname, on the other hand, is the site where the visitor arrives. Your main hostname will be your domain and, and depending on the configuration of your site, there may be others.
Source vs Hostname Google Analytics

Where is the hostname report in GA?

One of the reasons why this report is usually overlooked is because it isn't located on the main list of reports, if you type hostname on the search box it won't appear.

To find the hostname report:

  1. Go to the reporting section, select a year or more on the calendar, then select the Audience reports in the sidebar.
  2. Expand Technology and click on Network
  3. At the top of the report (just below the graph) select Hostname as a primary dimension (by default Service Provider is selected)Where is the hostnames report in Google Analytics

Here you will see the list of the hostnames,  real or fake (spam). The most important one is your domain, the rest will vary from site to site depending on the size, age, and configuration of your site.

Types of hostnames

Some of the hostnames you may find in your Analytics are:

Typeexample
Your domaincarloseo.com
Subdomains:services.carloseo.com,
Staging/development sites:staging.carloseo.com,
IPslocalhost (might also be from scraping site)
Payment gateways/systemsShopify
Tools connected to your AnalyticsYoutube, Mailchimp, etc,
Translate servicesGoogle translate
Cache servicesBing cache
Speed servicesGoogle Weblight
Archive servicesweb.archive.org
Scraping sitesSites that copy pages and post them exactly as they are
SpamThey may show the spammer URL or the URL of a known site to try to fool you like mashable.comgoogle.com or apple.com
(not set) hostnameTraffic coming from spam or a code issue (More information below)

The following screenshot shows some of the hostnames stored in the analytics property of this site.

Relevancy
 
Relevant hostname
 
Irrelevant/Spammy hostname
Control
 Fully controlled by you
 Partially controlled (third-party tools)
 Controlled by an external service
 Spam/Code issue
How to identify valid hostnames in Google Analytics

You may think well all that is interesting, but how can I use it to improve the quality of my Analytics?

You can use this information to create a filter to allow only traffic to the hostnames you consider relevant, that way any traffic that has an invalid hostname and doesn't add any value to your Analytics will be left out.

So now that you know how to find and identify your valid hostnames make a list of all the ones you want to include in your filter. Following the example above, these are the hostnames I consider valid.

My valid hostnames
carloseo.comwww.ohow.co
services.carloseo.com.list-manage.com
www.youtube.comcarloseo.googleweblight.com
ohow.us10.list-manage.comm.youtube.com

Building a hostname expression for your filter

You can only create one hostname filter so you need to fit all your hostnames in the same regular expression. The simplest way to do it is just pasting one after another adding a pipe character "|" like this

Simple expression: carloseo.com|services.carloseo.com|www.ohow.co|www.youtube.com|carloseo.googleweblight.com|ohow.us10.list-manage.com|m.youtube.com|.list-manage.com

However, you can simplify it a lot more. GA uses REGEX (a special text string for finding patterns) for custom filters, so you don't have to match exactly each hostname, a partial match will be enough. So the expression above can be shortened like this:

Optimized expression: carloseo|ohow|list-manage|youtube

Of course, if you have development environment like in my case "staging1.carloseo.com" the above expression would match that hostname. So here is where you will have to get a bit crafty by using a more advanced regex.

Polished expression: ^(services\.)?carloseo|ohow|list-manage|youtube

The parenthesis and question mark are special characters, that basically says it has to start with "services" or with "carloseo", any other subdomain won't be counted.

Here are some basic tips to help you build your expression:

  • To match exactly your hostnames you should add a caret ^ at the beginning and a dollar sign $ at the end of each hostname like this carloseo.com$|^services.carloseo.com$|^www.youtube.com$
  • To separate each hostname, you should use a bar or pipe character |, this works as OR, if you can´t find it, hold Alt + 124(Numeric pad)
  • The dot . and the hyphen - are considered special characters in REGEX so normally you would add a backslash \ , however in most cases is not needed for this type of filter.
  • Try to find a good way to match as many hostnames as you can, for example, if you want to match blog.carloseo.com, es.carloseo.com, www.carloseo.com, you don't need to add all of them to the expression entering carloseo, will be enough to match all of them.
  • Domains don't spaces so don't leave any in your expression.
  • IMPORTANT! The REGEX in GA has a limit of 255 characters if your expression exceeds this limit try to optimize it to keep everything under one expression because you can only have 1 Include hostname filter.
  • IMPORTANT! Don't add a pipe/bar |, at the beginning or the end of the expression.

On this post, you can find more about Regular Expressions

How to test your hostname expression

It's important that you add all your relevant hostnames, or you will lose valid data, so to make sure your filter will work as expected you can test it using one of the following:

  • Using a quick segment in GA, this will let you see live how you filter will behave directly on your reports.
  • Using a regex test tool like regex101.com, here is an example using the latest expression I created.

Do you need help building your hostname expression? I can help you

Creating a filter to include only valid hostnames

You are almost there! All this read and work will be soon rewarded.

Once you have your expression fully tested it time to create the "include hostname filter" that will help you get rid of all the toxic traffic that skews your reports.

How to create a valid hostname filter for ghost spam and dev sites

On your Google Analytics:

  1. Go to the Admin tab, and select the view where you want to apply the filter. If you follow the naming above, this will be the Master view or Test view.
  2. Select Filters under the View column, and select + Add Filter
    Add filter button Google Anlaytics
  3. Enter as a name for the filter Include Valid Hostnames.
  4. Configure the filter as follows:
    • Filter TypeCustom > Include
    • Filter FieldHostname
  5. In the Filter Pattern box copy the hostname expression that you built before.How to create a valid hostname filter?
  6. [optional] You can click on Verify this filter for a quick glance of how the filter will work. You should only see spam or irrelevant hostnames on the left side of the preview table.

    If you get this message: "This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small"

    It is probably because of the limited data used by this featureTry verifying it with a quick segment (if you haven't done it yet).

  7. After making sure your filter is ok, Save the filter.

IMPORTANT: This filter doesn't require regular updates, but it's essential to update the expression whenever you add the tracking ID to new service, tool, or domain.

Wrapping it up

The hostname report can greatly help you to increase the quality of your data. With the information given there, you can create a solid filter that will only allow valuable data to pass.

Depending on the configuration and size of your site it might be less or more difficult to configure the filter, however, the results are worth the time invested on preparing one of the most important filters you could add to your analytics.

What else can I do to improve my Google Analytics data

Adding the hostname filter will have a great impact on your Analytics data. Here are other guides that can help you even more:

Do you have any questions or feedback?

I've tried to cover all the important information in this guide, however, if there is any part of the guide where you had difficulties, please let me know in the comments section below.

If this article helped you, consider sharing it or leaving a comment below on your experience, it may help other people!