Google Analytics vs Server Logs

Article about Traffic and Stats,
 
print friendly version

Google Analytics vs Server Logs


Mar 6, 2014 by Paul White

Every Media Property measures their stats in one way or another.  Properties use their stats to set advertising rates, and measure their growth.  Unfortunately some properties still prefer to look at stats which show them numbers they like rather than the truth.  Here I will break down the two types of stats tracking ( analytics, and Server Logs ), and how each should be used.

What are Stats based on Server Logs?

Every request to your website is recorded in your server logs.  The server log files are typically ASCII readable, with 1 line per log entry.  It doesn't matter what is requested from your server ( photo, webpage, style sheet, video, ect ) each will be logged into the server logs.  A stats server then reads the server logs and organizes the information into a format that is more easy for users to view ( charts, tables ).  It also gives you information on How many visitors, and how many impressions (Page Views) your website receives.  

What is wrong with Stats based on Server Logs?

The problem is every request to your website is not always from organic visitors ( real people).  Your website is also visited by search engines, and various bots ( many of them with malicious intent).  Since the only visitors your advertisers care about are the real ones, its important to only meassure your organic visitors.  But Stats Servers based on your Server Logs don't have that ability.  This is where an Analytics program like Google Analytics comes in.

How does Google Analytics work?

Google has you add a small piece of javascript to every page of your website.  Then when real visitors land on your website, the javascript gets executed.  The Javascript essentially pings google with some basic information about the user.  Search Engines and Bots don't run the javascript since they are often just analizing the HTML code and images.  This means that the only visitors to your website capable of running the javascript are real vistiors.

How big of a gap is there in the data?

As you are probably wondering how much of a difference is there between your Servers Logs, and the data that Google Analytics reports?  Years ago there wasn't that big of a difference, but as more spiders and bots have come online things have changed.

The following charts are for the same website within the same date time period ( Feb 1st - Feb 28th, 2014)
SmarterStats (based on Server Logs)

Websites Stats with Smarter Stats
Google Analytics

Websites Stats with Google Analytics

Comparing the data

Lets take Feb 3rd 2014.
Google Says I had 743 Unique Vistors to my website.
Server Logs say I had 2159 Unique Visitors to my website.

Thats a pretty big gap.  Most Media Property owners would say that there is something wrong with google analytics.  But I can assure you there is nothing wrong with this picture.  Even though Google can't track users who decide to disable javascript, (some ad blockers and privacy plugins may also disable Google's javascript ) this number is going to be a small fraction of your visitors ( <1%).

When setting advertising rates I would want to use the 743 number from google analytics.  The 2159 number is not useless.  This number indicates the number of unique IP Addresses that visited the website that day.  Search Engines often use dozens of IPs when scanning your site.  Rouge Bots tend to use even more.  The number above is likely the result of Rogue Bots scanning the site for security weaknesses, and or attempting to comment spam on the submit forms.  

How to reduce the number of Rouge Bots to your site

If your website is built on a common framework ( WordPress, Joomla) or uses common web tools to manage the data ( PHPmyAdmin ), it is likely attacked daily for known weaknesses in these frameworks.  However these Rogue Bots often operate from IPs that are outside of the USA.  The easiest way to eliminate these bots is to firewall them from your server.  If you don't want to firewall every country outside the USA, you can also just firewall specific countries that tend host these IPs.  

I firewall the following countries from my server, which has cut down 95% of the Bots and Spam.
  • Russia
  • China
  • India
  • Vietnam
  • Brazil
  • Iran
  • Ukraine
  • Malaysia
  • Romania
  • Pakistan
  • Philippines

Summary

When discussing your websites traffic with potential advertisers always use your Google Analytics data.  But don't ignore your Server Logs, as they can often alert you to security and performance issues within your website's architecture.

Related Categories
> WebSites > Traffic and Stats

Read Comment this Email this0 Comments

Readers Comments
No comments have been submitted to this Article

Leave a comment
* Name
* Email ( will not be displayed )
* Comments
Code

Security Check
 
 
* Required
When you enter your name and email address, you'll be sent a link to confirm your comment. If you subscribe to the LPG Maillist you will also be sent a link to confirm your email address, and to validated that you want to be part of our maillist. To create a live link, simply type the URL (including http://) and we will make it a live link for you. HTML and Javascript Tags are not allowed. Comments that are considered spam will be deleted, Please keep the comments and links relavent to this Article.

Share this article with others.
* Your Email
* Their Email
 
* Required
When you enter your email address and their email address, our server will send them a link to this article. If you check the subscribe box, you both will also be sent a link to confirm your email address, and to validated that you want to be part of our maillist.