What is behind all those web traffic reports?

Feb 10
09:19

2006

Fernando Maciá

Fernando Maciá

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

Not all website statistical reports are created equal. Server activity analyses provide adequate measurements to assess the performance of your Internet presence, while real-time statistics offer more accurate data, like the exact number of unique visitors, their access method, or pages visited.

mediaimage

Commercial applications like Urchin or WebTrends belong to the first category,What is behind all those web traffic reports? Articles and they generate server analysis reports, typically accessible from a control panel. Services in the second category are able to detect from within the Web itself, each one of the website visits, collecting and registering very valuable information, which is later on presented in graphical form. It is important to point out that these two types of reports are not mutually exclusive, but complementary, and it will pay off to learn how to adequately interpret the information that each one conveys. The results will provide a complete picture, which will prove invaluable when making marketing decisions.

Server Activity Analysis

Internet servers keep records of all traffic and information requests inside log files. These log files include information on errors, processing time, bandwidth used, visitor IP address, where visitors came from (referred) along with information regarding their operating system and Internet browser used. Applications that utilize this information are typically installed behind the server’s firewall and they periodically analyze the log files (weekly, monthly…) After interpreting the data, these applications generate a report containing tables and graphs that present the information in a very readable and user-friendly format.

Real-time Statistics

Another method of analyzing website activity consists of updating a database each time that a visitor comes to a website. This method requires the inclusion of a small piece of JavaScript code in each of the pages to be tracked. This code is invisible to all website visitors. The first time that a visitor reaches the website, the Javascript code places a cookie in their computer so that they can be tracked as a unique visitor thereafter. Very shortly after inserting the code in each web page, information about visitors is collected and securely registered in a database, becoming available for immediate retrieval. The information stored in the database could, for example, accurately track a given marketing campaign (e.g. a banner, a pay-per-click program) since it can be determined how the visitor was referred, or it could track the dollar values of Internet orders. Since this information is saved in real time, it becomes immediately available, without having to wait for a periodic report.

Can a log file analyzer and a real-time statistical report produce different results?

It is indeed very possible since each one uses a different type of data. That is why it is essential to gain an understanding of what each report is trying to convey. Some of the major differences are outlined below:

Counting Method:

A log file analyzer counts all hits registered by a web server. This means that the server registers one hit for each piece of information requested by the visitor (i.e. one hit for each page accessed, one additional hit for each image contained within that page, plus one hit for each script executed.) If frames are used, additional hits are registered for accessing the page inside each frame. The number of hits reported in a real time statistic report, on the other hand, corresponds to visited pages, regardless of the number of elements contained within.

Unique Visitor Identification:

A log analyzer considers unique visitors those with different IP addresses. However, since most of the accesses are performed from Internet accounts with dynamic IP addressing, it is impossible to determine if multiple visits from the same account on one day are indeed from different visitors or from the same person who has been assigned a different IP address each time. A similar problem occurs when proxy-cache accesses are established. In that case, the visitor’s true IP address is hidden behind the Internet Service Provider’s proxy IP address. This limitation makes the number of unique visitors reported by a server analysis very unreliable. Since the real-time statistics store a cookie inside each visitor’s computer, all visitors can be uniquely identified during each subsequent visit, independent of their IP address. The number of unique visitors in these reports is therefore a lot more accurate.

Reporting Method:

Log analysis reports are generated periodically (e.g. weekly or monthly). Unfortunately, there is no way to know what might be taking place between reports. The data from the real-time statistics is available almost immediately and continuously. A report can be retrieved from any browser, anytime, anywhere. The reports typically present current and historical information in formats designed to quickly view trends and reach conclusions.

Referral Information:

A log analyzer tracks and registers the search keywords used to access the website along with the search engine utilized by each visitor. Real time statistics provide valuable information about the evolution of particular search keywords and their effectiveness in directing traffic towards the website. This information allows the fine-tuning of certain pages to help obtain better rankings.

Crawlers/Spiders::

A log analyzer registers visits from crawlers and spiders as hits. Real-time statistics do not account for their visits unless they have viewed an HTML file.

Proxy Server Page Caching:

A log analyzer does not account for visits to pages that already reside in the server’s cache memory since these pages have not been re-requested from the server. Real-time statistics count all pages visited.

Non-HTML Files:

A log analyzer reports accesses to non-HTML files like graphics, images, or Flash as pages visited. The real time statistics do not double count these accesses unless the file has been specifically marked for tracking.

Error Pages:

A log analyzer can detect and register error pages and store this information in a separate log file. This is very useful for correcting defective pages. Real-time statistics, on the other hand, do not register errors and the visit to the error page is accounted for if the Javascript code can execute successfully.

Conclusion

The information provided by both types of reports is complementary. There are certain aspects, like errors, average session times, visit referrals, and crawlers/spiders activity that are better tracked and reported by a server activity analysis. On the other hand, the information registered by the real-time statistics can, for example, immediately report how many visits are being generated by a banner that was placed yesterday in a particular search engine. Furthermore, the real-time reports can present in a very intuitive and graphical form important trends of the activity experienced in a website, providing, at the same time, a more accurate view of the number of unique visitors and the capability of the website to attract returning visitors. In a future article, we will address which are the magic figures. In other words, what information is most important and must be considered before judging the performance of a website, drawing conclusions, or proposing changes.