More and more organizations are implementing an alternative technique to collect web site traffic information rather than relying on web server log files. This technique is called client-side data collection, or data tagging for short. Webtrends’ implementation is referred to the SDC (SmartSource Data Collector). Data tagging solves many problems associated with web server log file analysis.
Implementing data tagging requires some development work to ensure that data tags are inserted and maintained on web pages.
With data tagging, web traffic data is more accurate because traffic normally hidden by cache or proxy servers is tracked. IT administration is eased because data collection is centralized in one location versus site data being dispersed among several log files from multiple web servers that may also be geographically dispersed. And web data can be collected from specialized applications, such as application servers and browser applications (e.g. Macromedia Flash).
In short the SDC has the following advantages:
- Generates logs that are optimized for Webtrends that are as much as 90% smaller than traditional access logs.
- Produces a single centralized SmartSource file rather than separate log files for each web server. This essentially eliminates the administrative headaches associated with gathering logs from multiple, geographically dispersed web servers. The SmartSource file can even contain hits from multiple domains (the domain name can also be passed as a query parameter), allowing visitor behavior to be analyzed across an organization’s sites or even partner sites provided they permit your tags to be included on their pages.
- Provides information that is difficult or impossible to obtain with log files. For example, data tags linked to your SDC can be included in your banner ads placed on other sites. SmartSource tags can also be inserted into Flash applications, permitting a hit to be entered into the SmartSource file for each event fired in the program. This means visitor activity within Flash applications can be analyzed just like visitor interactions with HTML-based pages.
- Web traffic data is more accurate because traffic normally hidden by cache or proxy servers is tracked. In many cases, web server log files do not accurately represent the actual interactions visitors have with a web site. Proxy servers are one of several examples of how analysis results can be distorted by web server log file data collection. Proxy servers deflect page views from web servers by caching the most frequently requested pages. Local caches have a similar effect, handling browser requests through locally cached pages rather than making repeated requests to the web server. In doing so, these page views are not recorded in the web server log files.
- Creates a cookie for more accurate reporting. Cookies ensure visitors are tracked as they navigate and return (if using a persistent cookie) to your site. This enables the most sophisticated features of Webtrends such as Scenario Analysis, SmartView, Path Analysis, and Custom Reporting.
- Acts as a filter: you only tag the pages you need reporting on.
- Bots and Spiders don’t need to be filtered out and / or scrubbed from the logs resulting in more accurate reporting and requiring less CPU processing power.
- Enhancing reporting capabilities through META-tagging (i.e. tracking revenue, correlating Paid Searched Terms with conversions, …)
- Tracking PDF downloads
- Tracking dynamic / Web 2.0 events on pages (i.e. DHTML or any browser-supported event)
- Tracking events within Flash movies
Client-side data collection is quickly growing in popularity as the superior approach to collecting web visitor behavior information. It provides greater reporting accuracy and lower administrative overhead. Organizations should carefully analyze the costs and benefits of data tagging versus web server log file analysis, and determine which method will best meet your insight needs.