Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage. The analysis using Web Analytics can be used to improve the effectiveness of websites, and as a tool for market and business research. Web analytics applications could also help organisation to measure the results of traditional print or broadcast advertising campaigns. It helps us to understand whether a campaign has been successful. Web analytics provides information about the number of visitors to a website, number of page views, helps gauge popularity trends which are useful for market research.
Web servers which host the millions of websites across the globe generate large amount of log data on daily basis. Most of the log data generated by servers is quite cryptic for an average user. However, there are tools that make this data easier to understand to the average user. Some of them are:
AWStats reporting dashboard
This free log analysis tool flexible, and also enjoys a large community base. You can not only analyze regular web server log files through this software, but can also parse FTP log files and mail log files. This makes it an all-in-one log file analysis solution. It not only generates reports in a streamlined format, but also lets you export the vital analyzed log data in different file formats for offline analysis. You can run this log analysis tool on almost every popular platform.
Analog - This is one of the lightest and powerful log file analysis tool which can be run it on almost every operating system. This multilingual analysis tool can prepare reports in over two dozen popular languages. It also supports plugins to enrich the reports giving you more effective information to analyze micro events happening on your web server. It has a rich set of reporting options giving you both graphical and statistical information about important traffic trends of the website. Since it is open source, you can easily modify this log analyzer to suit your business requirements.
Deep Log Analyzer - This powerful log analysis tool can easily parse log files generated by IIS as well as Apache web server. It has a nice interface with tons of customizable options to generate custom reports from the raw data. Apart from rich set of reporting modules, it also lets you export the parsed data in both HTML and Excel format. It also supports script execution for automated report generation. Webmasters can also generate reports which primarily focus on search engine optimization. It has a nice interface which makes this tool quite user friendly.
Log Parser - This unique log analysis tool not only parses regular web server log files, but also analyzes several other types of event log files generated by Windows operating system. In fact, with the help of additional helper software, you can make this flexible tool parse almost any kind of log file data. This tool is best for offsite log analysis where the amount of raw data is huge. The application itself has a small footprint on memory and runs smoothly without any glitch. It can easily parse and analyze different types of XML and CSV files containing raw server log data.
BareTail - This flexible log file analysis engine can parse and present the information in real time as the events occur on your web server. You can connect to a remote web server and can see the parsed log reports as they're generated on the server. You can also upload huge amount of raw data for parsing within this powerful analysis engine. Similarly, skipping to a specific point within a very large report is instantaneous. If required, you can also view multiple log files simultaneously and that too in real time from different web servers. I'd highly recommend this tool.
SmarterStats - The free edition of this unique log analysis software is powerful enough to analyze web server log files of small to medium websites. Apart from regular log file parsing, it also supports several SEO features to help you attain better presence on major search engines. The interface is clean and obvious with helpful prompts and hints. It can parse both Windows and Unix/Linux based web server log files seamlessly. You can also integrate several 3rd party software with this tool for additional features. You can also use this tool on your smartphones and tablets.
WebLog Expert - This handy log file analysis software can generate reports in HTML, PDF and CSV formats. The last format can be used to archive huge volume of report data. Its parsing engine is quite fast and generates long reports in fairly quick time without consuming much system resources. This tool can parse compressed log files generated through popular web servers. It also lets you create website profiles to segregate reporting data on per site basis. Its minimal and clean interface keeps things simple and easy that helps you customize the reports as per your needs.
Webalizer - This lightning fast log file analyzer can give bulky tools a run for their money. It supports parsing of different formats of popular log files. Power users can use its command line directives to generate reports without any delay. Due to its multilingual support, users from across the globe can generate the log reports in their preferred language. There's no limit on the size of raw log file you can use with this tool. You can even rotate the raw data to generate similar reports in a round robin fashion across multiple days. It has a DNS look-up capability to include relevant data.
Most web analytics processes down to four essential stages or steps, which are:
Collection of data: This stage is the collection of the basic, elementary data. Usually, this data is counts of things. The objective of this stage is to gather the data.
Processing of data into information: This stage usually take counts and make them ratios, although there still may be some counts. The objective of this stage is to take the data and conform it into information, specifically metrics.
Developing KPI: This stage focuses on using the ratios (and counts) and infusing them with business strategies, referred to as Key Performance Indicators (KPI). Many times, KPIs deal with conversion aspects, but not always. It depends on the organization.
Formulating online strategy: This stage is concerned with the online goals, objectives, and standards for the organization or business. These strategies are usually related to making money, saving money, or increasing marketshare.
Each stage impacts or can impact (i.e., drives) the stage preceding or following it. So, sometimes the data that is available for collection impacts the online strategy. Other times, the online strategy affects the data collected.
Web servers record some of their transactions in a log file. It was soon realized that these log files could be read by a program to provide data on the popularity of the website. Thus arose web log analysis software.
In the early 1990s, website statistics consisted primarily of counting the number of client requests (or hits) made to the web server. This was a reasonable method initially, since each website often consisted of a single HTML file. However, with the introduction of images in HTML, and websites that spanned multiple HTML files, this count became less useful. The first true commercial Log Analyzer was released by IPRO in 1994.
Two units of measure were introduced in the mid-1990s to gauge more accurately the amount of human activity on web servers. These were page views and visits (or sessions). A page view was defined as a request made to the web server for a page, as opposed to a graphic, while a visit was defined as a sequence of requests from a uniquely identified client that expired after a certain amount of inactivity, usually 30 minutes. The page views and visits are still commonly displayed metrics, but are now considered rather rudimentary.
The emergence of search engine spiders and robots in the late 1990s, along with web proxies and dynamically assigned IP addresses for large companies and ISPs, made it more difficult to identify unique human visitors to a website. Log analyzers responded by tracking visits by cookies, and by ignoring requests from knownspiders.
The extensive use of web caches also presented a problem for log file analysis. If a person revisits a page, the second request will often be retrieved from the browser's cache, and so no request will be received by the web server. This means that the person's path through the site is lost. Caching can be defeated by configuring the web server, but this can result in degraded performance for thevisitor and bigger load on the servers.
Concerns about the accuracy of log file analysis in the presence of caching, and the desire to be able to perform web analytics as an outsourced service, led to the second data collection method, page tagging or 'Web bugs'.
The web analytics service also manages the process of assigning a cookie to the user, which can uniquely identify them during their visit and in subsequent visits. Cookie acceptance rates vary significantly between websites and may affect thequality of data collected and reported.
Collecting website data using a third-party data collection server (or even an in-house data collection server) requires an additional DNS look-up by the user's computer to determine the IP address of the collection server. On occasion, delays in completing a successful or failed DNS look-ups may result in data not beingcollected.
With the increasing popularity of Ajax-based solutions, an alternative to the use of an invisible image is to implement a call back to the server from the rendered page. In this case, when the page is rendered on the web browser, a piece of Ajax code would call back to the server and pass information about the client that can then be aggregated by a web analytics company. This is in some ways flawed by browser restrictions on the servers which can be contacted with XmlHttpRequest objects. Also, this method can lead to slightly lower reported traffic levels, since the visitor may stop the page from loading in mid-response before the Ajax call is made.
Logfile analysis vs page tagging
Both logfile analysis programs and page tagging solutions are readily available to companies that wish to perform web analytics. In some cases, the same web analytics company will offer both approaches. The question then arises of which method a company should choose. There are advantages and disadvantages to each approach.
The main advantages of log file analysis over page tagging are as follows:
Logfiles require no additional DNS lookups or TCP slow starts. Thus there are no external server calls which can slow page load speeds, or result in uncounted page views.
The web server reliably records every transaction it makes, e.g. serving PDF documents and content generated by scripts, and does not rely on the visitors' browsers cooperating. Advantages of page tagging The main advantages of page tagging over log file analysis are as follows: Counting is activated by opening the page (given that the web client runs the tag scripts), not requesting it from the server. If a page is cached, it will not be counted by server-based log analysis. Cached pages can account for up to one-third of all page views. Not counting cached pages seriously skews many site metrics. It is for this reason server-based log analysis is not considered suitable for analysis of human activity on websites.
The script may have access to additional information on the web client or on the user, not sent in the query, such as visitors' screen sizes and the price of the goods they purchased.
Page tagging can report on events which do not involve a request to the web server, such as interactions within Flash movies, partial form completion, mouse events such as onClick, onMouseOver, onFocus, onBlur etc. The page tagging service manages the process of assigning cookies to visitors; with log file analysis, the server has to be configured to do this. Page tagging is available to companies who do not have access to their own web servers. Lately page tagging has become a standard in web analytics.