• No results found

The web server administrator needs to set certain properties to insure that logging is activated.

N/A
N/A
Protected

Academic year: 2021

Share "The web server administrator needs to set certain properties to insure that logging is activated."

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

(1)

Access Logs

As before, we are going to use the Microsoft Virtual Labs for this exercise. Go to

http://technet.microsoft.com/en-us/bb467605.aspx , then under Server Technologies click on Internet Information Services (IIS). Then under the heading “Step into the IIS7 Virtual Lab for Free”, click on “TechNet Virtual Lab: Working with the IIS Manager”. We are not going to use the lab manual they provide for this lab, though; we are just going to use their virtual lab to work in. Once you get into the lab, if it has a popup box that comes up, close it so that you see the actual desktop of the server.

Web site visitors request pages from the server. A web server administrator can configure the server to log these requests. Typically, the server can collect information regarding who is accessing what part of the site and when. The who part is typically limited to IP address (unless one requires registration at the site). The latter approach is more typical when dealing with e-Commerce sites. In many instances, visitors will also complete a shopping cart on such sites as they purchase merchandise. In those cases, one may also track shopping cart completion rates, purchase amounts, visitor buying habits, and many other facets. During this tutorial, we will just cover the fundamentals.

The web server administrator needs to set certain properties to insure that logging is activated.

Go to Start, then IIS Manager.

(2)

It is recommended to use the W3C Format, which is the most commonly used format. Note that you can customize the fields that are logged by clicking the “Select Fields” button next to the Format selection dropdown. Go ahead and click the “Select Fields” button so you can see the field options that are available for logging, but I would just leave the default selections.

Other options that you might change include the Directory where the log files are stored, the frequency of the schedule for creating the log file (I would leave it at Daily). The other alternative you have is to set a maximum file size for the log rather than setting a schedule for how often it is replaced. This is a text file, so it does not take up a lot of space. Additionally,you do have limited control over the individual file names (note the checkbox for using local time for file naming and rollover).

Before exiting the panel, make a note of the directory where the log files are stored, then sure to click “Apply” over in the right hand pane to save any changes, and then hit the back arrow at the top of the logging screen to go back to the main IIS panel.

Go to Start, then Computer, and navigate to the directory where the log files are stored (usually C:\inetpub\logs\LogFiles, and then double-click on whatever folders are in the LogFiles folder to access the logs . Drill down until you see files that end with .log.

(3)

Access logs contain a wealth of information (assuming the server administrator has

activated them and is tracking appropriate information). Some example lines from an old access log from an actual site are shown below:

#Fields: date time c-ip cs-method cs-uri-stem cs-uri-query sc-status sc-win32-status sc-bytes cs-bytes time-taken cs(User-Agent) cs(Cookie) cs(Referer)

2001-01-02 02:28:42 128.6.236.60 GET /robots.txt - 200 121 625 132 949218 ru-robot/0.1+[[email protected]] - - 2001-01-02 03:57:39 210.117.216.5 GET /98ASW_Classes.htm - 200 0 39112 139 1969 RaBot/1.0+Agent-admin/[email protected] - - 2001-01-02 08:49:01 152.163.188.167 GET /index.htm - 200 0 16359 134 781 - - - 2001-01-02 09:33:17 199.172.149.215 GET /Calendar+of+Events.htm - 200 0 15644 126 328 ArchitextSpider - - 2001-01-02 13:00:10 199.172.149.206 GET /Board_Staff_Bio.htm - 200 0 23146 119 469 ArchitextSpider - - 2001-01-02 13:28:57 199.172.149.165 GET /Living+Upstream.htm - 200 0 14152 121 343 ArchitextSpider - - 2001-01-02 14:59:27 64.210.248.135 GET /98ASW_Sponsors.htm - 200 0 5131 387 281 Mozilla/4.75+[en]+(Win98;+U) - -

In this case, you have the Date, Time, Client IP address, Access Method, the file they requested, the standard server status code and the windows status codes, the number of bytes, the time it was requested, what user agent was accessing the server (i.e., what browser and operating system), any cookie, and the site that contained the link to that file (if it was an external site).

As you can see, looking at the raw access logs is rather interesting, but we need to compile the information into usable data. There are many tools out there to accomplish this.

From this point on, go ahead and exit the Virtual Lab. Because the Virtual Lab does not allow you to access the internet through the virtual browser, we will go back to working on your local machine. This needs to be one of the following Operating Systems as we are going to be installing some software:

Windows 7, Vista, XP, Server 2003/2008 32- and 64-bit editions

A search on “weblog analysis” brings up a lot of choices. One such tool is called “Deep Log Analyzer”. It has a free version or a for-purchase version for around $200. A comparison of the feature differences between the two tools can be found at:

http://www.deep-software.com/compare.asp

We are going to look at how to use this tool. Open your favorite browser and go to

http://www.deep-software.com/ . Over on the top right hand side of the page, you will see two options; one says “Download Now It’s Free” and the other says “Buy Now Professional Edition”. Click on the “Download Now It’s Free” link.

(4)

also offer a free edition” link, and save the file to your computer. I saved mine to my desktop, feel free to save it wherever you’d like, just remember where you put it 

Now locate the file you downloaded; it should be called “dlafree.exe”. Double-click on the file, and follow the prompts to install it. On the last screen, it will ask if you want to open Deep Log Analyzer and the Sample Project. Leave the box checked so that the program will open with the Sample File loaded.

It should look something like this:

If your Sample Project didn’t load, click on “Sample Project” under “Open a Project” in the right hand menu.

Take some time to examine the reports offered in the left hand menu. For the lab, answer the following questions and put them in the “Notes” section of the lab (or put into a text document and upload it):

1. What was the top page for the sample site?

2. What is the most Popular day of the week for the website?

3. What was the most popular browser? Click on the + sign next to the top browser to see the specific version of that browser that was the most popular (looking at it, this obviously is older data!)

4. Tell me what the top search phrase was.

For the next part of the lab, we need some of our own log files to experiment with. Since we can’t access the log files in the virtual server, I’ve provided a set of old log files from an actual site for us to use. Minimize Deep Log and open your browser and go to:

(5)

Download this file to your computer (remember where you put it; the desktop is fine). Unzip the files to a folder and make a note of the location.

Get back into Deep Log and click on the “Create New Project” in the right hand menu. It will warn that this will close the current project, click ok.

The new project window will now pop up. Type “CMWEB290” for the Project Name in the box, then hit “Next”.

It looks something like this:

(6)

C:\inetpub\logs\LogFiles\MyServer\*.log

This will grab all of the log files stored in that directory.

In our case, click on the little folder icon next to “Web server log files list”. This will open up a blank line for us to type in. Type in the path to get to wherever you unzipped the files to, and end with ex*.log since all of the log files begin with ex. So for example, my files reside at C:\Users\sstripp\Desktop\unzip\sampleLogFiles\ex*.log .

Now click next. Now we have to give it some information about our site. These log files are from a site called the Sun Foundation. See the settings to fill in below:

For the website URL, put http://www.sunfoundation.org . For the domain names, put sunfoundation.org

For the default page, put index.htm Leave “Remove Self Referrals” checked Leave the other fields as default.

(7)

On the next screen, select the IP Address + User Agent Method, because otherwise you have to add DeepTracker code to every page on your website. This code calls a JavaScript that collects extra data such as Screen Resolution, Flash Version, JavaScript on/off, and system language. If you would like to have those stats, you will need to add the code that the program will give you on the last project setup screen to every page on your site. For our purposes, we’ll just use the IP Address method, which doesn’t require page

modification.

(8)
(9)

On the next screen leave the defaults. It will automatically detect your own IP address and fill it into the “Do Not Import from these Addresses” box; this allows you to exclude internal company traffic when analyzing the data so you don’t falsely inflate your numbers due to employees visiting the site. If you want to exclude your company traffic, you would want to list any external IP addresses that your company is using. Of course, the one it picked up for me is an internal IP address, so I would need to find out what my external ip address was for the network if I wanted to exclude myself.

(10)

The next screen allows you to limit the size and scope of your reports. I would recommend keeping the data for at least 12 months, and you will probably want to make it a practice to save a copy of the reports for the year on Dec 31st so that you have them.

Corporations and nonprofit organizations sometimes need website data to put in reports, but they may not think to ask you for them until sometime the following year!

In our case, Uncheck the box, then hit Finish & Save.

(11)

Now your screen should look something like the one below, with the Web server logs imported dates being between 01/01/2001 and 01/05/2002.

Examine several of the reports once again to see how they look. For the lab, tell me what the most popular hour of the day was for people to visit the site.

Summary - We have reviewed some of the fundamentals of access log analysis. The

information gained from these reports can give valuable insight on what is and is not working on a website, what your most popular areas are, what browsers you need to concentrate on when testing the site, etc. The Page Not Found error report is an important one to look at on a regular basis so that if there is a problem on the site, you can find it and correct it. Some companies even track the success of advertising campaigns by including a link to a certain page on the website as part of the advertisement, and then looking to see if that page shows up in the top hits.

Also, as I mentioned above, sometimes corporations need to know how many visitors they’ve had over a certain period and other statistics to include in reports. For example, some organizations include this data in their annual reports.

References

Related documents

If an employee chooses not to continue the life insurance during an unpaid leave, upon their return to active, eligible employment, they will be required to complete a Life

You will commonly use FTP when you have a collection of website files on your local machine and need to upload them to a web hosting server for them to appear on the

If you receive this error, please check that the start date entered is within the period of at least one of your professional jobs. If it does, your details may not have been

R80,10 R71,20 R44,50 Monthly subscription fees – Internet, Cellphone, Telephone Banking No charge Absa ATM cash deposits. • Cash deposit Absa ATM • Cash deposit

DEN COVERED PATIO GREAT ROOM BEDROOM #2 BATH #2 BATH #3 3-BAY GARAGE PORCH LAUNDRY ENTRY CASUAL DINING KITCHEN MASTER BEDROOM MASTER BATH WALK-IN CLOSET DURANGO TRAIL. PLAN 2AR -

PLAY Clip 1, PAUSE after “I believe.” FOLLOW UP by asking students for a working definition of “credit.” (The idea that you can lend money to someone and rely on them to pay

Once  you  have  uploaded  all  the  timetable  files  into  your  web  server,  you  can  now  test 

1 The expected retirement income for the traditional 401(k) account is determined by converting the retirement balance into an annuity at the post-retirement return rate for 15