English 中文(简体)
Displaying access log analysis
原标题:

I m doing some work to analyse the access logs from a Catalyst web application. The data is from the load balancers in front of the web farm and totals about 35Gb per day. It s stored in a Hadoop HDFS filesystem and I use MapReduce (via Dumbo, which is great) to crunch the numbers.

The purpose of the analysis is try to establish a usage profile -- which actions are used most, what the average response time for each action is, whether the response was served from a backend or cache -- for capacity planning, optimisation and to set thresholds for monitoring systems. Traditional tools like Analog will give me the most-requested URL or most-used browser but none of that s useful for me. I don t need to know that /controller/foo?id=1984 is the most popular URL; I need to know what hit rate and response time for all hits to /controller/foo is so I can see if there s room for optimisation or caching and try to estimate what might happen if hits for this action suddenly double.

I can easily break the data down into requests per action per period via MapReduce. The problem is displaying it in a digestable form and picking out important trends or anomalies. My output is of the form:

( 2009-12-08T08:30 ,  /ctrl_a/action_a ) (2440, 895)
( 2009-12-08T08:30 ,  /ctrl_a/action_b ) (2369, 1549)
( 2009-12-08T08:30 ,  /ctrl_b/action_a ) (2167, 0)
( 2009-12-08T08:30 ,  /ctrl_b/action_b ) (1713, 1184)
( 2009-12-08T08:31 ,  /ctrl_a/action_a ) (2317, 790)
( 2009-12-08T08:31 ,  /ctrl_a/action_b ) (2254, 1497)
( 2009-12-08T08:31 ,  /ctrl_b/action_a ) (2112, 0)
( 2009-12-08T08:31 ,  /ctrl_b/action_b ) (1644, 1089)

i.e., the keys are time periods and the values are tuples of (action, hits, cache hits) per time period. (I don t have to stick with this; it s just what I have so far.)

There are about 250 actions. They could be combined into a smaller number of groups but plotting the number of requests (or response time, etc) for each action over time on the same graph probably won t work. Firstly it ll be way too noisy and secondly the absolute numbers don t matter too much -- a 100 req/min rise in requests for a often-used, lightweight, cacheable response is much less important than a 100 req/min rise in a seldom-used but expensive (maybe hits the DB) uncacheable response. One the same graph we wouldn t see the changes in requests for the little-used action.

A static report isn t much good -- a huge table of numbers is hard to digest. If I aggregate by the hour we might miss important minute-by-minute changes.

Any suggestions? How re you handling this problem? I guess one way would be to somehow highlight significant changes in the rate of requests or response time per action. A rolling average and standard deviation might show this, but could I do something better?

What other metrics or figures could I generate?

问题回答

暂无回答




相关问题
Best logging approach for composite app?

I am creating a Composite WPF (Prism) app with several different projects (Shell, modules, and so on). I am getting ready to implement logging, using Log4Net. It seems there are two ways to set up the ...

How to make logging.debug work on Appengine?

I m having a tough time getting the logging on Appengine working. the statement import logging is flagged as an unrecognized import in my PyDev Appengine project. I suspected that this was just an ...

How to validate Java logging properties files?

I have a basic facility for allowing users to remotely apply changes to the logging files in my application. Some logs are configured using java.util.logging properties files, and some are configured ...

Logging SAS scripts

I ve been developing a lot of Java, PHP and Python. All of which offer great logging packages (Log4J, Log or logging respectively). This is a great help when debugging applications. Especially if the ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

热门标签