Thursday, January 22, 2004

Is web traffic normally distributed?

I've broken out MBA textbooks and software to apply some serious analysis to web traffic. My opening data set is about 4 months of traffic from my old employers website, and to add some punch to my analysis I can rely on the excellent Web Abacus tool to manipulate the data.

So, is it normally distributed? This is the first question I need to answer if I'm going to be able to apply sensible analysis to web traffic. So far the answer is irritatingly negative. Despite drawing two graphs that look pretty normal to me (for visits and visitors) and one that doesn't (for page impressions) the p-value for all three (which tells me how normal they are) remains stubbornly 0.00.

Next up I'm going to try the following tests.

What happens if I ignore weekends?

Update : This makes very little difference. Both weekend and weekday traffic is not normally distributed.

What happens if I do each day separately?

Update : Friday traffic might have been, saturday and thursday traffic wasn't though.

What happens if I strip out traffic from spiders and other robots?

Update : Now we're getting somewhere! And once we threw away weekend traffic we could conclude that traffic was normally distributed during weekdays in November.

Right, time for lunch and when I come back I'm going to try and turn these results into something more useful and better supported...

