Tag Archives: data analysis

Data Wallowing

It’s been snowing in the Seattle area and I now live on a very steep hill, so today I worked from home and have been wallowing in website analytics between a couple of sledding breaks.

My data wallows look at everything the analytics reports can serve me: visitors, page views, countries, browser versions, page paths, etc. The more esoteric the data, the more I like it, actually. I tend to find the signals in the extremes: the most popular and least popular stuff. They tell you what to focus on and what to chuck overboard.

I look at the last month, quarter, half-year, and year to get a feel for the trends over time and see how major site updates impacted traffic. I look at the top stats for each bucket and also look deep in the long tail to see what’s hiding. I sanity-check the data against my expectations, like looking at the percentage of non-U.S. visitors to the U.S. site, (it’s always higher than I expect,) and referring pages, (the top entry is really bookmarks instead of search???)

I look at clickmaps of the most and least trafficked pages to get a sense of how the page layout may be influencing clickthroughs.

Then if I have access to it, (at Microsoft I do,) I look at the data of the referring sites themselves to see where the outbound rank is for the site I’m analyzing. I look for customer satisfaction data, customer feedback, planning and marketing data, as well as industry trends for the segment the site’s in.

I search social media and look for positive and negative things about the site in question. I also see what they’re saying about the competition.

Then I spent time thinking about instrumentation gaps and how I can triangulate across or re-query the data sets I do have access to in order to guesstimate the gaps. Examples of gaps that I’ve run into in the past are not instrumenting by content types or site sections.

When I have all this data loaded into my head and spreadsheets, I can finally begin analysis by creating an empirical top task list based on what the data says and compare that to the expected or desired top task list. Further analysis is a topic for another day.

Data Acquisition and Analysis, and Startups

I had dinner at my in-laws over the Christmas break, and my brother-in-law’s father is a retired Boeing engineer in his 80’s that used to work in their wind tunnels. He shared some great stories of how moving from analog to computerized methods solved many problems for him and his team, especially around data acquisition and analysis, and it made me think of startups.

One area in particular that got much easier for them was recording air pressure over airfoils at various speeds and angles of attack. They used to have air pressure sensors attached to dozens of needle gages that were mounted on a nearby board. They’d crank up the fan to the desired speed, adjust angle accordingly and snap a few pictures of the gages. Any change of speed or angle would have them taking more photographs of the gage board.

The reason they’d take a few pictures per setting was that the needle gages had a lot of noise – they’d bounce around the scale and they needed multiple pictures to average/guess what the “real” reading was across all the photographed readings per wind speed and airfoil angle setting.

The time from taking a photo and then having it developed, examined, and compared to others in the series to record the results of the experiment in order to compare the tabulated data against the calculated model could be weeks. You can also imagine all sorts of ways that errors could creep into the dataset using this method.

In engineering today, most of this type of data is now either simulated entirely inside the computer or the computer collects massive amounts of data for further analysis. Data is the new coin of the realm, and the bigger your dataset, the larger the opportunity you have to exploit it for financial gain and hit your target.

Disk is cheap and missing the signal in the noise is expensive.

If your startup isn’t already generating and analyzing datasets or considering which you may create or have, you might as well be taking photos of physical gages in your quest to build a rocket to the moon.