English 中文(简体)
How do Alexa and Google Analytics track demographics?
原标题:

How are services like Alexa and Google Analytics capable of tracking visitors age, gender, college education, and so forth?

http://www.alexa.com/siteinfo/stackoverflow.com

最佳回答

Alexa definitely gets its traffic info from its toolbar users. Since that is a relatively small and self-selecting group of people, this inevitably leads to a biased sample (which is why Alexa traffic doesn t match measured traffic on the sites I run). Even with the best statistical techniques for reducing bias, you can never get rid of it entirely when the sampling distribution is not uniform.

Unclear how Google does it, although it might involve tracking cookies.

A project I have been working on recently has bearing on this question.

Another way to do this (that also has biases, but different ones) would be to use an IP to location service to find the approximate latitude and longitude of each visitor to your site. Then use my project (full disclosure: I run that site and it is commercial):

http://askgeo.com

To get demographic information for that location. AskGeo actually provides demographic information on several geographic levels (state, county, county subdivision, city, ZIP code, census tract (a few thousand people), and census block group (about a thousand people). You d presumably want to use the lowest level (i.e., census block group) for a given latitude and longitude.

The site returns a huge number of demographic variables. The idea would be to use soft counts from the demographic variables provided on the block group level. To take an example, if you are trying to track the age distribution of your users, then you d use the age ranges provided in the AskGeo response and for a given sample, you d add a fractional soft count to each range that corresponds to the percentage of the population in that block group from the corresponding age range. For example, take my neighborhood in San Francisco. It has the following age distribution:

  • CensusAgePercent0To4: 7.3%
  • CensusAgePercent5To9: 3.5%
  • CensusAgePercent10To: 3.2%

... (skipping a bit, as you probably get the idea) ...

  • CensusAgePercentOver85: 1.5%

If you got an IP address that you tracked to that census block group, you d add each of those percentages (as a fraction from 0 to 1) to your (soft) counters for those age ranges. (A soft counter is just a counter that allows for non-integer counts.)

You could do the same with race, gender, income level, house values, etc.

This method also has biases, for sure, since it assumes that all the people in a given block group are equally likely to visit your site. But it is something that you can do on your own site, not just Google and Alexa, and it would still give you a relative sense of who is visiting your site if your soft counts in a given category are higher than the national average in that category.

It is also possible that a more sophisticated technique than simple direct counts could lead to a much richer result.

问题回答

I did some research, and apparently these demographics are tracked the same way TV audience demographics are tracked. There are people who browse with their (Alexa s) toolbars, which keeps track of the sites visited. These people willingly (?) supply information like age, gender, etc. and Alexa extrapolates the general demographics from this sample. This of course leaves room for bias, but that s a problem with statistics.

Alexa gets its information from browser toolbars that you install on purpose or as part of a bundle with some software. It asks questions to understand demographic params and also tracks sites that you visit. If you know that 80% of site visitors are women and you have new visitor who visits this site that you can think that there is high probability that this person is a woman. If you know a lot of sites this person visits you can guess a lot.

But as http://netberry.co.uk/alexa-rank-explained.htm says you can rely only on information from Alexa TOP100,000 because then Alexa has enough information from small amount of users visiting these sites. They say "millions" but it s small share of total





相关问题
How to divide a search query into sub queries?

I am just wondering if there is an algorithm that can divide a user input query for a search engine into a set of sub queries. for example if the entered query is "plcae to stay and eat" the sub ...

Is there a lighter version of Google Analytics for Flash

40k of compiled code seems like a lot to me to be making some straightforward flash-javascript calls and makes GA unsuitable for banner ad work as well. Does anyone know if there is a lite version ...

PHP GET question - calling from a POST call

I have a quick question i hope you guys can answer, i ve got a search system that uses POST to do searches, now i want to track queries using Google Analytics but it requires using GET url parameters ...

Google Analytics to track FireFox extension use

I m developing a Firefox extension and would like to track its use with google analytics, but I can t get it working. I ve tried manually calling a function from ga.js, but that didn t work for some ...

Google Analytics _trackEvent troubles

I m having some noob troubles with Google Analytics and _trackEvent. Using it seems straight forward in the documentation, but I can t get this simple example to work. The call to _trackEvent fails ...

How to Add script codes before the </body> tag ASP.NET

Heres the problem, In Masterpage, the google analytics code were pasted before the end of body tag. In ASPX page, I need to generate a script (google addItem tracker) using codebehind ClientScript ...

热门标签