I've recently come across sites publishing statistics involving "number of distinct hosts", but have found it hard to find a good defintion of how this is measured.
I gather that it is a count of IP addresses, perhaps with some jiggery-pokery to deal with cached pages, and folks who all have the same IP adress.
If anyone could give me the low down on how this measurement is made and in what ways it is likely to over- or under count the number of actual people viewing the site, I'd be most grateful.
"This is the total number of distinct machines which have requested files from the server. This is not the total number of distinct people who have visited the site. A large ISP may have hundreds of people who log into one machine. If all these people access a given web site, then only one distinct host will be reported by this statistic since all the request came from one machine."
In other words, distinct IP addresses that have visited your site in a given time. This will be an under-count (potentially) of the total number of unique visitors. For example, in a shared facility (university, library, etc), each computer will probably have a distinct IP address. If 20 people visit your site from that machine, it will only register as 1 distinct host.
Also, depending on the type of 'network translation' protocol being used, a shared internet connection can sometimes send out requests from the same IP address. For example, a large corporation's office in London could have one fat pipe of bandwidth running into it. For proxying reasons, or otherwise, all requests from all internal computers go through an Internet Connection Sharing System. This will - potentially - forward on the requests - from it's own IP address - get the response back, then give it back to the original requestor. In this case, again the same IP address (same distinct host) will be registered for all requests, causing an under-count in your stats.
Many thanks, that's realy helpful And I notice the link you give links on to a good, short, geek-free explanation of why Mango's (or anyone's ) statistics on ths point break down: http://www.mangonet.com/mango/faq/stats_accuracy.html
I have seen the statement that distinct hosts will be an undercount by 25%- 50% beacuse of the factors you explain, but I guess that this will vary hugely - a site that draws customers from big corporations, or from AOL will be more affected than a site with lots of home broadband users.
I recently saw the results of a user survey done by the medical journal publisher bmj.com. For one week every year, "access to the site entails completion of a questionnaire". Among the calculations they can do is the number of distinct hosts among their questionnaire respondents, and arrive at the number of 1.4 individuals per distinct host. Now I know ehat they mean!
Their survey results are worth a look for anyone wanting to do this kind of research. If you're interested in journal publishing or in content sites that are moving from free access to premium content (BMJ.com is doing this in January 2005) there are additional reasons to look.
Links to their survey data (2003 back to 1997) are at the foot of this page http://bmj.bmjjournals.com/aboutsite/visitorstats.shtml
On 17:27:51 31 March 2004 Dan Zambonini wrote:
>From: http://www.mangonet.com/mango/faq/stats_info.html >
>"This is the total number of distinct machines which
>have requested files from the server. This is not the
>total number of distinct people who have visited the site.
>A large ISP may have hundreds of people who log into one
>machine. If all these people access a given web site, then
>only one distinct host will be reported by this statistic
>since all the request came from one machine."
>
>In other words, distinct IP addresses that have visited
>your site in a given time. This will be an under-count
>(potentially) of the total number of unique visitors. For
>example, in a shared facility (university, library, etc),
>each computer will probably have a distinct IP address.
>If 20 people visit your site from that machine, it will
>only register as 1 distinct host.
>
>Also, depending on the type of 'network translation'
>protocol being used, a shared internet connection can
>sometimes send out requests from the same IP address. For
>example, a large corporation's office in London could have
>one fat pipe of bandwidth running into it. For proxying
>reasons, or otherwise, all requests from all internal
>computers go through an Internet Connection Sharing
>System. This will - potentially - forward on the requests
>- from it's own IP address - get the response back, then
>give it back to the original requestor. In this case,
>again the same IP address (same distinct host) will be
>registered for all requests, causing an under-count in
>your stats.
The Middle East and North Africa Digital Consumer Report is based on a Real Opinions survey of more than 2,000 consumers across different regions in MENA, including North Africa, the Levant and the GCC.
The 55-page report looks at internet usage in the Middle East and North Africa, including the extent to which consumers use the internet to research products and purchase online. The report also examines in detail how consumers use a wide range of online channels, including mobile, social media, search and email.
Freelance Project Manager at Chris Baker FPM
31 March 2004 17:10pm
I've recently come across sites publishing statistics involving "number of distinct hosts", but have found it hard to find a good defintion of how this is measured.
I gather that it is a count of IP addresses, perhaps with some jiggery-pokery to deal with cached pages, and folks who all have the same IP adress.
If anyone could give me the low down on how this measurement is made and in what ways it is likely to over- or under count the number of actual people viewing the site, I'd be most grateful.
Thanks
Technical Director at Box UK
31 March 2004 17:27pm
From: http://www.mangonet.com/mango/faq/stats_info.html
"This is the total number of distinct machines which have requested files from the server. This is not the total number of distinct people who have visited the site. A large ISP may have hundreds of people who log into one machine. If all these people access a given web site, then only one distinct host will be reported by this statistic since all the request came from one machine."
In other words, distinct IP addresses that have visited your site in a given time. This will be an under-count (potentially) of the total number of unique visitors. For example, in a shared facility (university, library, etc), each computer will probably have a distinct IP address. If 20 people visit your site from that machine, it will only register as 1 distinct host.
Also, depending on the type of 'network translation' protocol being used, a shared internet connection can sometimes send out requests from the same IP address. For example, a large corporation's office in London could have one fat pipe of bandwidth running into it. For proxying reasons, or otherwise, all requests from all internal computers go through an Internet Connection Sharing System. This will - potentially - forward on the requests - from it's own IP address - get the response back, then give it back to the original requestor. In this case, again the same IP address (same distinct host) will be registered for all requests, causing an under-count in your stats.
Freelance Project Manager at Chris Baker FPM
01 April 2004 09:17am
Many thanks, that's realy helpful And I notice the link you give links on to a good, short, geek-free explanation of why Mango's (or anyone's ) statistics on ths point break down: http://www.mangonet.com/mango/faq/stats_accuracy.html
I have seen the statement that distinct hosts will be an undercount by 25%- 50% beacuse of the factors you explain, but I guess that this will vary hugely - a site that draws customers from big corporations, or from AOL will be more affected than a site with lots of home broadband users.
I recently saw the results of a user survey done by the medical journal publisher bmj.com. For one week every year, "access to the site entails completion of a questionnaire". Among the calculations they can do is the number of distinct hosts among their questionnaire respondents, and arrive at the number of 1.4 individuals per distinct host. Now I know ehat they mean!
Their survey results are worth a look for anyone wanting to do this kind of research. If you're interested in journal publishing or in content sites that are moving from free access to premium content (BMJ.com is doing this in January 2005) there are additional reasons to look.
Links to their survey data (2003 back to 1997) are at the foot of this page http://bmj.bmjjournals.com/aboutsite/visitorstats.shtml
On 17:27:51 31 March 2004 Dan Zambonini wrote:
>From: http://www.mangonet.com/mango/faq/stats_info.html
>
>"This is the total number of distinct machines which
>have requested files from the server. This is not the
>total number of distinct people who have visited the site.
>A large ISP may have hundreds of people who log into one
>machine. If all these people access a given web site, then
>only one distinct host will be reported by this statistic
>since all the request came from one machine."
>
>In other words, distinct IP addresses that have visited
>your site in a given time. This will be an under-count
>(potentially) of the total number of unique visitors. For
>example, in a shared facility (university, library, etc),
>each computer will probably have a distinct IP address.
>If 20 people visit your site from that machine, it will
>only register as 1 distinct host.
>
>Also, depending on the type of 'network translation'
>protocol being used, a shared internet connection can
>sometimes send out requests from the same IP address. For
>example, a large corporation's office in London could have
>one fat pipe of bandwidth running into it. For proxying
>reasons, or otherwise, all requests from all internal
>computers go through an Internet Connection Sharing
>System. This will - potentially - forward on the requests
>- from it's own IP address - get the response back, then
>give it back to the original requestor. In this case,
>again the same IP address (same distinct host) will be
>registered for all requests, causing an under-count in
>your stats.
Founder at TagMan
16 April 2004 08:41am
It's probably worth mentioning our study into the accuracy of IP and cookie based analytics, The RedEye Report. We found that IP based stats could over estimate unique users by 7.6 times and cookie based stats could over estimate by 2.3 times. As far as metric definitions go, they are decided by JICWEBS (http://www.jicwebs.org/) and a good list of approved definitions can be found on ABCe's site (http://www.abce.org.uk/cgi-bin/gen5?runprog=abce/abce&type=page&p=definitions.html&menuid=rulesaregs|definitions)