You are viewing the historical archive of The Philosophy Forum.
For current discussions, visit the live forum.
Go to live forum

A few forum stats

SophistiCat August 10, 2020 at 12:24 4700 views 16 comments
Here are a few stats that I scrubbed from the public member list.

  • Date: 10-08-2020
  • Total posts: 432,641
  • Registered users: 7,603
  • Users with no posts: 64%
  • Posters (>0 posts) with fewer than 10 posts: 63%
  • Posters with fewer than 626 posts: 95%
  • Posters with a single post ("drive-by"): 25%


Note that these stats include banned and inactive users.

(Click on the links to see graphs)

Number of posts by join date scatter

Number of posts pie chart

Top posters

Comments (16)

Sir2u August 11, 2020 at 00:39 #441861
Reply to SophistiCat I am not great at math nor statistics, but I don't think there are 222% of users here in the forum. :smirk:
Caldwell August 11, 2020 at 03:06 #441882
Quoting Sir2u
?SophistiCat I am not great at math nor statistics, but I don't think there are 222% of users here in the forum. :smirk:


:grin: No, SophistiCat's stats did not add up to 222%. Please read it again.
SophistiCat August 11, 2020 at 12:04 #441980
Corrected one figure: 95% of posters have < 626 posts (was 130). Added stat drive-bys.

Reply to Sir2u Note that "posters" are a subset of "users," and posters with < 10 posts are a subset of posters with < 626 posts.
Sir2u August 12, 2020 at 01:10 #442170
Quoting Caldwell
:grin: No, SophistiCat's stats did not add up to 222%. Please read it again.


But it appears to.

Quoting SophistiCat
Note that "posters" are a subset of "users," and posters with < 10 posts are a subset of posters with < 626 posts.


As I said, I am not that great at statistics and the manner of presentation is I think unusually as well. In the little studying I ever did of statistics (not that much in engineering) it was always made clear to us that the percentages should always add up to 100%

One would expect to see something like the following to give a true representation of the data.

Registered users: 7,603
Users with no posts: 64% = 4,865
Etc, etc. until reaching 100%

Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%
Pantagruel August 12, 2020 at 10:18 #442311
Quoting Sir2u
Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%


Percentage of people who like green eggs: 71%
Percentage of people who like ham: 85%
Percentage of people who like Dr. Seuss: 93%
etc., etc., etc..

The stats weren't presented as an analysis of the collective, but of the properties of individuals. No wonder I have so much difficulty offering systems theoretical arguments.
fdrake August 12, 2020 at 18:34 #442399
Statistics with fdrake (readers may or may not receive a free car upon reading the post).

Quoting Sir2u
But it appears to.


Quoting SophistiCat
Posters (>0 posts) with fewer than 10 posts: 63%
Posters with fewer than 626 posts: 95%
Posters with a single post ("drive-by"): 25%


These categories aren't mutually exclusive or exhaustive. If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts. A list of percentages is only ensured to add up to 100% when they represent a mutually exclusive and exhaustive collection of properties regarding the same population. Exhaustive means everything is counted, mutually exclusive means things are only counted once.

Example - population of people in each continent except Antartica.

List 1

1 Asia 59.69%
2 Africa 16.36%
3 Europe 9.94%
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%

These add to 100% because people who live somewhere in a populated continent live in one of the continents. The list's items exhaustive, it covers all the population (of people in the populated continents).

And if you currently are in Asia, you can't also currently be in Africa or Europe or North America and so on. The list's items are mutually exclusive, you can only ever be in one list item.

If you added Afro-Eurasia to the list - it has 85.90% of the world population. But then it would read:

List 2
1 Asia 59.69%
2 Africa 16.36%
3 Europe 9.94%
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%
7 Afro-Eurasia 85.90%

Now they they add up to 185.90%. But if you live in Afro-Eurasia, you can live anywhere in Europe or Asia or... It breaks the mutually exclusive thing. Since Asians, Africans, Europeans are counted in their list entries but also in Afro-Eurasia.

If instead you delete Europe from the original list:

List 3
1 Asia 59.69%
2 Africa 16.36%
[s]3 Europe 9.94%[/s]
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%

It adds up to 90.06%. This breaks the exhaustive thing - you can live in Europe but not be on the list.

The reason it's generally expected that lists of %s sum to 100% is that lists of % are generally used to represent a mutually exclusive and exhaustive collection of properties (like List 1). Sophisticat's collection of properties don't have that property since:

If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts.

Having <626 posts behaves like "being in Afro-Eurasia" in list 2, it contains all the <1 items (since 0 is less than 626) and all the <10 items (since 10 is less than 626).





SophistiCat August 13, 2020 at 06:58 #442569
Reply to Sir2u Would It help to put it this way?

2/3 registered users have never posted
Of those who have posted, 2/3 have under 10 posts

Or if you like pies (who doesn't?) Edit: added to the OP
Michael August 13, 2020 at 09:50 #442592
Can you do a growth chart based on user sign up date?

Also it might just be me but all I see is this:

User image

A direct link to the image gives a 403 "Your client does not have permission to get URL..."
Jamal August 13, 2020 at 12:33 #442643
Quoting Michael
Also it might just be me but all I see is this:


Quoting Michael
A direct link to the image gives a 403 "Your client does not have permission to get URL..."


Me too. Maybe only you can see them @SophistiCat.
Sir2u August 14, 2020 at 00:15 #442824
Quoting Pantagruel
Percentage of people who like green eggs: 71%
Percentage of people who like ham: 85%
Percentage of people who like Dr. Seuss: 93%
etc., etc., etc..


Now this is much clearer, nothing like the other set. It is obvious that we have a hundred percent of people and that a percentage of them like certain things. No subsets involved, just different objects to like or dislike.

Quoting Pantagruel
The stats weren't presented as an analysis of the collective, but of the properties of individuals. No wonder I have so much difficulty offering systems theoretical arguments.


But as an individual I might fall into several of those groups.

Posters (>0 posts) with fewer than 10 posts: 63% - if I have made 1 post I am in this group
Posters with fewer than 626 posts: 95% - if I have made 1 post I am in this group
Posters with a single post ("drive-by"): 25% - if I have made 1 post I am in this group

As I said, the data is not really clear.
Sir2u August 14, 2020 at 00:20 #442825
Quoting SophistiCat
Would It help to put it this way?

2/3 registered users have never posted
Of those who have posted, 2/3 have under 10 posts


Here again you are creating sub groups to try to explain data.

Quoting SophistiCat
Or if you like pies (who doesn't?)


Pie charts are much easier to understand, because the are based 100% and split into parts equivalent to the percentages. Have you ever tried to create a pie where subgroups appear in several places in the chart?
Caldwell August 14, 2020 at 01:06 #442830
Quoting Sir2u
Pie charts are much easier to understand, because the are based 100% and split into parts equivalent to the percentages. Have you ever tried to create a pie where subgroups appear in several places in the chart?


Venn Diagram will work here, to present a visual overlapping of counts.
Sir2u August 14, 2020 at 02:32 #442844
Quoting Caldwell
Venn Diagram will work here, to present a visual overlapping of counts.


But they are not that good at representing percentages of a whole.
Caldwell August 14, 2020 at 02:47 #442847
Quoting Sir2u
Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%


Admittedly, yes. If you read polls by Gallup, for example, they try to make it as basic as possible for the general public. If you want to use individuals in more than one attributes, keep making header titles that highlight the different attributes.
SophistiCat August 14, 2020 at 06:17 #442906
Reply to Sir2u Don't get hung up on this; there's more than one way to present data, depending on what you want to highlight. Posters are a distinct category, because they are who you actually see and interact with on the forum, so it made sense to me to split that category, instead of the overall number of registered users. under 10 posts seemed to me like representative group before I did the actual count - and so it turned out to be. 95% is kind of a magic number that statisticians like to use. And single-posters are an interesting outlier in themselves; I expected to see a lot of these, but not quite as many.

Reply to Michael Reply to jamalrob Shouldn't try to cheat Google :) Added links instead.
Sir2u August 15, 2020 at 01:57 #443131
Quoting SophistiCat
Don't get hung up on this;


I ain't. :up: