A few forum stats
Here are a few stats that I scrubbed from the public member list.
Note that these stats include banned and inactive users.
(Click on the links to see graphs)
Number of posts by join date scatter
Number of posts pie chart
Top posters
- Date: 10-08-2020
- Total posts: 432,641
- Registered users: 7,603
- Users with no posts: 64%
- Posters (>0 posts) with fewer than 10 posts: 63%
- Posters with fewer than 626 posts: 95%
- Posters with a single post ("drive-by"): 25%
Note that these stats include banned and inactive users.
(Click on the links to see graphs)
Number of posts by join date scatter
Number of posts pie chart
Top posters
Comments (16)
:grin: No, SophistiCat's stats did not add up to 222%. Please read it again.
Note that "posters" are a subset of "users," and posters with < 10 posts are a subset of posters with < 626 posts.
But it appears to.
Quoting SophistiCat
As I said, I am not that great at statistics and the manner of presentation is I think unusually as well. In the little studying I ever did of statistics (not that much in engineering) it was always made clear to us that the percentages should always add up to 100%
One would expect to see something like the following to give a true representation of the data.
Registered users: 7,603
Users with no posts: 64% = 4,865
Etc, etc. until reaching 100%
Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%
Percentage of people who like green eggs: 71%
Percentage of people who like ham: 85%
Percentage of people who like Dr. Seuss: 93%
etc., etc., etc..
The stats weren't presented as an analysis of the collective, but of the properties of individuals. No wonder I have so much difficulty offering systems theoretical arguments.
Quoting Sir2u
Quoting SophistiCat
These categories aren't mutually exclusive or exhaustive. If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts. A list of percentages is only ensured to add up to 100% when they represent a mutually exclusive and exhaustive collection of properties regarding the same population. Exhaustive means everything is counted, mutually exclusive means things are only counted once.
Example - population of people in each continent except Antartica.
List 1
1 Asia 59.69%
2 Africa 16.36%
3 Europe 9.94%
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%
These add to 100% because people who live somewhere in a populated continent live in one of the continents. The list's items exhaustive, it covers all the population (of people in the populated continents).
And if you currently are in Asia, you can't also currently be in Africa or Europe or North America and so on. The list's items are mutually exclusive, you can only ever be in one list item.
If you added Afro-Eurasia to the list - it has 85.90% of the world population. But then it would read:
List 2
1 Asia 59.69%
2 Africa 16.36%
3 Europe 9.94%
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%
7 Afro-Eurasia 85.90%
Now they they add up to 185.90%. But if you live in Afro-Eurasia, you can live anywhere in Europe or Asia or... It breaks the mutually exclusive thing. Since Asians, Africans, Europeans are counted in their list entries but also in Afro-Eurasia.
If instead you delete Europe from the original list:
List 3
1 Asia 59.69%
2 Africa 16.36%
[s]3 Europe 9.94%[/s]
4 North America 7.79%
5 South America 5.68%
6 Oceania 0.54%
It adds up to 90.06%. This breaks the exhaustive thing - you can live in Europe but not be on the list.
The reason it's generally expected that lists of %s sum to 100% is that lists of % are generally used to represent a mutually exclusive and exhaustive collection of properties (like List 1). Sophisticat's collection of properties don't have that property since:
If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts.
Having <626 posts behaves like "being in Afro-Eurasia" in list 2, it contains all the <1 items (since 0 is less than 626) and all the <10 items (since 10 is less than 626).
2/3 registered users have never posted
Of those who have posted, 2/3 have under 10 posts
Or if you like pies (who doesn't?) Edit: added to the OP
Also it might just be me but all I see is this:
A direct link to the image gives a 403 "Your client does not have permission to get URL..."
Quoting Michael
Me too. Maybe only you can see them @SophistiCat.
Now this is much clearer, nothing like the other set. It is obvious that we have a hundred percent of people and that a percentage of them like certain things. No subsets involved, just different objects to like or dislike.
Quoting Pantagruel
But as an individual I might fall into several of those groups.
Posters (>0 posts) with fewer than 10 posts: 63% - if I have made 1 post I am in this group
Posters with fewer than 626 posts: 95% - if I have made 1 post I am in this group
Posters with a single post ("drive-by"): 25% - if I have made 1 post I am in this group
As I said, the data is not really clear.
Here again you are creating sub groups to try to explain data.
Quoting SophistiCat
Pie charts are much easier to understand, because the are based 100% and split into parts equivalent to the percentages. Have you ever tried to create a pie where subgroups appear in several places in the chart?
Venn Diagram will work here, to present a visual overlapping of counts.
But they are not that good at representing percentages of a whole.
Admittedly, yes. If you read polls by Gallup, for example, they try to make it as basic as possible for the general public. If you want to use individuals in more than one attributes, keep making header titles that highlight the different attributes.
Shouldn't try to cheat Google :) Added links instead.
I ain't. :up: