Baseball wins seem not to follow a normal distribution

There is a great R Package dataset of football results by James Curley here. This engsoccerdata has a function to generate soccer league tables of many countries over a long time period.

` ````
league<-maketable_all(df=england[,])
```

team GP W D L gf ga gd Pts Pos

1 Manchester United 962 604 209 149 1856 847 1009 2021 1

and create a new column for the percentage of wins

` ````
league<-league %>%
mutate(PercentW = W / GP)
p<-ggplot(data=league, aes(league$PercentW)) + geom_histogram()
#binwidth=20
p<-p + ggtitle("Percentage wins\n in English football league") + xlab("Percentage Wins") + ylab("Number of Teams")
p<-p+theme_update(plot.title = element_text(hjust = 0.5))
p<-p + theme_bw()
```

` ````
library(fitdistrplus)
library(logspline)
x<-league$PercentW
fit.norm <- fitdist(x, "norm")
plot(fit.norm)
```

` ``shapiro.test(x) `

Shapiro-Wilk normality test
data: x W = 0.96276, p-value = 0.0006663 Which means English football wins really do not have a normal distribution.

Goals per game are also not normally distributed. But I dont think anyone expectes them to be

` ````
league<-league %>%
mutate(GoalsPgame = gf / GP)
shapiro.test(league$GoalsPgame)
```

Shapiro-Wilk normality test data: x W = 0.92134, p-value = 4.818e-07

And for France

Shapiro-Wilk normality test

data: leagueF$PercentW
W = 0.98522, p-value = 0.4699 so French football wins ~~do not have~~ might have (thanks for Paulfor the correction in the comments) a normal distribution. I must check the other leagues in the dataset as behaviour this different is odd.