# What are typical football results II?

We continue What are typical football results? The notion of weaker and stronger has not been made precise. Is is true that a team that has say 5 Elo points more than another team is really stronger? What might be an appropriate threshold? A glance at the current Elo ranking might give an indication that teams in within 50 points may be considered as equally strong. But is this true? At which threshold the probabilities of win, draw, lose will change?

``````result<-data_frame(Win=rep(0,40),
Draw=rep(0,40),
Lose=rep(0,40))
for (i in  1:40)
{
DataEloNeutral2 %>% filter(EloStrongerBefore-EloWeakerBefore>i*10) %>%
mutate(Result= 3*(GoalStronger>GoalWeaker)+(GoalStronger==GoalWeaker) ) %>% count(Result) %>% arrange(desc(Result)) %>% transmute(Result=as.character(Result), Freq=round(n/sum(n),2)*100)-> df
result[i,]<-df\$Freq
}
result\$Diff<-1:40*10
resultMelted <- melt(result, id="Diff")
resultMelted <- transmute(resultMelted, Diff=Diff, Result=variable, Freq=value)
ggplot(data=resultMelted, aes(x=Diff, y=Freq, col=Result)) +
geom_line()+
geom_point()+
labs(x="Elo difference", y="Percentage (%)")`````` Unfortunately there is no clear change point. The frequency of draws stays more or less constant until an Elo difference of 100 points, while the probabilities of a win start to change around an Elo difference of 50. Since this last observation matches with our first intuition, let us take 50 Elo points as threshold. This might not be very scientific but will work for the moment.

## Result of teams of same strength

Let us see what the typical results are when two equally strong teams are playing against each other.

``````DataEloNeutral2 %>% filter(EloStrongerBefore-EloWeakerBefore<51) %>%
select(GoalStronger, GoalWeaker)->df
tdf <- round(table(df)/sum(length(df\$GoalStronger)), 3)
df2 <- as.data.frame(tdf)
df3 <- df2 %>% filter(Freq>0.05) %>% arrange(desc(Freq)) %>%
transmute(Result= paste(GoalStronger, GoalWeaker, sep=":"), Freq=Freq*100)
ggplot(data=df3, aes(x=reorder(Result, Freq), y=Freq)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=Freq), hjust=-0.3, size=3.5, color="#993333")+
labs(x="Result", y="Percentage (%)", caption="Based on all matches of the participants of 2018 FIFA World Cup (plus Italy, Netherlands \n and Austria)  against teams with at least 1600 Elo points between 1/1/2000 and 31/12/2017. \n Showing only results with frequency above 5% ")+
ggtitle("Probabilities of football match outcomes \n  Stronger vs. almost as strong")+
theme(plot.title = element_text(hjust = 0.5),
plot.caption=element_text(hjust=0.5))+
coord_flip(ylim = c(0,12.5))`````` This is somehow surprising; the slightly weaker team seems to have an advantage here! Let us check out the probabilities of win, draw, lose.

``````DataEloNeutral2  %>% filter(EloStrongerBefore-EloWeakerBefore<51) %>% mutate(Result= 3*(GoalStronger>GoalWeaker)+(GoalStronger==GoalWeaker) ) %>% count(Result) %>% arrange(desc(Result)) %>% transmute(Result=as.character(Result), Freq=round(n/sum(n),2)*100)  ->df
df\$Result<-factor(df\$Result, levels = c(3,0,1))

ggplot(data=df, aes(x=Result, y=Freq)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=Freq), vjust=-0.3, size=3.5, color="#993333")+
labs(x="Win, draw or lose", y="Percentage (%)", caption="Based on all matches of the participants of 2018 FIFA World Cup (plus Italy, Netherlands\n and Austria)  against teams with at least 1600 Elo points between 1/1/2000 and 31/12/2017. ")+
ggtitle("Result of football match \n Stronger vs. almost as strong")+
theme(plot.title = element_text(hjust = 0.5),
plot.caption=element_text(hjust=0.5))`````` The stronger team has still a higher chance to win, even though the most likely outcomes are 1:1 and 0:1.

## Result of teams of different strength

Let us see what the typical results are when two equally strong teams are playing against each other.

``````DataEloNeutral2 %>% filter(EloStrongerBefore-EloWeakerBefore>50) %>%
select(GoalStronger, GoalWeaker)->df
tdf <- round(table(df)/sum(length(df\$GoalStronger)), 3)
df2 <- as.data.frame(tdf)
df3 <- df2 %>% filter(Freq>0.05) %>% arrange(desc(Freq)) %>%
transmute(Result= paste(GoalStronger, GoalWeaker, sep=":"), Freq=Freq*100)
ggplot(data=df3, aes(x=reorder(Result, Freq), y=Freq)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=Freq), hjust=-0.3, size=3.5, color="#993333")+
labs(x="Result", y="Percentage (%)", caption="Based on all matches of the participants of 2018 FIFA World Cup (plus Italy, Netherlands \n and Austria)  against teams with at least 1600 Elo points between 1/1/2000 and 31/12/2017. \n Showing only results with frequency above 5% ")+
ggtitle("Probabilities of football match outcomes \n  Stronger vs. weaker")+
theme(plot.title = element_text(hjust = 0.5),
plot.caption=element_text(hjust=0.5))+
coord_flip(ylim = c(0,13))`````` This is somehow expected. Let us check out the probabilities of win, draw, lose.

``````DataEloNeutral2  %>% filter(EloStrongerBefore-EloWeakerBefore>50) %>% mutate(Result= 3*(GoalStronger>GoalWeaker)+(GoalStronger==GoalWeaker) ) %>% count(Result) %>% arrange(desc(Result)) %>% transmute(Result=as.character(Result), Freq=round(n/sum(n),2)*100)  ->df
df\$Result<-factor(df\$Result, levels = c(3,0,1))

ggplot(data=df, aes(x=Result, y=Freq)) +
geom_bar(stat="identity", fill="steelblue")+
geom_text(aes(label=Freq), vjust=-0.3, size=3.5, color="#993333")+
labs(x="Win, draw or lose", y="Percentage (%)", caption="Based on all matches of the participants of 2018 FIFA World Cup (plus Italy, Netherlands\n and Austria)  against teams with at least 1600 Elo points between 1/1/2000 and 31/12/2017. ")+
ggtitle("Result of football match \n Stronger vs. weaker")+
theme(plot.title = element_text(hjust = 0.5),
plot.caption=element_text(hjust=0.5))`````` While the stronger team wins every second game, the weaker still wins in every fourth game. Is this the reason why football is such a succesful sport?