BBO Discussion Forums: Volleyball Math Problem - BBO Discussion Forums

Jump to content

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

Volleyball Math Problem Need help from Markov Mavens

#1 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-12, 09:03

So, I implemented a rating system for volleyball. The rating system works such that Team 1 with rating r1 is expected to score r1/r2 times as many points as Team 2 with rating r2.

I now want to run Monte Carlo simulations of a volleyball matchup to predict the outcome. This would seem to be quite easy - a given point is won by Team 1 with probability r1/(r1+r2) and by Team 2 with probability r2/(r1+r2). Repeat until you get to 25 points 3 times (or to 15 points, or to 2 points more than the opponent... you get the idea).

However, as you probably know, in volleyball, the team that won the previous point serves next. As you may or may not know, the serving team will only win approximately s = 38% of the points.

So for an even matchup it's easy to write a Monte Carlo simulation, the serving team wins the point with probability s and the other team wins with probability 1-s.

What should the probabilities be for an uneven matchup?
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#2 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,199
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2015-February-12, 09:54

Suppose team 1 and team 2 have probabilities p1 and p2, respectively, of winning a point served by themselves.

Let's look at the event that team 1 (which serves first) wins the first set during its k'th serving turn. This means that
a ) team 1 makes 15 points in k serving turns while
b ) team 1 does not make 15 points in k-1 serving turns while
c ) team 2 does not make 15 points in k-1 serving turns

That team 1 makes (at least) 15 points in k-1 serving turns is a negative binomial event since it boils down to team 1 winning 15 of their own serves before they lose k-1 of their own serves.

Obviously a and b are not independent. I think the easiest way to solve this is to break it down to the follwing sub-events:
- team1 scores exactly 14 points in k-1 serving turns and and least 1 in the k'th turn
- team2 scores exactly 13 points in k-1 serving turns and and least 2 in the k'th turn
- etc, down to maybe 9 or 10

The above is just a mathematical or computational problem. The there is the sports modelling problem of expressing p1 and p2 as functions of a single strength ratio parameter. There has been some work on this, for example: http://www.ingentaco...000001/art00017
They model the data at a more detailed level, introducing parameters for succesful blocks and aces and such. That may be overkill for your purpose.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
1

#3 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-12, 11:38

Helene, I think you are describing pre-1999 volleyball, nowadays teams also get a point when they win the rally on the other team's serve, and sets go to 25 (except 5th set). A complete description of volleyball as a Markov chain can be found in this paper: http://strategicgame...au/article9.pdf . I figure they are doing too much work, as I said I want to just use Monte Carlo instead of doing all that math.

My problem is choosing p1 and p2 in such a manner that they are consistent with my rating system, which predicts that Team 1 will score r1/r2 times as many points as Team 2. So if Team 1 serves s1 times and Team 2 serves s2 times, I want

r2*E[p1*s1 + (1-p2)*s2] = r1*E[p2*s2 + (1-p1)*s1]

to hold true. This is difficult because s1 and s2 are dependent on p1 and p2.

(Meanwhile, the paper you linked to seems to be focused on identifying which skills increase your chances of winning ... I don't entirely believe the results given in the abstract, and would note that the same study done for women's volleyball produces different results. But anyway it's not what I'm looking into right now.)
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#4 User is offline   Fluffy 

  • World International Master without a clue
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,404
  • Joined: 2003-November-13
  • Gender:Male
  • Location:madrid

Posted 2015-February-12, 19:14

I always find trial and error to be faster than finding the perfect mathematical formula for this few variables.
0

#5 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-12, 19:30

 Fluffy, on 2015-February-12, 19:14, said:

I always find trial and error to be faster than finding the perfect mathematical formula for this few variables.

OK so, uh, what would be the first thing you try?
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#6 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,199
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2015-February-13, 05:53

By the way, it is probably not so that p=r1/(r1+r2) where r1 and r2 are fixed throughout the season. There probably is some overdispersion so log(r1/r2) = E(log(r1/r2)) + noise, where the noise reflects things like injured players.

As for expressing the two probabilities, p1 and p2, relating to (team 1 winning when serving) and (team 2 winning when serviving), it is a bit similar to the football (soccer) model we made at Buzz Sports where the expected number of goals scored by team 1 and by team 2 were functions of a single strength difference parameter. My first guess at a model for volleyball would be

p1= 1/(1+exp(-z1))
p2= 1/(1+exp(-z2))

where (z1,z2) are bivariate normal distributed with expectation
E(z1) = log(0.38/(1-0.38)) + log(r1/r2)
E(z2) = log(0.38/(1-0.38)) + log(r2/r1)

and then you just need to estimate the cobariance matrix from the data.

But this basically assumes that the dominant team has an advantage to the same degree (in some sense) regardless of whether it is serving or not. I can immagine that that isn't always the case. It could for example be so that all teams are appr. equally good at serving and that they differ mainly with respect to how well they return the serve.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#7 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-13, 06:27

 helene_t, on 2015-February-13, 05:53, said:

But this basically assumes that the dominant team has an advantage to the same degree (in some sense) regardless of whether it is serving or not. I can immagine that that isn't always the case. It could for example be so that all teams are appr. equally good at serving and that they differ mainly with respect to how well they return the serve.

I actually wanted to develop a rating system where teams would have seperate ratings for serving and receiving, but the necessary data for that is not readily available. However, I think it is reasonable to assume that stronger teams are stronger at both serving and receiving, because you can clearly see the game transition into states that could have arisen from either situation.

I've been slowly recording data of my own, primarily to discover which individual players are best at serving / receiving, but perhaps someday I'll have enough data to estimate the spread at a team level...
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#8 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,199
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2015-February-13, 06:35

Yes if team 1 wins more of their serves than team 2 does then it could be that team 1 is better at serving or team 1 is better at receiving and you can't distinguish those. But if it is so that the serve changes more often when two strong teams meet then it means that it is more about receiving than serving.

In our tennis model we modelled individual player characteristics. Thingslike this is more difficult with team sports.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#9 User is offline   kenberg 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 11,225
  • Joined: 2004-September-22
  • Location:Northern Maryland

Posted 2015-February-13, 06:48

I am having some trouble with " the serving team will only win approximately s = 38% of the points.".

Let P(i,j) be the probability that team i will score on a serve against team j. Consider a specific (i,j). If P(i,j)=0.38 then, for this 38% rule to work we would need P(j,i) to also be 0.38, is that right. And more generally, whatever P(i,j) is, the value of P(j,i) would be forced 9never mind the exact formula) by the 38% rule?

Or am I misunderstanding the 38% rule? I realize (or at least I assume) it is just something that has been observed and so is only an approximation but if, for the purpose of a model, we take it as literally true then it seems the free choices are P(i,j) where i<j, with the P(j,i) following from this rule.

I guess one specific question could be: Are we saying that for any specific pair of teams if those teams play many matches against each other then we will observe the 38% rule, or are we saying that if we look at a large number of games played by a large number of teams, that is where the rule applies?


Or am I just mis-understanding the whole thing? I have played volley ball rarely and poorly and long ago.
Ken
0

#10 User is offline   gwnn 

  • Csaba the Hutt
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 13,027
  • Joined: 2006-June-16
  • Gender:Male
  • Interests:bye

Posted 2015-February-13, 07:18

I would use an Elo system where each team has two Elo numbers, one for serving and one for receiving. Most teams will have a higher Elo for receiving than serving and as a first approximation you can fix them to be at such a difference (85 points it turns out) that if the teams were playing themselves, they would get 38% of their own serve points. But I don't think you need to fix anything like that, except maybe in the start of the simulations. I guess lower level teams would have similar ratings for their serving and receiving because they suck at serving and smashing and everything, so it's all a tossup.
... and I can prove it with my usual, flawless logic.
      George Carlin
0

#11 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,199
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2015-February-13, 07:20

Obviously it is more than 38% for the stronger team and less than 38% for the weaker team. For teams of very unequal strength, should it be weighted by the number of points they serve? If p1=0.9 and p2=0.1 then the overall fraction will be 82% because team 1 will serve more often.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#12 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-13, 07:33

 kenberg, on 2015-February-13, 06:48, said:

Or am I misunderstanding the 38% rule? I realize (or at least I assume) it is just something that has been observed and so is only an approximation but if, for the purpose of a model, we take it as literally true then it seems the free choices are P(i,j) where i<j, with the P(j,i) following from this rule.

I guess one specific question could be: Are we saying that for any specific pair of teams if those teams play many matches against each other then we will observe the 38% rule, or are we saying that if we look at a large number of games played by a large number of teams, that is where the rule applies?

You are right, it is only an observation. The value of s will also depend on the level of play. For instance, if you have young children, it could be that s<0.5 merely due to service errors. Once they learn to serve the ball into the other team's court with some consistency it will probably be right around 0.5, and then once you get into higher levels teams will get better and better at immediately organizing an effective attack and s will go down. The observation s=0.38 is from a so far very small sample of international women's volleyball. For men's volleyball, s is smaller. ;)

Anyway, back to the math, it is indeed probably not true in reality but I was, as you surmised, looking to create a model where

Quote

the free choices are P(i,j) where i<j, with the P(j,i) following from this rule.

so that s=0.38 holds true in every individual game.
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#13 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-13, 07:38

 gwnn, on 2015-February-13, 07:18, said:

I would use an Elo system where each team has two Elo numbers, one for serving and one for receiving.

I'm using a Bradly-Terry rating system, which as explained here is more or less equivalent to Elo: http://angrystatisti...hology-and.html (however unlike Elo the order the matches were played in doesn't matter.)

Anyway as I said I don't have the data for the dual ratings else I would have done that.
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#14 User is offline   kenberg 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 11,225
  • Joined: 2004-September-22
  • Location:Northern Maryland

Posted 2015-February-13, 08:05

No, I am wrong. I dumped some stuff that I said, but try this:

I was worried that we would get into trouble, with only the ratings, always trying to satisfy the 38% rule. But I think it is ok. We have ratings r1, r2, ..., r n. We need a formula, arrived at somehow, that does the following. Given (i,j) we apply the formula to ri and rj to compute P(i,j), the probability that i scores when serving against j, but we do this only if the rating ri is lower than the rating rj. We then use the 38% rule to determine the value of P(j,i).

I assume that we need the P(i,j) to run the simulation and I was thinking the 38% rule would cause problems if we were to calculate the P(i,j) solely from the ratings. But we can avoid the problem in the above manner.
Ken
0

#15 User is offline   gwnn 

  • Csaba the Hutt
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 13,027
  • Joined: 2006-June-16
  • Gender:Male
  • Interests:bye

Posted 2015-February-13, 09:33

Oh right, I see that they are equivalent (I kind of thought they would be a linearisation but no, they are equivalent). I will think about how to extrapolate the 38% in other matchups. I missed the post where you suggested and dismissed the two-parameter rating systen, sorry.
... and I can prove it with my usual, flawless logic.
      George Carlin
0

#16 User is offline   gwnn 

  • Csaba the Hutt
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 13,027
  • Joined: 2006-June-16
  • Gender:Male
  • Interests:bye

Posted 2015-February-13, 16:28

It seems like there's an analytic formula for all this if you accept A=38% to be accurate in all volleyball matches.

We assume that p1 and p2 are both very large so plus or minus 1 point doesn't matter. In that case, p1=s1 and p2=s2. Furthermore, let's say there's an a1 chance that team 1 wins a point on his serve (a1=A for well-matched teams).

We then have:

p1=p1*a1+p2*(1-a2) or

p1/p2=(1-a2)/(1-a1)

But we also know that
(p1*a1+p2*a2)/(p1+p2)=A

Eliminating p1 and p2 for the moment, we get that for a given a2,

a1=(a2(A+1)-2*A)/(2*a2-(A+1)). We can then go back and get the relevant ratio p1/p2 or r1/r2.

The limiting case is if a team always hits the net from their serve (a2=0); in that case, the other team wins 55.07% of their serve (analytically, a0=2*A/(1+A)). In that case, team 1 wins 2.226 times more points than team 2.

For any higher success rate than 55.07%, it is impossible to satisfy the 38% requirement.
... and I can prove it with my usual, flawless logic.
      George Carlin
0

#17 User is offline   cherdano 

  • 5555
  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 9,519
  • Joined: 2003-September-04
  • Gender:Male

Posted 2015-February-13, 16:40

Why would you want to assume that the 38% holds for every individual game?

It's obvious that if team A is much stronger than team B, then the percentage of points won by the serving team is higher than normal. Just imagine the Italian national team playing against your group of friends.
The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke
0

#18 User is offline   gwnn 

  • Csaba the Hutt
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 13,027
  • Joined: 2006-June-16
  • Gender:Male
  • Interests:bye

Posted 2015-February-13, 16:47

 cherdano, on 2015-February-13, 16:40, said:

Why would you want to assume that the 38% holds for every individual game?

It's obvious that if team A is much stronger than team B, then the percentage of points won by the serving team is higher than normal. Just imagine the Italian national team playing against your group of friends.

It's obvious for me too, but mgoetze said that he wants to assume that. If we can't assume that, I'd go either for the two-value model or a two-value model with a fixed ratio between the two skills, i.e. a one-value model.
... and I can prove it with my usual, flawless logic.
      George Carlin
0

#19 User is offline   mgoetze 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 4,942
  • Joined: 2005-January-28
  • Gender:Male
  • Location:Cologne, Germany
  • Interests:Sleeping, Eating

Posted 2015-February-13, 16:50

 cherdano, on 2015-February-13, 16:40, said:

Why would you want to assume that the 38% holds for every individual game?

It's obvious that if team A is much stronger than team B, then the percentage of points won by the serving team is higher than normal. Just imagine the Italian national team playing against your group of friends.

Well, because I don't have any better data. Really I just want to translate my ratings into a prediction like "there is a 60% chance that team A will win in 4 sets, 20% they win in 3 sets, 10% they win in 5 sets, 10% they lose" via Monte Carlo simulation. The Italian national team has a rating in my system, my friends don't. ;)
"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision"
    -- Bertrand Russell
0

#20 User is offline   gwnn 

  • Csaba the Hutt
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 13,027
  • Joined: 2006-June-16
  • Gender:Male
  • Interests:bye

Posted 2015-February-13, 16:59

Also, an easy formula for either a1 and a2, plus two plots (sorry, I was lazy to use gnuplot..):

p2/p1=(A+1-2*a2)/(1-A)

Posted Image

Posted Image
... and I can prove it with my usual, flawless logic.
      George Carlin
0

  • 2 Pages +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

2 User(s) are reading this topic
0 members, 2 guests, 0 anonymous users