Comparing butler scores to score against minimax

2 Pages
1
2
→

You cannot start a new topic
You cannot reply to this topic

Comparing butler scores to score against minimax

#1 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2010-January-12, 09:28

I think that comparing scores of given partnership to minimax scores is better measure of how well this pair play than butler scores (although probably it has more variance). Do you know if anybody did statistical work of comparing top parterships from top events using scores against minimax as the measure ?

I am very interested to see if butler winners would be on the top as well and how big the difference would be.

#2 Free

mmm Duvel

Group: Advanced Members
Posts: 10,728
Joined: 2003-July-30
Gender:Male
Location:Belgium
Interests:Duvel, Whisky

Posted 2010-January-12, 09:37

With "minimax scores", do you mean "double dummy results"?

"It may be rude to leave to go to the bathroom, but it's downright stupid to sit there and piss yourself" - blackshoe

#3 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2010-January-12, 09:55

Quote

do you mean "double dummy results"?

Yes.

#4 Gerardo

Group: Admin
Posts: 2,498
Joined: 2003-February-12
Gender:Male
Location:Dartmouth, NS, Canada

Posted 2010-January-12, 15:22

BBO uses Cross IMPs in every place it uses IMPs. In no place Butler is used.

#5 Jeroen71

Group: Members
Posts: 10
Joined: 2009-August-10
Location:Maastricht

Posted 2010-January-13, 05:38

bluecalm, on Jan 12 2010, 05:28 PM, said:

Before you can start doing some statistical analysis, you would have to define what it means for one measure to be better than another.
Not a trivial task....

#6 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2010-January-13, 06:57

Quote

Before you can start doing some statistical analysis, you would have to define what it means for one measure to be better than another.
Not a trivial task....

That's what I am suggesting: using scores at the table and compare them to minimaxes. The pair who beats minimax by more points/imps than other pair in the long is the better pair.
Of course this measure can have a lot of variance and you need to play a lot of hands for it be reliable. I have no idea how many. It would be nice to see how for example Lauria-Versace fares against Meckwell using this measure.

#7 helene_t

The Abbess

Group: Advanced Members
Posts: 17,232
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2010-January-13, 07:08

Jeroen71, on Jan 13 2010, 12:38 PM, said:

bluecalm, on Jan 12 2010, 05:28 PM, said:

Before you can start doing some statistical analysis, you would have to define what it means for one measure to be better than another.
Not a trivial task....

Not so difficult. Well, not so difficult to define it. Maybe more difficult to argue that it's the most reasonable definition

Here is what I would do: The score when board b is played by pair NSi aginst EWi is modeled as something like
( strength[NSi] - strength [EWi] ) * a[b] + eps[i,b]
where a[t] is log-normal distributed across tables and eps[i,b] is normal distributed with variance sigma[b]^2.

The best scoring is the one that leads to the lowest estimate of
E(sigma[...]^2) / ( var(strength[...]) * E(a[...])

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#8 Jeroen71

Group: Members
Posts: 10
Joined: 2009-August-10
Location:Maastricht

Posted 2010-January-14, 07:38

helene_t, on Jan 13 2010, 03:08 PM, said:

Not so difficult. Well, not so difficult to define it. Maybe more difficult to argue that it's the most reasonable definition

Would you care to elaborate a bit on your choice of model + criterion?

#9 helene_t

The Abbess

Group: Advanced Members
Posts: 17,232
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2010-January-14, 07:52

The criterion is that the variance in scores that is due to strength difference between pairs must be as large as possible compared to the random variance in scores.

The model says that the score on a particular board for two particular pairs playing that board is normal distributed with an expectation which is proportional* to the strength difference between the two pairs. Cascade, Gerben and I have all used that model for different IMP data sets and found that it fits well. Of course a radically different scoring could have different statistical characteristics, but as long as it is some kind of IMP scoring we are probably OK.

*I have assumed that the proportionality factor is log-normal distributed across boards. I felt that that is more realistic than assuming that it is constant. But I have no evidence for any particular distribution of this factor. Maybe Gerben has an informed opinion about it.

I have not provided any model for the distribution of sigma[...] across boards. It may be useful to model it, especially for long tournaments with small fields. Usually an inverse-gamma model is used for such purposes but again I have no evidence in support of any particular model. Again, maybe Gerben can say more.

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#10 paulg

Group: Advanced Members
Posts: 5,101
Joined: 2003-April-26
Gender:Male
Location:Scottish Borders

Posted 2010-January-14, 08:01

Butlers and minimax for the first Camrose weekend

           Butler  Minimax  Butler Minimax 
                              Position
Pair 1       1.39     1.10     1     2
Pair 2       1.14     1.15     2     1
Pair 3       1.03     1.01     3     3
Pair 4       0.72     0.02     4     10
Pair 5       0.25     0.55     5     5
Pair 6       0.24     0.09     6     9
Pair 7       0.15     0.24     7     7
Pair 8       0.08    -0.08     8     12
Pair 9       0.07    -0.70     9     16
Pair 10      0.03     0.17     10     8
Pair 11     -0.16     0.71     11     4
Pair 12     -0.18    -0.09     12     13
Pair 13     -0.32    -0.50     13     15
Pair 14     -0.39     0.00     14     11
Pair 15     -0.55    -1.07     15     17
Pair 16     -0.62     0.31     16     6
Pair 17     -0.90    -1.55     17     19
Pair 18     -0.90    -0.33     18     14
Pair 19     -1.76    -1.42     19     18

Paul

The Beer Card

I don't work for BBO and any advice is based on my BBO experience over the decades

#11 Fluffy

World International Master without a clue

Group: Advanced Members
Posts: 17,404
Joined: 2003-November-13
Gender:Male
Location:madrid

Posted 2010-January-14, 08:04

I don't understand this very well, if minimax is double dummy, if you make 7NT on 2 deep fineses and a drop are you giving every pair who gets those cards -17 or something?

BridgeGod: my personal website with interactive problems and articles

#12 helene_t

The Abbess

Group: Advanced Members
Posts: 17,232
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2010-January-14, 08:07

Fluffy, on Jan 14 2010, 03:04 PM, said:

I don't understand this very well, if minimax is double dummy, if you make 7NT on 2 deep fineses and a drop are you giving every pair who gets those cards -17 or something?

Exactly. Why do you say you don't understand it very well?

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#13 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2010-January-14, 08:50

Quote

Butler Minimax Butler Minimax
   Position
Pair 1    1.39    1.10    1    2
Pair 2    1.14    1.15    2    1
Pair 3    1.03    1.01    3    3
Pair 4    0.72    0.02    4    10
Pair 5    0.25    0.55    5    5
Pair 6    0.24    0.09    6    9
Pair 7    0.15    0.24    7    7
Pair 8    0.08 -0.08    8    12
Pair 9    0.07 -0.70    9    16
Pair 10 0.03    0.17    10    8
Pair 11    -0.16    0.71    11    4
Pair 12    -0.18 -0.09    12    13
Pair 13    -0.32 -0.50    13    15
Pair 14    -0.39    0.00    14    11
Pair 15    -0.55 -1.07    15    17
Pair 16    -0.62    0.31    16    6
Pair 17    -0.90 -1.55    17    19
Pair 18    -0.90 -0.33    18    14
Pair 19    -1.76 -1.42    19    18

Looks encouraging (for me:) ). Would it be possible for you to provide script/tools (even if they are commercial). I would love to check some matchups like : Meckwell versus Lauria-Versace etc. Looks like minimax is quite reasonable from that sample and good thing about it is that you don't need any other pairs to see who played better (just a lot of hands).

Quote

I don't understand this very well, if minimax is double dummy, if you make 7NT on 2 deep fineses and a drop are you giving every pair who gets those cards -17 or something?

Yes

. It seems unjust but...
The most obvious measure of who is better at bridge is to just choose :
score = total points won.

Unfortunately this way you would need thousands (ten of thousands ?) of hands to have something you can rely on.
Here comes modern scoring. Instead of just counting total points won you compare total points won to what other people won on this board. This is much more valuable but still you need many hands to see who is better. Butler scoring is based on this idea.
Unfortunately when major tournaments goes to playoff stage butler is no longer reliable because you don't have enough scores from other tables.
My idea is that maybe comparing to minimax isn't that far away from butler scores and thus can be used as reliable measure of who plays better if you don't have any other (or not enough) scores to compare.
Of course there are "unjust" deals if you use this measure but there are also unjust deals if you play for total points. Even at imps there are many unjust deals (if one pair plays your 7NT they get +17imps, and other pair gets - 17imps).

It's all about variance (how many hands you need to get close to expected value).
Both total points and comparing against minimax are objective (pair with better expected value in both of them is better at bridge) and pair who scores better in the long run is just the better pair.
My hope is that variance in "compare to minimax" is much less than in "count total points won".

#14 Fluffy

World International Master without a clue

Group: Advanced Members
Posts: 17,404
Joined: 2003-November-13
Gender:Male
Location:madrid

Posted 2010-January-14, 08:57

No, the pair that plays 7NT goes 2 downs because they won't see the that ♦Jxxx on LHO and ♣Q10xxx on RHO. Althou they will gues by force ♥KJxx.

No biggie, they will lose 19 when the others lose 17 or so.

BTW, for years I got many help from BBF posters to improve my english, I am glad for the first time to be the one who helps

, it is said "unfair", unjust has no menaing in english.

BridgeGod: my personal website with interactive problems and articles

#15 bluecalm

Group: Advanced Members
Posts: 2,555
Joined: 2007-January-22

Posted 2010-January-14, 09:00

Quote

I am glad for the first time to be the one who helps , it is said "unfair", unjust has no menaing in english.

Thanks much

English is not much native language. I have just learnt it from the Internet

Quote

No biggie, they will lose 19 when the others lose 17 or so.

Well.. what about a hand where someone bids hopeless game and win on 3 finesses ? +13, other pair -13. Unfair !
What about grand slam on pure guess of a queen. You guess, +14. Other table played reasonable 6NT and got -14 (or whatever). Unfair !
There many examples of that I am sure you saw more of them than me

In the long run we hope luck will all even out... (it won't but it will matter less and less with many hands played although it will be on average bigger in absolute terms). This is our hope with both duplicate scoring and scoring against minimax...

#16 Fluffy

World International Master without a clue

Group: Advanced Members
Posts: 17,404
Joined: 2003-November-13
Gender:Male
Location:madrid

Posted 2010-January-14, 09:06

I agree with you, but my point is, you randomice a lot by who your opponents are and how they play, you pay a big pice for them becoming inspired against you.

Bad contracts that make, and contracts that make on a finese randomice even more.

If the opponents play well there are times where you will make a decision that will lead to a game that is from 20% to 90% to make. The fact that you landed on the 20% game and made it didn't mean you were lucky, you had made a good decision that might land you on a bad contract before, but maybe a decision nobody else had to face.

There are enough randomice factors around, but my point is: if you add to these on wich positions you sit on certain deals, you are randomicing the results even more.

BridgeGod: my personal website with interactive problems and articles

#17 paulg

Group: Advanced Members
Posts: 5,101
Joined: 2003-April-26
Gender:Male
Location:Scottish Borders

Posted 2010-January-14, 10:42

I use the BBO records plugged into Double Dummy Solver from Bridge Captain to generate the minimax score.

The rest is a manual input into a complex spreadsheet.

Paul

The Beer Card

I don't work for BBO and any advice is based on my BBO experience over the decades

#18 655321

Group: Advanced Members
Posts: 2,502
Joined: 2007-December-22

Posted 2010-January-14, 19:06

Fluffy, on Jan 14 2010, 09:57 AM, said:

BTW, for years I got many help from BBF posters to improve my english, I am glad for the first time to be the one who helps

, it is said "unfair", unjust has no menaing in english.

Fluffy's English is way better than my Spanish, but anyway:

http://dictionary.re...m/browse/unjust

That's impossible. No one can give more than one hundred percent. By definition that is the most anyone can give.

#19 Fluffy

World International Master without a clue

Group: Advanced Members
Posts: 17,404
Joined: 2003-November-13
Gender:Male
Location:madrid

Posted 2010-January-15, 02:34

*****

BridgeGod: my personal website with interactive problems and articles

#20 helene_t

The Abbess

Group: Advanced Members
Posts: 17,232
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted 2010-January-15, 05:11

The Australian youth selection, now on vugraph, uses a, external datum score (which happens to come from real life bridge rather than DD).

I think using an external datum score, whether based on DD, robot play or some large high-level tournament, has some advantages:
- When organizing an event for weak players, to reduce the randomness of the datum score.
- When playing a very small pairs tourney, same argument.
- Maybe (I haven't thought this through) a kind of Swiss movement. The travelers for the boards that remains on table 7 could say something like "Datum is 420 NS. Winners go to 5 NS, losers to 9 EW". Maybe it would be possible to construct something similar to Swiss which would allow for faster movements because you don't need to enter the results in the computer and communicate the results from the computer. And you can move as soon as the round finishes even if some late tables haven't entered their results yet.

OK, the new bridgemates can tell you where to move to in a Swiss movement so the last argument is becoming obsolete.

Anyway, I would prefer robot play to DD. Jack has an option for simulating frequency tables by letting it play a board with a range of systems and styles.

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

2 Pages
1
2
→

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: Comparing butler scores to score against minimax - BBO Discussion Forums

Comparing butler scores to score against minimax

#1 bluecalm

#2 Free

#3 bluecalm

#4 Gerardo

#5 Jeroen71

#6 bluecalm

#7 helene_t

#8 Jeroen71

#9 helene_t

#10 paulg

#11 Fluffy

#12 helene_t

#13 bluecalm

#14 Fluffy

#15 bluecalm

#16 Fluffy

#17 paulg

#18 655321

#19 Fluffy

#20 helene_t

4 User(s) are reading this topic
0 members, 4 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: Comparing butler scores to score against minimax - BBO Discussion Forums

Comparing butler scores to score against minimax

4 User(s) are reading this topic 0 members, 4 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

4 User(s) are reading this topic
0 members, 4 guests, 0 anonymous users