BBO Discussion Forums: Suggestion for score when playing at a table - BBO Discussion Forums

Jump to content

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

Suggestion for score when playing at a table NOOSANCE's Letter to Fred

#41 User is offline   hanp 

  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,987
  • Joined: 2009-February-15

Posted 2010-February-09, 10:00

I do think it would be better to drop the extreme scores for comparison purposes.

I don't care though, and I don't understand why anybody would care much.
and the result can be plotted on a graph.
0

#42 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2010-February-09, 10:27

Deleting the two best scores for EW and the two best for NS would be fine if we with any degree of confidence could say that the most extreme scores are dumpings or balooka results while the middle 12 are not.

If that were the case, we would remove two good and two bad scores so XIMP would barely be affected, unlike butler scores who do get substantially affected by such an outlier removal. This is why it is so important to understand the difference between butler and XIMP when discussing these things. But OK, it would be better than nothing.

But it is not always the case. Sometimes a balooka auction can give a result within the range of normal results. Sometimes there will be only normal results. In such cases, outlier removal will make the score less meaningful.

I am sure outlier removal per saldo makes butler scores more meaningful. I doubt that they would do any good to XIMP scores, though.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#43 User is offline   jdonn 

  • - - T98765432 AQT8
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,085
  • Joined: 2005-June-23
  • Gender:Male
  • Location:Las Vegas, NV

Posted 2010-February-09, 10:40

Careful helene, you can't discuss any topic except what noosance brought up or else you are missing the point and have no clue.
Please let me know about any questions or interest or bug reports about GIB.
0

#44 User is offline   jdonn 

  • - - T98765432 AQT8
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,085
  • Joined: 2005-June-23
  • Gender:Male
  • Location:Las Vegas, NV

Posted 2010-February-09, 10:44

Btw, general business question. If business 1 is getting less and less popular over time, and business 2 is getting more and more popular over time, should business 2 start copying the methods of business 1?
Please let me know about any questions or interest or bug reports about GIB.
0

#45 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2010-February-09, 10:52

That is generally how it works in the software industry, isn't it?

Anyway, I am sure Fred and Uday are open to the possibility that a few of their choices might turn out to be worse than the choices made by competitors. I mean, it would be rare to see one company be right on everything.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#46 User is offline   Gerardo 

  • PipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 2,502
  • Joined: 2003-February-12
  • Gender:Male
  • Location:Dartmouth, NS, Canada

Posted 2010-February-09, 14:26

Assuming XIMP, taking out the extremes is problematic.

How do you calculate the score for said extremes? You have to use a different method.

Plus, assuming a result is such an outlier, the fact the IMP scale is logarithmic gives some protection, normal scores won't get much of it (compared to what the other normal scores get out of it, that is.) It needs to compare across the field, though, if you section it then it would be a huge difference.

#47 User is offline   uday 

  • PipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 5,808
  • Joined: 2003-January-15
  • Gender:Male
  • Location:USA

Posted 2010-February-10, 09:08

Moving to more comparisons might be a net plus. But the windows version of BBO would likely have a fit if we tried that. It isnt clear to me that the effort needed to make the windows version be able to handle this is worth the payback.
0

#48 User is offline   bid_em_up 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,351
  • Joined: 2006-March-21
  • Location:North Carolina

Posted 2010-February-10, 10:08

uday, on Feb 10 2010, 10:08 AM, said:

Moving to more comparisons might be a net plus.

I seriously doubt this. What noosance is suggesting sounds nice in theory, but in reality, it is unlikely to make much of a difference.

Lets say the number of comparisons is increased by a factor of 10 (160 board comparison). At best, all this will do is give you 10 times the number of "normal" results, and 10 times the number of idiotic random results which essentially leaves you right where you are now, if not worse, since the "normal" result (say +620) on the board will remain constant, but there is no telling what the random results will be. Realistically, given the randomness of levels of play and partnerships on BBO, I would expect that the more boards that are used for comparison that random results would be increased by a larger factor than then number of "normal" results would.

I think BBO had the right idea earlier when the Relaxed Bridge club and the "Masters"? club were added in addition to the MBC. If the Masters room had succeeded as the "serious" bridge club, and the boards played in that room only contained results from that room, better/more realistic results might have been obtained. However, we all know how successful that experiment was and I doubt it will be revisited.

jmoo.
Is the word "pass" not in your vocabulary?
So many experts, not enough X cards.
0

#49 User is offline   hotShot 

  • Axxx Axx Axx Axx
  • PipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 2,976
  • Joined: 2003-August-31
  • Gender:Male

Posted 2010-February-10, 11:24

We sometimes make BBO-parties, that means 4 of us with our laptops meet somewhere to play on BBO using the "crowd" as "other table".
You use the table result as score.
Whenever the "normal" result is close to zero this works perfectly well.

Now I took 3 random results from myhands.
2 Board 1 and Board 3) of then also not without flaws, work fine for that method, as the normal results for EW and NS are close to 0.
Board 2 has an irrational result that will rip the EW and NS scores about 10 IMPS apart for the normal play.

Board 1
Normal score, 2/16 missed to bid game.
Board 2
One lunatic 6xxN-6 for -3400 distorting the score for everybody else.

Board 3
Everybody makes 10 or 11 tricks but, one fails in 4NT 1/16 distorted results.

Taking the number up to 160 might not change Board 1 and 3 much, perhaps the number of weak players is about 1/16 bis 2/16.
But my "random" boards have only 1 lunatic score in 3 tries, off cause that is meaningless because of the small sample size and the lack of true randomness, but if we assume this is a trend, the distortion would reduce from about 10 IMPs to 3.3 IMPS.

Considering the efforts of abuse and BBO's automatic abuse detection, I would hope that the number of lunatic is smaller.

I think a higher number of comparisons would be nice, but it is not vital.

Since the focus is now on the flash version, I think it's only a matter of time until the Windows client will be constraining further development of BBO in enough ways to be given up. I assume the at that time only very few people will still be using it.
0

#50 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2010-February-10, 14:37

bid_em_up, on Feb 10 2010, 05:08 PM, said:

Realistically, given the randomness of levels of play and partnerships on BBO, I would expect that the more boards that are used for comparison that random results would be increased by a larger factor than then number of "normal" results would.

huh? If 10% of all scores are idiotic then 10% of 15 comparisons will be idiotic, on average, as will 10% of 159 comparisons.

The advantage of larger numbers of comparisons is that it reduces the imbalance between idiotic NS scores and idiotic EW scores. Say if out of 159 comparisons there are appr. 10 idiotically bad scores for EW scores and 10 idiotically bad ones for NS. That would roughly cancel out so your XIMP would be ok. But with only 15 comparisons there could easily be 2 idiotically bad ones for EW and none for NS. Of course it could also be the other way round, but the fluctuations on the difference between EW idiocy and NS idiocy are larger the smaller the number of comparisons.

So I think increasing the number of comparisons would probably be a good thing.

As I said before, it is not so much the fluctuations of the eventual XIMP you get when it had been played 16 times that bothers me. 16 is a large number. But it means that for every 16 boards you play, on average one of them will have no immediate comparison (giving you 0 IMPs which sometimes is hard to understand until one notices that there is no comparison), one will have only one immediate comparison (which may happen to be an idiotic one, or just an extreme result following from a reasonable but very unlucky decision) etc.
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

#51 User is offline   mrdct 

  • PipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 1,448
  • Joined: 2003-October-27
  • Gender:Male
  • Location:Moama, NSW

Posted 2010-February-10, 18:43

An issue germane to this discussion is the merits of cross-imping as compared to scoring-up against an average (with or without the exclusion of outliers). I'm still yet to hear a convincing argument as to why cross-imping is superior.

Interestingly, I was informed by the scorer of Australia's Youth Butler Trials (which changed from scoring-up against an average to cross-imping a few years ago) that after rescoring the event using both methods, the final ranking was exactly the same for the three years that he recomputed everything; so I expect there isn't much difference. Youth Butler Trials might not be a great example as it's quite a small field (5 or 6 tables) with fairly significant variation in the standard of the players, so I'd be keen to hear of any similar analysis that might have been done in events such as the Cavendish. Of course there are hundreds of thousands of hands that have been played 16 times in the MBC of BBO that could be similarly analysed.

As I stated previously, my gut feel is that there probably isn't much in it so I prefer to score-up against an average but exclude some outliers to reduce the volatility. Scoring-up against an average, I think 12 results (16 scores less 4 outliers) is quite sufficient and has the advantage on BBO that the scoring is very timely.
Disclaimer: The above post may be a half-baked sarcastic rant intended to stimulate discussion and it does not necessarily coincide with my own views on this topic.
I bidding the suit below the suit I'm actually showing not to be described as a "transfer" for the benefit of people unfamiliar with the concept of a transfer
0

#52 User is offline   barmar 

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 21,674
  • Joined: 2004-August-21
  • Gender:Male

Posted 2010-February-10, 19:24

Cross-imp scores are much easier to relate to team game scores. Suppose you make your contract, while every other pair bids the same contract and goes down 1. Your cross-imp score will be the same as if you were in a team game and made your contract while the other table went down. As the number of other pairs that duplicate your result goes up, this score is then prorated based on the fraction of pairs that duplicated your result; e.g. if the results are half making, half going down, everyone gets half the team game IMP score (if it's a vul game, they get half the 10 IMPs for making game, or +/- 5).

Butler, on the other hand, calculates IMPs against a meaningless number, because scoring is non-linear. Suppose half make a non-vul game and half go down 1, the datum is 235, and everyone gets +/- 6 IMPs. It's even worse for vul game: +/- 6 IMPs with cross-imps, +/- 8 with Butler. What's happening is that the IMP scale normally flattens out large score differences, and Butler stretches them out again.

#53 User is offline   helene_t 

  • The Abbess
  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 17,253
  • Joined: 2004-April-22
  • Gender:Female
  • Location:Copenhagen, Denmark
  • Interests:History, languages

Posted 2010-February-11, 05:35

Barmar's argument is the main reason for prefering XIMP to butler. There are a few other arguments:
- Less effect of the discreteness of the IMP scale. Suppose you chose 3NT over 4M for the same number of tricks so you get 10 points more than you might have. At butler this will usually yield an IMP difference of 0 but may yield an IMP difference of 1, depending on what the datum score. At XIMPs it will give you some fractional IMP because the 10 points will matter relative to some tables but not relative to others.
- How many IMPs do you get for being 16.754 points ahead of the datum? 0 or 1? Of course a decision has been made for the Butler score but that is somewhat arbitrary. At XIMPs you don't have the issue.
- You don't have to remove outliers at XIMP because the method is inherently robust. Removing outliers is probably necessary at Butler but there are problems with it. In a very small field you cannot really remove outliers. Suppose there are 3 tables and you remove one outlier at either end. If two tables have 6= and the 3rd table has 4+2, then datum would become 6= for 0 IMPs to the slam bidders, which feels wrong.

But of course Dave is right. It matters very little, especially in large fields. In a field of two tables it makes a lot of difference: If you score 20 points more than the other table, you win 1 IMP at XIMP and 0 IMP at Butler. Of course nobody would use butler for a two-table event but it illustrates the difference between the two scorings in small fields: The spread of the comparison is reduced at butler because you are comparing to an average to which you contribute yourself. Unless you become an outlier and get removed. So Butler scoring in small fields favors strategies with right-skewed distributions of the points scored.

Also, because of the nonlinearity of the IMP scale, the shrinkage of the difference relative to datum is more severe for small differences than for large differences. So in small fields, Butler makes the high-stake boards (slam decisions) relatively important, more so than XIMP does. (This is not related to outlier removal).
The world would be such a happy place, if only everyone played Acol :) --- TramTicket
0

  • 3 Pages +
  • 1
  • 2
  • 3
  • You cannot start a new topic
  • You cannot reply to this topic

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users