Suggestion for score when playing at a table NOOSANCE's Letter to Fred
#41
Posted 2010-February-09, 10:00
I don't care though, and I don't understand why anybody would care much.
#42
Posted 2010-February-09, 10:27
If that were the case, we would remove two good and two bad scores so XIMP would barely be affected, unlike butler scores who do get substantially affected by such an outlier removal. This is why it is so important to understand the difference between butler and XIMP when discussing these things. But OK, it would be better than nothing.
But it is not always the case. Sometimes a balooka auction can give a result within the range of normal results. Sometimes there will be only normal results. In such cases, outlier removal will make the score less meaningful.
I am sure outlier removal per saldo makes butler scores more meaningful. I doubt that they would do any good to XIMP scores, though.
#43
Posted 2010-February-09, 10:40
#44
Posted 2010-February-09, 10:44
#45
Posted 2010-February-09, 10:52
Anyway, I am sure Fred and Uday are open to the possibility that a few of their choices might turn out to be worse than the choices made by competitors. I mean, it would be rare to see one company be right on everything.
#46
Posted 2010-February-09, 14:26
How do you calculate the score for said extremes? You have to use a different method.
Plus, assuming a result is such an outlier, the fact the IMP scale is logarithmic gives some protection, normal scores won't get much of it (compared to what the other normal scores get out of it, that is.) It needs to compare across the field, though, if you section it then it would be a huge difference.
#47
Posted 2010-February-10, 09:08
#48
Posted 2010-February-10, 10:08
uday, on Feb 10 2010, 10:08 AM, said:
I seriously doubt this. What noosance is suggesting sounds nice in theory, but in reality, it is unlikely to make much of a difference.
Lets say the number of comparisons is increased by a factor of 10 (160 board comparison). At best, all this will do is give you 10 times the number of "normal" results, and 10 times the number of
I think BBO had the right idea earlier when the Relaxed Bridge club and the "Masters"? club were added in addition to the MBC. If the Masters room had succeeded as the "serious" bridge club, and the boards played in that room only contained results from that room, better/more realistic results might have been obtained. However, we all know how successful that experiment was and I doubt it will be revisited.
jmoo.
So many experts, not enough X cards.
#49
Posted 2010-February-10, 11:24
You use the table result as score.
Whenever the "normal" result is close to zero this works perfectly well.
Now I took 3 random results from myhands.
2 Board 1 and Board 3) of then also not without flaws, work fine for that method, as the normal results for EW and NS are close to 0.
Board 2 has an irrational result that will rip the EW and NS scores about 10 IMPS apart for the normal play.
Board 1
Normal score, 2/16 missed to bid game.
Board 2
One lunatic 6♥xxN-6 for -3400 distorting the score for everybody else.
Board 3
Everybody makes 10 or 11 tricks but, one fails in 4NT 1/16 distorted results.
Taking the number up to 160 might not change Board 1 and 3 much, perhaps the number of weak players is about 1/16 bis 2/16.
But my "random" boards have only 1 lunatic score in 3 tries, off cause that is meaningless because of the small sample size and the lack of true randomness, but if we assume this is a trend, the distortion would reduce from about 10 IMPs to 3.3 IMPS.
Considering the efforts of abuse and BBO's automatic abuse detection, I would hope that the number of lunatic is smaller.
I think a higher number of comparisons would be nice, but it is not vital.
Since the focus is now on the flash version, I think it's only a matter of time until the Windows client will be constraining further development of BBO in enough ways to be given up. I assume the at that time only very few people will still be using it.
#50
Posted 2010-February-10, 14:37
bid_em_up, on Feb 10 2010, 05:08 PM, said:
huh? If 10% of all scores are idiotic then 10% of 15 comparisons will be idiotic, on average, as will 10% of 159 comparisons.
The advantage of larger numbers of comparisons is that it reduces the imbalance between idiotic NS scores and idiotic EW scores. Say if out of 159 comparisons there are appr. 10 idiotically bad scores for EW scores and 10 idiotically bad ones for NS. That would roughly cancel out so your XIMP would be ok. But with only 15 comparisons there could easily be 2 idiotically bad ones for EW and none for NS. Of course it could also be the other way round, but the fluctuations on the difference between EW idiocy and NS idiocy are larger the smaller the number of comparisons.
So I think increasing the number of comparisons would probably be a good thing.
As I said before, it is not so much the fluctuations of the eventual XIMP you get when it had been played 16 times that bothers me. 16 is a large number. But it means that for every 16 boards you play, on average one of them will have no immediate comparison (giving you 0 IMPs which sometimes is hard to understand until one notices that there is no comparison), one will have only one immediate comparison (which may happen to be an idiotic one, or just an extreme result following from a reasonable but very unlucky decision) etc.
#51
Posted 2010-February-10, 18:43
Interestingly, I was informed by the scorer of Australia's Youth Butler Trials (which changed from scoring-up against an average to cross-imping a few years ago) that after rescoring the event using both methods, the final ranking was exactly the same for the three years that he recomputed everything; so I expect there isn't much difference. Youth Butler Trials might not be a great example as it's quite a small field (5 or 6 tables) with fairly significant variation in the standard of the players, so I'd be keen to hear of any similar analysis that might have been done in events such as the Cavendish. Of course there are hundreds of thousands of hands that have been played 16 times in the MBC of BBO that could be similarly analysed.
As I stated previously, my gut feel is that there probably isn't much in it so I prefer to score-up against an average but exclude some outliers to reduce the volatility. Scoring-up against an average, I think 12 results (16 scores less 4 outliers) is quite sufficient and has the advantage on BBO that the scoring is very timely.
I ♦ bidding the suit below the suit I'm actually showing not to be described as a "transfer" for the benefit of people unfamiliar with the concept of a transfer
#52
Posted 2010-February-10, 19:24
Butler, on the other hand, calculates IMPs against a meaningless number, because scoring is non-linear. Suppose half make a non-vul game and half go down 1, the datum is 235, and everyone gets +/- 6 IMPs. It's even worse for vul game: +/- 6 IMPs with cross-imps, +/- 8 with Butler. What's happening is that the IMP scale normally flattens out large score differences, and Butler stretches them out again.
#53
Posted 2010-February-11, 05:35
- Less effect of the discreteness of the IMP scale. Suppose you chose 3NT over 4M for the same number of tricks so you get 10 points more than you might have. At butler this will usually yield an IMP difference of 0 but may yield an IMP difference of 1, depending on what the datum score. At XIMPs it will give you some fractional IMP because the 10 points will matter relative to some tables but not relative to others.
- How many IMPs do you get for being 16.754 points ahead of the datum? 0 or 1? Of course a decision has been made for the Butler score but that is somewhat arbitrary. At XIMPs you don't have the issue.
- You don't have to remove outliers at XIMP because the method is inherently robust. Removing outliers is probably necessary at Butler but there are problems with it. In a very small field you cannot really remove outliers. Suppose there are 3 tables and you remove one outlier at either end. If two tables have 6♥= and the 3rd table has 4♥+2, then datum would become 6♥= for 0 IMPs to the slam bidders, which feels wrong.
But of course Dave is right. It matters very little, especially in large fields. In a field of two tables it makes a lot of difference: If you score 20 points more than the other table, you win 1 IMP at XIMP and 0 IMP at Butler. Of course nobody would use butler for a two-table event but it illustrates the difference between the two scorings in small fields: The spread of the comparison is reduced at butler because you are comparing to an average to which you contribute yourself. Unless you become an outlier and get removed. So Butler scoring in small fields favors strategies with right-skewed distributions of the points scored.
Also, because of the nonlinearity of the IMP scale, the shrinkage of the difference relative to datum is more severe for small differences than for large differences. So in small fields, Butler makes the high-stake boards (slam decisions) relatively important, more so than XIMP does. (This is not related to outlier removal).