Systrem performance metrics

Page 1 of 1

You cannot start a new topic
You cannot reply to this topic

Systrem performance metrics

#1 helene_t

The Abbess

Group: Advanced Members
Posts: 17,240
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted Today, 05:29

I have determined opening bids for randomly dealt hands using various bidding systems - so far SA, English Acol, Scottish Acol, KS, Wei, Tarzan, IMPrecision, and Moscito. Will add some more.

And then I plan to derive various system performance metrics such as
- how often can responder immediately decide on something useful such as forcing to 2NT, ruling out a major suit fit etc.
- how much bidding space do we, on average, have left (taking anymy interference into account) below the safety level on those hands where responder can't immediately decide anything
- how often can either partner decide if it's right (not) to throw in the towel after opps' WJO

You might wonder what use such a data set is ....

Jan Eric Larsson wrote a book on comparison of bidding systems, using simulated tournaments. Now it is quite a lot of work to extend on this, as it requires implementing whole bidding systems. But if one could identify metrics, based on the opening bids alone, that predict system performance, one could easily scale this to many systems, maybe even search a large space of systems for the optimal one.

Or at least, when AWM or Kungsgeten design their own crazy system they could quickly see which weaknesses they have to work on.

Any thoughts?

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#2 P_Marlowe

Group: Advanced Members
Posts: 10,351
Joined: 2005-March-18
Gender:Male

Posted Today, 06:47

There was a post on bridge newsgroup doing something similar, as far as I understood, he tried to determine the
information content of the first bid.
It was based oninformation theory (?), he assumed, that the system quality after the opening bid was the
same for all systems.

The user had a name like tycsn?

With kind regards
Uwe Gebhardt (P_Marlowe)

#3 helene_t

The Abbess

Group: Advanced Members
Posts: 17,240
Joined: 2004-April-22
Gender:Female
Location:Copenhagen, Denmark
Interests:History, languages

Posted Today, 07:22

P_Marlowe, on 2025-February-28, 06:47, said:

This must be our Tysen, he wrote about that idea here also:

https://www3.dal13.s...post__p__473663

Basically he optimized his bidding system using the entropy of the distribution of the optimal contract, as seen by the partner of the player whose bid he wanted to optimize. And then he added an aggresiveness penalty to prevent the system for using up too much bidding space initially.

The "optimal" system turned out to involve a pass in first or second seat promising spade length.

The world would be such a happy place, if only everyone played Acol :) --- TramTicket

#4 P_Marlowe

Group: Advanced Members
Posts: 10,351
Joined: 2005-March-18
Gender:Male

Posted Today, 07:30

helene_t, on 2025-February-28, 07:22, said:

Yes, Tysen, ... my spelling was hitting 3 out of 5, and a bonus for a letter in the wrong position,
not a bad showing of my long term memory, given that I did not think about this the last ? years.

Anyway, the result is basically a HUM, pass promises a hand stronger than hands, that open.
And HUM systems are prohibited, you cant play those in reg. environments.
The question is, can one do the optimization he was doing, using only allowed systems?

With kind regards
Uwe Gebhardt (P_Marlowe)

#5 DavidKok

Group: Advanced Members
Posts: 2,691
Joined: 2020-March-30
Gender:Male
Location:Netherlands

Posted Today, 09:39

I think this is an interesting question, and I'd like to know more about what sort of results pop out.

At the same time I think it's terrifyingly easy to use too simple a model. Bridge is a pretty complicated game, and reducing bidding systems or even opening agreements to a small set of criteria is dangerous. Personally I would approach this by listing as many criteria as possible, in an attempt to reduce the risk of missing something important. Here's a few suggestions:

What is our combined HCP distribution given responder's hand and opener's first bid? If the chance of it being 25+ is low enough (say, under 5%) or it being 24- is low enough (ditto) we have an answer to 'can we make game'. If not, we need more space to figure it out.
What is the probability of us having an 8(+)-card major fit?
What is the probability of the opponents having an 8(+)-card major fit?
Can we establish a safety level? Important break points are 1NT, 2M, 3m, 3NT and 4M.
Conditional on partner opening and RHO overcalling at the 1- or 2-level (with an appropriate hand), how often do we have a safety level beyond that?

Maybe double dummy simulations of game odds can take the place of my crude HCP suggestion. Though already some flaws snuck in: in general we should keep bidding until we think bidding on is negative expected value, which is not the same as bidding on until these questions have been answered to a satisfactory (but arbitrary) threshold - instead such threshoulds should depend on the amount of bidding space left to the safety level.

#6 jdiana

Group: Full Members
Posts: 297
Joined: 2021-November-17

Posted Today, 10:51

Please don't take this the wrong way - the fact that you can even do such a thing is amazing to me - but, just from a non-technical, pragmatic, point of view it would seem that there might be better uses for the time and effort that this would require.

Deriving system performance metrics based simply on the opening bid seems questionable to me. There are so many decisions that go into any bidding system. No one plays "2/1" exactly the same way. Also, what is the end goal? Is it to conclude that system x is best? I think what I'm hearing is that you might conclude that system x is relatively weaker than system y with respect to a particular metric, which would guide system x players toward trying to shore up that part of their system. That could be useful if it can be done accurately.

When I read about ideas like this - and there are many of them, given how many smart and computer-savvy people play bridge - I always think "How would this help me be better at the game?" (I understand that I'm probably not the target audience - experts will be more interested.)

As an example of something that I think would be really useful, Matthew Kidd has talked about adding a feature to BBO Helper as follows:

"A bigger idea is to provide a heads-up display of your opponent's style based on their playing history, classifying actions along both aggressiveness and wildness dimensions. So for takeout doubles, where a player averages on an (HCP + distribution points) histogram would measure their aggressive while their proclivity to make off-shape doubles would measure their wildness, e.g. a takeout double with 5-3 in the majors or a stiff in the clubs when 1♦ was opened, would drive up your wildness score, and a doubleton in an unbid major even more so. Similarly one can examine two-level overcalls, perhaps with a special statistic for how often they overcall on 5-3-3-2 shape—you see a lot of this even in open ACBL events; sometimes it can't be punished but sometimes it merely goes unpunished."

https://bridgewinner...2-2-se8mmlwks3/

Maybe it would be interesting to poll experts about what unanswered questions they think are most worth solving. It might also be useful to think about what BBO could do to help further this type of research. For example, if everyone was forced to click on a radio button to describe their basic system, would that make this kind of research easier? What else could BBO do to facilitate research?

Again, feel free to ignore these comments from a Luddite. They're just my two cents.

#7 DavidKok

Group: Advanced Members
Posts: 2,691
Joined: 2020-March-30
Gender:Male
Location:Netherlands

Posted Today, 11:02

jdiana, on 2025-February-28, 10:51, said:

Deriving system performance metrics based simply on the opening bid seems questionable to me. There are so many decisions that go into any bidding system. No one plays "2/1" exactly the same way. Also, what is the end goal? Is it to conclude that system x is best? I think what I'm hearing is that you might conclude that system x is relatively weaker than system y with respect to a particular metric, which would guide system x players toward trying to shore up that part of their system. That could be useful if it can be done accurately.

My personal motivation for focusing on opening bids in particular is twofold. Firstly many modern expert systems do really well on constructed auctions, reaching a good contract 90+% of the time or so. In a sense, there are serious diminishing marginal returns to trying to optimise your constructive system over 'expert standard' - though in practice that is a pretty elusive target to hit in the first place. Secondly the bidding has become more competitive (what, really?!), with around 60-70% of modern auctions being contested. It is very reasonable to assume that you might have to deal with interference after your opening bid, and that your partner only has one shot to tell you what to do before you have to make a decision at a (possibly uncomfortably) high level. Therefore optimising the relevant information content conditional on expecting interference is of great value in modern bidding. This flies in the face of classical thinking, where people tried to squeeze their Fibonacci and exponential sequences in their frequency distributions.

#8 P_Marlowe

Group: Advanced Members
Posts: 10,351
Joined: 2005-March-18
Gender:Male

Posted Today, 12:42

jdiana, on 2025-February-28, 10:51, said:

<snip>
As an example of something that I think would be really useful, Matthew Kidd has talked about adding a feature to BBO Helper as follows:

"A bigger idea is to provide a heads-up display of your opponent's style based on their playing history, classifying actions along both aggressiveness and wildness dimensions. So for takeout doubles, where a player averages on an (HCP + distribution points) histogram would measure their aggressive while their proclivity to make off-shape doubles would measure their wildness, e.g. a takeout double with 5-3 in the majors or a stiff in the clubs when 1♦ was opened, would drive up your wildness score, and a doubleton in an unbid major even more so. Similarly one can examine two-level overcalls, perhaps with a special statistic for how often they overcall on 5-3-3-2 shape—you see a lot of this even in open ACBL events; sometimes it can't be punished but sometimes it merely goes unpunished."
<snip>

This is a full disclosure / monitoring problem / question. If you have a database like in chess, you coul do it.
At the momement the majority of games are not centrally accessible in an electronic way.
The hand records of most big tournaments are av. online, but scattered across.

With kind regards
Uwe Gebhardt (P_Marlowe)

Page 1 of 1

You cannot start a new topic
You cannot reply to this topic

BBO Discussion Forums: Systrem performance metrics - BBO Discussion Forums

Systrem performance metrics

#1 helene_t

#2 P_Marlowe

#3 helene_t

#4 P_Marlowe

#5 DavidKok

#6 jdiana

#7 DavidKok

#8 P_Marlowe

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

BBO Discussion Forums: Systrem performance metrics - BBO Discussion Forums

Systrem performance metrics

3 User(s) are reading this topic 0 members, 3 guests, 0 anonymous users

Delete Post

Skin and Language

Execution Stats

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users