World Truth League Scoring System

Why do you still wash, rinse & repeat your business development strategies?

🇺🇸Happy 4th of July🇺🇸 Following up on my previous post, Steve Kuhn, myself, and several others have been brainstorming a new type of second-generation information market: the World Truth League. (Here is a link to our 16-page white paper, which contains a tentative sketch of our idea, including a one-page prospectus/executive summary.) So, how would our proposed truth league keep score? In brief, we propose a simple Bayesian scoring system in which teams would have an opportunity to update their beliefs. In brief, our scoring system consists of three discrete steps, which are further described below the fold, and can be summed up in three words: wash, rinse, and repeat.

1. Wash

Disagreement is to the World Truth League what shampoo is to hair: as a general matter, you can’t wash your hair without a bottle of shampoo; likewise, there would be no need for a truth league without disagreement on matters of factual controversy. More specifically, teams in our proposed leage would play a series of “truth matches” or “truth contests”, and each match or contest, in turn, would consist of some question or hypothesis that is currently in controversy–i.e. in which a consensus has yet to take hold among experts or the public at large–and each match would be played over several rounds or until a consensus emerges. Where would these controversial questions or disputed hypotheses come from? Before the start of the regular season, the Commissioner’s Office of the truth league (Steve and I!) would receive input from fans and then curate and post a finite list of trending forecasting questions regarding discrete future events or historical hypotheses regarding uncertain events from the past. The truth league could either “steal” these questions from other platforms, like Manifold, Kalchi, or even Twitter, or it could develop its own set of original questions. Is the NBA draft lottery fixed? Is Iran developing weapons of mass destruction? No question or controversy would be off limits.

2.  Rinse

After shampooing and rinsing your hair, you should feel better and cleaner than before. Ideally, truth-seeking should yield similar results: after discussing and debating a question–and then updating our beliefs based on new arguments or evidence–we should begin arriving at the truth of the matter. To this end, the teams in our proposed truth league would field the designated forecasting questions and historical hypotheses (see Step 1 above) over a series of rounds. During each round, the teams would not only provide their best “truth estimates” (i.e. their subjective probability estimates or degrees of belief) in response to the specific forecasting questions or historical hypotheses in play; the teams would also disclose any information, evidence, or arguments on which they are basing their truth estimates.

Each team’s truth estimate would thus represent that team’s best guess about the likelihood that an event will occur in the future (a forecast estimate) or an estimate about the likelihood that an event actually happened in the past (a hindcast bet). For simplicity, each team’s truth estimate would consist of a real number on some standard scale, such as a ten-point or 100-point scale (including 0; see, for example, the simple truth-estimate scale pictured below), where 0 = zero probability, 100 = total certainty, and the numbers in between = various degrees of belief or confidence levels. At the close of each round, a leaderboard would instantaneously announce the results. Teams would win points depending on how “right” their truth estimates are.

On A Scale From To 10, How Much Do The Numbers Used In, 49% OFF

By way of example, assume for mathematical simplicity that the truth league consists of four teams. Further assume the question presented is some variant of the much-debated Wuhan lab-leak theory. One team (say, Team A) might be super-confident that COVID-19 was the result of a lab leak, thus assigning a 90% probability to the lab-leak theory. Another team (Team B), by contrast, might be super-confident in the opposite direction, that COVID-19 was not caused by a lab leak, thus assigning only a 10% probability to the lab-leak theory. Lastly, the remaining two teams (Teams C and D) might also happen to agree or disagree with the lab-leak theory, but at the same time, they might be far less confident in their beliefs, assigning only 55% and 45% probabilities to the lab-leak theory, respectively.

truth league teamsprobability lab-leak
Team A90%
Team B10%
Team C55%
Team D45%

In the lab-leak illustration, the average of the four teams’ probability estimates is 50%. Teams A and C would therefore win points, since they both guessed in the right direction. Team A, however, would be awarded +2 points, since it was super-confident in its belief, while Team C would be awarded only +1 point, since it was less confident in its guess. (Why two and one points, respectively? Because Team A’s 90% probability estimate in this example is two standard deviations from the group average or mean, while Team C’s 55% probability estimate is just one standard deviation from the average/mean.) By the same token, Teams B and D would lose points, since they guessed in the wrong direction. Two points would be deducted from Team B (-2), while only one point would be deducted from Team D (-1).

3.  Repeat

We now repeat the process all over again. Each question would be played over multiple rounds until the averages of the teams’ estimates converge toward the truth.

Unknown's avatar

About F. E. Guerra-Pujol

When I’m not blogging, I am a business law professor at the University of Central Florida.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment