Let uncertainty factor be configurable instead of hardcoded 2 #104

tychota · 2021-07-01T22:37:00Z

(issue after discord discussion)

[gjm] I wonder whether it would be an improvement to make use of the prior information we already have: most networks are similar in strength to previous ones, or some simple extrapolation from previous ones. So why not plot a graph that takes a prior based on other recent networks, updates on evidence from games played, and shows something like the 2.5, 50, 97.5 percentiles of the posterior?

[lightvector] Right now it does show the posterior already, it's just that the prior is very weak. The nice thing about having only a very weak prior that current network = previous network is that the entire thing is essentially unbiased. I know you tried to cover that by mentioning "simple extrapolation", but then you start having additional parameters relating to how to do that, and it gets messy, particularly since the optimization algorithm currently doesn't handle priors like "assume things continue in straight line log-scale trends" or whatever.
It seems a lot simpler to just move the "strongest confident" algorithm to be 3 sigma instead of 2 and let things continue to be nearly-unbiased.
Which is still a (simple) open task.

First, why we need to compute that value ?

We need the lower confidence ( log_gamma_lower_confidence) to select best network (so katrain can download it for example).

We need the upper confidence ( log_gamma_upper_confidence) to select a network worth spending rating game on it. That way, we eliminate false good network (high elo and high uncertainty).

We need to store in db log_gamma_lower_confidence = log_gamma - <some_factor> * log_gamma_uncertainty and log_gamma_upper_confidence = log_gamma + <some_factor> * log_gamma_uncertainty because we are going to sort by log_gamma_lower_confidence or log_gamma_upper_confidence and without database indexes the queries are going to be slow. But without relying on sql dark magic, we need the result to be precomputed to be indexed.

Then, when we need to compute that value ?

We set the field first on upload which is indeed

katago-server/src/apps/trainings/serializers/network.py

Lines 53 to 66 in e2d3909

    
           data = validated_data.copy() 
        
           if "parent_network" in data: 
        
               if data["parent_network"]: 
        
                   data["log_gamma"] = data["parent_network"].log_gamma 
        
                   data["log_gamma_uncertainty"] = 2.0 
        
                   data["log_gamma_lower_confidence"] = data["log_gamma"] - 2 * data["log_gamma_uncertainty"] 
        
                   data["log_gamma_upper_confidence"] = data["log_gamma"] + 2 * data["log_gamma_uncertainty"] 
        
           if "log_gamma" in data and "log_gamma_uncertainty" not in data: 
        
               data["log_gamma_uncertainty"] = 2.0 
        
           if "log_gamma" in data and "log_gamma_uncertainty" in data: 
        
               if "log_gamma_lower_confidence" not in data: 
        
                   data["log_gamma_lower_confidence"] = data["log_gamma"] - 2 * data["log_gamma_uncertainty"] 
        
               if "log_gamma_upper_confidence" not in data: 
        
                   data["log_gamma_upper_confidence"] = data["log_gamma"] + 2 * data["log_gamma_uncertainty"]

.

We also update network every 10 min here

katago-server/src/apps/trainings/tasks/update_bayesian_rating.py

Lines 14 to 37 in e2d3909

    
           current_run = Run.objects.select_current() 
        
           if current_run is None: 
        
               return 
        
           network_ratings = Network.pandas.get_ratings_dataframe(current_run) 
        
           anchor_network = Network.objects.filter(run=current_run).order_by("pk").first() 
        
           if anchor_network is None: 
        
               return 
        
           detailed_tournament_result = RatingGame.pandas.get_detailed_tournament_results_dataframe( 
        
               current_run, for_tests=for_tests 
        
           ) 
        
           assert_no_match_with_same_network = ( 
        
               detailed_tournament_result["reference_network"] != detailed_tournament_result["opponent_network"] 
        
           ) 
        
           detailed_tournament_result = detailed_tournament_result[assert_no_match_with_same_network] 
        
           bayesian_rating_service = BayesianRatingService( 
        
               network_ratings, anchor_network.id, detailed_tournament_result, current_run.virtual_draw_strength 
        
           ) 
        
           new_network_ratings = bayesian_rating_service.update_ratings_iteratively(current_run.elo_number_of_iterations) 
        
           Network.pandas.bulk_update_ratings_from_dataframe(new_network_ratings)

which uses

katago-server/src/apps/trainings/managers/network_pandas_manager.py

Lines 51 to 52 in e2d3909

    
           dataframe["log_gamma_upper_confidence"] = dataframe["log_gamma"] + 2 * dataframe["log_gamma_uncertainty"] 
        
           dataframe["log_gamma_lower_confidence"] = dataframe["log_gamma"] - 2 * dataframe["log_gamma_uncertainty"]

Which is good as we don't need to care for old network, they would be updated anyway :slight_smile:

How to change ?

As you can see there is some coupling (two identical magic constant, so maybe the 2 can be extracted in run model with an integer field (or maybe float field) such as

katago-server/src/apps/runs/models/run.py

Lines 165 to 169 in e2d3909

    
           elo_number_of_iterations = IntegerField( 
        
               _("Elo computation number of iterations"), 
        
               help_text=_("How many iterations to use per celery task to compute log_gammas and Elos."), 
        
               default=10, 
        
               validators=[validate_positive],

and added to admin

katago-server/src/apps/runs/admin/run_admin.py

Line 16 in e2d3909

"fields": (

: this way it can be changed at runtime.

Will it break anything ?

people relying on log_gamma_upper_confidence from API will now have really different value
=> we can just notify potential users here for now,
=> ideally we shouldn't expose log_gamma_lower_confidence or log_gamma_upper_confidence but let people calculate it from exposed log_gamma and log_gamma_uncertainty

UI (network page, elo graph) needs update if we wantit to be coherant with the internal changes ( @lightvector#3657 do we ?) :
=> django generated view uses

katago-server/src/apps/trainings/models/network.py

Lines 237 to 238 in cdf1b31

    
           def rating(self): 
        
               return f"{self.elo:.{self.elo_precision}f} ± {2 * self.elo_uncertainty:.{self.elo_precision}f}"

=> elo graph uses

katago-server/src/frontend/templates/run_stats_head.html

Line 380 in cdf1b31

.attr("y1", network => yScale(network["elo"] - 2.0 * network["elostdev"]))

and hard code X * network["elostdev"], eg for graph axis min elo and max elo

katago-server/src/frontend/templates/run_stats_head.html

Lines 171 to 172 in cdf1b31

    
           var minElo = d3.min(filtered.map(network => network["elo"] - Math.min(300, 2.5 * network["elostdev"]))); 
        
           var maxElo = d3.max(filtered.map(network => network["elo"] + Math.min(300, 2.5 * network["elostdev"])));

unit tests will maybe break (if they are good they will)

The text was updated successfully, but these errors were encountered:

tychota added enhancement New feature or request good first issue Good for newcomers labels Jul 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let uncertainty factor be configurable instead of hardcoded 2 #104

Let uncertainty factor be configurable instead of hardcoded 2 #104

tychota commented Jul 1, 2021

Let uncertainty factor be configurable instead of hardcoded 2 #104

Let uncertainty factor be configurable instead of hardcoded 2 #104

Comments

tychota commented Jul 1, 2021

First, why we need to compute that value ?

Then, when we need to compute that value ?

How to change ?

Will it break anything ?