Replies: 4 comments
-
The way I'm thinking this would work is as follows. Currently we have a single fitness score and we select for reproduction for the next generation from the top N scoring genomes, and/or proportionally to fitness score, i.e. fitness=10 has twice the probability of fitness=5. So yes you can combine the multiple fitness scores into a combined score, but I agree it's not ideal. Instead, imagine two scores, so we can imagine all of our genomes positioned in 2D where score1 is the x coordinate, score 2 is y the coord. The best possible genome is at the top-right corner, but we also have genomes that score well on one but not the other score, located along the x and y-axes, and we have a bunch of genomes in between. Typically we would probably see genomes distributed in an arc, which you can think of as a 'pareto front', see pic on this wiki page:
Anyways, we can now envisage some different selection schemes based on thinking about this 2D space, e.g. genomes will tend to form clusters, and we want to select from these clusters where they are positioned nearest to the top-right corner, but we also are interested in 'lone' genomes on the pareto front, as these represent novel strategies, i.e. selecting these helps maintain some diversity in the population, whereas selecting only from the dense clusters just draws the entire population to those clusters. So my initial thoughts here are that we would still combine the two scores, e.g. maybe by using euclidean distance from the origin combinedScore = sqrt(scoreA^2 + scoreB^2) But in addition t this we also have the comcept of genome density in the 2D space, so we also want to add some pressure that increases the probability of selecting from low density regions. Then you probably want weighting factors to balance out these two metrics, and the fitness scores themselves need t be carefully chosen, e.g. don;t have one that has range [0.1] and another with range [0, 10^6]. Those are my initial thoughts FWIW. |
Beta Was this translation helpful? Give feedback.
-
Thanks Colin for your feedback! I found out there's a whole book about this subject: So there are multiple known solutions. I will try using your combinedScore = sqrt(scoreA^2 + scoreB^2) and see how that works. Or we could have multiple fitness scores in AuxFitnessInfo, and modify your class GenomeFitnessComparer and take some code from https://github.com/chen0040/cs-moea/blob/2c54298e29000354ef8e41dc190ec37617471183/cs-moea/ComponentModels/CompareUtil.cs to sort the genomes? What do you think? |
Beta Was this translation helpful? Give feedback.
-
Yeh looks like NSGA-II is pretty much what I was describing, although I suspect it's not useful to try and distinguish between different 'fronts' and instead just take a purely probabilistic approach of higher overall score leads to higher selection probability, but where overall score is a combination not only of the multiple objective scores, but also the density of genomes in the 2D (in my example) fitness space. (lower density => high probability of selection, i.e. there are fewer other genomes like this one, so let's increase the chances of selecting it). Yeh you should be able to store multiple fitness scores in AuxFitnessInfo, and then use them in CreateOffSpring() when selecting genomes. This is one area in the refactor where I want to build in a bit more support in the framework for doing this sort of thing. |
Beta Was this translation helpful? Give feedback.
-
Yeah, you're right they call this density a "Crowding Distance". I think this selection mechanism can have a powerful impact on the overall performance. |
Beta Was this translation helpful? Give feedback.
-
Hi Colin,
could there be a multi fitness solution where your algorithm would minimize one variable and maximize the other? For example to maximize profits and minimize standard deviation of a trading system. I know I can lump these two into one fitness function like fitness = VAR1 * (1-VAR2), but this may now work properly. It seems there could be a more optimal way to search the space. For example your the genetic algorithm could focus on minimizing one parameter for a couple of generations and then switch to maximizing the other parameter for a couple of generations and back and forth. There could be more than 2 objectives of course.
Maybe we could re-use some code from https://github.com/chen0040/cs-moea
Could this be done? I'd pay for this add-on.
Beta Was this translation helpful? Give feedback.
All reactions