Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many dangling nodes without a connection to an output are created / left -> network breaks the longer you run it #250

Open
markste-in opened this issue Aug 16, 2022 · 26 comments

Comments

@markste-in
Copy link

Describe the bug
Currently I am running NEAT for a long time and inspect the growth of the network from time to time. I started to notice that NEAT creates many "dead ends" / "dangling nodes" -> Nodes that are not going anywhere and are not connected to another node or an output. Yesterday I found an example with quite a few in them.

I am unsure if that happened with older versions before (or to this extend).
Maybe that behavior is desired?
I would have expected, that "dangling" nodes are either removed too when the connected node(to the exit/next leading node) is removed or that the node is re-connected to the node that comes after the removed one.

There are also nodes that are "going out" of an output that are not recurrent?! I don't think that should happen either?

The longer you run the algorithm the more "crude" and broken the network gets, up to a point when you have almost no "functional" nodes anymore because they are not connected to an output anymore

To Reproduce
I used:
- gym 0.25.1 with 'Alien-v4'
- neat 0.93
- config see below

OS:

Linux Ubuntu-2204-jammy-amd64-base 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Expected behavior
No "dangling" nodes and no nodes that "leave" an exit (maybe except of recurrent ones)
or
When nodes get removed and would leave a dangling node behind then they should either be removed too or be connected to the nodes that come after the now removed node

Screenshots with problematic nodes
Screenshot 2022-08-16 at 21 20 50

Screenshot 2022-08-16 at 21 36 45

Screenshot 2022-08-16 at 21 40 31

Used Config

[NEAT]
pop_size              = 300
fitness_criterion     = max
fitness_threshold     = 1000.0
reset_on_extinction   = 0
no_fitness_termination = True

[DefaultGenome]
num_inputs              = 128
num_hidden              = 0
num_outputs             = 18
initial_connection      = unconnected
feed_forward            = True
compatibility_disjoint_coefficient = 1.0
compatibility_weight_coefficient   = 1.0
conn_add_prob           = 0.35
conn_delete_prob        = 0.25
node_add_prob           = 0.35
node_delete_prob        = 0.25
activation_default      = random
activation_options      = clamped relu sigmoid sin tanh
activation_mutate_rate  = 0.05
aggregation_default     = sum
aggregation_options     = sum min max mean
aggregation_mutate_rate = 0.15
bias_init_type          = gaussian
bias_init_mean          = 0.0
bias_init_stdev         = 1.0
bias_replace_rate       = 0.15
bias_mutate_rate        = 0.8
bias_mutate_power       = 0.4
bias_max_value          = 30.0
bias_min_value          = -30.0
response_init_mean      = 1.0
response_init_stdev     = 0.0
response_replace_rate   = 0.15
response_mutate_rate    = 0.15
response_mutate_power   = 0.15
response_max_value      = 30.0
response_min_value      = -30.0

weight_init_type        = gaussian
weight_max_value        = 30
weight_min_value        = -30
weight_init_mean        = 0.0
weight_init_stdev       = 1.0
weight_mutate_rate      = 0.8
weight_replace_rate     = 0.02
weight_mutate_power     = 0.4
enabled_default         = True
enabled_mutate_rate     = 0.01

single_structural_mutation = false
structural_mutation_surer = default
response_init_type = gaussian
enabled_rate_to_true_add = 0.0
enabled_rate_to_false_add = 0.0

[DefaultSpeciesSet]
compatibility_threshold = 5

[DefaultStagnation]
species_fitness_func = mean
max_stagnation       = 50
species_elitism      = 4

[DefaultReproduction]
elitism            = 2
survival_threshold = 0.2
min_species_size = 50
@jtoleary
Copy link

jtoleary commented Sep 1, 2022

I have the exact same problem!

@valpaz
Copy link

valpaz commented Sep 14, 2022

Same problem !

@nexon33
Copy link

nexon33 commented Sep 15, 2022

This seems like a feature, I didn't make this library but read about the mechanisms used for evolution. Sometimes neurons get isolated or replaced at random, this includes disabling a connection that was there before, and which could also get reconnected at random

@markste-in
Copy link
Author

This seems like a feature, I didn't make this library but read about the mechanisms used for evolution. Sometimes neurons get isolated or replaced at random, this includes disabling a connection that was there before, and which could also get reconnected at random

I tried the following in the config to deactivate the disabling of connections:
enabled_mutate_rate = 0.0

I still get a lot of dangling nodes and a mostly "broken" net.

@markste-in
Copy link
Author

markste-in commented Sep 27, 2022

I created a fork that has a changed implementation. It will remove all dangling nodes and connection at the end of every run

If people wanna try it out:
https://github.com/markste-in/neat-python/tree/remove_dangling_nodes
I am eager for any feedback. So far I made good experiences building big functional networks

Screenshot 2022-09-25 at 09 57 23

@ntraft
Copy link

ntraft commented Nov 19, 2022

This seems like a feature, I didn't make this library but read about the mechanisms used for evolution. Sometimes neurons get isolated or replaced at random, this includes disabling a connection that was there before, and which could also get reconnected at random

I tried the following in the config to deactivate the disabling of connections: enabled_mutate_rate = 0.0

I still get a lot of dangling nodes and a mostly "broken" net.

Even without "disabled" (but existing) edges, you could still get lots of dangling nodes due to deleting edges, right? And you probably wouldn't want to disable the deleting of edges entirely, so that brings us back to where we started.

I don't know whether other NEAT implementations prevent this. It seems a bit ambiguous as to whether this is desirable or not; maybe those nodes could be rewired to be useful in a future generation. On the other hand, you're right that it adds more and more bloat as the algorithm progresses! Even if we're able to prune these away when instantiating the network, we are still wasting lots of time mutating them when they'll have no impact on the final output.

Btw, have you tried DefaultGenome.get_pruned_copy()? Doesn't that eliminate the dangling output nodes? (Though it doesn't remove the dangling input nodes, since they actually are used.)

@Finebouche
Copy link

Hey wanted to react to that, I don't think dandling nodes will add more bloat as the algorithm progresses as :

  • they are not use to compute the outputs of the network
  • they will probably be remove because of stagnation as any change in those part of the network have 0 chance of improving it.

@Finebouche
Copy link

Ok changed my mind. The problem with dandling nodes will be that genome modification can still only impact those dandling nodes and the algorithm can get stuck because most contribution to the genome append to those unconnected parts

@ntraft
Copy link

ntraft commented May 24, 2024

I've recently realized that this has long been a topic of debate in evolution more broadly: is "neutrality" beneficial or harmful? Is it an important feature of evolvability? (I.e., modifications to the genome which are neutral—they have no effect on selection.) Arguably, it may be useful to have these in the background if they could become useful later. A background store of genetic diversity. But obviously they can also be harmful because they provide no gradient toward improvement.

@Finebouche
Copy link

From my experience anyway, it doesn't seem really useful. I always end up with so many disconnected parts. It really seems hard for the networks and weights evolution to make cense of those big chunks of nodes (or parts of them) when they happen to get connected.

@markste-in
Copy link
Author

Yeah exactly. In most of cases that I tested the algorithm got stuck because most of its "brain" was "dead" (dangling nodes without any impact). Further evolutions just made it worse and it never recovered. I thought about changing my implementation from "directly" removing dangling nodes to "removing it after a few evolution" to simulate a degeneration of dead cells.

@markste-in
Copy link
Author

I forked the project and in the branch "remove_dangling_nodes" I now have the option
trim_dangling_after_n_generations_wo_improvment in the config file.

If you set it to anything greater 0 it will trim the network of the species after it made no improvement for set generations.
If you set it to 0 it will always trim and if you set it to anything negative no trimming will be done.

In the code I added a trim function that I call whenever a species has not improved for n generations

I modified the openai-lander example so people see how to use it (see config).

You can find the forked repo here: https://github.com/markste-in/neat-python/tree/remove_dangling_nodes

@markste-in
Copy link
Author

I also found out that sometimes the output of an network are mistakenly used as an input. I fixed that in the my branch too.
issue

@Finebouche
Copy link

Ah alright, fixed that in this pull request as well : #282

@markste-in
Copy link
Author

ah nice but it looks like u remove potential dangling nodes directly. I am curious if it might be helpful to keep them for a while.
Like I leave them for 30 epochs and then start to trim them before they hit the stagnation limit at 40 e.g.

Another question: Were you able to solve the Lunar Lander with it? I solved it a few years ago but I can't get it to solve anymore with the current code base

@Finebouche
Copy link

Finebouche commented May 29, 2024

Hi, actually I don't remove the dangling nodes from the genome, but only from the feed forward network use to do the inference. This way dandling node can still evolve but they don't damage inference time (since they are useless for inference).

And yes, I was able to solve the Lunar Lander with it 👍

@markste-in
Copy link
Author

could you share your config ... i think i am off somewhere and try to figure out where

@Finebouche
Copy link

Hi,
Sure thing ! I made a nice little interface between Gym and neat-python so that you can use any gymnasium environment easily. Check this repo : https://github.com/Finebouche/neat-gymnasium
It works with my current branch of neat-python

@markste-in
Copy link
Author

Thanks for sharing! I tried to use you config on the current repository but it never so solves the LunarLander.
I started to troubleshoot this repo a bit more since the version in your repo works (I wanted to understand what goes wrong).

It turns out that the fitness function is totally broke. It calculates some log-loss and never converges (Not sure what the inital intention of author was since it is not documented).
I started to reverse the code and implemented a "pure" fitness oriented approach. Now I am able to solve the environement too.

I will fix the demo and propose a PR so people have a working demo when they find this repo.

@Finebouche
Copy link

Finebouche commented Jun 4, 2024

Ah yes the lunar example is totally broken.

@allinduetime
Copy link

Working on messing with this stuff today, is mark’s commit still the best to handle all the issues or?

@markste-in
Copy link
Author

markste-in commented Jun 5, 2024

The main repo here is still broken

but

There are two working solutions rn:
A working fork of the current repository
https://github.com/markste-in/neat-python/tree/remove_dangling_nodes

  • This one fixes all the mentioned issues and has a parameter (trim_dangling_after_n_generations_wo_improvment) to configure when to trim dangling nodes.
  • The lunarlander is working and plots the "best brains" every 10 generations and validates the best genome on 20 x 3 runs. It will run until the average of those 20 x 3 runs is better than 200. This can take a while since it takes a lot longer to get "a stable and robust" model

Then there is a good example from @Finebouche
https://github.com/Finebouche/neat-gymnasium
It needs the neat package installed and has a working lunarlander example.

  • This example will stop whenever the fitness threshold is meet. SInce there is no further validation it appears to solve the LunarLander quicker because you can hit the fitness threshold early if "your are lucky" and get 3 decent runs.
  • It is not plotting "the brains" every n-steps
  • ... but it has some good further examples for other gym environments

You should checkout both.
Maybe we get the best of both merged into here soon.

@allinduetime
Copy link

allinduetime commented Jun 5, 2024 via email

@allinduetime
Copy link

…man I am not responding from email again without yeeting the rest

@markste-in
Copy link
Author

#282 and https://github.com/markste-in/neat-python/tree/remove_dangling_nodes are both fixing the issues mentioned in #255

@allinduetime
Copy link

Alright, thank you, I’m gonna go figure out how to install from GitHub now lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants