Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicitly define a network topology #199

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nicolasochem
Copy link
Contributor

@nicolasochem nicolasochem commented May 13, 2021

Add an ability to explicitly link nodes. links value is a list of 2-item lists.

This will allow testing of large topologies where we define the graph fully and prevent arbitrary peering. This way, we can enforce number of hops between nodes and more efficiently detect regressions in block propagation times.

Configuration example:

nodes:
  tezos-baking-node:
    instances:
    - bake_using_accounts:
      - tezos-baking-node-0
      config:
        shell:
          history_mode: archive
        p2p:
          private-mode: true
      is_bootstrap_node: true
    - bake_using_accounts:
      - tezos-baking-node-1
      config:
        shell:
          history_mode: archive
      is_bootstrap_node: true
    runs:
    - baker
    - endorser
    storage_size: 15Gi
  tezos-node:
    instances:
    - config:
        shell:
          history_mode: rolling
      is_bootstrap_node: false
    - config:
        shell:
          history_mode: rolling
      is_bootstrap_node: false
    storage_size: 15Gi
links:
- [ tezos-baking-node-0.tezos-baking-node, tezos-node-0.tezos-node ]
- [ tezos-baking-node-0.tezos-baking-node, tezos-node-1.tezos-node ]
- [ tezos-baking-node-1.tezos-baking-node, tezos-node-0.tezos-node ]
- [ tezos-baking-node-1.tezos-baking-node, tezos-node-1.tezos-node ]
- [ tezos-baking-node-0.tezos-baking-node, tezos-baking-node-1.tezos-baking-node ]

When private_mode is set to "true", none of the bootstrap peers from the usual sources is configured in the node. Instead, only the links listed explicitly are configured.

Previously I found that private-private node connections did not work. But today, I found that it was actually working in my topology, but not reliably. We may want to configure trusted relationship between nodes to make sure connections get established reliably. After talking to devs, the trust is implicit when putting a peer in p2p.bootstrap-peers so it may be a bug. With the example topology above, I expect every node to have 3 peers, but in practice, they can have anything between 0 and 3.

* allow arbitrary config from values.yaml to be set in the node. This
  allow for setting private mode from values.yaml and this setting is
  transparently passed to the node config

* do not fail silently when generate-config.py crashes

* scour both regular and baking nodes for bootstrap nodes in
  wait-for-bootstrap. If you have only non-baking bootstrap nodes, the
  init script would otherwise fail.
@nicolasochem nicolasochem requested review from elric1 and harryttd May 13, 2021 20:15
@nicolasochem nicolasochem changed the title preliminary work for private mode support private mode support May 14, 2021
@nicolasochem nicolasochem marked this pull request as draft May 14, 2021 21:03
@harryttd
Copy link
Contributor

Whatever the structure or concept links will turn out to be, can we please give it a more obvious name? Something like trusted_peers? This is how the docs seem to refer to private node connections. https://tezos.gitlab.io/user/node-configuration.html?highlight=private%20mode

@nicolasochem
Copy link
Contributor Author

nicolasochem commented May 19, 2021

These are not trusted peers, these are links between 2 nodes. We already have nodes, what's wrong with links? What's not obvious about it?

@nicolasochem nicolasochem marked this pull request as ready for review August 11, 2021 06:26
@harryttd
Copy link
Contributor

harryttd commented Aug 11, 2021

Why need a standalone list of node links? With Roland's restructuring of the nodes definition, why not have a list of node links per node instance, per node class?

nodes:
  tezos-baking-node:
    instances:
    - bake_using_accounts:
      - tezos-baking-node-0
      config:
        shell:
          history_mode: archive
        p2p:
          private-mode: true
      is_bootstrap_node: true
      links:
         - tezos-node-0
         - tezos-node-1
         - tezos-baking-node-1
    - bake_using_accounts:
      - tezos-baking-node-1
      config:
        shell:
          history_mode: archive
      is_bootstrap_node: true
      links:
         - tezos-node-0
         - tezos-node-1
    runs:
    - baker
    - endorser
    storage_size: 15Gi
......

This centralizes the node's links inside the node itself. You don't need an explicit 2 way mapping from the instanceA to instanceB as the links are being defined in instanceA.

Also, I don't think one should need to specify the FQDN of each node. Why can't the code handle that for the user? All you need is the name of the node itself.

Personally I still think we need a better name than links. Why not something like link_to_nodes, link_with_nodes, peer_with_nodes? Or links_to_nodes/links_with_nodes/peers_with_nodes? These names are much more explicit as to what this field is for. links is a list of nodes. But linking for what purpose? peers is the description of what such linking is meant for.

(The former names suggestion is in imperative/verb form. The latter names suggestion is in active/descriptive form. The reason I am making this grammatical distinction is because I just realized that we haven't been consistent with these forms in our structures. Examples being runs as the adjective, bake_using_accounts as the imperative. I think if something is a function you would use the imperative/verb form. If it is a property you would use the active/adjective/descriptive. node.runs.baker = true is describing a property of the instance. It isn't a function telling the node to run baker. The function in code would be shouldRunBaker and that would look up the prop. Sorry for this whole long craziness. Just want to express clearly what I'm thinking. So to sum this all up, I think we should use peers_with_nodes.)

@harryttd
Copy link
Contributor

Can we update the name of this PR? Private mode is already supported as you can pass whatever config you want to a node. The PR's main feature is being able to explicitly define a network topology.

@nicolasochem nicolasochem changed the title private mode support explicitly define a network topology Aug 11, 2021
@nicolasochem
Copy link
Contributor Author

I agree with replacing the fqdns with just the hostnames. Also replacing links with peering_links is fine.

I think defining links separately from nodes is cleaner. I briefly searched for reasons why and came across this:

https://stackoverflow.com/questions/43052290/representing-a-graph-in-json

@itkach
Copy link

itkach commented Aug 11, 2021

I think it would be really cool to define topology in dot format (https://graphviz.org/) instead of a custom syntax - it could then be immediately visualized and compared with actual topology reported by https://github.com/oxheadalpha/tezos-net-viz

@harryttd
Copy link
Contributor

@nicolasochem

Does the link array structure imply that it is non-directed and therefore A -> B and and also B -> A ? Or is the graph directed and therefore just A -> B?

If the former then I understand why a separate structure makes sense, as you wouldn't want to define the edge in each node instance. You only want to define the edge once. If we really do require a separate structure and we are talking about DAGs, why not have a map of all the nodes, where each node has a list of every other node they point to? Why should each element in the list be a pair instead of a list of nodes?

The SO question you referenced is talking about a DAG. However, the DAG the user is talking about is more complicated than our use case here. His use case requires that the nodes and edges each have associated data. So the DAG representation is more complicated. We don't need that. Just a string being the hostname of a node pointing to another hostname string.

Here is a SO question about DAGs in json that is just like our use case: https://stackoverflow.com/a/9899057/6379084
The answer is like I'm suggesting here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants