Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Proposal for FederatedCatalog Distribution and TargetNodeDirectory #1718

Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Proposal for FederatedCatalog with Tractus-X distribution and its TargetNodeDirectory

## Decision

The Federated Catalog Cache will be deployed as a standalone component. The Tractus-X EDC Connector Helm charts will be updated to feature a new Federated Catalog deployment template.
Regarding the TargetNodeDirectory, a new extension in the FederatedCatalog will expose an API to allow adding participant's identifiers which will be used to obtain the respective data from the Discovery Service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple discovery services, but we should be a bit more precise here. In addition, it should be mentioned that this is about a core service, not a local service.

Suggested change
Regarding the TargetNodeDirectory, a new extension in the FederatedCatalog will expose an API to allow adding participant's identifiers which will be used to obtain the respective data from the Discovery Service.
Regarding the TargetNodeDirectory, a new extension in the FederatedCatalog will expose an API to allow adding participant's identifiers which will be used to obtain the respective data from the Connector Discovery Core Service.


## Rationale

While a standalone component (= K8S deployment) brings a slight increase in configuration complexity, its ability to be managed and scaled independently makes up for that.

## Approach

For TargetNodeDirectory it will be set by a new extension responsible for exposing an API, where a member can input the DID's of the participants from which the catalogs are wanted, and then it will retrieve and store the respective Connector URL's. This new extension would get the data from the Discovery Service, and will be named `DiscoveryServiceRetrieverExtension`. This solution allows the member to choose precisely the Target Catalog Nodes that interests them, resulting in reduced network calls and latency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a provided DID, how is the member able to choose precisely the catalog nodes he is interested in? He will receive all connectors registered from the discovery service, right?

Additionally, if a Connector URL is registered (or unregistered) in the Discovery Service, the retriever will reflect it since it requests based on BPN and the registered URL's will be returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Here you talk about BPN although previously you talked about DIDs.
  2. Why will changes in the discovery service be reflected in the federated catalog? Above, it is described, that the federated catalog service will retrieve and store the connector url's, changes in the discovery will only be recognized, if the same BPN/DID will be retrieved again. It is not mentioned that the local storage is a caching mechanism.


This solution improves on the default one of having the data in a static file since a dynamic approach would avoid downtime when a change is required.

Other solution for the TargetNodeDirectory was also considered
- File in a S3 bucket (or different cloud provider's solution)
- This solution was discarded due to one file for all instead of each partner having the data that respectively needs does not match the requirement and this solution would lock the usage of a proprietary tool (cloud provider) being harder to sustain in the long run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the sentence, the part "the data that respoectively needs does not match the requirement"? What does this mean


Since the Federated Catalog Cache will be a standalone runtime, the Tractus-X EDC Connector Helm charts will be updated to include the Federated Catalog Cache as a separated deployment. The update will include the creation of a specific `deployment-federatedcatalog.yaml`, similar [to this one](https://github.com/eclipse-tractusx/tractusx-edc/blob/a263bf71a110245657131509d4b37d058a1d220d/charts/tractusx-connector-azure-vault/templates/deployment-dataplane.yaml#L47) (for `ingress` and `hpa` as well), for different scenarios (InMemory, PostreSQL, etc.). This results in added configuration complexity.

For its TargetNodeDirectory, the extension is able to obtain the Connectors' URL's through the Discovery Service and store them. Two API's will be provided in this new extension, at least during alpha stage, one to allow the user to input a list of DID's and other for BPN's. The `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog Cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For its TargetNodeDirectory, the extension is able to obtain the Connectors' URL's through the Discovery Service and store them. Two API's will be provided in this new extension, at least during alpha stage, one to allow the user to input a list of DID's and other for BPN's. The `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog Cache.
For its TargetNodeDirectory, the extension is able to obtain the Connectors' URL's through the Discovery Service and store them. Two API's will be provided in this new extension, at least during alpha stage, one to allow the user to input a list of DID's and other for BPN's. The `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to provide two api families, one for bpn and one for did, as it can be easily decided internally, what has been provided. So one api family that allows to CRUD an identifier could be enough.

By default no TargetNodes are stored, so the extension will not request data from the Discovery Service.

A DID added through the `DiscoveryServiceRetrieverExtension` API will be resolved with the BDRS client to obtain the BPN which will be used to query the Discovery Service. the BDRS client must be updated [since only allows to resolve a BPN to a DID and not the other way around](https://github.com/eclipse-tractusx/tractusx-edc/blob/8e1a3202be77d6374731dee5aaf6847feec8963a/spi/bdrs-client-spi/src/main/java/org/eclipse/tractusx/edc/spi/identity/mapper/BdrsClient.java). A change to resolve a BPN given the respective DID has to be done prior to the new extension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A DID added through the `DiscoveryServiceRetrieverExtension` API will be resolved with the BDRS client to obtain the BPN which will be used to query the Discovery Service. the BDRS client must be updated [since only allows to resolve a BPN to a DID and not the other way around](https://github.com/eclipse-tractusx/tractusx-edc/blob/8e1a3202be77d6374731dee5aaf6847feec8963a/spi/bdrs-client-spi/src/main/java/org/eclipse/tractusx/edc/spi/identity/mapper/BdrsClient.java). A change to resolve a BPN given the respective DID has to be done prior to the new extension.
A DID added through the `DiscoveryServiceRetrieverExtension` API will be resolved with the BDRS client to obtain the BPN which will be used to query the Discovery Service. The BDRS client must be enhanced to support a reversed lookup from a given DID to the matching BPN. This change affects the file [BdrsClient.java](https://github.com/eclipse-tractusx/tractusx-edc/blob/8e1a3202be77d6374731dee5aaf6847feec8963a/spi/bdrs-client-spi/src/main/java/org/eclipse/tractusx/edc/spi/identity/mapper/BdrsClient.java).


The retrieval of Connector URL's through the Discovery Service is enabled by the endpoint:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think, that this section is necessary. It is obvious from what you described before.

```
POST: /api/administration/connectors/discovery
```
In which, the body of the request can contain the BPN's related with participants from which the catalogs want to be obtained. Although the DiscoveryService allows to perform a request without providing BPN's (empty list) it will not be done by the extension.
Information regarding the related API can be found [here](https://catenax-ev.github.io/docs/standards/CX-0001-EDCDiscoveryAPI#22-api-specification).

Some limitations of this TargetNodeDirectory solution are:
- Each partner must have the DID's beforehand. If a new Partner is registered and an existing partner would want their catalog, the DID (or BPN) of the new partner must be obtained first and added through the new extension API;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not a limitation as each data space participant needs both a DID and a BPN. You cannot communicate to anyone who does not have this.

- The usage of the Discovery Service requires a technical user account to access it (must be requested). After obtaining them, the credentials can be stored in the vault;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this should come with the membership in Catena-X, so I do not see this as a limitation, could be mentioned as implementation requirement

- Change in the BDRS client to allow resolve a BPN provided the DID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also not a limitation, but a fact you already described above.



As indicated, the new extension would have own API capable of:

#### Save DID's
A member can add a DID (or BPN while two API's are maintained) through this API from which the Connector URL's are needed. This extension will iterate over the listed DID's, resolved them and query the Discovery Service.
Request body would contain a list of BPN's, allowing to store in bulk.
```
[POST] /api/target-nodes
```
Request Body Example
```json
[ "did:web:info:api:administration:staticdata:did:BPNL000000000001","did:web:info:api:administration:staticdata:did:BPNL000000000002" ]
```

#### Remove a stored DID
Once a member understands that they do not need the Catalogs from a certain DID, this can be removed.
DID to be removed is sent as a path param.
```
[DELETE] /api/target-nodes/{did}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, whether a DID is a suitable part in an url, must at least be encoded I suppose.

```
#### Retrieve DID's
Get DID's (value and connectors associated with it).
```
[POST] /api/target-nodes/request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[POST] /api/target-nodes/request
[GET] /api/target-nodes?filter_did={did1}&filter_did={did2}

There is no need to make use of POST for such an easy get endpoint

```
Request Body Example
```json
[ "did:web:info:api:administration:staticdata:did:BPNL000000000001","did:web:info:api:administration:staticdata:did:BPNL000000000002" ]
```
Response Example
```json
[
{
"did": "did:web:info:api:administration:staticdata:did:BPNL000000000001",
"connectorEndpoint": [
"https://connector1/api/v1/dsp"
]
},
{
"bpn": "did:web:info:api:administration:staticdata:did:BPNL000000000002",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, this is a DID, not a BPN. The BPN would be BPNL000000000002. Actually, I would propose to return both, the DID and the BPN. From this example, it is unclear to me, when a did and when a bpn is returned.

"connectorEndpoint": [
"https://connector2/api/v1/dsp",
"https://connector3/api/v1/dsp",
"https://connector4/api/v1/dsp"
]
}
]
```
Loading