Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: ADR-204 new comms architecture #272

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from
Draft

docs: ADR-204 new comms architecture #272

wants to merge 25 commits into from

Conversation

pentreathm
Copy link
Member

No description provided.

@pentreathm pentreathm requested a review from a team as a code owner October 21, 2024 19:19
@pentreathm pentreathm marked this pull request as draft October 21, 2024 19:19
Copy link

cloudflare-workers-and-pages bot commented Oct 21, 2024

Deploying adr with  Cloudflare Pages  Cloudflare Pages

Latest commit: 888d9d9
Status: ✅  Deploy successful!
Preview URL: https://d4f2014e.adr-cvq.pages.dev
Branch Preview URL: https://adr-comms-ea.adr-cvq.pages.dev

View logs

@pentreathm pentreathm changed the title Adr comms ea docs: ADR-204 new comms architecture Oct 21, 2024
Copy link
Contributor

@nachomazzara nachomazzara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @pentreathm

content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved

#### Design Considerations

The establishment of the connection with the island remains unchanged, as that portion of the protocol is the same. The key difference lies in the types of messages sent through the channel. To establish a connection with a scene-room, a new service called [Gatekeeper](https://github.com/decentraland/comms-gatekeeper/) will be introduced to manage token permissions for the transport layer ([LiveKit](https://livekit.io/)). Only one scene room can be active at a time, as a user can only be present in one scene at any given moment. The scene room operates similarly to the communication service implementation used in worlds, where each scene has its own dedicated room. The Gatekeeper service will also allow scene owners to create authorizations for users, which will be reflected in the LiveKit token. This capability will enable them to moderate voice interactions and data streams within the scene effectively.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one scene room can be active at a time,

What is the limit of connected peers and what happpens if that limit is achieved?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LiveKit does not have a hard limit on the number of participants in a room, allowing us to define our own limits as needed based on client needs.

The client receives information about users through the island and scene rooms. It must determine how many avatars it can render efficiently, especially in scenarios with a high number of participants. An approach could be prioritizing the processes of messages from players based on their proximity, rendering only the closest 200 avatars and ignoring the rest. This selective rendering can reduce processing load by discarding non-essential messages. This requires fine-tuning but these behaviors were deferred to a later phase as it is not a problem in the current state of the platform and the door is opened to different possible solutions. The key point is that based on LiveKit, there could be no limit on the amount of players connected to the room and we would probably find some bottlenecks in the bandwidth and the processing of the messages

I will add a comment in the ADR.

@kuruk-mm
Copy link
Member

Hi, how are you? Great work @pentreathm! I have just a few questions:

  1. Is it possible to connect to multiple scene rooms at the same time, or is only one scene room connection allowed at once?
  2. With this design, are we going to lose the ability to walk around with a friend, explore Decentraland, and voice chat together? It seems like you'll be reconnecting with each scene change, and I think that's one of the best features of Islands. Has this been considered in the design? Are there any future plans for private rooms where we can stay with our friends to mitigate this issue?
  3. I read in the "Consequences" section that you're not going to hear someone who is near you but in a different scene. If the reason for this is to give the scene owner control over the voice chat content, shouldn't the nearby chat be treated the same way?

content/ADR-204-comms-architecture.md Show resolved Hide resolved
content/ADR-204-comms-architecture.md Show resolved Hide resolved
content/ADR-204-comms-architecture.md Show resolved Hide resolved

![comms](/resources/ADR-204/comms.png)

In the illustrated scenario, Peers 1, 2, and 3 are connected to the scene room `E1`. Additionally, Peer 1 establishes a connection to the island `I1`, while Peers 2 and 3 are linked to the island `I2`. This example serves as a simplified illustration. Users may be far apart within the same scene, but this aims to demonstrate that if a peer is at the edge of a scene, they will be able to see other players who are not within the scene but are within viewing range. On the other hand, the scene channel will be there to share consistent data among the visitors, like a stream of video on a texture or the state of an object like a door that can be opened or closed besides also sharing information about the users visiting the scene.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious about this scenario:

  • User A is on scene 101
  • User B is on scene 102 which borders on scene 101
  • Both users shares a single channel because of proximity but also are connected to different channels because of the scene distinct (one channel for scene 101 and another one for scene 102)
  • If User A opens a door in scene 101
  • And User B after the door was opened, joins scene 101, will it see the door open or close?

The question is if when a user enters a new scene it replicates the current state of the scene that has been changing by another users who were already on that scene.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, yes. The state of the scene (CRDT State) should be saved, and when it enters the scene room, it should receive the state and synchronize the scene.

So User B will see the door opened when it joins the scene and the state gets syncronized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, what @kuruk-mm said is correct.

content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
pentreathm and others added 3 commits October 28, 2024 11:02
@pentreathm
Copy link
Member Author

pentreathm commented Oct 28, 2024

  • Is it possible to connect to multiple scene rooms at the same time, or is only one scene room connection allowed at once?

A user should always be connected to only one scene at a time, only one scene connection should remain open.

  • With this design, are we going to lose the ability to walk around with a friend, explore Decentraland, and voice chat together? It seems like you'll be reconnecting with each scene change, and I think that's one of the best features of Islands. Has this been considered in the design? Are there any future plans for private rooms where we can stay with our friends to mitigate this issue?

Indeed, this feature will be lost in this initial version, it was a necessary trade-off in the design to give scene owners greater control over audio streams and the ability to moderate content. That said, there is potential for improvements. For instance, if a user is connected to Archipelago but not to a specific Scene, they could leverage that channel to share audio, enabling this use case. Additionally, a future "party mode" could be developed using the comms gatekeeper, allowing users to create a private communication channel with friends. This would facilitate sharing audio and text chat, irrespective of other connections in the environment.

  • I read in the "Consequences" section that you're not going to hear someone who is near you but in a different scene. If the reason for this is to give the scene owner control over the voice chat content, shouldn't the nearby chat be treated the same way?

yes, and I agree with your point, however, the decision was made because text chat is generally less intrusive for other users compared to voice chat. With voice, individuals can disrupt the experience for others, whereas text chat can simply be muted or turned off when needed.

cc @kuruk-mm

* Describe Compression and Encoding algorithms

Signed-off-by: Mikhail Agapov <[email protected]>
Comment on lines +473 to +494
**Parcel Linearization**

```csharp
public static class GenesisCityData
{
public static readonly Vector2Int MIN_PARCEL = -150 * Vector2Int.one;
public static readonly Vector2Int MAX_PARCEL = new (163, 158);
}

public int MinX => GenesisCityData.MIN_PARCEL.x - terrainData.borderPadding;
public int MinY => GenesisCityData.MIN_PARCEL.y - terrainData.borderPadding;
public int MaxX => GenesisCityData.MAX_PARCEL.x + terrainData.borderPadding;
public int MaxY => GenesisCityData.MAX_PARCEL.y + terrainData.borderPadding;

public int Encode(Vector2Int parcel) =>
parcel.x - MinX + ((parcel.y - MinY) * width);

public Vector2Int Decode(int index) =>
new ((index % width) + MinX, (index / width) + MinY);
```

`terrainData.borderPadding = 2` in the new client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this definitely MUST NOT be hard-coded here in the ADR. It could be assumed with the actual map size or included in another message with the part of the constants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, we should talk about another way to set up the relative parcel you are. Setting 17 bits to this index is a hard-limit that we already are close with Genesis City.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to avoid referencing new client and make it more generic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to ensure that that different clients can properly decode the message we need to have this included in ADR.
I agree that all the information should be here in the ADR. But we can approach an identical implementation even if we remove the genesis city shape&size from here.

Setting different origins for player to refer the relative position is a good approach. Then, we can take advantage of that for sending less data. But I think it's a bad idea to put Genesis City shape and size here to index parcels. I also think 17bits is a hard-limit and really close to what we currently have for Genesis-City.

I can add the idea of using another message to send the references/settings instead of hard-coding it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we could expose it in the /about under comms if needed. @mikhail-dcl what do you think about these options?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The number of bits needed to encode parcels can be dynamic but in practice, we never expanded it
  • I have no objections to having an endpoint to provide us with the Genesis city description.
  • Still, the size should fit the structure defined as int32 or int64 can't grow per se

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have that data available in the /about endpoint right now.

Example: https://realm-provider-ea.decentraland.org/main/about

We can process the sizes in the configurations.map.sizes and get the minimum bottom, maximum top, minimum left, and maximum right values. Then, we can use these values.

...
  "configurations": {
    "map": {
      ...
      "sizes": [
        { "left": -150, "top": 150, "right": 150, "bottom": -150 },
        { "left": 62, "top": 158, "right": 162, "bottom": 151 },
        { "left": 151, "top": 150, "right": 163, "bottom": 59 }
      ],
      ...
    }
  },

If you query Genesis Plaza, you will get:

Maximum right: 163
Minimum left: -150
Maximum top: 158
Minimum bottom: -150

which is the same data used in your definition:

            public static readonly Vector2Int MIN_PARCEL = -150 * Vector2Int.one;
            public static readonly Vector2Int MAX_PARCEL = new (163, 158);

here is the ADR of that spec: https://adr.decentraland.org/adr/ADR-250

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to move parcel information to Metadata: it will no longer be compressed and will be set only when the parcel is changed. Thus, your concern will be covered and no extra info is needed.

We need to try it out first in E@ to understand how reliable it is before moving on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikhail-dcl could you please update the ADR so that we can resolve the comments and merge it?

content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
Comment on lines +332 to +339
### Movement (Compressed)

```protobuf
message MovementCompressed {
int32 temporal_data = 1; // bit-compressed: timestamp + animations
int64 movement_data = 2; // bit-compressed: position + velocity
}
```
Copy link
Contributor

@leanmendoza leanmendoza Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey folks, since adding all the compressed complexity (and the assumption in each part) is not cheap, I imagine you have an analysis report to share the trade-off here. Adding that analysis to understand the real motivation behind this would be interesting. Also, it'd be interesting to have the comparison in the practice (maybe benchmarks or theoretical scenarios):

  • We have a 48-byte message (uncompressed) vs. a 12-byte message (MovementCompreesed). You could reduce the 48-byte message to 40-byte or fewer with significantly less commitment and assumptions. But let's take 48/12=4; this is the gain ratio. Have you considered the LiveKit overhead that this is sent on top of that? This is directly affected, and it results in a lower gain ratio. Then you also have the WebRTC overhead and then the UDP/IP layer (I imagine this is sent by the unreliable way)

  • What are the rationale behind each constant? I can guess some of them, but let's not leave it to the imagination.

  • Instead of having the constant, would it be more suitable to have a MovementHeader (to be announced with the profile maybe) where all the constant are set it up for that profile? It'd be far flexible and you avoid explain the constant here, only the limitations. For instance, you send the minValues, maxValues, nBits, etc.

content/ADR-204-comms-architecture.md Show resolved Hide resolved
content/ADR-204-comms-architecture.md Outdated Show resolved Hide resolved
Comment on lines +473 to +494
**Parcel Linearization**

```csharp
public static class GenesisCityData
{
public static readonly Vector2Int MIN_PARCEL = -150 * Vector2Int.one;
public static readonly Vector2Int MAX_PARCEL = new (163, 158);
}

public int MinX => GenesisCityData.MIN_PARCEL.x - terrainData.borderPadding;
public int MinY => GenesisCityData.MIN_PARCEL.y - terrainData.borderPadding;
public int MaxX => GenesisCityData.MAX_PARCEL.x + terrainData.borderPadding;
public int MaxY => GenesisCityData.MAX_PARCEL.y + terrainData.borderPadding;

public int Encode(Vector2Int parcel) =>
parcel.x - MinX + ((parcel.y - MinY) * width);

public Vector2Int Decode(int index) =>
new ((index % width) + MinX, (index / width) + MinY);
```

`terrainData.borderPadding = 2` in the new client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to avoid referencing new client and make it more generic

@aixaCode
Copy link
Contributor

aixaCode commented Dec 3, 2024

Should we also mark the older comms ADR as deprecated as we introduced the new state?

@leanmendoza
Copy link
Contributor

Should we also mark the older comms ADR as deprecated as we introduced the new state?

Only after this had lived for at least year (or a period that makes sense) I'd set the current comms as deprecated. There is no rush; this is still experimental and not implemented by other clients except by the new Unity Explorer.

@aixaCode
Copy link
Contributor

aixaCode commented Dec 4, 2024

Should we also mark the older comms ADR as deprecated as we introduced the new state?

Only after this had lived for at least year (or a period that makes sense) I'd set the current comms as deprecated. There is no rush; this is still experimental and not implemented by other clients except by the new Unity Explorer.

Agree, there is definitely no rush, just would like to avoid situation when potential new clients start using the old version of the comms.

- **Stats**: A service that exposes users' positions and islands to display realm statistics, providing insights into hot scenes and identifying where crowds are gathering.
- **[NATS](https://nats.io/)**: Message broker to exchange information between services.

**Rom Limits**: The transport layer (LiveKit) does not impose a strict limit on the number of participants in a single room. The real constraint comes from the end user’s bandwidth and the client’s ability to process data efficiently to render nearby players. In Archipelago, islands are capped at a maximum of 100 players, a limitation originally introduced to prevent performance degradation when rendering large numbers of avatars simultaneously.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Rom Limits**: The transport layer (LiveKit) does not impose a strict limit on the number of participants in a single room. The real constraint comes from the end user’s bandwidth and the client’s ability to process data efficiently to render nearby players. In Archipelago, islands are capped at a maximum of 100 players, a limitation originally introduced to prevent performance degradation when rendering large numbers of avatars simultaneously.
**Room Limits**: The transport layer (LiveKit) does not impose a strict limit on the number of participants in a single room. The real constraint comes from the end user’s bandwidth and the client’s ability to process data efficiently to render nearby players. In Archipelago, islands are capped at a maximum of 100 players, a limitation originally introduced to prevent performance degradation when rendering large numbers of avatars simultaneously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to fix the typo in WebScoket Connector -> WebSocket Connector? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants