Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github user.role is potentially inaccurate in graphs with multiple github orgs? #1374

Open
danbrauer opened this issue Oct 31, 2024 · 3 comments
Labels
data-addition Describes adding new data to the graph GitHub Related to GitHub intel module

Comments

@danbrauer
Copy link
Contributor

danbrauer commented Oct 31, 2024

In the process of working on pr 1373 I noticed what seems like an edge case for users who are in multiple Github organizations: imagine a user who is an 'ADMIN' to Org1, and a 'MEMBER' to Org2. The user node's ‘role’ property will say either 'ADMIN' or 'MEMBER', so, it will be incorrect with respect to one of the orgs.

This was not an issue for my PR specifically but in thinking through it, it did make me think of an alternative graph that might be clearer? Cartography could graph these relationships like so:

(User)-[MEMBER_OF|ADMIN_OF|UNAFFILIATED]-(Org)

And then the user.role property would goes away (because its info would now be encoded in the type of the relationship between user and org).

I haven't looked into implementing this yet, because I wasn't sure if it's of any interest, or if maybe there is a reason for the way things are currently done. It would be a 'breaking' change, since it removes a property from a node and would change the relationship type of some users (i.e. admins currently are MEMBER_OF but would become ADMIN_OF).

Does this make sense and, if we wanted to make this update in one of our upcoming sprints, would that be welcome? It's not critical for us, and we have other changes we'd like to make, but it might be something I can do.

  • Cartography release version 0.93
  • Python version: 3.12
@achantavy
Copy link
Contributor

achantavy commented Nov 1, 2024

Mm, yeah everything you wrote sounds correct. The code right now does not support the idea of a user having different 'role' properties for different orgs. How about a schema like this:

  1. A user belongs to a github org. This is the "tenant relationship". cartography convention is to have one label be that specific relationship so that we can determine what org a user belongs to in a uniform way -- that is, without needing to account for multiple possibilities of label values connecting back to the org.
(:GitHubOrganization)<-[:MEMBER_OF]-(:GitHubUser)
  1. Some users can be admins of github orgs:
(:GitHubOrganization)<-[:ADMIN_OF]-(:GitHubUser)

So in the case where a user is an admin, we would have 1 MEMBER_OF edge + 1 ADMIN_OF edge, so it would be a bit redundant, but that's fine. It'd look something like

(:GitHubUser)-[:MEMBER_OF]->(:GitHubOrganization)
          |                 ^
           -[:ADMIN_OF]----|

Thoughts?

cc: @heryxpc @ramonpetgrave64 @chandanchowdhury

@achantavy achantavy added GitHub Related to GitHub intel module data-addition Describes adding new data to the graph labels Nov 1, 2024
@danbrauer
Copy link
Contributor Author

I like it. I will wait for others to chime in.

Also, a little more thinking on why this could be valuable: maybe multiple orgs in one enterprise is a bit of an edge case, and I am pretty sure Github says it’s a contra pattern and that they try to dissuade it, but, I think it still happens, and it can't be bad for the graph to be more accurate, especially for a relationship as important as member vs admin.

(For my part, we have other Github module contributions we'll need to work on first, but, assuming this is still not done, I could then argue to my manager that I can do it while all the Github module stuff is fresh in my mind.)

@achantavy
Copy link
Contributor

I agree, cartography is all about verifying assumptions, especially if things are used in a way that is not intended

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-addition Describes adding new data to the graph GitHub Related to GitHub intel module
Projects
None yet
Development

No branches or pull requests

2 participants