Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hadoop-env.sh.j2 in two separate locations #770

Open
PACordonnier opened this issue Jun 27, 2023 · 4 comments
Open

hadoop-env.sh.j2 in two separate locations #770

PACordonnier opened this issue Jun 27, 2023 · 4 comments
Assignees

Comments

@PACordonnier
Copy link
Member

hadoop-env.sh.j2 is stored in two locations in the repo

  • roles/hadoop/common/templates
  • roles/hdfs/common/templates

I'm not sure it is necessary ? From my understanding roles/hadoop/common/templates/hadoop-env.sh.j2 is exclusively used by hadoop/client. HDFS roles used their own hadoop-env.sh.j2

I think it makes sense to only manage one file (whether trough symbolic link or delete unused file)

@PACordonnier PACordonnier self-assigned this Jun 27, 2023
@rpignolet
Copy link
Contributor

If this file is rendered at the same location on target machine, I think it should only be templated by hadoop_client and deleted from hdfs role. We must check that hadoop-env content can be rendered by hadoop tdp_vars.

@PACordonnier
Copy link
Member Author

That's the issue indeed.

hdfs's hadoop-env.sh.j2 uses vars such as hdfs_datanode_heapsize which are hdfs vars, not hadoop, and this would create an error hadoop_client_config.

@rpignolet
Copy link
Contributor

This is always the problem with Hadoop which contains HDFS and YARN.

I think the hadoop-env needs to use hadoop variables from the tdp_vars to make the template happen at the hadoop_client level even though that contains the memory settings for HDFS components.

I don't understand what the java heaps configuration does in the hadoop-env...

Maybe we should discuss it at one of our meetings.

@PACordonnier
Copy link
Member Author

It's getting worse, hadoop-env is actually in three locations since it's also in roles/yarn/common/templates 😱 Will update my recent PR

We can discuss it during meeting. Now that I investigate a bit further I think having 3 files, while not elegant, could actually be an adequate solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants