Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sparkR): integration of sparkr #765

Merged
merged 1 commit into from
Oct 6, 2023

Conversation

selmazrg
Copy link
Contributor

@selmazrg selmazrg commented Jun 19, 2023

This PR allows the ability to use R langage on Spark. SparkR can be launched with the command: sparkr3-shell

Which issue(s) this PR fixes

Fixes #764

Additional comments

To integrate SparkR :

  • Installation of R ;
  • symbolic link from the sparkr folder to R library ;
  • render /usr/bin/sparkr-shell command.

Requires :

New build of spakr3.2 with sparkR profile activated is required.
To build spark3.2 with sparkR, follow guideline in README.md file of R. Make sure that your build environment have R > 3.5.

Agreements

@GuillaumeHold
Copy link
Contributor

GuillaumeHold commented Jul 6, 2023

To be able to use spark3r-shell --master yarn, the worker group should be added to [spark3_client] .

Copy link
Contributor

@Pierrotws Pierrotws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

topology.ini Outdated Show resolved Hide resolved
@rpignolet
Copy link
Contributor

We should add a flag to enable sparkR. Add in tdp_vars/spark/spark.yml and tdp_vars/spark3/spark3.yml a variable spark_enable_r: false.

@rpignolet rpignolet merged commit 4e52e25 into TOSIT-IO:master Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add SparkR
4 participants