Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differing behavior for handling run tags between copy_run and import_run in Databricks #29

Open
smurching opened this issue Dec 9, 2021 · 3 comments

Comments

@smurching
Copy link
Contributor

smurching commented Dec 9, 2021

In Databricks, I noticed there's a behavior difference when calling the RunExporter.export_run + RunImporter.import_run APIs vs just calling the RunCopier.copy_run APIs, to push runs between workspaces. The former approach seems to skip importing internal MLflow tags (e.g. original job ID/job run ID) but the latter just directly copies all tags over (see logic), which results in broken links to e.g. Databricks notebooks or jobs in the run UI. Filing this issue to facilitate discussion

@smurching
Copy link
Contributor Author

One alternative could be to copy over tags known to be absolute/workspace agnostic, e.g. tags like mlflow.source.git.commit should just be copied over since they should be the same regardless of what workspace the run is in.

@amesar
Copy link
Owner

amesar commented Dec 13, 2021

  • I've added to the README files that the copy mode is deprecated.
  • Since the MLflowClient constructor doesn't allow you to pass in credentials, you can't have two (or more) clients active in the same program. This is needed to make the copy mode reliable and maintable. Currently the credentials are obtained by singleton global magic variables (environment variables). Sort of violates dependency injection principles. ;)
  • This results in squirrelly code to overcome this limitation.
  • It makes the copy mode very difficult to code, test and maintain.
  • Besides, under the hood copy mode doesn't provide much advantage since the artifacts will still have to be exported into an intermediate location just like in the export/import mode.
  • Copy mode is basically syntactic sugar over the underlying code export/import mode.
  • So I'd recommend sticking with the export/import mode which has been "battle tested".
  • Do you have a specific use case to prefer copy mode over export/import mode?

@smurching
Copy link
Contributor Author

smurching commented Dec 14, 2021

Since the MLflowClient constructor doesn't allow you to pass in credentials, you can't have two (or more) clients active in the same program. This is needed to make the copy mode reliable and maintable. Currently the credentials are obtained by singleton global magic variables (environment variables). Sort of violates dependency injection principles. ;)

This is something we can make clearer in the MLflow docs :) but it actually is possible to have multiple clients active in the same program - basically what I was doing was supplying the src_client and dst_client parameters of RunCopier and relying on Databricks secrets to configure them to talk to different workspace as described here

Do you have a specific use case to prefer copy mode over export/import mode?

Good question, mostly just to avoid writing code to export to a tmpfile and reimport it into my new workspace, i.e. the code with copy is more declarative + closer to my underlying intention. But it's definitely not a must-have :). The bigger issue I ran into was actually #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants