Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileTree export can be improve to reduce the noise #214

Open
jecisc opened this issue Feb 2, 2017 · 3 comments
Open

FileTree export can be improve to reduce the noise #214

jecisc opened this issue Feb 2, 2017 · 3 comments

Comments

@jecisc
Copy link

jecisc commented Feb 2, 2017

When I review code on github I often see files with modification but no content added.

For example:

I would like to see two things to improve it:

  • The exporter could avoid to rewrite metadata without changes
  • The exporter could avoid to change the order of metadata between two export

What do you think?

@ThierryGoubier
Copy link
Collaborator

Hi @jecisc, it looks like the use of two slightly different json exporters producing different text outputs for the same data. You have to look into how Iceberg is writing on disk and what is happening with the different backends Iceberg has.

This issue ties in #186 which was discussed about a year ago with @npasserini. It is about making a better FileTree writer that changes only files that needs to be changed, by saving a diff instead of a package. I even created a branch for that ( issue_186).

@jecisc
Copy link
Author

jecisc commented Feb 3, 2017

Hi!

I think it would indeed be good. I would like to help but I will not have the time before month. :(

@ThierryGoubier
Copy link
Collaborator

No need to hurry. The gain could be very important for large packages (it could make writing a new version to disk much faster, and git operations faster too) but it could be difficult to do a proper diff in the first place (do the diff in-memory for example and you risk missing changes done on-disk and mess everything).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants