-
Notifications
You must be signed in to change notification settings - Fork 27.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change package name from "transformers" to something less generic #24934
Comments
You do realize this would break the existing code of many many people? |
Yes My theory/suggestion is that HF is still a relatively young library used by a relatively niche community used to having to move in a rapidly developing field (we're not talking about the C standard lib or something), that a lot of people likely feel this way, and that if this change were implemented it would be looked back on as a good decision ten years later (not as if we're new to breaking changes in the Python community - hell even HF has pushed breaking changes before) |
That kind of stuff would be hell for projects like ours, we have many low level patches in place to extend HF. |
Hello there. General TLTR ; No, because for now, the libraries are consistent and helpful in becoming the standard . The following comments have sections
ImpactTLTR; 17M monthly downloads, 1700 monthly MAUs, +100K repositories impact In order to gain some data-driven perspective about the impact of this change, what I did is check-in the downloads coming from PyPI from the 3 libraries and make a sum of the last month's downloads, giving an overall sum of 17M-ish . I'm assuming that there is a clear funnel here that separates users that are newcomers, explorers, and MAUs ( Monthly Active Users ). My analysis took me to focus on these last ones, as they are using the code regularly or might be the ones that might be using the libraries in a production scenario or in a work dependent project. Taking out 4 orders of magnitude - in a pessimistic overview - the hypothesis takes us to new 1700 montly-MAUs Therefore, the data-driven impact exploration took me to used-by reporting in the head page of the repository, as the impact of a number of repositories that depend on the libraries. Transformers library has been reported to be used by 84,4 K people, datasets by 20,4 k people, and datasets by 2.9 k people. This gave a total of +100K repositories this change could have impact in . Hypothesis limitations: this data could change with other insights about MAUs funnel conversion and maintained active repositories + private repositories. Before going further, and I guess this is a question directly for @geajack , can I help you brainstorm other names - syntactically and semantically aligned - that could help solve your problem? Considerations in the matterTLTR; Other standard names are also taken. What I understood from the issue is that the generalization of the package name supposed an interference and a cognitive dissonance WRT the naming standard with respect to other libraries. Then I went to
I really -really- tried to benchmark your motivations with Open Source Research insights 1 2 3 to try to have an empathetic generalistic view about this concern . Still maturing it, but what Im taking is that you might encounter beneficial and aligned with some Open Source ideas(yet to be proven representative) that generalistic names are not proprietary, beyond your individual code problem. However, I invite you to go deeper into motivations behind Open Source, as there seem to be equally important motivations that contributors and users are driven by. Encourage you to please share with me mature ideas that might not be aligned with my mental model. If we can go beyond one individual, and try to catch a community o a more general mental model, that would be amazing. On the other hand, putting myself in Hugginface's shoes, I couldn´t stop thinking broadly about their Open Source sustainability contribution with respect to other companies and proprietary software. Really recommend this reading! Before going further , and I guess these is a question for @geajack , would it be worthy to think deeply about the trade-off that the libraries are giving with respect to what they are taking ? Can I help you brainstorm the utilities you put on evaluate.py and datasets.py on your code and submit a contribution so we can encapsulate your needs to all coders and avoid frustration? Responsibility when becoming the standardTLTR; Motivation of owners might be becoming the standard. They seem worried about that responsibility in many dimensions. It might be fair to think that that naming in this case might entail the search for becoming the standard, and I left to the reader to analyze whether the owners of the libraries are being responsible or not with respect to their Open Source duties for being recognized as such beyond the naming in order to analyze coherence. On my side, the trust level system and contributor management , together with the pro-active response with respect to other Open Source responsibilities, talk by itself. This doesn´t entail that they should have a present and future concern on this matter. I guess this is a question for @geajack , do you think we shall consider this dimension into account for this matter? Bibliography and OpenessBeyond the cited readings, I really recommend this book . I m acknowledging that this response might be dense, so I would like to thank the reader, the owner of this issue, the contributors, and the maintainer for going through this material. As an emotional openness exercise and following the bravery of @geajack , I must confess It has taken me a significant amount of courage to press Comment on this one. |
@SoyGema thanks for the detailed breakdown. First of all I just want to say that I don't intend to present myself as some kind of sponsor for these issues - I just want there to be a place in the issue tracker for people to voice this concern if it is indeed a common concern. I do think you may have misunderstood the issue at a couple of points, though. In your second section, it sounds like you think the complaint is that because HF is taking up My most recent use-case for this was wanting a script called I'm not under the impression that this is a change that can be made tomorrow or even this year. When I opened these issues I pictured them (assuming they didn't just get buried) being the kinds of issues that sit open for years and years accumulating hundreds of comments, acting as an informal community forum before anything is done about them. The only place on the internet I could find someone expressing a similar sentiment was this highly upvoted /r/Python comment, but I suspect a fair few people feel this way. |
Hey @geajack thanks for your response and for the clarification. Thanks also for the reddit link, that wasn't on my radar until now. As feedback , if you could share a line with the motivations and links behind this issue when opened that would be great!🙂 I'm happy that you already have a turn around for this . Yes, you are correct. I thought that this was beyond a local use of a script and more library oriented due to the impact of the change and my normal sparks under 'annoying' naming scenario. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
I agree that the naming is annoying. Is it possible to have a wrapper package with another name? |
Feature request
I'm repeatedly finding myself in situations where I want to have a package called
datasets.py
orevaluate.py
in my code and can't because those names are being taken up by Huggingface packages. While I can understand how (even from the user's perspective) it's aesthetically pleasing to have nice terse library names, ultimately a library hogging simple names like this is something I find short-sighted, impractical and at my most irritable, frankly rude.My preference would be a pattern like what you get with all the other big libraries like numpy or pandas:
or things like
If this isn't possible for some technical reason, at least just call the packages something like
hf_transformers
and so on.I realize this is a very big change that's probably been discussed internally already, but I'm making this issue and sister issues on each huggingface project just to start the conversation and begin tracking community feeling on the matter, since I suspect I'm not the only one who feels like this.
Sorry if this has been requested already on this issue tracker, I couldn't find anything looking for terms like "package name".
Sister issues:
Motivation
Not taking up package names the user is likely to want to use.
Your contribution
No - more a matter of internal discussion among core library authors.
The text was updated successfully, but these errors were encountered: