Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to install on macOS M1 #85

Open
KatHellm opened this issue Nov 20, 2024 · 1 comment
Open

Failed to install on macOS M1 #85

KatHellm opened this issue Nov 20, 2024 · 1 comment
Assignees

Comments

@KatHellm
Copy link

Description of issue

Package installation fails when building the wheel for tokenizers.

System: macOS M1

Tried: installing and updating Rust and rustup but the error persists. Also attempted on Python version 3.9 and 3.12 in new env.

Error details

      warning: variable does not need to be mutable
         --> tokenizers-lib/src/models/unigram/model.rs:265:21
          |
      265 |                 let mut target_node = &mut best_path_ends_at[key_pos];
          |                     ----^^^^^^^^^^^
          |                     |
          |                     help: remove this `mut`
          |
          = note: `#[warn(unused_mut)]` on by default
      
      warning: variable does not need to be mutable
         --> tokenizers-lib/src/models/unigram/model.rs:282:21
          |
      282 |                 let mut target_node = &mut best_path_ends_at[starts_at + mblen];
          |                     ----^^^^^^^^^^^
          |                     |
          |                     help: remove this `mut`
      
      warning: variable does not need to be mutable
         --> tokenizers-lib/src/pre_tokenizers/byte_level.rs:200:59
          |
      200 |     encoding.process_tokens_with_offsets_mut(|(i, (token, mut offsets))| {
          |                                                           ----^^^^^^^
          |                                                           |
          |                                                           help: remove this `mut`
      
      error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
          = note: `#[deny(invalid_reference_casting)]` on by default
      
      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to 1 previous error; 3 warnings emitted
@KasperFyhn KasperFyhn self-assigned this Nov 21, 2024
@KasperFyhn
Copy link
Contributor

The issue does not seem to be for conspiracies specifically, but related to old dependencies in the project which ultimately results in a rather old version of tokenizers. It is long overdue to get dependencies updated, but there is quite a bit of technical debt associated with some of the dependencies, e.g. getting some of the models to run with newer versions of SpaCy.

I will take a look if we can do something about getting tokenizers to a more recent version and get back.

A related issue can be found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants