Different eval_score and creativity_score when model is loaded in 4 bit #117

itorgov · 2024-11-23T23:18:16Z

In the scoring.get_eval_score.get_eval_score function the model loads with the following code when flash_attn installed:

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,  # This does not hurt performance much according to
    bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
    f"{request.repo_namespace}/{request.repo_name}",
    revision=request.revision,
    quantization_config=quant_config,
    attn_implementation="flash_attention_2",
    torch_dtype=torch.bfloat16,
    device_map="sequential",
    cache_dir=f"model_cache_dir/{cache_path}",
)

In this case I get the following result {'eval_score': 0.5030202252204222, 'creativity_score': 0.4620578276830795}.

If the flash_attn package is not installed then the model loads by this code:

model = AutoModelForCausalLM.from_pretrained(
    f"{request.repo_namespace}/{request.repo_name}",
    revision=request.revision,
    device_map="auto",
    cache_dir=f"model_cache_dir/{cache_path}",
    # force_download=True
)

And in this case I get completely different result {'eval_score': 0.8671109773311513, 'creativity_score': 0.18112541666424237}.

Two questions:

What is using on production? I assume the second method (without quantization).
Can you fix the code by having only one way of downloading a mode.

P.S.: I can prepare a PR if you wish.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different eval_score and creativity_score when model is loaded in 4 bit #117

Different eval_score and creativity_score when model is loaded in 4 bit #117

itorgov commented Nov 23, 2024

Different eval_score and creativity_score when model is loaded in 4 bit #117

Different eval_score and creativity_score when model is loaded in 4 bit #117

Comments

itorgov commented Nov 23, 2024