Skip to content

v6.5.0 - Llama 3.1 & MiniCPM v2

Compare
Choose a tag to compare
@BBC-Esq BBC-Esq released this 07 Aug 18:13
· 335 commits to main since this release
ccc5d5b

General updates

  • Remove triton dependency as cogvlm vision model is also removed.
  • Redid all benchmarks with more-accurate parameters.

Local Models

Overall, the large amount of chat models was becoming unnecessary or redundant. Therefore, I removed models that weren't providing optimal responses to simplify the user's experience, and added Llama 3.1.

Removed Models

  • Qwen 2 - 0.5b
  • Qwen 1.5 - 0.5b
  • Qwen 2 - 1.5b
  • Qwen 2 - 7b
    • Redundant with Dolphin Qwen 2 - 7b
  • Yi 1.5 - 6b
  • Stablelm2 - 12b
  • Llama 3 - 8b
    • Redundant with Dolphin Llama 3 - 8b

Added Models

  • Dolphin Llama 3.1 - 8b

Vision Models

Overall, two vision models were removed as unnecessary and MiniCPM-V-2_6 - 8b was added. As of the date of this release, MiniCPM-V-2_6 - 8b is now the best model in terms of quality. I currently recommend using this model if you have the time and VRAM.

Removed Models

  • cogvlm
  • MiniCPM-Llama3

Vector Models

  • Added Stella_en_1.5B_v5, which ranks very high on the leaderboard.
    • Note, this is a work in progress as currently the results seem to be sub-optimal.

Current Chat and Vision Models

chart_chat chart_vision