Smaug-72B-v0.1: The New Open-Source LLM Roaring to the Top of the Leaderboard

hexual@lemmy.world · 10 months ago

FaceDeer@kbin.social · 10 months ago

And at 72 billion parameters it’s something you can run on a beefy but not special-purpose graphics card.

glimse@lemmy.world · 10 months ago

Based on the other comments, it seems like this needs 4x as much ram than any consumer card has

FaceDeer@kbin.social · 9 months ago

It hasn’t been quantized, then. I’ve run 70B models on my consumer graphics card at a reasonably good tokens-per-second rate.