• turkishdelight@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 months ago

    You can’t shrink a model to 1/8 the size and expect it to run at the same quality. Quantization allows me to move from a cloud gpu to my laptops crappy cpu/igpu, so I’m ok with that tradeoff.