Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
January 22, 2024
·
Boston
llama.cpp: Local Quantized LLMs
Overview
Running smaller quantized open source models on your own computer is getting popular, I thought I would demo how I do that with llama.cpp
Links
C/C++ LLM inference using ggml, supporting GGUF quantization and diverse backends.
Tech stack