Repository brief

ggml-org/llama.cpp

Read the upstream summary on the left, browse the cached forks below it, and load each fork comparison into the right-hand panel.

Cached analysis
cached 2026-03-29T22:30:12.790Z

ggml-org/llama.cpp

ggml-org/llama.cpp is a very active, widely used open source LLM inference project in C/C++ with strong recent development. It focuses on local and cloud inference with minimal setup, supports multiple hardware backends and quantization formats, and includes tooling for model conversion, a server, and WebUI-related features.

GitHub
Stars99,888
Forks16,002
Default branchmaster
Last pushed2026-03-29T21:35:39Z
Best maintainedLostRuins/koboldcpp
Closest to upstreamTheTom/llama-cpp-turboquant
Most feature-richcmp-nct/ggllm.cpp
Most opinionatedLostRuins/koboldcpp
Forks

Choose a fork to inspect

6 cached fork briefs
ggml-org/llama.cpp · Discofork