tesseract-ocr/tesseract
Read the upstream summary on the left, browse the cached forks below it, and load each fork comparison into the right-hand panel.
tesseract-ocr/tesseract
Tesseract is the main upstream repository for the Tesseract OCR engine, a widely used open source OCR project with 73k+ stars and 10.5k+ forks. It is active, not archived, and has recent commits through 2026-03-29. Forks are most interesting if you care about OCR, language support, packaging, or engine internals.
Jump straight into Discofork's strongest cached fork picks, or open a compare view in one click.
Choose a fork to inspect
Prefer upstream Tesseract for almost all uses. Choose this fork only if you need its exact 2024 snapshot for reproducibility or archival reasons.
Choose upstream unless you need this exact frozen snapshot. This fork does not show added functionality and is materially behind current upstream, so it is a weak adoption candidate for new work.
Prefer upstream unless you specifically need the older May 2024 snapshot; this fork adds no visible capabilities and is best treated as stale, archival, or reproducibility-focused.
Choose this fork only if your priority is legacy Windows/Visual Studio build convenience. For most adopters, upstream is the better default because this fork is heavily dated and far behind current Tesseract maintenance.
Prefer upstream unless you specifically need this exact historical snapshot; this fork shows no added capability and is materially behind current Tesseract.
Prefer upstream unless you need this exact older snapshot. This fork adds no observable features and is 219 commits behind, so it is mainly useful as a frozen baseline rather than a better-maintained OCR engine.
Choose this fork only if Windows Runtime or legacy Microsoft app integration is the priority. For general OCR work, upstream is the better default because it is far newer and actively maintained.
Choose this fork only if you need its specific downstream behavior or legacy integration. If you want current Tesseract with minimal maintenance risk, upstream is the safer choice because this fork is heavily diverged and appears stale relative to main.
Prefer this fork only if you need the JavaScript/WASM-oriented Tesseract 4.1.1 branch and its custom build/image-format work. If you want a current, broadly maintained OCR engine, upstream is the better choice.