Repository brief

hiyouga/LlamaFactory

Read the upstream summary on the left, browse the cached forks below it, and load each fork comparison into the right-hand panel.

Cached analysis

cached 2026-03-30T11:51:53.203Z

hiyouga/LlamaFactory

LlamaFactory is an actively maintained Python project for unified, efficient fine-tuning of 100+ LLMs and VLMs. It is large and widely adopted, with 69,252 stars, 8,436 forks, recent commits on 2026-03-30, and support surfaces for CLI, Web UI, Docker, docs, examples, and tests. Forks are most likely interesting if you want to extend model support, training backends, datasets, or UI/workflow integrations in a fast-moving fine-tuning stack.

GitHub

Stars69,252

Forks8,436

Default branchmain

Last pushed2026-03-30T02:47:20Z

Best maintainedxiezhe-24/ChatTS-Training

Closest to upstreamxiezhe-24/ChatTS-Training

Most feature-richWGS-note/LLaMA-Factory

Most opinionatedWGS-note/LLaMA-Factory

Forks

Choose a fork to inspect

10 cached fork briefs

Fork comparison

Qihoo360/360-LLaMA-Factory

slowing

significant_divergence

Choose this fork if sequence parallelism is the requirement and you can tolerate upstream lag and merge overhead. Choose upstream if you want the broadest, freshest LlamaFactory feature set and easier maintenance.

Likely purpose

Add sequence parallelism support to LlamaFactory while adapting training, data, and web UI code for that backend.

Best for

Teams experimenting with or deploying sequence parallelism for fine-tuning; Engineers who need custom distributed-training behavior more than upstream freshness; Users operating multimodal or VLM training pipelines that must align with sequence-parallel execution; Maintainers willing to absorb upstream drift in exchange for a specialized performance path

Additional features

Sequence parallelism support for training
VLM sequence-parallel support
Training-path adjustments for distributed execution
Fork-specific fixes for text-only and multimodal edge cases in sequence-parallel mode
Custom README and workflow/documentation updates for the fork's training focus
Adds plugin, tool, or MCP-style extensibility beyond the upstream baseline.

Missing features

Upstream is 200 commits ahead, so it likely lags recent upstream model, data, and backend fixes
Upstream's newer training backend work such as HyperParallel FSDP2 support is not reflected in the fork snapshot
Upstream's recent bug fixes for datasets, tool calls, packing, and multimodal edge cases may be absent or harder to use here
The fork defaults to a non-main branch (`sp`), which can make standard upstream workflows less direct
It trails upstream by 200 commits, so some recent upstream features and fixes are likely not present yet.

Strengths

Adds a concrete distributed-training capability that upstream does not appear to target directly
Likely attractive for performance-sensitive fine-tuning experiments
Includes both text-only and VLM handling, so it is not limited to one model family
Large codebase delta suggests the fork is purpose-built rather than a superficial branding fork

Risks

High divergence increases merge cost when following upstream
Being behind upstream means adopters may miss recent fixes and model support
Sequence-parallel changes can narrow compatibility with standard LlamaFactory deployment paths
The fork's value is specialized; general users may prefer upstream's faster-maintained mainline