deepspeedai/DeepSpeed
Read the upstream summary on the left, browse the cached forks below it, and load each fork comparison into the right-hand panel.
deepspeedai/DeepSpeed
DeepSpeed is an active Apache-2.0 deep learning optimization library for distributed training and inference. It has a large upstream footprint with 41,948 stars, 4,770 forks, and very recent activity on 2026-03-30, which makes it a high-interest upstream if you care about large-scale model training systems.
Jump straight into Discofork's strongest cached fork picks, or open a compare view in one click.
Choose a fork to inspect
Choose this fork if your priority is Habana/Gaudi support and you want a DeepSpeed variant already adapted for that platform. Choose upstream if you need the newest DeepSpeed features, fastest bugfix flow, or the least-divergent codebase.
Prefer this fork only if you need its older, customized behavior and are prepared to own maintenance. If you want current DeepSpeed capabilities, active fixes, and modern distributed-training features, upstream is the better choice.
Choose this fork if your priority is accelerator-specific compatibility and you can tolerate lagging upstream features. Choose upstream if you want the latest DeepSpeed capabilities, active maintenance, and lower integration risk.
Choose this fork only if you need its specific older/custom DeepSpeed behavior and are prepared to own major divergence. For most adopters, upstream DeepSpeed is the safer choice because it is active, much newer, and far richer in maintained features.
Choose upstream unless you specifically need this older, unchanged snapshot. This fork does not add capabilities and lags substantially behind current DeepSpeed.
Prefer upstream unless you specifically need this fork's older snapshot or legacy chat/CPU/AMD changes; for new adoption, the fork is too stale and too divergent to be a safe default.
Prefer this fork only if you need Snowflake-specific maintenance or the narrowed codebase it represents. If you want current DeepSpeed features, active upstream alignment, or broad model-system support, upstream is the better choice.
Prefer this fork only if you need its specific 2023-era customizations and can accept major divergence from upstream. For most adopters, upstream DeepSpeed is the safer choice because this fork is stale, heavily rewritten, and likely missing newer features and fixes.
Prefer this fork only if you explicitly want an older, heavily pruned DeepSpeed baseline and are prepared to own maintenance yourself. For most adopters, upstream is the safer choice because this fork is stale and materially diverged.