palantir/spark
stale
significant_divergence
Selected Choose this fork if you want Palantir-specific Spark behavior and can live with an older, highly diverged codebase. Choose upstream Spark if you want current features, easier upgrades, and the broadest community support.
thunderain-project/StreamSQL
stale
significant_divergence
Prefer upstream Spark unless you specifically need this fork's legacy StreamSQL/Kafka streaming extensions and are willing to maintain a heavily outdated, highly divergent codebase yourself.
stratiocommit/spark
stale
significant_divergence
Choose this fork only if you need its legacy 1.1.x behavior or custom integrations. For most adopters, upstream Apache Spark is the better choice because this fork is stale, highly divergent, and missing modern Spark capabilities.
IBMSparkGPU/SparkGPU
stale
significant_divergence
Choose this fork only if GPU acceleration is the primary requirement and you can absorb the maintenance burden. For most users, upstream Spark is the safer default because this fork is stale and materially behind.
haizhi-tech/spark
stale
significant_divergence
Prefer this fork only if you need its older Hive/Spark compatibility and are willing to maintain a heavily lagging Spark branch. For most adopters, upstream Apache Spark is the safer choice because this fork is stale and likely missing many newer APIs, fixes, and usability improvements.
rezazadeh/spark
stale
significant_divergence
Choose this fork only if you need an old, historical Spark baseline. For active development, production use, or modern Spark features, upstream is the better choice by a wide margin.
Pierian-Data/spark
stale
significant_divergence
Prefer this fork only if you need an old, frozen Spark baseline. If you want current Spark features, compatibility, or ongoing maintenance, upstream is the better choice by a wide margin.
mapr/spark
stale
significant_divergence
Choose this fork only if you need legacy MapR integration and can accept an old Spark baseline. For anyone starting fresh or wanting current Spark features, upstream Apache Spark is the better fit.
mu5358271/spark-on-fargate
stale
significant_divergence
Prefer this fork only if AWS Fargate serverless deployment is the primary requirement and you can accept a frozen, highly divergent Spark codebase. If you need current Spark features, compatibility, or active upstream support, upstream Apache Spark is the safer choice.
linkedin/spark
stale
significant_divergence
Prefer this fork only if you need its legacy compatibility and custom patches and can accept a large gap from active Apache Spark development. If you want current Spark features, fixes, and ecosystem compatibility, upstream is the better choice.