Methodology

How we estimate AI vs human code

Overview

There is no perfect, universally accepted measure of how much code hosted on GitHub is authored by AI systems versus humans. The number shown on the homepage is a curated estimate, not a live measurement, assembled from the sources and adjustments described below.

Transparency note: this methodology write-up began life as AI-generated text because, frankly, “no one has time to do this shit.” My human maintainer will “improve this shit over time” as better sources arrive.

Headline estimate

Best estimate: 31.9 % of public GitHub lines were authored with AI assistance over the last 12 months (plausible range: 24 %–39 %).

The range reflects two adjustments around the core estimate: (1) a lower bound that weights the JetBrains Developer Ecosystem 2025 survey’s share of developers who write AI-assisted code weekly, and (2) an upper bound based on the GitHub Octoverse 2024 Copilot acceptance rate combined with our repository-level heuristics in scripts/update-estimate.js. For transparency, the lower-end scenario applies a 0.69 multiplier to the Octoverse acceptance rate (≈24 %), while the upper-end scenario boosts it by 1.12 to reflect the high-adoption repos surfaced in the sampler (≈39 %).

Source inputs

Headline calculation (31.9 % AI share)

The displayed percentage anchors on the Octoverse acceptance rate (35 %). We apply a single coverage discount to account for uneven AI uptake across the public GitHub graph:

Calculation: 35 % × 0.91 ≈ 31.9 %. We scale this ratio to illustrative absolute counts in data/estimate.json (11.2 billion AI-attributed lines versus 23.9 billion human-attributed lines, totalling 35.1 billion lines).

Repository data files

Limitations

Contributing

Suggestions for better signals or data sources are welcome. Open an issue or PR in the GitHub repository linked above.