Issue I · Code Uniformity · Q2 2026Methodology v0.1.02026-05-10Yash Datta · saucam

How AI authorship
reshapes code structure.

Forty-eight public open-source repositories, five languages, three strata of AI-authorship intensity. We attribute every line of every function to AI or human authorship via git blame against Claude-tagged commits, score each function’s structural similarity to its peers via semble, and ask whether the resulting AI-vs-human uniformity gap holds across languages, across AI authorship intensity, and across function size¹1Methodology pre-registered in full at v0.1.0 before any sampling. Six hypotheses (H1–H6). We report results regardless of direction: three confirmed, two counter-prediction, one partially confirmed..

The findings GitHub →HuggingFace dataset →Reproduction package →Per-issue methodology →Editorial charter →

Repositories sampled

47 produced full data

Languages

Python · TypeScript · JavaScript · Go · Rust

Functions analyzed

12,254

Across 47 repos at locked SHAs

Pre-registered hypotheses

3 confirmed · 2 counter · 1 partial

What we measured

For each public function in each repo, we compute its semantic similarity to the other functions in the same repo (via semble’s hybrid retrieval — Model2Vec embeddings + BM25 + reciprocal rank fusion), its cohesion-vs-coupling profile, and its cyclomatic complexity (Python only, via radon). Every line is attributed to AI or human authorship through git blame against commits carrying Co-Authored-By: Claude or the Claude Code footer. Functions with ≥70% AI lines are classified as AI-authored; those with ≤10% AI lines as human-authored. Per-repo metrics aggregate across all functions; per-language and per-bucket metrics aggregate across the 48-repo sample.

Finding 01 · The bucket reversal

AI code is the outlier in human codebases.
AI code is the norm in AI codebases.

Same code, opposite role. The AI-vs-human uniformity gap reverses sign across the AI authorship range — and the relationship is roughly linear.

Figure 1 · Bucket reversal · n = 45 reposPearson r = +0.579 between AI authorship ratio and AI-vs-human uniformity gap

By AI authorship bucket

Bucket	Repos	Mean AI gap	Interpretation
Low (<30% AI authored)	18	-0.00229	AI is the outlier
Mid (30 – 70%)	15	+0.00080	Roughly equal
High (>70%)	15	+0.00283	AI is the norm

The gap is small in absolute terms — fractions of a thousandth on a similarity scale that runs roughly 0–0.05 — but it is directionally consistent and correlated. In the eighteen repositories where humans wrote most of the code, AI-authored functions read structurally differently from the surrounding code. In the fifteen repositories where AI dominated, AI functions cluster together and human contributions become the structural minority. The mid-bucket sits near zero. Pearson r between AI ratio and gap is +0.58 across all forty-five repositories with sufficient samples in both groups.

Finding 02 · Per-language divergence

AI converges in JavaScript.
AI varies in TypeScript.

The sign reversal between TypeScript (−0.00254, humans more uniform within-repo) and JavaScript (+0.00383, AI more uniform) is the largest cross-language signal in the run. They are the only two languages whose mean AI gap falls on opposite sides of zero. Python, Go, and Rust sit close to zero in between.

Figure 2 · Per-language AI-vs-human uniformity gap · n = 48 repos

By primary language

Language	Repos	Mean AI gap	Uniformity index
javascript	7	+0.00383	0.01355
python	8	+0.00178	0.01274
rust	10	+0.00028	0.01330
go	12	+0.00013	0.01493
typescript	11	-0.00254	0.01506

The likely cause: TypeScript’s type system forces explicit structural choices, so humans converge to type-driven idioms while AI takes the structural freedom and varies. JavaScript has weaker conventions and AI imposes its own pattern rigidly. Go and Rust both have strong cultural conventions (gofmt,rustfmt, idiomatic style guides) that flatten the AI-vs-human distinction to noise.

Finding 03 · DRY densitycounter-prediction

AI does not generate
more duplicate code than humans.

We expected DRY-cluster density — the number of similar-looking function pairs above the per-language similarity threshold, normalized by function count — to climb with AI authorship. The intuition: AI generates fourteen copies of pagination, twenty subtly different validation helpers, sprawling near-duplicates that humans would have factored out.

The data does not show that. AI-heavy repos sit slightly below human-heavy repos on this measure. The slight inverse trend is small but directionally consistent: AI-generated functions cluster differently from each other than human functions do, but they do not cluster more. The “AI generates fourteen versions of pagination” concern is not visible at this benchmark’s scale.

Bucket	DRY pairs / fn
low	8.24
mid	8.36
high	7.37

Finding 04 · The isolated tailcounter-prediction

The rare hand-coded edges
aren’t rarer in AI’s share.

We expected: in AI-dominated codebases, the most-isolated functions — the rare hand-coded utilities, the one-off infrastructure pieces — would be disproportionately human. Across 15 high-AI repos, the bottom-10% most-isolated functions had AI authorship within ±10% of the repo’s overall AI rate.

15 high-AI repos · isolation surplus signal

3 · human surplus

10 · neutral

2 · AI surplus

The “humans hand-craft the rare edges” mental model is wrong on average, though three repos do show it clearly — bmad-module-skill-forge is 81% AI overall but only 54% AI in the isolated tail. AI writes uniform code; AI also writes the rare-shaped code. Mean AI surplus across the 15 high-AI repos is +0.002 — indistinguishable from zero.

Finding 05 · Complexity by function sizepartial

AI Python is simpler at most sizes —
except the 21–50-line band.

For Python functions across five repos with both AI and human samples (cyclomatic complexity via radon), AI is measurably simpler in three of four line-count bins. The 21–50-line band — which contains most “real function” sizes in production code — goes the other way. AI is 11% more complex per function there.

Figure 5 · Mean cyclomatic complexity by function size · Python only · 5 repos with both AI and human samples

Pre-registered tests

What we predicted.
What we found.

Six hypotheses locked in the methodology document at v0.1.0 before any sampling. We report the result regardless of direction.

Pre-registered hypotheses · v0.1.0 · all 6confirmed 3 · counter 2 · partial 1

ID	Prediction	Result	Magnitude
H1	AI more uniform than human within repo	✓confirmed	+0.00034 mean (CI95 [−0.0013, +0.0020])
H2	Gap correlates with repo AI ratio	✓confirmed	Pearson r = 0.579
H3	DRY density higher in AI-heavy repos	✗counter-prediction	high = 7.37, low = 8.24
H4	Isolated functions disproportionately human	✗counter-prediction	mean surplus +0.002, 3 / 15 repos show human surplus
H5	AI lower CC than human at matched size	≈partial	3 / 4 line-count bins; 21–50-line band reverses
H6	AI uniformity gap varies across languages	✓confirmed	TS −0.0025 ↔ JS +0.0038 (sign reversal — only pair across the five languages whose mean gaps fall on opposite sides of zero)

Reproduction

Same code path
we ran.

Every number in this report came from the public agent-uniformity package. Install it, point it at any of the 48 sampled repos at their locked SHAs, and you should match within ~1% (semble’s BM25 has small non-determinism).

Reproduction package →Raw partials on HuggingFace →Methodology + analysis CSVs →

Reproduce any numberMIT · Python 3.11 / 3.12

# install the public reference implementation
$ pip install agent-uniformity

# verify any single number from the published numbers
$ agent-uniformity run-one davila7-claude-code-templates --output ./out

# rerun the full sample sequentially (~6-8 hr on a laptop)
$ agent-uniformity run-all --output ./out

Cite

Datta, Y. (saucam). (2026). Code Uniformity Q2 2026 — How AI authorship
reshapes the structure of public open-source code. Agent Almanac.
https://github.com/saucam/agent-uniformity-q2-2026

Run metadata

run id: 2026-05-09T15-02-36Z-4931
run date: 2026-05-09
methodology: v0.1.0
published: 2026-05-10
byline: Yash Datta · saucam

AI code is the outlier in human codebases.AI code is the norm in AI codebases.

AI converges in JavaScript.AI varies in TypeScript.

AI does not generatemore duplicate code than humans.

The rare hand-coded edgesaren’t rarer in AI’s share.

AI Python is simpler at most sizes —except the 21–50-line band.

What we predicted.What we found.

Same code pathwe ran.