Why GitHub's contributions API lies to you about private repos (and how we fixed it)
GraphQL's commitContributionsByRepository returns nothing for private repos even with the right scope. We worked around it.
The setup
We wanted to show people a per repo breakdown of where their commits landed this year. GitHub has a GraphQL field that does exactly that:
user(login: $login) {
contributionsCollection(from: $from, to: $to) {
commitContributionsByRepository(maxRepositories: 10) {
repository { nameWithOwner }
contributions { totalCount }
}
}
}This works great for public repos. It returns nothing for private ones. Not less, not partial, nothing.
The maddening part
We assumed it was a token scope problem. We made a personal access token with the full repo scope. The field still returned an empty array. We checked the token actually had the scope. It did. We checked the viewer matched the user being queried. It did. We checked the contributions calendar at the same time and saw 158 contributions for the year, all marked as restrictedContributionsCount.
GitHub knew there was activity. It just refused to break it down.
The workaround
We stopped using contributionsCollection for the per repo data and started querying repositories directly:
viewer {
repositories(
first: 100
affiliations: [OWNER, COLLABORATOR, ORGANIZATION_MEMBER]
orderBy: { field: PUSHED_AT, direction: DESC }
isFork: false
) {
nodes { name owner { login } pushedAt }
}
}Then for each repo with activity in the target year, we run a follow up query:
repository(owner: $owner, name: $name) {
defaultBranchRef {
target {
... on Commit {
history(first: 100, since: $from, until: $to, author: { id: $userId }) {
nodes { messageHeadline messageBody }
}
}
}
}
}This returns real commits with real messages from real private repos.
Why this matters
If you ever try to build a tool on top of the GitHub contributions API, do not trust commitContributionsByRepository to include private data. Query the repositories list and walk the commit history yourself. It is more requests and slower, but the data is actually there.