Feature: reduce GitHub REST API usage in favor of clones?

GitHub imposes rate limits on their REST API; these limits can be onerous for higher-volume users, since some of our audits (particularly `impostor-commit` and `ref-confusion`) require a _lot_ of individual paginated requests to fetch the needed repository state.

To reduce this, we could switch to (or offer a model) for a model where we `git clone` and collect history locally instead, since GitHub has much higher rate limits on clones. 

Pros:

- Fewer rate limit issues.
- Potentially much, much faster overall (since local Git object/history scanning will be a lot faster than our current network roundtrips)

Cons:

- A bit more complicated (but not much more)
- Probably slightly slower on each audit cycle (i.e. per `uses:` clause), since we'd need to `git clone` and pull down more data initially. This will be cached and amortized by the faster filtering (above), but it'll probably make these audits a little less responsive.

Related thoughts:

- Maybe a blobless clone will be faster? I'm not sure if this induces a tradeoff on GitHub's side or not, versus the hot path for a normal clone.
- This is also somewhat related to #278, which seeks to solve the same problem by deploying our own static API. 

(h/t @andrew)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Feature: reduce GitHub REST API usage in favor of clones? #764

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature: reduce GitHub REST API usage in favor of clones? #764

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions