Every scanning tool has the same problem: it's very good at generating findings, and not particularly interested in what happens to them afterwards. You scan a repo, you get a list of issues, you file them somewhere, and a month later that list is longer and the oldest items have been quietly ignored.

I wanted to close that loop. Not by building an automated system that makes changes without oversight — that's a different kind of problem — but by automating the mechanical steps between "we know about this" and "there's a PR ready for review." The pipeline I've been calling AutoDev does that work.

The gap between scanning and fixing

When I mapped out the lifecycle of a finding, the expensive part wasn't the scan. It was everything that happened after:

For a single critical finding, that process is fine — it warrants human attention at every stage. But for the long tail of medium-severity, mechanical fixes (unpinned dependency versions, missing resource tags, hardcoded strings that should be config, out-of-date patterns), the overhead kills momentum. Either the team processes them slowly, or they don't process them at all.

AutoDev targets that long tail. High-severity, complex findings still go to humans directly. The mechanical ones go through the pipeline.

The pipeline in five stages

1. Ingestion

A finding arrives — from a scanner, a governance check, a dependency audit, whatever. It gets normalised into a standard shape: what repository, what file, what line, what rule was violated, what severity, what suggested fix. That last field is the key one — without a suggested fix, there's nothing for the pipeline to work with.

2. Proposal generation

The pipeline reads the finding and the relevant file content, then generates a concrete fix proposal. Not a description of what should change — an actual diff, with an explanation of what was wrong and why the proposed change fixes it. This is where an LLM does useful work: translating "dependency X is unpinned" into "here's the exact version to pin it to, and here's the line to change."

The proposal is stored and surfaced for human review before anything touches the repository. Approving a proposal means: "yes, this is the right fix — proceed." Rejecting it means: "no, route this differently" or "I'll handle this manually."

The proposal gate is non-negotiable. Automated changes to a codebase without human review at the proposal stage is not a pipeline — it's a liability. The value of the pipeline is that it prepares the work; humans still own the decisions.

3. Branch and change

Once a proposal is approved, the pipeline creates a branch, applies the diff, and runs the project's own verification steps — linting, type-checking, whatever the repo has configured. If verification fails, the branch is flagged and a human is notified. The pipeline doesn't try to iterate on a failing fix; that's a sign the proposed change was too mechanical and needs rethinking.

4. Pull request

If verification passes, a PR is opened. The PR description is generated from the proposal — it explains the original finding, what was changed, and references the governance rule that flagged it. Reviewers get context, not just a diff.

The PR is explicitly labelled as machine-generated. I think transparency here matters: reviewers should know they're looking at automated output so they apply appropriate scrutiny rather than assuming a human thought carefully about every line.

5. Close the finding

When the PR merges, the original finding is marked resolved. The scan that next runs on that repo should not surface the same finding. If it does, something went wrong with the fix and the system raises it again — this time with "previously attempted" context so a human knows the history.

What this doesn't solve

AutoDev is useful for a specific class of findings: ones where the correct fix is deterministic enough that an LLM can generate it reliably. That's a narrower category than it sounds. Security vulnerabilities, architectural flaws, logic errors, test coverage gaps — these need human reasoning. The pipeline won't help you there, and claiming otherwise would be dishonest.

What it does help with:

For larger codebases, this class of finding is a significant proportion of the backlog. Automating it clears space for the team to focus on the findings that actually require thinking.

It was tested: the ACME sandbox

I want to be concrete about this because "I built a pipeline" without any evidence of it running is exactly the kind of vaporware I try to avoid on this site.

The AutoDev pipeline was tested against purpose-built sandbox repositories — deliberately constructed with governance violations, unpinned dependencies, missing tags, and lint failures. The pipeline scanned them, generated proposals, opened branches, applied fixes, and raised PRs. The ACME Inventory System and ACME Widgets CRM were both built as autonomous dev sandboxes — known bad state, controlled environment, real pipeline execution against a real GitHub organisation.

Not everything worked first time. The proposal generation step was over-confident early on — it would generate fixes for findings it didn't fully understand, producing changes that were syntactically correct but semantically wrong. The verification step (run linting and type-checking before opening the PR) caught most of these. The ones it didn't catch were caught in PR review. That's the right failure mode: noisy PRs, not merged bad code.

The upstream contract: the Finding schema

AutoDev works best when scanning is governed — when findings have consistent severity ratings, standard rule identifiers, and machine-readable remediation hints. If your scanner outputs free-text descriptions with no structure, the pipeline has nothing to work with.

This is the actual Finding dataclass from the scanner (api/models.py). It's the contract between the scanner and AutoDev — every field is required, none are free-text blobs:

# api/models.py
@dataclass
class Finding:
    """Individual scan finding — PK=SCAN#{id} SK=FINDING#{index}."""

    index:       int
    category:    str  # dependency | security | code_quality | iac | governance
                      # + deep: secret | sast | license | quality
    severity:    str  # critical | high | medium | low | info
    title:       str
    description: str
    remediation: str  # concrete, actionable — this is what AutoDev works from
    file_path:   Optional[str] = None
    # Deep scan additions
    analysis_layer: Optional[str] = None  # dependency|secret|sast|iac|license|quality
    line_number:    Optional[int] = None  # exact line for SAST and secret findings

The remediation field is the key one. A finding without a concrete remediation hint ("pin to version 2.31.0" vs "dependency is outdated") gives the pipeline nothing to work with. The scanner is the upstream contract; AutoDev is a consumer of it.

Deduplication is also part of the contract — the scanner runs _deduplicate(findings) before storing, keyed on (file_path, category, title), keeping the highest severity when the same issue appears in multiple layers. AutoDev never sees duplicate findings.

The full cycle — scan, propose, branch, PR, merge, re-scan — is what I mean when I talk about a closed loop. Not lights-out automation, but a pipeline where the mechanical steps don't pile up in a backlog waiting for human time that never comes.

If the articles or tools have been useful, a coffee helps keep things running.

☕ buy me a coffee

Related tools and articles

→ Scan your repository → Agentic development → Governance as code

Scan any public GitHub repo for dependency risk, secrets, and code quality issues — free, no account needed.

Scan a repo free See governance agents →