Reflection and adaptation in agentic systems

Two of the more elusive patterns in the Gulli taxonomy are Reflection (pattern 4) and Learning and Adaptation (pattern 9). They sound like advanced concepts that require sophisticated infrastructure. In practice, both appear in every non-trivial agentic workflow in a much simpler form than the names suggest.

Reflection is just a second pass. An agent generates output, then reviews it against a set of criteria before returning it. Learning is just structured memory: saving what worked, what failed, and why, so the next run starts from a better position. Neither requires retraining a model. Both require discipline about when and how to check your own work.

Pattern 4: Reflection

The canonical definition: an agent reviews its own output and revises it before returning a result. This is distinct from a second agent reviewing the first agent's output (that would be multi-agent coordination). Reflection is a loop within a single agent's execution.

The reflection pattern adds a review-and-revise loop between generation and output. The loop terminates when the output passes the review criteria or a maximum iteration count is reached.

The fixer-bot implements this directly. When it generates a code change, it runs the change through a static analysis check before writing it to a file. If the check finds issues, the bot revises the change and re-checks. The output that reaches the file system has already passed its own review.

This is qualitatively different from just having a linter in the CI pipeline. The linter catches issues after the fact. Reflection catches issues before the output leaves the agent, which means the developer sees a finished result rather than a result with known problems attached.

Implementing reflection in practice

The minimal implementation is a check function that the agent calls on its own output before returning it. The check can be rule-based (does this output contain a hardcoded secret?), model-based (ask the same model to review the output), or both.

def generate_with_reflection(issue: Issue, max_iterations: int = 3) -> CodeChange:
    """Generate a code change, reflect on it, revise if needed."""
    change = _generate_change(issue)

    for i in range(max_iterations):
        review = _review_change(change, issue)
        if review.passes:
            logger.info("change passed review on iteration %d", i + 1)
            return change

        logger.info("revision needed: %s", review.feedback)
        change = _revise_change(change, review.feedback)

    # After max iterations, return best effort with review attached
    logger.warning("max iterations reached, returning with review notes")
    return change

def _review_change(change: CodeChange, issue: Issue) -> ReviewResult:
    """Check the change against quality criteria."""
    violations = []
    if _contains_hardcoded_secret(change.diff):
        violations.append("hardcoded credential detected")
    if not _matches_issue_scope(change, issue):
        violations.append("change exceeds issue scope")
    if _breaks_type_signatures(change.diff):
        violations.append("type signature mismatch")
    return ReviewResult(passes=len(violations) == 0, feedback=violations)

The iteration cap matters. Without it, a reflection loop can spin indefinitely on a change that genuinely cannot satisfy the review criteria. Three iterations is a reasonable default: enough to catch accidental errors, not enough to mask a fundamentally ill-specified task.

Pattern 9: Learning and Adaptation

Learning in the Gulli taxonomy means an agent improves its behaviour based on feedback and observed outcomes. This does not require model fine-tuning. It requires a memory system that accumulates structured artefacts: what worked, what failed, and the context that explains the difference.

Roo context MCP is the learning substrate for this platform. Every session produces notes, decisions, and file change records. The next session loads these artefacts and starts from a richer context than a clean slate. The model does not change, but the effective knowledge available to it does.

The learning ladder. Session output feeds notes and decisions; decisions feed architectural patterns; patterns feed back into the next session via context loading. The model stays constant; the accumulated knowledge grows.

What roo-context actually stores

Four artefact types accumulate across sessions on this platform:

Session summaries -- what was built, what decisions were made, cost incurred. Loaded at the start of each new session to restore context.
Notes -- tagged, categorised observations: patterns, gotchas, API behaviours, infrastructure facts. Searchable by category and keyword.
Decisions -- architectural choices with context, rationale, rejected alternatives, and affected files. A decision log that explains why the system looks the way it does.
File history -- which files were created or modified in which session, with a one-line description of each change.

At the start of a session, get_project_context() loads all four types in a single call. The agent does not start with a blank slate; it starts with everything previous sessions chose to preserve.

# Session start pattern
session = start_session(branch="main", project="ticketyboo")
context = get_project_context(
    project="ticketyboo",
    sessions=3,    # last 3 session summaries
    notes=10,      # most recent 10 notes
    decisions=5    # most recent 5 architecture decisions
)
# context now contains: what was built, what was learned, what was decided

The difference between note and decision

Notes are observations. Decisions are commitments. A note might say: "SSM Parameter Store SecureString is free; Secrets Manager costs $0.40/secret/month." A decision says: "We use SSM Parameter Store instead of Secrets Manager for new secrets. Rationale: cost. Alternatives rejected: Secrets Manager, hardcoded environment variables."

The distinction matters because decisions carry rejected alternatives. That context prevents the same debate happening twice. When a new session asks "why not Secrets Manager?", the decision record has the answer.

The sprint retrospective as a learning artefact

The most direct implementation of learning and adaptation on this platform is the sprint plan itself. Each sprint plan is written using knowledge from previous sprints. The docs/plans/ directory is a series of learning artefacts in chronological order: what was planned, what was built, what the next sprint should focus on.

This is not an accident. It is pattern 9 expressed at the project level rather than the session level. The agent (Roo) improves its planning and execution because each plan reflects lessons from the previous one. The mechanism is explicit: session summaries are saved to roo_context, retrieved at the start of the next session, and used to write a better plan.

Reflection and learning are complementary, not the same. Reflection improves a single output by reviewing and revising it within one execution. Learning improves future outputs by preserving what worked across executions. You can have one without the other, but the combination is more powerful: reflection produces higher-quality outputs in the moment; learning ensures those improvements carry forward.

Where both patterns appear in ticketyboo

The fixer-bot's code review loop is the clearest reflection implementation: generate, check, revise, output. The Gatekeep governance model adds a layer on top: before any output reaches production, it passes through declarative rules that act as a structured review. That is reflection at the deployment level.

The roo-context MCP is the learning substrate. Notes, decisions, and session histories accumulate across every coding session. The sprint plans are the learning output: a series of documents that record what was built and what to do next.

The data-draft pattern is a human-reviewed learning gate. Every new article or tool is marked data-draft="true" and does not appear on the index until a human has reviewed it and removed the flag. That review is a form of supervised reflection: the agent builds, the human reflects, the learning is captured in the publishing decision.

Pattern taxonomy from Antonio Gulli, Agentic Design Patterns: A Hands-On Guide to Building Intelligent Systems (Springer, 2025). All examples are original implementations from the ticketyboo.dev platform.

Useful? The scanner finds the same patterns in your repositories.

Scan a repository

ticketyboo runs five governance agents on every pull request — Security, Cost, SRE, CTO, and Dependency. Evidence signed, audit trail complete.

See how it works 5 free runs, one-time →

Reflection and adaptation in agentic systems

Pattern 4: Reflection

Implementing reflection in practice

Pattern 9: Learning and Adaptation

What roo-context actually stores

The difference between note and decision

The sprint retrospective as a learning artefact

Where both patterns appear in ticketyboo

Related