Two of the more elusive patterns in the Gulli taxonomy are Reflection (pattern 4) and Learning and Adaptation (pattern 9). They sound like advanced concepts that require sophisticated infrastructure. In practice, both appear in every non-trivial agentic workflow in a much simpler form than the names suggest.
Reflection is just a second pass. An agent generates output, then reviews it against a set of criteria before returning it. Learning is just structured memory: saving what worked, what failed, and why, so the next run starts from a better position. Neither requires retraining a model. Both require discipline about when and how to check your own work.
Pattern 4: Reflection
The canonical definition: an agent reviews its own output and revises it before returning a result. This is distinct from a second agent reviewing the first agent's output (that would be multi-agent coordination). Reflection is a loop within a single agent's execution.
The fixer-bot implements this directly. When it generates a code change, it runs the change through a static analysis check before writing it to a file. If the check finds issues, the bot revises the change and re-checks. The output that reaches the file system has already passed its own review.
This is qualitatively different from just having a linter in the CI pipeline. The linter catches issues after the fact. Reflection catches issues before the output leaves the agent, which means the developer sees a finished result rather than a result with known problems attached.
Implementing reflection in practice
The minimal implementation is a check function that the agent calls on its own output before returning it. The check can be rule-based (does this output contain a hardcoded secret?), model-based (ask the same model to review the output), or both.
def generate_with_reflection(issue: Issue, max_iterations: int = 3) -> CodeChange:
"""Generate a code change, reflect on it, revise if needed."""
change = _generate_change(issue)
for i in range(max_iterations):
review = _review_change(change, issue)
if review.passes:
logger.info("change passed review on iteration %d", i + 1)
return change
logger.info("revision needed: %s", review.feedback)
change = _revise_change(change, review.feedback)
# After max iterations, return best effort with review attached
logger.warning("max iterations reached, returning with review notes")
return change
def _review_change(change: CodeChange, issue: Issue) -> ReviewResult:
"""Check the change against quality criteria."""
violations = []
if _contains_hardcoded_secret(change.diff):
violations.append("hardcoded credential detected")
if not _matches_issue_scope(change, issue):
violations.append("change exceeds issue scope")
if _breaks_type_signatures(change.diff):
violations.append("type signature mismatch")
return ReviewResult(passes=len(violations) == 0, feedback=violations)
The iteration cap matters. Without it, a reflection loop can spin indefinitely on a change that genuinely cannot satisfy the review criteria. Three iterations is a reasonable default: enough to catch accidental errors, not enough to mask a fundamentally ill-specified task.
Pattern 9: Learning and Adaptation
Learning in the Gulli taxonomy means an agent improves its behaviour based on feedback and observed outcomes. This does not require model fine-tuning. It requires a memory system that accumulates structured artefacts: what worked, what failed, and the context that explains the difference.
Roo context MCP is the learning substrate for this platform. Every session produces notes, decisions, and file change records. The next session loads these artefacts and starts from a richer context than a clean slate. The model does not change, but the effective knowledge available to it does.
What roo-context actually stores
Four artefact types accumulate across sessions on this platform:
- Session summaries -- what was built, what decisions were made, cost incurred. Loaded at the start of each new session to restore context.
- Notes -- tagged, categorised observations: patterns, gotchas, API behaviours, infrastructure facts. Searchable by category and keyword.
- Decisions -- architectural choices with context, rationale, rejected alternatives, and affected files. A decision log that explains why the system looks the way it does.
- File history -- which files were created or modified in which session, with a one-line description of each change.
At the start of a session, get_project_context() loads all four types
in a single call. The agent does not start with a blank slate; it starts with
everything previous sessions chose to preserve.
# Session start pattern
session = start_session(branch="main", project="ticketyboo")
context = get_project_context(
project="ticketyboo",
sessions=3, # last 3 session summaries
notes=10, # most recent 10 notes
decisions=5 # most recent 5 architecture decisions
)
# context now contains: what was built, what was learned, what was decided
The difference between note and decision
Notes are observations. Decisions are commitments. A note might say: "SSM Parameter Store SecureString is free; Secrets Manager costs $0.40/secret/month." A decision says: "We use SSM Parameter Store instead of Secrets Manager for new secrets. Rationale: cost. Alternatives rejected: Secrets Manager, hardcoded environment variables."
The distinction matters because decisions carry rejected alternatives. That context prevents the same debate happening twice. When a new session asks "why not Secrets Manager?", the decision record has the answer.
The sprint retrospective as a learning artefact
The most direct implementation of learning and adaptation on this platform is the
sprint plan itself. Each sprint plan is written using knowledge from previous sprints.
The docs/plans/ directory is a series of learning artefacts in chronological
order: what was planned, what was built, what the next sprint should focus on.
This is not an accident. It is pattern 9 expressed at the project level rather than the session level. The agent (Roo) improves its planning and execution because each plan reflects lessons from the previous one. The mechanism is explicit: session summaries are saved to roo_context, retrieved at the start of the next session, and used to write a better plan.
Where both patterns appear in ticketyboo
The fixer-bot's code review loop is the clearest reflection implementation: generate, check, revise, output. The Gatekeep governance model adds a layer on top: before any output reaches production, it passes through declarative rules that act as a structured review. That is reflection at the deployment level.
The roo-context MCP is the learning substrate. Notes, decisions, and session histories accumulate across every coding session. The sprint plans are the learning output: a series of documents that record what was built and what to do next.
The data-draft pattern is a human-reviewed learning gate. Every new article
or tool is marked data-draft="true" and does not appear on the
index until a human has reviewed it and removed the flag. That review is a
form of supervised reflection: the agent builds, the human reflects, the
learning is captured in the publishing decision.
Useful? The scanner finds the same patterns in your repositories.
Scan a repositoryticketyboo runs five governance agents on every pull request — Security, Cost, SRE, CTO, and Dependency. Evidence signed, audit trail complete.
See how it works 5 free runs, one-time →