I used /goals command wrong... Here are all tips & mistakes

By AI Jason

Share:

Key Concepts

  • Goal Feature: A persistent, iterative execution mode for AI coding agents (like Codeium/Codeex and Hermas) that allows them to work on complex, long-running tasks without premature termination.
  • Agentic Loop: A mechanism where an LLM evaluates its own output against a defined "goal" to decide whether to continue working or terminate the session.
  • Quantifiable Definition of Done (DoD): The requirement for specific, measurable criteria to prevent the AI from becoming "lazy" or declaring victory too early.
  • Alignment/Interview Phase: The practice of having a preliminary conversation with the agent to establish context, constraints, and project goals before execution.
  • Mission-Based Execution: An experimental framework for long-term, multi-week goals that involve hypothesis testing, scheduled intervals, and human-in-the-loop intervention.

1. The Problem: Premature Termination and "Lazy" Agents

Standard AI coding agents often struggle with complex, multi-hour projects. They frequently exhibit "laziness," where they perform a small portion of a task and incorrectly report that the entire job is complete.

  • The "Rough Loop" Predecessor: Early attempts to solve this involved programmatic while loops (e.g., "Rough Loop"), which simply re-triggered the agent until a maximum iteration count was reached.
  • The Evolution: The new "Goal" feature replaces simple programmatic loops with an LLM-based judge. After each step, the model evaluates if the goal is satisfied. If not, it provides feedback to the agent to continue working.

2. How the Goal Feature Works

The feature functions as a continuous feedback loop:

  1. Goal Definition: The user provides a prompt defining the objective, constraints, and the "definition of done."
  2. Execution & Evaluation: The agent performs a task. An LLM call then reviews the output against the goal.
  3. Feedback Loop: If the goal is not met, the agent receives a prompt: "Continuing toward your standing goal... Take the next concrete steps."
  4. Self-Correction: Unlike simple loops, the agent is instructed to explicitly state when it believes the goal is achieved, preventing it from stopping prematurely.

3. Best Practices for Implementation

To ensure success, users must move away from vague prompts toward structured, verifiable objectives:

  • The "Interview" Phase: Before starting, conduct an alignment conversation. Discuss project context, anti-patterns, user expectations, and what "bad" results look like.
  • Quantifiable Goals: Avoid open-ended instructions like "fix all bugs." Instead, use specific targets: "Find 20 discrete issues, produce a reproduction for each, and push a fix to a branch."
  • Tooling (Go-Body): Use tools like go-body to generate a goal.md file and a state.yaml file. This creates a persistent record of the mission, constraints, and progress, which the agent references in every iteration.
  • Verification: Integrate automated testing (e.g., Playwright) into the goal. The agent should be tasked with verifying its own output against reference screens or test suites.

4. Technical Workflow & Commands

  • Enabling: Use codeex features list to view experimental features and codeex features enable [feature_name] to activate.
  • Execution: Use the /go command followed by a detailed prompt or a path to a goal.md file.
  • Monitoring: Run /go again during execution to check status, token usage, and duration.
  • Interruption: Use /go pause or /go clear to stop the agent. Use /side to fork the conversation if you need to ask questions without disrupting the main goal loop.

5. Limitations and Future Directions: "Missions"

The current "Goal" feature is optimized for hours-long coding tasks but struggles with long-term, non-verifiable objectives (e.g., "improve SEO over three months").

  • The "Mission" Framework: This is an experimental approach for long-running, multi-day/week goals.
    • Hypothesis-Driven: The agent forms a strategy, executes a step, and saves the result as an artifact.
    • Scheduled Intervals: Instead of a continuous loop, the agent schedules the next run for hours or days later.
    • Human-in-the-Loop: If the agent encounters ambiguity or needs to make a high-stakes decision, it pauses to request human input.
  • Real-World Application: This has been tested for social media growth (e.g., growing a Twitter following), where the agent analyzes performance metrics, adjusts its "voice," and iterates on content strategy over time.

Conclusion

The transition from simple, one-off prompts to persistent "Goal" and "Mission" frameworks represents a shift toward autonomous agentic workflows. By enforcing strict definitions of "done," utilizing alignment interviews, and integrating automated verification, developers can move from simple code generation to managing complex, multi-hour, or even multi-week engineering and growth projects.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video