diff --git a/.config/opencode/skills/.skill-builder.disabled/SKILL.md b/.config/opencode/skills/.skill-builder.disabled/SKILL.md new file mode 100644 index 0000000..bd71c49 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/SKILL.md @@ -0,0 +1,1014 @@ +--- +name: skill-builder +description: This skill should be used when the user asks to "create a new skill", "build a skill", "make a skill for", "scaffold a skill", "validate my skill", "check skill structure", or needs help setting up opencode skills with proper structure, YAML frontmatter, and validation. +license: MIT +compatibility: opencode +metadata: + category: development + version: "1.0.0" + author: user +--- + +# Skill Builder + +Rapidly scaffold, validate, and create well-structured opencode skills. This skill provides scripts and templates to ensure skills follow best practices and pass strict validation. + +## Quick Start + +Create a new skill: + +```bash +# Basic skill +~/.config/opencode/skills/skill-builder/scripts/init-skill.sh my-skill + +# Full skill with all resources +~/.config/opencode/skills/skill-builder/scripts/init-skill.sh my-skill --full + +# Local skill in current project +~/.config/opencode/skills/skill-builder/scripts/init-skill.sh my-skill --local --full +``` + +Validate a skill: + +```bash +~/.config/opencode/skills/skill-builder/scripts/validate-skill.sh ~/.config/opencode/skills/my-skill +``` + +## Skill Creation Workflow + +Follow this process for consistent, high-quality skills: + +### Step 1: Plan the Skill + +Before creating, answer these questions: + +#### Core Intent Questions + +1. **What does it do?** - Core functionality in 1-2 sentences +2. **When is it used?** - Specific trigger phrases users will say +3. **What resources are needed?** - Scripts, references, or assets +4. **What output format is expected?** - How should the model respond? +5. **Should we set up test cases?** - For objectively verifiable outputs (file transforms, data extraction, code generation) + +#### Interview and Research + +Proactively ask questions about edge cases, input/output formats, example files, success criteria, and dependencies: + +- What are common edge cases or error scenarios? +- What input formats will users provide? +- What output format does the user expect? +- Are there example files or data I can reference? +- What constitutes success for this skill? +- Are there any dependencies or prerequisites? + +### Step 2: Scaffold the Skill + +Use the init script with appropriate flags: + +```bash +# Simple knowledge skill (no resources) +init-skill.sh docker-commands + +# Skill with scripts for automation +init-skill.sh docker-helper --with-scripts + +# Skill with detailed documentation +init-skill.sh docker-guide --with-refs + +# Complete skill with all resources +init-skill.sh docker-toolkit --full +``` + +### Step 3: Edit SKILL.md + +Fill in the critical fields: + +**Frontmatter (REQUIRED):** +- `name`: Must match directory name, lowercase with hyphens +- `description`: 20-1024 chars, include trigger phrases in quotes + +**Body (REQUIRED):** +- Brief description of purpose +- Common tasks with examples +- Important notes or caveats + +### Step 4: Validate Strictly + +Always validate before use: + +```bash +validate-skill.sh ~/.config/opencode/skills/my-skill +``` + +This checks: +- ✓ Name matches directory +- ✓ Description is 20-1024 characters +- ✓ YAML frontmatter is valid +- ✓ No second-person writing +- ✓ Referenced files exist +- ✓ Scripts are executable + +### Step 5: Test and Iterate + +Test the skill by asking opencode to use it: + +``` +"Use my-skill to..." +"Help me with [skill trigger phrase]" +``` + +## Progressive Disclosure + +Organize content efficiently to minimize context usage: + +### What Goes in SKILL.md Body + +- Core purpose and overview +- Quick start examples +- Most common use cases +- Pointers to detailed resources + +**Keep under 500 lines, ideally 1,500-2,000 words** + +### What Goes in references/ + +- Detailed documentation +- Advanced techniques +- API references +- Troubleshooting guides +- Long examples + +**Loaded only when referenced** + +### What Goes in scripts/ + +- Executable utilities +- Validation tools +- Automation helpers +- Complex operations + +**Executed without loading into context** + +### What Goes in assets/ + +- Template files +- Configuration examples +- Boilerplate code +- Binary resources + +**Used in output, not loaded into context** + +## Testing and Evaluation + +To ensure your skill works correctly, create test cases and iterate based on results. + +### When to Create Test Cases + +Skills with objectively verifiable outputs benefit from test cases: +- File transforms (converting formats) +- Data extraction (parsing, filtering) +- Code generation (scaffolding, templates) +- Fixed workflow steps (validation, checks) + +Skills with subjective outputs often don't need formal test cases: +- Writing style guidance +- Art or design skills +- General advice + +### Creating Test Prompts + +After writing the skill draft, create 2-3 realistic test prompts - the kind of thing a real user would actually say: + +1. **Common case:** The typical scenario +2. **Edge case:** Something tricky or unusual +3. **Varied phrasing:** Same intent, different words + +Example test prompts for a Docker helper skill: +``` +1. "Show me all running containers and their resource usage" +2. "My container keeps crashing, how do I debug it?" +3. "clean up all the stopped containers and unused images" +``` + +### The Iteration Loop + +``` +Draft → Test → Review → Improve → Repeat +``` + +1. **Draft the skill** - Initial SKILL.md with basic structure +2. **Test it** - Try the skill with realistic prompts +3. **Review outputs** - Check if outputs match expectations +4. **Improve** - Based on what didn't work well +5. **Repeat** - Until outputs are satisfactory + +**When to stop iterating:** +- The user says they're happy +- Test case outputs meet expectations +- You're not making meaningful progress + +### Generalizing from Feedback + +When improving the skill based on test results: + +1. **Don't overfit to test cases** - The skill should work for similar prompts, not just the exact test cases +2. **Explain the "why"** - Instead of rigid "ALWAYS" or "NEVER" rules, explain the reasoning so the model understands +3. **Look for patterns** - If multiple tests have similar issues, fix the root cause, not individual symptoms +4. **Keep it lean** - Remove instructions that aren't helping; add clarity where needed + +## Writing Patterns and Best Practices + +### Defining Output Formats + +Explicitly specify output formats when consistency matters: + +```markdown +## Output Format + +ALWAYS use this exact template: + +# [Title] +## Overview +## Steps +## Examples +## Notes +``` + +### Examples Pattern + +Include clear examples in skills. Format them consistently: + +```markdown +## Example: Task Name + +**Input:** User's request or input data +**Output:** Expected result or response + +**Example 1:** +Input: "Create a Python function to reverse a string" +Output: +```python +def reverse_string(s): + return s[::-1] +``` +``` + +### Description Writing + +The description determines when the skill triggers. Make it: + +- **Specific** - Include concrete trigger phrases +- **Comprehensive** - Cover different ways users might ask +- **"Pushy"** - Encourage use when relevant (combat undertriggering) + +```yaml +# ❌ Too vague +name: dashboard-helper +description: Helps create dashboards + +# ❌ Still vague +name: dashboard-helper +description: How to build dashboards to display data + +# ✅ Good - specific and "pushy" +name: dashboard-helper +description: How to build dashboards to display internal data. Make sure to use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of data, even if they don't explicitly ask for a "dashboard." +``` + +### Using Imperative Form + +Prefer imperative form in instructions: + +```markdown +# ❌ Don't use second person +"You should run this command first." +"You need to check the configuration." + +# ✅ Use imperative form +"Run this command first." +"Check the configuration for errors." +``` + +### Explaining the Why + +Explain reasoning instead of just dictating rules: + +```markdown +# ❌ Rigid rule without context +ALWAYS validate JSON before parsing. + +# ✅ Explained reasoning +Validate JSON before parsing. Unvalidated JSON can cause cryptic errors that are hard to debug. The validator catches syntax errors and provides helpful error messages with line numbers. +``` + +## Common Mistakes to Avoid + +### ❌ Bad: Vague Description + +```yaml +description: Helps with Docker tasks +``` + +### ✅ Good: Specific with Triggers + +```yaml +description: This skill should be used when the user asks to "create a docker container", "build a docker image", "manage docker compose", or work with Docker workflows +``` + +### ❌ Bad: Second Person in Body + +❌ **Don't write:** "You should run this command first." +❌ **Don't write:** "You need to check the logs." + +### ✅ Good: Imperative Form + +✅ **Do write:** "Run this command first." +✅ **Do write:** "Check the logs for errors." + +### ❌ Bad: Everything in SKILL.md + +```markdown +SKILL.md (8,000 words - all content) +``` + +### ✅ Good: Progressive Disclosure + +```markdown +SKILL.md (1,800 words - essentials) +references/ + advanced.md (2,500 words) + troubleshooting.md (1,200 words) +``` + +## Resources + +### Templates + +See `references/skill-templates.md` for copy-paste templates: +- Simple skill (knowledge only) +- Tool integration skill (with scripts) +- Workflow skill (multi-step) +- Documentation skill (heavy references) + +### Helper Scripts + +- `scripts/init-skill.sh` - Scaffold new skills +- `scripts/validate-skill.sh` - Strict validation + +## Troubleshooting + +### Validation Fails: Name Mismatch + +**Error:** `'name' (my-skill) does not match directory name (my_skill)` + +**Fix:** Ensure the `name` field in SKILL.md matches the directory name exactly, including hyphens vs underscores. + +### Validation Fails: Description Too Short + +**Error:** `'description' must be at least 20 characters` + +**Fix:** Add more detail and specific trigger phrases. Example: +- Bad: "Docker helper" +- Good: "This skill should be used when the user asks to 'create a docker container', 'build an image', or manage Docker workflows" + +### Skill Not Loading + +1. Check SKILL.md filename is ALL CAPS +2. Verify YAML frontmatter starts with `---` +3. Ensure `name` and `description` fields exist +4. Run validation to catch issues + +### Permission Denied on Scripts + +**Fix:** Make scripts executable: + +```bash +chmod +x ~/.config/opencode/skills/my-skill/scripts/*.sh +``` + +## Integration with Projects + +When using `--local` flag, the script: + +1. Creates skill in `.opencode/skills/` (project-local) +2. Checks for existing `AGENTS.md` in project root +3. Suggests creating one if missing + +**Benefits of local skills:** +- Version controlled with project +- Team-shared +- Project-specific workflows + +**Benefits of global skills:** +- Available across all projects +- Personal productivity tools +- Common utilities + +## Description Optimization + +The description field is the primary mechanism that determines whether a skill is used. After creating a skill, optimize the description for better triggering accuracy. + +### What Makes a Good Description + +1. **Comprehensive trigger coverage** + - Include 3-5 specific trigger phrases in quotes + - Cover different ways users might phrase the same request + - Include both formal and casual phrasing + +2. **"Pushy" encouragement** + - Models tend to "undertrigger" skills (not use them when they should) + - Encourage use with phrases like "Make sure to use this skill whenever..." + +3. **Clear scope definition** + - What the skill does + - When to use it (specific contexts) + - What it handles + +### Good vs Bad Descriptions + +```yaml +# ❌ Too vague - won't trigger reliably +description: Docker helper + +# ❌ Passive - doesn't encourage use +description: Helps with Docker container management and image building + +# ✅ Good - specific triggers and encouraging +description: This skill should be used when the user asks to "create a docker container", "build a docker image", "manage docker compose", or work with Docker workflows. Make sure to use this skill for any Docker-related tasks, container operations, or deployment configurations. +``` + +### Testing Description Effectiveness + +After creating a skill: + +1. Try various trigger phrases from the description +2. Note which ones activate the skill +3. If important phrases don't trigger, add them explicitly +4. Test edge cases that are similar but should NOT trigger the skill + +### Iterative Refinement + +As you test the skill: + +1. **Note missed opportunities** - When should the skill have triggered but didn't? +2. **Add those triggers** - Update the description with missing phrases +3. **Test again** - Verify the new triggers work +4. **Watch for overtriggering** - If it triggers when it shouldn't, clarify the scope + +## Validation Rules (Strict Mode) + +The validator enforces these rules: + +1. **Structure** + - SKILL.md must exist + - Must start with YAML frontmatter (`---`) + - Body content required after frontmatter + +2. **Frontmatter** + - `name`: lowercase alphanumeric with hyphens, matches directory + - `description`: 20-1024 characters + - No empty fields + +3. **Writing Style** + - No second person ("You should...") + - No "When to Use This Skill" section in body + - Description should include quoted trigger phrases + +4. **Resources** + - Referenced files must exist + - Scripts must be executable + - No README.md or CHANGELOG.md files + +5. **Warnings (Strict Mode = Errors)** + - Vague descriptions + - Missing trigger phrases + - Unreferenced resource files + - Unexecutable scripts + +## Best Practices + +✅ **DO:** +- Use specific trigger phrases in description +- Keep SKILL.md lean, move details to references/ +- Test scripts before including +- Validate after every change +- Use imperative form in body +- Include working code examples +- Create test cases for objectively verifiable skills +- Iterate based on test results +- Explain the "why" behind instructions (not just rigid rules) +- Generalize improvements from test feedback (don't overfit to specific test cases) +- Make descriptions "pushy" - encourage use when relevant +- Ask questions about edge cases and requirements before building + +❌ **DON'T:** +- Put everything in one file +- Use second person writing +- Create auxiliary documentation files +- Skip validation +- Leave placeholder content +- Forget to make scripts executable +- Skip testing and iteration +- Overfit skills to specific test examples + +## Test Case Structure + +Create test cases to verify your skill works correctly across different scenarios. + +### When to Create Test Cases + +Skills with objectively verifiable outputs benefit from formal test cases: +- File transforms (converting formats) +- Data extraction (parsing, filtering) +- Code generation (scaffolding, templates) +- Fixed workflow steps (validation, checks) + +Skills with subjective outputs often do not need formal test cases: +- Writing style guidance +- Art or design skills +- General advice and best practices + +### Test Case Format + +Store test cases in `evals/evals.json`: + +```json +{ + "skill_name": "my-skill", + "evals": [ + { + "id": 1, + "name": "common-case", + "prompt": "Realistic user request with context", + "expected_output": "Description of expected result", + "assertions": [ + "Output file exists at expected location", + "File contains specific content or pattern" + ] + } + ] +} +``` + +### Required Fields + +- **id**: Unique number for each test case +- **name**: Descriptive identifier (e.g., "common-case", "edge-case-with-large-file") +- **prompt**: The exact user request to test +- **expected_output**: Description of what should happen + +### Optional Fields + +- **assertions**: List of specific checks (text descriptions for manual verification) + +### Creating Test Cases + +Use the helper script to scaffold test cases: + +```bash +# Interactive test case creation +~/.config/opencode/skills/skill-builder/scripts/create-tests.sh ~/.config/opencode/skills/my-skill +``` + +Or create `evals/evals.json` manually following the template in `references/eval-templates.md`. + +### Required Test Case Types + +Every skill should have at least 3 test cases: + +1. **Common Case**: The typical, straightforward scenario +2. **Edge Case**: Something tricky, unusual, or error-prone +3. **Varied Phrasing**: Same intent as case 1, but expressed differently + +This ensures the skill works reliably across different ways users might ask for help. + + +## Writing Effective Test Prompts + +Good test prompts are realistic and include context that real users would provide. + +### Good Test Prompt Example + +``` +ok so my boss just sent me this xlsx file (its in my downloads, called something like 'Q4 sales final FINAL v2.xlsx') and she wants me to add a column that shows the profit margin as a percentage. The revenue is in column C and costs are in column D i think +``` + +**Why this is good:** +- Includes file path and realistic filename +- Mentions personal context ("my boss sent me") +- Has specific column references +- Uses casual, natural language +- Contains realistic uncertainty ("i think") + +### Bad Test Prompt Example + +``` +Format this data +``` + +**Why this is bad:** +- Too vague - no context about what "format" means +- No file path or data type mentioned +- Does not test real-world usage +- Too short to trigger complex skills + +### Test Prompt Guidelines + +**Do:** +- Include file paths when relevant (`~/Downloads/`, `./src/`) +- Add personal context (job role, situation, urgency) +- Use column names, field names, or specific values +- Mix formal and casual language +- Include realistic typos or abbreviations +- Vary the length (some short, some detailed) + +**Do not:** +- Use generic requests without specifics +- Make all prompts the same length +- Use only perfect grammar +- Test only obvious, clear-cut cases + +### Examples by Skill Type + +**File Transform Skill:** +- Good: "Convert the CSV at ./data/raw_export.csv to JSON and save it as ./output/clean_data.json. Make sure dates are in ISO format." +- Bad: "Convert CSV to JSON" + +**Code Generation Skill:** +- Good: "Create a Python script that recursively finds all .log files in /var/log, compresses them if they're older than 30 days, and moves them to /archive. Handle permission errors gracefully." +- Bad: "Write a Python script" + +**Workflow Skill:** +- Good: "I need to deploy my Node.js app to production. The repo is at ~/projects/myapp, it uses Docker, and I need to make sure the database migrations run before the new version starts. Also need to tag the release." +- Bad: "Deploy my app" + + +## The Testing Workflow + +Follow this iterative process to improve your skill: + +### The Iteration Loop + +``` +Draft → Test → Grade → Improve → Repeat +``` + +### Step 1: Draft the Skill + +Create the initial SKILL.md with: +- Clear description with trigger phrases +- Core instructions in the body +- Any needed scripts or references + +### Step 2: Create Test Cases + +Write 3 test prompts following the guidelines above: +1. Common case (typical usage) +2. Edge case (tricky scenario) +3. Varied phrasing (different words, same intent) + +Save them in `evals/evals.json`. + +### Step 3: Run Tests + +Execute the test workflow: + +```bash +# Run through all test cases interactively +~/.config/opencode/skills/skill-builder/scripts/run-tests.sh ~/.config/opencode/skills/my-skill +``` + +This will: +1. Display each test prompt +2. Ask you to test it manually in opencode +3. Record your evaluation (pass/fail/skip) +4. Capture notes about issues +5. Save results to `evals/test-results.json` + +### Step 4: Grade Outputs + +Evaluate the skill's performance: + +**Use the grading helper:** +```bash +~/.config/opencode/skills/skill-builder/scripts/grade-output.sh ~/.config/opencode/skills/my-skill +``` + +**Or grade manually by reviewing:** +- Did the output match expectations? +- Were there errors or omissions? +- Did the skill trigger appropriately? +- Was the output format correct? + +See `references/grading-guide.md` for detailed evaluation criteria. + +### Step 5: Improve the Skill + +Based on test results, update SKILL.md: + +**If specific issues found:** +- Add clarifying instructions +- Include examples for edge cases +- Fix incorrect guidance + +**If patterns emerge across tests:** +- Generalize the solution +- Add helper scripts for common tasks +- Explain the "why" behind instructions + +**General improvement principles:** +- Do not overfit to exact test wording +- Explain reasoning, not just rules +- Remove instructions that are not helping +- Add clarity where needed +- Extract repeated work into scripts/ + +### Step 6: Repeat + +Run the tests again with the improved skill: + +```bash +# Run tests again +~/.config/opencode/skills/skill-builder/scripts/run-tests.sh ~/.config/opencode/skills/my-skill +``` + +Compare results with previous runs and continue iterating. + +### When to Stop Iterating + +Stop when any of these are true: + +1. **User is satisfied** - The skill meets their needs +2. **Outputs meet expectations** - All test cases pass consistently +3. **No meaningful progress** - Iterations are not improving results + +Avoid endless tweaking. A skill that works well for 90% of cases is better than one still being perfected. + + +## Grading Guidelines + +Evaluate skill outputs systematically to identify improvements. + +### Evaluation Checklist + +For each test case, assess: + +**Correctness:** +- [ ] Output matches expected result +- [ ] No factual errors +- [ ] Logic is sound +- [ ] Edge cases handled appropriately + +**Completeness:** +- [ ] All requested tasks completed +- [ ] No steps skipped +- [ ] Appropriate level of detail +- [ ] Relevant context included + +**Format:** +- [ ] Output follows specified format (if any) +- [ ] Consistent with examples in skill +- [ ] Easy to read and understand + +**Triggering:** +- [ ] Skill activated when appropriate +- [ ] Did not activate when inappropriate + +**Efficiency:** +- [ ] No unnecessary steps +- [ ] Reasonable token usage +- [ ] Not overly verbose + +### Identifying Patterns + +Look for patterns across multiple test cases: + +**Single test failure:** +- Likely a specific edge case +- Add targeted instruction or example +- Fix the specific issue + +**Multiple test failures (same issue):** +- Indicates systemic problem +- Fix the root cause, not symptoms +- Consider adding a script to handle this + +**Multiple test failures (different issues):** +- Skill may be too broad or unclear +- Clarify scope in description +- Add more specific instructions + +### Decision Framework + +**Fix the specific case when:** +- Unique edge case not covered +- One-time issue unlikely to recur +- Fix is simple and does not add complexity + +**Generalize the solution when:** +- Same issue appears in 2+ tests +- Pattern suggests broader applicability +- Fix would benefit similar future requests + +**Extract to script when:** +- Same multi-step process repeated +- Deterministic operation (not creative) +- Would save time on every invocation + +### Grading Output Format + +When grading, record: + +```json +{ + "test_id": 1, + "result": "pass|fail|partial", + "issues": [ + "Description of issue 1", + "Description of issue 2" + ], + "suggested_fix": "Brief description of improvement", + "extract_script": false, + "priority": "high|medium|low" +} +``` + +Use the `grade-output.sh` script to generate this structure interactively. + + +## Manual Description Optimization + +Improve the skill description to trigger at the right times. + +### The Description's Role + +The description field determines when opencode loads your skill. It appears in the `skill` tool's available skills list. The agent decides whether to load your skill based on this description. + +### Testing Description Effectiveness + +**Step 1: Create Test Queries** + +Generate 8 test queries - 4 that should trigger the skill, 4 that should not: + +**Should-Trigger Queries (4):** +- Direct request using skill name +- Request using synonyms +- Casual phrasing +- Request needing skill but not naming it + +**Should-Not-Trigger Queries (4):** +- Adjacent domain (similar but different) +- Ambiguous phrasing +- Query that touches on skill but needs different tool +- Completely unrelated request + +**Example for a Docker skill:** + +Should trigger: +1. "Create a docker container for my Node.js app" +2. "Build an image from this Dockerfile" +3. "How do I compose up my services?" +4. "I need to deploy this using containers" + +Should not trigger: +1. "Install Docker on my machine" (installation vs usage) +2. "What is containerization?" (education vs hands-on) +3. "Show me Kubernetes commands" (different tool) +4. "Write a fibonacci function" (completely unrelated) + +**Step 2: Test Each Query** + +For each query: + +1. Start a fresh opencode session +2. Type the query +3. Note whether the skill appears in available skills +4. If it appears, note whether it was loaded +5. Record results + +**Step 3: Analyze Results** + +**Missing triggers (should trigger but does not):** +- Add those phrases to description +- Use "pushy" language: "Make sure to use this skill whenever..." +- Include synonyms and variations + +**False triggers (triggers when it should not):** +- Clarify scope with specific examples +- Add exclusions: "Use this for X, not for Y" +- Refine trigger phrases to be more specific + +**Step 4: Refine Description** + +Update the description field in SKILL.md frontmatter: + +```yaml +# ❌ Too vague +description: Helps with Docker tasks + +# ❌ Good but not pushy +description: Use for Docker container management and image building + +# ✅ Good - specific and encouraging +description: This skill should be used when the user asks to "create a docker container", "build a docker image", "manage docker compose", or work with Docker workflows. Make sure to use this skill for any Docker-related tasks, container operations, or deployment configurations. +``` + +**Step 5: Re-test** + +Run the 8 test queries again with the updated description. + +**Step 6: Iterate** + +Continue refining until: +- All should-trigger queries activate the skill +- All should-not-trigger queries do not activate it +- Description is still concise (under 1024 characters) + +### Description Optimization Tips + +**Be specific:** +- Include 3-5 concrete trigger phrases in quotes +- Use exact language users would type +- Mention file types or tools when relevant + +**Be comprehensive:** +- Cover different ways to ask +- Include both formal and casual phrasing +- Account for synonyms + +**Be "pushy":** +- Models tend to "undertrigger" skills (not use when they should) +- Use phrases like "Make sure to use this skill whenever..." +- Encourage use even if user does not explicitly ask + +**Examples of Good Descriptions:** + +```yaml +# File processing skill +description: This skill should be used when the user asks to "convert CSV to JSON", "transform this file", "parse this data", or work with file format conversions. Make sure to use this skill whenever the user mentions converting, transforming, or reformatting files, even if they do not specify the exact formats. + +# API documentation skill +description: This skill should be used when the user asks to "document an API", "create API docs", "write endpoint documentation", or needs help with REST API documentation. Use for OpenAPI specs, endpoint descriptions, request/response examples, and API reference guides. + +# Git workflow skill +description: This skill should be used when the user asks to "create a release", "publish a new version", "tag a release", or prepare software for deployment. Make sure to use this skill for any release-related tasks, version bumps, or deployment preparation. +``` + +### When to Optimize + +Optimize the description: +- After initial skill creation +- When users report skill not triggering +- When skill triggers inappropriately +- After significant skill updates + +Do not over-optimize - a description that works for 90% of cases is sufficient. + + +## Testing Scripts Reference + +### Create Test Cases + +```bash +# Interactive test case generation +~/.config/opencode/skills/skill-builder/scripts/create-tests.sh ~/.config/opencode/skills/my-skill +``` + +Generates `evals/evals.json` with 3 template test cases. + +### Run Tests + +```bash +# Execute test workflow +~/.config/opencode/skills/skill-builder/scripts/run-tests.sh ~/.config/opencode/skills/my-skill +``` + +Guides you through testing each case and saves results. + +### Grade Outputs + +```bash +# Interactive grading +~/.config/opencode/skills/skill-builder/scripts/grade-output.sh ~/.config/opencode/skills/my-skill +``` + +Provides structured evaluation checklist. + +### Quick Start with Tests + +When creating a new skill with tests: + +```bash +# Create skill with test infrastructure +~/.config/opencode/skills/skill-builder/scripts/init-skill.sh my-skill --full --with-tests + +# Generate test cases +~/.config/opencode/skills/skill-builder/scripts/create-tests.sh ~/.config/opencode/skills/my-skill + +# Edit evals/evals.json to customize prompts + +# Run tests +~/.config/opencode/skills/skill-builder/scripts/run-tests.sh ~/.config/opencode/skills/my-skill +``` + +See `references/eval-templates.md` for copy-paste test case templates. diff --git a/.config/opencode/skills/.skill-builder.disabled/references/eval-templates.md b/.config/opencode/skills/.skill-builder.disabled/references/eval-templates.md new file mode 100644 index 0000000..f46bc73 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/references/eval-templates.md @@ -0,0 +1,615 @@ +# Test Case Templates + +Copy-paste templates for creating test cases based on skill type. + +## Table of Contents + +1. [File Transform Skills](#file-transform-skills) +2. [Code Generation Skills](#code-generation-skills) +3. [Workflow Skills](#workflow-skills) +4. [Tool Integration Skills](#tool-integration-skills) +5. [Documentation Skills](#documentation-skills) + +--- + +## File Transform Skills + +For skills that convert, parse, or reformat files. + +### Example: CSV to JSON Converter + +```json +{ + "skill_name": "csv-to-json", + "description": "Converts CSV files to JSON format with data type handling", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "Convert the CSV file at ./data/sales.csv to JSON and save it as ./output/sales.json. The CSV has headers: date, product, quantity, price. Make sure dates are in ISO 8601 format.", + "expected_output": "A JSON file at ./output/sales.json containing an array of objects, each representing a row from the CSV with properly formatted dates.", + "assertions": [ + "JSON file exists at ./output/sales.json", + "File contains valid JSON array", + "All dates are in ISO 8601 format (YYYY-MM-DD)", + "Numeric fields (quantity, price) are numbers, not strings" + ] + }, + { + "id": 2, + "name": "edge-case-large-file", + "type": "edge", + "prompt": "Convert a CSV file with 50,000 rows at ./data/export.csv to JSON. The file contains some rows with missing values in the 'email' column and special characters (emojis) in the 'notes' column.", + "expected_output": "JSON file is created successfully, handling missing values as null or empty strings, and preserving special characters correctly.", + "assertions": [ + "Large file is processed without memory errors", + "Missing email values are handled (null or empty string)", + "Special characters and emojis are preserved in output", + "All 50,000 rows are converted" + ] + }, + { + "id": 3, + "name": "varied-phrasing-casual", + "type": "variation", + "prompt": "Hey, I've got this spreadsheet data/export.csv that I need to turn into JSON. Can you do that for me? Also, the dates are in MM/DD/YYYY format right now - can you make them proper ISO format?", + "expected_output": "Same as common case: JSON file with ISO 8601 dates", + "assertions": [ + "Skill triggers with casual language ('turn into', 'proper format')", + "Implicit date formatting requirement is handled", + "Output matches common case results" + ] + } + ] +} +``` + +### Template: File Transform Skill + +```json +{ + "skill_name": "YOUR_SKILL_NAME", + "description": "[Describe what file transformations this skill does]", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[Realistic request with specific file paths and formats]", + "expected_output": "[What the transformed file should contain]", + "assertions": [ + "Output file exists at expected location", + "File is valid [format: JSON, XML, etc.]", + "Data is correctly transformed", + "Specific requirements met (encoding, formatting, etc.)" + ] + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[Large files, special characters, missing data, malformed input]", + "expected_output": "[Graceful handling or error with helpful message]", + "assertions": [ + "Edge case is handled appropriately", + "No data loss or corruption", + "Error messages are helpful if transformation fails" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[Same request with casual language, different wording]", + "expected_output": "[Same as common case]", + "assertions": [ + "Skill triggers with varied phrasing", + "Output matches common case", + "Implicit requirements are understood" + ] + } + ] +} +``` + +--- + +## Code Generation Skills + +For skills that generate code, scripts, or templates. + +### Example: Python Script Generator + +```json +{ + "skill_name": "python-script-gen", + "description": "Generates Python scripts for file operations and data processing", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "Create a Python script that recursively finds all .log files in /var/log, compresses them with gzip if they're older than 30 days, and moves them to /archive. Handle permission errors gracefully and log all actions to cleanup.log.", + "expected_output": "A Python script that implements the log cleanup functionality with proper error handling, logging, and follows Python best practices.", + "assertions": [ + "Script is syntactically valid Python", + "Implements recursive file search", + "Compresses files older than 30 days", + "Handles permission errors gracefully", + "Logs actions to cleanup.log" + ] + }, + { + "id": 2, + "name": "edge-case-empty-directory", + "type": "edge", + "prompt": "Create a Python script that processes all CSV files in ./data and generates a summary report. What should it do if the directory is empty or doesn't exist?", + "expected_output": "Script handles empty/non-existent directories gracefully with informative error messages and doesn't crash.", + "assertions": [ + "Script checks if directory exists before processing", + "Handles empty directory case gracefully", + "Provides informative error message", + "Returns appropriate exit code" + ] + }, + { + "id": 3, + "name": "varied-phrasing-brief", + "type": "variation", + "prompt": "Write me a python script to backup my photos. It should copy everything from ~/Pictures to ~/Backups/photos with today's date in the folder name. Skip duplicates if possible.", + "expected_output": "Python backup script with timestamped folder and duplicate detection", + "assertions": [ + "Skill works with brief, casual description", + "Generates complete script without requiring clarification", + "Handles date formatting for folder name", + "Implements duplicate detection" + ] + } + ] +} +``` + +### Template: Code Generation Skill + +```json +{ + "skill_name": "YOUR_SKILL_NAME", + "description": "[Describe what code this skill generates]", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[Detailed request with specific requirements and constraints]", + "expected_output": "[Description of the generated code]", + "assertions": [ + "Code is syntactically valid", + "Implements all requested features", + "Follows language best practices", + "Includes error handling where appropriate", + "Is well-structured and readable" + ] + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[Ambiguous requirements, missing data, error conditions]", + "expected_output": "[Code handles edge cases or asks for clarification]", + "assertions": [ + "Edge cases are handled appropriately", + "Error conditions are managed", + "Code doesn't crash on unexpected input" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[Same request with minimal details or casual language]", + "expected_output": "[Same quality as common case]", + "assertions": [ + "Skill fills in reasonable defaults", + "Generates complete solution", + "Output quality matches detailed request" + ] + } + ] +} +``` + +--- + +## Workflow Skills + +For skills that guide multi-step processes. + +### Example: Release Workflow + +```json +{ + "skill_name": "release-workflow", + "description": "Guides the complete software release process", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "I need to create a new release for my Node.js project. The repo is at ~/projects/myapp. We're currently at version 1.2.3 and this is a minor feature release (1.3.0). I need to update the version, create a changelog entry, commit, tag, and push to GitHub.", + "expected_output": "Step-by-step guide covering: version bump in package.json, CHANGELOG.md update, commit creation, annotated tag, and push commands. Should provide copy-pasteable commands.", + "assertions": [ + "All release steps are covered", + "Commands are copy-pasteable", + "Version numbers are consistent", + "Validation steps are included", + "Provides rollback guidance" + ] + }, + { + "id": 2, + "name": "edge-case-dirty-worktree", + "type": "edge", + "prompt": "Help me release version 2.0.0 of my project. By the way, I have some uncommitted changes in my working directory that I'm not sure about.", + "expected_output": "Workflow detects dirty worktree, suggests stashing or committing changes before proceeding with release. Provides commands to handle the situation.", + "assertions": [ + "Detects uncommitted changes", + "Warns about dirty worktree", + "Provides options: stash, commit, or abort", + "Doesn't proceed without addressing the issue" + ] + }, + { + "id": 3, + "name": "varied-phrasing-urgent", + "type": "variation", + "prompt": "Need to push out v2.1.0 ASAP. Hotfix for critical bug. What's the fastest way to get this released?", + "expected_output": "Accelerated workflow prioritizing speed while maintaining essential safety checks", + "assertions": [ + "Recognizes urgency from language ('ASAP', 'fastest')", + "Still includes critical safety checks", + "Prioritizes speed without skipping validation", + "Provides streamlined command sequence" + ] + } + ] +} +``` + +### Template: Workflow Skill + +```json +{ + "skill_name": "YOUR_SKILL_NAME", + "description": "[Describe what workflow this skill guides]", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[Standard workflow request with clear requirements]", + "expected_output": "[Complete step-by-step guide with all necessary steps]", + "assertions": [ + "All workflow steps are included", + "Steps are in logical order", + "Validation points are provided", + "Commands are copy-pasteable where applicable", + "Clear success criteria defined" + ] + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[Workflow with complications, errors, or unusual state]", + "expected_output": "[Workflow detects issues and provides guidance]", + "assertions": [ + "Detects unusual states or errors", + "Provides recovery options", + "Doesn't proceed blindly", + "Offers rollback or alternative paths" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[Workflow request with urgency, casual language, or minimal details]", + "expected_output": "[Same workflow adapted to context]", + "assertions": [ + "Adapts to urgency level", + "Works with minimal context", + "Still provides complete guidance" + ] + } + ] +} +``` + +--- + +## Tool Integration Skills + +For skills that wrap command-line tools. + +### Example: Docker Helper + +```json +{ + "skill_name": "docker-helper", + "description": "Streamlines Docker container and image management", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "I need to deploy my Node.js app using Docker. The Dockerfile is in ~/projects/myapp. Build an image tagged as myapp:v1.0, then run a container named 'myapp-prod' that maps port 3000 to the host. Make sure it restarts automatically if it crashes.", + "expected_output": "Docker commands to build the image and run the container with specified configuration, including restart policy.", + "assertions": [ + "Provides correct build command with tag", + "Provides correct run command with port mapping", + "Includes restart policy (--restart unless-stopped or always)", + "Sets container name correctly", + "Commands are copy-pasteable" + ] + }, + { + "id": 2, + "name": "edge-case-port-conflict", + "type": "edge", + "prompt": "Run my Docker container on port 3000, but I think something might already be using that port on my machine. How do I check and handle this?", + "expected_output": "Commands to check port usage, offer solutions (kill process, use different port, or map to different host port), and proceed accordingly.", + "assertions": [ + "Detects potential port conflict", + "Provides command to check port usage", + "Offers multiple solutions", + "Explains trade-offs of each option" + ] + }, + { + "id": 3, + "name": "varied-phrasing-cleanup", + "type": "variation", + "prompt": "Docker is taking up too much space. Clean up old stuff for me?", + "expected_output": "Commands to clean up stopped containers, unused images, and build cache", + "assertions": [ + "Understands implicit request from context", + "Provides safe cleanup commands", + "Warns about data loss where applicable", + "Shows space savings after cleanup" + ] + } + ] +} +``` + +### Template: Tool Integration Skill + +```json +{ + "skill_name": "YOUR_SKILL_NAME", + "description": "[Describe what tool this skill wraps]", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[Standard tool usage with specific options and requirements]", + "expected_output": "[Correct command(s) with proper flags]", + "assertions": [ + "Command syntax is correct", + "All required flags are included", + "Best practices are followed", + "Commands are copy-pasteable", + "Explains what each part does" + ] + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[Tool usage with errors, conflicts, or unusual requirements]", + "expected_output": "[Troubleshooting steps and solutions]", + "assertions": [ + "Detects potential issues", + "Provides diagnostic commands", + "Offers multiple solutions", + "Explains risks of each approach" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[Casual request with vague requirements]", + "expected_output": "[Tool commands with reasonable defaults]", + "assertions": [ + "Fills in reasonable defaults", + "Provides complete solution", + "Explains assumptions made" + ] + } + ] +} +``` + +--- + +## Documentation Skills + +For skills that help create or review documentation. + +### Example: README Generator + +```json +{ + "skill_name": "readme-generator", + "description": "Creates comprehensive README files for projects", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "Create a README for my Python project. It's a CLI tool called 'file-organizer' that sorts files by type and date. The repo is at ~/projects/file-organizer. It supports Python 3.8+, uses click for CLI, and has features for: organizing by extension, organizing by date, dry-run mode, and config file support.", + "expected_output": "Complete README with: title, description, installation, usage examples, features list, configuration, and contributing sections. Formatted in Markdown.", + "assertions": [ + "README is well-structured with clear headings", + "All requested sections are included", + "Installation instructions are clear", + "Usage examples show actual commands", + "Features are listed comprehensively" + ] + }, + { + "id": 2, + "name": "edge-case-minimal-info", + "type": "edge", + "prompt": "Write a README for my project. It's called 'utils' and it does some stuff with files.", + "expected_output": "README template with placeholder sections and prompts for missing information. Asks clarifying questions or provides generic placeholders.", + "assertions": [ + "Creates template structure despite minimal info", + "Uses placeholders for missing details", + "Suggests what information to add", + "Doesn't invent features or functionality" + ] + }, + { + "id": 3, + "name": "varied-phrasing-casual", + "type": "variation", + "prompt": "Hey can you write a readme for my new js library? It's on npm as 'async-queue'. It helps manage async tasks with a queue so you don't overwhelm APIs. Pretty simple but useful.", + "expected_output": "README with npm installation, basic usage example, and API overview", + "assertions": [ + "Understands from package name and casual description", + "Includes npm install instructions", + "Provides JavaScript usage examples", + "Explains the problem it solves" + ] + } + ] +} +``` + +### Template: Documentation Skill + +```json +{ + "skill_name": "YOUR_SKILL_NAME", + "description": "[Describe what documentation this skill helps with]", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[Detailed request with project information and requirements]", + "expected_output": "[Complete, well-structured documentation]", + "assertions": [ + "Documentation is well-organized", + "All required sections are included", + "Examples are clear and relevant", + "Formatting is consistent", + "Language is clear and concise" + ] + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[Vague request with minimal information]", + "expected_output": "[Template with placeholders or clarifying questions]", + "assertions": [ + "Creates structure despite limited info", + "Uses appropriate placeholders", + "Identifies missing information", + "Doesn't invent false details" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[Casual request with implied requirements]", + "expected_output": "[Documentation meeting implicit needs]", + "assertions": [ + "Understands implicit requirements", + "Provides complete documentation", + "Matches tone of request" + ] + } + ] +} +``` + +--- + +## Tips for Customizing Templates + +### Making Tests Realistic + +**Bad:** +``` +"Convert CSV to JSON" +``` + +**Good:** +``` +"Convert the CSV file at ./data/sales.csv to JSON and save it as ./output/sales.json. The CSV has headers: date, product, quantity, price. Make sure dates are in ISO 8601 format." +``` + +### Assertions Should Be Checkable + +**Vague:** +```json +"assertions": [ + "Output looks good" +] +``` + +**Specific:** +```json +"assertions": [ + "JSON file exists at ./output/sales.json", + "All dates are in ISO 8601 format", + "Numeric fields are numbers, not strings" +] +``` + +### Edge Cases to Consider + +1. **Large data** - Files with 10,000+ rows +2. **Special characters** - Emojis, Unicode, unusual symbols +3. **Missing data** - Empty cells, null values +4. **Wrong format** - CSV with inconsistent columns +5. **Permissions** - Read/write errors +6. **Conflicts** - Port conflicts, file already exists +7. **Empty input** - Zero rows, blank files +8. **Unexpected state** - Dirty git worktree, missing dependencies + +### Variation Ideas + +1. **Casual language** - "Hey, can you...", "I need to..." +2. **Abbreviated** - Minimal words, assumes context +3. **Urgent** - "ASAP", "quickly", "fastest way" +4. **Uncertain** - "I think", "maybe", "probably" +5. **Brief** - Single sentence, minimal details +6. **Verbose** - Extra context, backstory + +--- + +## Validation Checklist + +Before using your evals.json: + +- [ ] Contains exactly 3 test cases (common, edge, variation) +- [ ] Each test has unique id (1, 2, 3) +- [ ] Each test has descriptive name +- [ ] Prompts are realistic with specific details +- [ ] Expected outputs are clear +- [ ] Assertions are objectively checkable +- [ ] JSON is valid (run through jq or json linter) +- [ ] File is saved as evals/evals.json in skill directory + +Run this to validate: +```bash +jq . evals/evals.json > /dev/null && echo "Valid JSON" || echo "Invalid JSON" +``` diff --git a/.config/opencode/skills/.skill-builder.disabled/references/grading-guide.md b/.config/opencode/skills/.skill-builder.disabled/references/grading-guide.md new file mode 100644 index 0000000..e447f98 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/references/grading-guide.md @@ -0,0 +1,607 @@ +# Grading Guide + +Comprehensive guide for evaluating skill outputs and identifying improvements. + +## Table of Contents + +1. [The Evaluation Framework](#the-evaluation-framework) +2. [Correctness](#correctness) +3. [Completeness](#completeness) +4. [Format](#format) +5. [Triggering](#triggering) +6. [Efficiency](#efficiency) +7. [Identifying Patterns](#identifying-patterns) +8. [Decision Framework](#decision-framework) +9. [Common Issues and Solutions](#common-issues-and-solutions) +10. [When to Stop Iterating](#when-to-stop-iterating) + +--- + +## The Evaluation Framework + +Evaluate skill outputs across five dimensions: + +1. **Correctness** - Is the output accurate and correct? +2. **Completeness** - Are all requirements met? +3. **Format** - Does it follow expected structure? +4. **Triggering** - Does it activate appropriately? +5. **Efficiency** - Is it concise yet complete? + +Each dimension has specific criteria to check. Use the grade-output.sh script for systematic evaluation. + +--- + +## Correctness + +### What to Check + +**Output matches expected result:** +- For file transforms: Output file contains correct data +- For code generation: Code works as described +- For workflows: All steps lead to desired outcome +- For documentation: Information is accurate + +**No factual errors:** +- Commands use correct syntax +- File paths are accurate +- Versions are correct +- Technical details are right + +**Logic is sound:** +- Reasoning makes sense +- Steps follow logically +- No contradictions +- Edge cases considered + +**Edge cases handled appropriately:** +- Empty inputs don't crash +- Large inputs work +- Special characters preserved +- Errors handled gracefully + +### Examples + +**Good (Correct):** +```bash +# Command actually works +docker run -p 3000:3000 --name myapp myimage:v1.0 +``` + +**Bad (Incorrect):** +```bash +# Wrong flag syntax +docker run --port 3000:3000 myimage # Should be -p or --publish +``` + +**Good (Handles Edge Cases):** +```python +import os +if os.path.exists(filename): + process_file(filename) +else: + print(f"Error: {filename} not found") +``` + +**Bad (No Edge Case Handling):** +```python +process_file(filename) # Will crash if file doesn't exist +``` + +--- + +## Completeness + +### What to Check + +**All requested tasks completed:** +- Every requirement from prompt addressed +- No steps forgotten +- All files generated +- All transformations applied + +**No steps skipped:** +- Validation steps included +- Cleanup steps not forgotten +- Confirmation steps present +- Rollback guidance provided + +**Appropriate level of detail:** +- Not too brief (missing context) +- Not too verbose (information overload) +- Explains "why" not just "what" +- Provides examples where helpful + +**Relevant context included:** +- Prerequisites mentioned +- Dependencies noted +- Error scenarios covered +- Alternative approaches offered + +### Examples + +**Good (Complete):** +``` +To release version 1.3.0: + +1. Update version in package.json: "version": "1.3.0" +2. Add entry to CHANGELOG.md with today's date +3. Commit: git commit -am "chore(release): bump version to 1.3.0" +4. Create tag: git tag -a v1.3.0 -m "Release 1.3.0" +5. Push: git push origin main && git push origin v1.3.0 +6. Create GitHub release: gh release create v1.3.0 --notes-from-tag + +Prerequisites: +- All tests passing +- CHANGELOG.md updated +- You have push access to repository +``` + +**Bad (Incomplete):** +``` +To release: +1. Update version +2. Create tag +3. Push +``` + +--- + +## Format + +### What to Check + +**Output follows specified format:** +- If skill defines a template, output matches it +- Markdown formatting is correct +- Code blocks use proper syntax highlighting +- Tables are properly formatted + +**Consistent with examples in skill:** +- Format matches examples in SKILL.md +- Style is consistent +- Terminology is consistent +- Structure follows patterns + +**Easy to read and understand:** +- Clear headings and sections +- Good use of whitespace +- Logical organization +- No wall of text + +### Examples + +**Good (Well-Formatted):** +```markdown +## Deployment Steps + +### 1. Build Docker Image + +```bash +docker build -t myapp:v1.0 . +``` + +### 2. Run Container + +```bash +docker run -d -p 3000:3000 --name myapp myapp:v1.0 +``` + +### 3. Verify Deployment + +```bash +curl http://localhost:3000/health +``` +``` + +**Bad (Poor Formatting):** +``` +step 1: build the image with docker build -t myapp:v1.0 . then step 2 run the container with docker run -d -p 3000:3000 --name myapp myapp:v1.0 and step 3 verify with curl +``` + +--- + +## Triggering + +### What to Check + +**Skill activates when appropriate:** +- Relevant triggers work +- Similar phrasings activate it +- Context clues are recognized +- Doesn't require exact keywords + +**Doesn't activate when inappropriate:** +- Adjacent domains don't trigger it +- Ambiguous queries don't wrongly trigger +- Different tools aren't confused +- Scope is respected + +### Testing Triggering + +Test with 8 queries (4 should-trigger, 4 should-not-trigger): + +**Example for Docker skill:** + +Should trigger: +1. "Create a docker container for my Node.js app" +2. "Build an image from this Dockerfile" +3. "How do I compose up my services?" +4. "I need to deploy this using containers" + +Should not trigger: +1. "Install Docker on my machine" (installation vs usage) +2. "What is containerization?" (education vs hands-on) +3. "Show me Kubernetes commands" (different tool) +4. "Write a fibonacci function" (completely unrelated) + +### Examples + +**Good (Appropriate Triggering):** +- User: "containerize my app" +- Skill: Docker helper activates ✓ + +**Bad (Missed Trigger):** +- User: "I need to run my app in containers" +- Skill: Doesn't activate (too narrow description) ✗ + +**Bad (False Trigger):** +- User: "Install Docker on Ubuntu" +- Skill: Activates but only handles usage, not installation ✗ + +--- + +## Efficiency + +### What to Check + +**No unnecessary steps:** +- Doesn't do redundant work +- Doesn't suggest unnecessary commands +- Shortcuts used where safe +- Process is streamlined + +**Reasonable response length:** +- Not excessively verbose +- Doesn't repeat information +- No filler content +- Every sentence adds value + +**Not overly verbose:** +- Commands not over-explained +- Doesn't explain obvious steps +- Concise but complete +- Respects user's expertise + +### Examples + +**Good (Efficient):** +```bash +# Clean up Docker resources +docker system prune -f +``` + +**Bad (Verbose):** +```bash +# Clean up Docker resources +# First, we need to run the docker command +# Then we use the system subcommand +# Then we use the prune subcommand +# The -f flag means force +docker system prune -f +# This will remove stopped containers, unused networks, and dangling images +``` + +--- + +## Identifying Patterns + +### Single Test Failure + +**Characteristics:** +- Only one test case fails +- Issue is unique to that scenario +- Other tests pass + +**Action:** +- Add targeted instruction +- Include example for that edge case +- Fix the specific issue +- Don't generalize prematurely + +**Example:** +``` +Issue: Large file processing fails +Solution: Add instruction about memory-efficient processing +for files over 10,000 rows +``` + +### Multiple Test Failures (Same Issue) + +**Characteristics:** +- Same problem across 2+ tests +- Pattern suggests root cause +- Symptom appears in different forms + +**Action:** +- Fix the root cause +- Add general principle, not specific fix +- Consider extracting to script +- Explain the "why" + +**Example:** +``` +Issue: Tests 1 and 3 both have incorrect date formatting +Solution: Add general instruction about ISO 8601 date format +rather than fixing each case individually +``` + +### Multiple Test Failures (Different Issues) + +**Characteristics:** +- Different problems in each test +- No clear pattern +- Skill may be too broad + +**Action:** +- Clarify scope in description +- Add more specific instructions +- Consider splitting into multiple skills +- Focus on core functionality first + +**Example:** +``` +Issue: Test 1 fails on format, Test 2 fails on logic, Test 3 +never triggers skill +Solution: Clarify what the skill does/doesn't do in description +Add validation steps +``` + +--- + +## Decision Framework + +### Fix Specific Case When: + +- Unique edge case not covered +- One-time issue unlikely to recur +- Fix is simple and doesn't add complexity +- Doesn't indicate systemic problem + +**Example:** +``` +Test: CSV with Unicode characters fails +Fix: Add note about UTF-8 encoding for international characters +(Don't rewrite entire CSV handling) +``` + +### Generalize Solution When: + +- Same issue in 2+ tests +- Pattern suggests broader applicability +- Fix would benefit similar requests +- Explains underlying principle + +**Example:** +``` +Tests: Both test 1 and 2 have incorrect error handling +Fix: Add general principle about validating inputs before processing +with examples of common validation checks +``` + +### Extract to Script When: + +- Same multi-step process repeated +- Deterministic operation (not creative) +- Would save time on every invocation +- Logic is complex or error-prone + +**Example:** +``` +Pattern: All three tests require converting dates to ISO format +Action: Create scripts/convert-dates.py +Include in SKILL.md: "Use scripts/convert-dates.py for date formatting" +``` + +--- + +## Common Issues and Solutions + +### Issue: Skill Doesn't Trigger + +**Symptoms:** +- User says relevant phrase +- Skill doesn't appear in available skills +- Model handles request without skill + +**Solutions:** +1. Add specific trigger phrases to description +2. Use "pushy" language: "Make sure to use this skill whenever..." +3. Include synonyms and variations +4. Test with 8 trigger queries + +### Issue: Output Format Inconsistent + +**Symptoms:** +- Sometimes follows template, sometimes doesn't +- Format varies between similar requests +- Missing sections or elements + +**Solutions:** +1. Add explicit format template in SKILL.md +2. Provide multiple examples +3. Explain why format matters +4. Use imperative: "ALWAYS use this template" + +### Issue: Edge Cases Not Handled + +**Symptoms:** +- Large files cause errors +- Empty inputs crash +- Special characters corrupted +- Missing data causes failures + +**Solutions:** +1. Add validation step instructions +2. Include error handling examples +3. Create helper script for complex validation +4. Document known limitations + +### Issue: Too Verbose + +**Symptoms:** +- Responses are walls of text +- Over-explains obvious steps +- Repeats information +- Takes too long to get to point + +**Solutions:** +1. Remove redundant explanations +2. Trust user's expertise +3. Move details to references/ +4. Use progressive disclosure + +### Issue: Incomplete Responses + +**Symptoms:** +- Misses steps in workflow +- Forgets validation +- Doesn't mention prerequisites +- No error handling + +**Solutions:** +1. Add comprehensive checklist in SKILL.md +2. Include validation at each step +3. Document prerequisites upfront +4. Add error handling examples + +### Issue: Incorrect Commands + +**Symptoms:** +- Commands don't work when copied +- Syntax errors +- Wrong flags or options +- Outdated versions + +**Solutions:** +1. Test all commands before including +2. Specify version requirements +3. Add validation steps +4. Include error messages to expect + +--- + +## When to Stop Iterating + +### Stop When: + +1. **User is satisfied** + - User says "this works for me" + - User stops requesting changes + - User starts using skill regularly + +2. **Outputs meet expectations** + - All test cases pass + - Common scenarios work + - Edge cases handled reasonably + - No critical issues remain + +3. **No meaningful progress** + - Last 2-3 iterations didn't improve results + - Changes are cosmetic only + - Diminishing returns on effort + - 90% solution is good enough + +### Don't Stop When: + +- **Perfect is the enemy of good** - 90% working is better than endlessly tweaking +- **Edge cases are theoretical** - Don't over-engineer for unlikely scenarios +- **User wants to keep iterating** - Follow user's lead +- **New issues emerge** - Fix real problems as they're found + +### The 90% Rule + +A skill that works well for 90% of cases is sufficient. Users can handle: +- Rare edge cases manually +- Unique situations with custom guidance +- Complex scenarios with multiple steps + +**Focus on:** +- Common use cases (80% of value) +- Clear failure messages (10% of value) +- Good documentation (10% of value) + +**Don't focus on:** +- Every possible edge case (diminishing returns) +- Perfect formatting (good enough is fine) +- Handling every possible error (major errors are enough) + +--- + +## Grading Checklist + +Use this checklist for each test case: + +### Correctness +- [ ] Output matches expected result +- [ ] No factual errors +- [ ] Logic is sound +- [ ] Edge cases handled appropriately + +### Completeness +- [ ] All requested tasks completed +- [ ] No steps skipped +- [ ] Appropriate level of detail +- [ ] Relevant context included + +### Format +- [ ] Output follows specified format +- [ ] Consistent with examples +- [ ] Easy to read and understand + +### Triggering +- [ ] Skill activated when appropriate +- [ ] Did not activate when inappropriate + +### Efficiency +- [ ] No unnecessary steps +- [ ] Reasonable response length +- [ ] Not overly verbose + +### Overall +- [ ] Would use this skill again +- [ ] Would recommend to others +- [ ] Saves time vs manual approach +- [ ] Output quality meets needs + +--- + +## Recording Results + +After grading, record: + +```json +{ + "test_id": 1, + "result": "pass|fail|partial", + "issues": [ + "Description of issue 1", + "Description of issue 2" + ], + "suggested_fix": "Brief description of improvement", + "extract_script": false, + "priority": "high|medium|low" +} +``` + +Use the grade-output.sh script to generate this structure interactively. + +--- + +## Next Steps After Grading + +1. **Review all results** - Look for patterns +2. **Prioritize fixes** - High priority first +3. **Update SKILL.md** - Based on issues found +4. **Create scripts** - Extract repeated work +5. **Re-run tests** - Verify improvements +6. **Repeat** - Until satisfied or good enough diff --git a/.config/opencode/skills/.skill-builder.disabled/references/skill-templates.md b/.config/opencode/skills/.skill-builder.disabled/references/skill-templates.md new file mode 100644 index 0000000..f617eb2 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/references/skill-templates.md @@ -0,0 +1,888 @@ +# Skill Templates + +Copy-paste templates for common skill patterns. Use these as starting points when creating new skills. + +## Table of Contents + +1. [Simple Skill (Knowledge Only)](#simple-skill) +2. [Tool Integration Skill](#tool-integration-skill) +3. [Workflow Skill](#workflow-skill) +4. [Documentation Skill](#documentation-skill) +5. [Testing Your Skill](#testing-your-skill) + +--- + +## Simple Skill + +For skills that provide knowledge without scripts or complex resources. + +**Use when:** The skill provides guidance, best practices, or reference information that can be explained in text. + +### Directory Structure + +``` +simple-skill/ +└── SKILL.md +``` + +### SKILL.md Template + +```markdown +--- +name: simple-skill +description: This skill should be used when the user asks to "TRIGGER_PHRASE_1", "TRIGGER_PHRASE_2", or needs help with SPECIFIC_TASK +license: MIT +compatibility: opencode +metadata: + category: CATEGORY + version: "1.0.0" +--- + +# Simple Skill Name + +Brief description of what this skill covers. + +## Overview + +High-level explanation of the domain or topic. + +## Common Tasks + +### Task 1: Description + +Step-by-step instructions: + +1. First step with code example +2. Second step with code example +3. Third step + +### Task 2: Description + +Step-by-step instructions: + +1. First step +2. Second step + +## Best Practices + +- Bullet point 1 +- Bullet point 2 +- Bullet point 3 + +## Important Notes + +- Critical caveat 1 +- Critical caveat 2 +``` + +### Example: Git Commit Guidelines + +```markdown +--- +name: git-commit-guide +description: This skill should be used when the user asks to "write a commit message", "format a commit", "conventional commits", or needs help with Git commit standards +license: MIT +compatibility: opencode +--- + +# Git Commit Guidelines + +Follow conventional commits specification for clear, structured commit messages. + +## Format + +``` +(): + +[optional body] + +[optional footer] +``` + +## Types + +- `feat`: New feature +- `fix`: Bug fix +- `docs`: Documentation only +- `style`: Code style (formatting, semicolons, etc) +- `refactor`: Code refactoring +- `test`: Adding or updating tests +- `chore`: Maintenance tasks + +## Examples + +```bash +# Feature +feat(auth): add OAuth2 login support + +# Fix with scope +fix(api): resolve null pointer in user endpoint + +# Breaking change +feat(db)!: migrate to PostgreSQL +``` + +## Rules + +- Use imperative mood: "add" not "added" +- Don't capitalize first letter +- No period at end +- Keep first line under 72 characters +``` + +--- + +## Tool Integration Skill + +For skills that interact with command-line tools and include automation scripts. + +**Use when:** The skill wraps a CLI tool, provides automation, or requires executable scripts. + +### Directory Structure + +``` +tool-skill/ +├── SKILL.md +├── scripts/ +│ └── helper.sh +├── references/ +│ └── advanced-usage.md +└── evals/ + └── evals.json +``` + +### SKILL.md Template + +```markdown +--- +name: tool-skill +description: This skill should be used when the user asks to "TRIGGER_PHRASE_1", "run TOOL_NAME", "TOOL_NAME commands", or work with TOOL_NAME +license: MIT +compatibility: opencode +metadata: + category: tools + version: "1.0.0" +--- + +# Tool Name Integration + +Interact with TOOL_NAME for SPECIFIC_PURPOSE. + +## Quick Start + +Basic usage: + +```bash +tool-name --option value +``` + +## Common Operations + +### Operation 1: Description + +```bash +# Example command +tool-name command --flag +``` + +Explanation of what this does. + +### Operation 2: Description + +```bash +# Example command +tool-name another-command +``` + +## Scripts + +Use helper scripts in `scripts/`: + +- `scripts/helper.sh` - What this script does + +## Advanced Usage + +For detailed options and edge cases, see `references/advanced-usage.md`. + +## Important Notes + +- Requirement 1 +- Requirement 2 +- Common pitfall +``` + +### scripts/helper.sh Template + +```bash +#!/bin/bash +# +# Helper script for TOOL_NAME operations +# + +set -euo pipefail + +# Configuration +DEFAULT_OPTION="value" + +# Functions +show_help() { + cat << EOF +Usage: $0 [OPTIONS] + +Helper script for TOOL_NAME + +Options: + -h, --help Show this help message + -v, --verbose Enable verbose output + -o, --option Specify option value + +Examples: + $0 --verbose + $0 --option custom-value +EOF +} + +# Parse arguments +VERBOSE=false +OPTION="$DEFAULT_OPTION" + +while [[ $# -gt 0 ]]; do + case $1 in + -h|--help) + show_help + exit 0 + ;; + -v|--verbose) + VERBOSE=true + shift + ;; + -o|--option) + OPTION="$2" + shift 2 + ;; + *) + echo "Unknown option: $1" + show_help + exit 1 + ;; + esac +done + +# Main logic +main() { + [[ "$VERBOSE" == true ]] && echo "Running with option: $OPTION" + + # Your tool integration logic here + echo "Helper script executed successfully" +} + +main "$@" +``` + +### Example: Docker Helper + +```markdown +--- +name: docker-helper +description: This skill should be used when the user asks to "manage docker containers", "build docker images", "docker compose operations", or work with Docker workflows +license: MIT +compatibility: opencode +--- + +# Docker Helper + +Streamline Docker container and image management. + +## Quick Commands + +### List Running Containers + +```bash +docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" +``` + +### Clean Up System + +```bash +# Remove stopped containers, unused networks, dangling images +docker system prune -f +``` + +## Scripts + +Use helper scripts: + +- `scripts/container-logs.sh` - View and follow container logs +- `scripts/cleanup.sh` - Safe cleanup of unused resources + +## Docker Compose + +### Start Services + +```bash +docker compose up -d SERVICE_NAME +``` + +### View Logs + +```bash +docker compose logs -f SERVICE_NAME +``` + +## References + +- `references/compose-patterns.md` - Advanced compose configurations +- `references/troubleshooting.md` - Common issues and solutions +``` + +--- + +## Workflow Skill + +For skills that guide multi-step processes or complex procedures. + +**Use when:** The skill orchestrates a sequence of steps, decisions, or collaborative tasks. + +### Directory Structure + +``` +workflow-skill/ +├── SKILL.md +├── scripts/ +│ ├── step-validator.sh +│ └── automation.sh +├── references/ +│ ├── decision-matrix.md +│ └── examples/ +│ └── sample-workflow.md +└── evals/ + └── evals.json +``` + +### SKILL.md Template + +```markdown +--- +name: workflow-skill +description: This skill should be used when the user asks to "WORKFLOW_NAME", "perform WORKFLOW_TASK", "WORKFLOW_VERB process", or needs help with WORKFLOW_DOMAIN +license: MIT +compatibility: opencode +metadata: + category: workflow + version: "1.0.0" +--- + +# Workflow Name + +Guide the WORKFLOW_PROCESS from start to finish with validation at each step. + +## Prerequisites + +Before starting, ensure: + +1. Requirement 1 +2. Requirement 2 +3. Requirement 3 + +## Workflow Overview + +``` +Step 1 → Step 2 → Step 3 → Completion + ↓ ↓ ↓ +Validate Validate Validate +``` + +## Step-by-Step Process + +### Step 1: Preparation + +**Objective:** What to accomplish in this step + +**Actions:** +1. Action item 1 +2. Action item 2 + +**Validation:** +- [ ] Check item 1 +- [ ] Check item 2 + +**Script:** `scripts/step-validator.sh --step 1` + +### Step 2: Execution + +**Objective:** What to accomplish in this step + +**Actions:** +1. Action item 1 +2. Action item 2 + +**Validation:** +- [ ] Check item 1 +- [ ] Check item 2 + +**Decision Point:** + +If CONDITION_A: +- Follow path A (see `references/path-a.md`) + +If CONDITION_B: +- Follow path B (see `references/path-b.md`) + +### Step 3: Verification + +**Objective:** Final checks and completion + +**Actions:** +1. Run verification command +2. Review output + +**Validation:** +- [ ] All checks pass +- [ ] No errors in logs + +## Automation + +For automated execution: + +```bash +scripts/automation.sh --full +``` + +## Troubleshooting + +See `references/troubleshooting.md` for common issues. + +## Examples + +Complete workflow examples in `references/examples/`. +``` + +### Example: Release Workflow + +```markdown +--- +name: release-workflow +description: This skill should be used when the user asks to "create a release", "publish a new version", "tag a release", or prepare software for deployment +license: MIT +compatibility: opencode +--- + +# Release Workflow + +Guide the complete release process from version bump to deployment. + +## Prerequisites + +- [ ] All tests passing +- [ ] CHANGELOG.md updated +- [ ] Version number decided + +## Workflow + +### Step 1: Pre-Release Checks + +**Actions:** +1. Run full test suite: `npm test` +2. Verify CHANGELOG.md is current +3. Check for uncommitted changes + +**Validation:** +```bash +scripts/pre-release-check.sh +``` + +### Step 2: Version Bump + +**Actions:** +1. Update version in package.json +2. Update CHANGELOG.md with release date +3. Commit with message: `chore(release): bump version to X.Y.Z` + +**Validation:** +- [ ] Version updated in all files +- [ ] CHANGELOG.md has date +- [ ] Commit created + +### Step 3: Create Tag + +**Actions:** +1. Create annotated tag: `git tag -a vX.Y.Z -m "Release X.Y.Z"` +2. Push tag: `git push origin vX.Y.Z` + +**Validation:** +```bash +scripts/verify-tag.sh vX.Y.Z +``` + +### Step 4: GitHub Release + +**Actions:** +1. Draft release notes from CHANGELOG +2. Create GitHub release +3. Attach artifacts if needed + +**Command:** +```bash +gh release create vX.Y.Z --notes-from-tag +``` + +## Decision Matrix + +See `references/decision-matrix.md` for: +- Version numbering (semver vs calver) +- Pre-release vs stable +- Hotfix procedures +``` + +--- + +## Documentation Skill + +For skills focused on writing, reviewing, or managing documentation. + +**Use when:** The skill helps create documentation, enforces standards, or provides writing guidelines. + +### Directory Structure + +``` +docs-skill/ +├── SKILL.md +├── references/ +│ ├── style-guide.md +│ ├── templates/ +│ │ ├── README-template.md +│ │ └── API-doc-template.md +│ └── examples/ +│ └── sample-docs/ +├── assets/ +│ └── diagram-template.png +└── evals/ + └── evals.json +``` + +### SKILL.md Template + +```markdown +--- +name: docs-skill +description: This skill should be used when the user asks to "write documentation", "create a README", "document this code", "review docs", or needs help with technical writing +license: MIT +compatibility: opencode +metadata: + category: documentation + version: "1.0.0" +--- + +# Documentation Helper + +Create clear, consistent, and effective technical documentation. + +## Quick Start + +### Create README + +Start with the template: + +```bash +cp references/templates/README-template.md ./README.md +``` + +Then fill in: +1. Project name and description +2. Installation steps +3. Usage examples +4. API reference (if applicable) + +## Documentation Types + +### README.md + +Essential sections: +- Title and description +- Installation +- Quick start +- Usage examples +- API reference (if applicable) +- Contributing +- License + +Use template: `references/templates/README-template.md` + +### API Documentation + +Structure: +- Endpoint description +- Request format +- Response format +- Error codes +- Examples + +Use template: `references/templates/API-doc-template.md` + +### Code Comments + +Follow style guide in `references/style-guide.md`: +- JSDoc for JavaScript +- Docstrings for Python +- Rustdoc for Rust + +## Style Guide + +See `references/style-guide.md` for: +- Tone and voice +- Formatting rules +- Terminology +- Code example standards + +## Review Checklist + +Before finalizing documentation: + +- [ ] Clear and concise +- [ ] No typos or grammar errors +- [ ] Code examples work +- [ ] All links functional +- [ ] Proper heading hierarchy +- [ ] Consistent formatting + +## Examples + +Review well-documented projects in `references/examples/`. + +## Assets + +- `assets/diagram-template.png` - Template for architecture diagrams +``` + +### Example: API Documentation Helper + +```markdown +--- +name: api-docs-helper +description: This skill should be used when the user asks to "document an API", "create API documentation", "write endpoint docs", or needs help with REST API documentation +license: MIT +compatibility: opencode +--- + +# API Documentation Helper + +Create comprehensive REST API documentation following OpenAPI standards. + +## Endpoint Documentation Template + +For each endpoint, document: + +### 1. Overview + +```markdown +## POST /api/v1/resource + +Brief description of what this endpoint does. + +**Authorization:** Required (Bearer token) +**Rate Limit:** 100 requests/minute +``` + +### 2. Request + +```markdown +### Headers + +| Header | Value | Required | +|--------|-------|----------| +| Content-Type | application/json | Yes | +| Authorization | Bearer {token} | Yes | + +### Body + +```json +{ + "field1": "string (required) - Description", + "field2": "number (optional) - Description" +} +``` +``` + +### 3. Response + +```markdown +### Success (200 OK) + +```json +{ + "id": "string", + "created_at": "ISO 8601 datetime", + "status": "active" +} +``` + +### Error Responses + +- `400 Bad Request` - Invalid input +- `401 Unauthorized` - Missing/invalid token +- `404 Not Found` - Resource doesn't exist +``` + +### 4. Examples + +Provide curl and client library examples. + +## Standards + +Follow conventions in `references/api-standards.md`: +- Use JSON for request/response bodies +- ISO 8601 for dates +- snake_case for field names +- Include pagination metadata for lists + +## Tools + +Validate documentation: +- `scripts/validate-openapi.sh` - Check OpenAPI spec +- `scripts/check-examples.sh` - Verify code examples work +``` + +--- + +## Testing Your Skill + +After creating a skill from these templates, follow this testing workflow: + +### Step 1: Create Test Cases + +Generate test cases for your skill: + +```bash +~/.config/opencode/skills/skill-builder/scripts/create-tests.sh ~/.config/opencode/skills/your-skill +``` + +This creates `evals/evals.json` with 3 template test cases: +1. **Common Case** - Typical usage scenario +2. **Edge Case** - Tricky or unusual situation +3. **Varied Phrasing** - Same intent, different words + +### Step 2: Customize Test Prompts + +Edit `evals/evals.json` and replace placeholder text with realistic test prompts: + +**Good test prompt:** +``` +Convert the CSV file at ./data/sales.csv to JSON and save it as ./output/sales.json. +The CSV has headers: date, product, quantity, price. Make sure dates are in ISO 8601 format. +``` + +**Bad test prompt:** +``` +Convert CSV to JSON +``` + +See `eval-templates.md` for copy-paste templates for different skill types. + +### Step 3: Run Tests + +Execute the test workflow: + +```bash +~/.config/opencode/skills/skill-builder/scripts/run-tests.sh ~/.config/opencode/skills/your-skill +``` + +This will: +- Display each test prompt +- Guide you through manual testing in opencode +- Record your evaluations (pass/fail/skip) +- Save results to `evals/test-results.json` + +### Step 4: Grade Results + +Use the grading checklist for systematic evaluation: + +```bash +~/.config/opencode/skills/skill-builder/scripts/grade-output.sh ~/.config/opencode/skills/your-skill +``` + +This evaluates: +- Correctness +- Completeness +- Format +- Triggering +- Efficiency + +And generates `evals/grading-report.json` with structured feedback. + +### Step 5: Iterate + +Based on test results, improve your skill: + +1. Review issues in grading report +2. Update SKILL.md with fixes +3. Re-run tests +4. Repeat until satisfied + +See `grading-guide.md` for detailed evaluation criteria and decision frameworks. + +### Testing by Skill Type + +**Simple Skills:** +- May not need formal test cases +- Test manually with 2-3 realistic prompts +- Verify outputs are helpful and accurate + +**Tool Integration Skills:** +- Must test all commands work when copied +- Verify edge cases (errors, conflicts) +- Test with different tool versions if possible + +**Workflow Skills:** +- Test complete end-to-end flow +- Verify each step produces correct result +- Test decision points and branches +- Check error recovery paths + +**Code Generation Skills:** +- Verify generated code is syntactically valid +- Test that code actually runs +- Check edge cases (empty input, large data) +- Validate error handling + +### When to Stop Testing + +Stop iterating when: +- All critical test cases pass +- User is satisfied with quality +- Common scenarios work reliably +- No longer making meaningful improvements + +Remember: A skill that works well for 90% of cases is sufficient. Perfection is the enemy of good. + +--- + +## Tips for All Templates + +1. **Replace placeholders immediately** - Search for `TRIGGER_PHRASE`, `TOOL_NAME`, etc. + +2. **Write description first** - Include 3-5 specific trigger phrases in quotes + +3. **Keep examples working** - Test all code examples before finalizing + +4. **Validate early and often** - Run validation after every significant change + +5. **Progressive disclosure** - Move details to references/ as the skill grows + +6. **Be specific** - Generic skills don't trigger well. Make descriptions concrete. + +7. **Test triggers** - After creating, try phrases from your description to verify activation + +## Validation Checklist for New Skills + +Before considering a skill complete: + +- [ ] SKILL.md created with proper frontmatter +- [ ] `name` matches directory name +- [ ] `description` is 20-1024 characters with trigger phrases +- [ ] No second-person writing in body +- [ ] All scripts are executable +- [ ] Referenced files exist and are mentioned in SKILL.md +- [ ] Validation passes: `validate-skill.sh /path/to/skill` +- [ ] Test cases created in `evals/evals.json` +- [ ] Tests pass or issues documented +- [ ] Tested with actual trigger phrases diff --git a/.config/opencode/skills/.skill-builder.disabled/scripts/create-tests.sh b/.config/opencode/skills/.skill-builder.disabled/scripts/create-tests.sh new file mode 100755 index 0000000..c5ced84 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/scripts/create-tests.sh @@ -0,0 +1,179 @@ +#!/bin/bash +# +# create-tests.sh - Interactive test case generator for skills +# +# Usage: ./create-tests.sh +# +# This script interactively generates evals/evals.json with 3 template test cases. +# + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Parse arguments +SKILL_DIR="" + +usage() { + echo "Usage: $0 " + echo "" + echo "Examples:" + echo " $0 ~/.config/opencode/skills/my-skill" + echo " $0 ./my-skill" + exit 1 +} + +# Parse command line arguments +while [[ $# -gt 0 ]]; do + case $1 in + -h|--help) + usage + ;; + -*) + echo -e "${RED}Error: Unknown option $1${NC}" + usage + ;; + *) + if [[ -z "$SKILL_DIR" ]]; then + SKILL_DIR="$1" + else + echo -e "${RED}Error: Multiple skill directories provided${NC}" + usage + fi + shift + ;; + esac +done + +# Validate skill directory +if [[ -z "$SKILL_DIR" ]]; then + echo -e "${RED}Error: Skill directory is required${NC}" + usage +fi + +# Resolve path +SKILL_DIR="$(cd "$SKILL_DIR" 2>/dev/null && pwd)" || { + echo -e "${RED}Error: Cannot access directory: $SKILL_DIR${NC}" + exit 1 +} + +# Get skill name from directory +SKILL_NAME="$(basename "$SKILL_DIR")" + +# Verify SKILL.md exists +if [[ ! -f "$SKILL_DIR/SKILL.md" ]]; then + echo -e "${RED}Error: SKILL.md not found in $SKILL_DIR${NC}" + echo "This does not appear to be a valid skill directory." + exit 1 +fi + +echo "========================================" +echo "Creating Test Cases for: $SKILL_NAME" +echo "========================================" +echo "" + +# Create evals directory +mkdir -p "$SKILL_DIR/evals" +echo -e "${GREEN}✓ Created evals/ directory${NC}" +echo "" + +# Check if evals.json already exists +if [[ -f "$SKILL_DIR/evals/evals.json" ]]; then + echo -e "${YELLOW}⚠ evals.json already exists${NC}" + read -p "Overwrite? (y/N): " -n 1 -r + echo + if [[ ! $REPLY =~ ^[Yy]$ ]]; then + echo "Aborted." + exit 0 + fi +fi + +# Get skill purpose from user +echo "Let's create 3 test cases for your skill." +echo "" +echo "First, tell me about your skill's purpose:" +echo " (e.g., 'Helps users convert CSV files to JSON format')" +read -r SKILL_PURPOSE + +if [[ -z "$SKILL_PURPOSE" ]]; then + SKILL_PURPOSE="[Describe what your skill does]" +fi + +echo "" +echo "Now I'll generate 3 test case templates." +echo "You can edit evals/evals.json later to customize them." +echo "" + +# Generate evals.json +cat > "$SKILL_DIR/evals/evals.json" << EVALSJSON +{ + "skill_name": "$SKILL_NAME", + "description": "$SKILL_PURPOSE", + "evals": [ + { + "id": 1, + "name": "common-case", + "type": "common", + "prompt": "[REPLACE WITH REALISTIC USER REQUEST]\n\nExample: Convert the CSV file at ./data/sales.csv to JSON format and save it as ./output/sales.json. Make sure dates are in ISO 8601 format.", + "expected_output": "[DESCRIBE EXPECTED RESULT]\n\nExample: A JSON file at ./output/sales.json containing the sales data from the CSV, with all dates converted to ISO 8601 format.", + "assertions": [ + "Output file exists at the specified location", + "File is valid JSON", + "Dates are in ISO 8601 format" + ], + "notes": "This tests the typical, straightforward use case." + }, + { + "id": 2, + "name": "edge-case", + "type": "edge", + "prompt": "[REPLACE WITH EDGE CASE SCENARIO]\n\nExample: Convert a CSV file with 100,000 rows, some containing special characters (emojis, Unicode) and empty values in certain columns. The file is at ./data/large_export.csv.", + "expected_output": "[DESCRIBE EXPECTED BEHAVIOR]\n\nExample: JSON file is created successfully, handling all special characters correctly and preserving empty values as null or empty strings.", + "assertions": [ + "Large file is processed without errors", + "Special characters are preserved correctly", + "Empty values are handled appropriately" + ], + "notes": "This tests edge cases like large files, special characters, or missing data." + }, + { + "id": 3, + "name": "varied-phrasing", + "type": "variation", + "prompt": "[REPLACE WITH SAME INTENT, DIFFERENT WORDS]\n\nExample: Hey, I've got this spreadsheet in data/sales.csv. Can you turn it into JSON for me? And make sure the dates look right - you know, standard format?", + "expected_output": "[SAME AS COMMON CASE]", + "assertions": [ + "Skill triggers correctly with casual language", + "Output matches common case results", + "Implicit requirements (date formatting) are handled" + ], + "notes": "This tests that the skill works with different phrasings and casual language." + } + ] +} +EVALSJSON + +echo -e "${GREEN}✓ Created evals/evals.json${NC}" +echo "" +echo "========================================" +echo "Next Steps:" +echo "========================================" +echo "" +echo "1. Edit evals/evals.json" +echo " Replace the [PLACEHOLDER] text with realistic test cases" +echo "" +echo "2. Tips for writing good test prompts:" +echo " - Include file paths (e.g., ./data/file.csv)" +echo " - Add personal context (e.g., 'my boss sent me')" +echo " - Use specific values and column names" +echo " - Mix formal and casual language" +echo "" +echo "3. Run tests when ready:" +echo " ~/.config/opencode/skills/skill-builder/scripts/run-tests.sh $SKILL_DIR" +echo "" +echo -e "${GREEN}Done!${NC}" diff --git a/.config/opencode/skills/.skill-builder.disabled/scripts/grade-output.sh b/.config/opencode/skills/.skill-builder.disabled/scripts/grade-output.sh new file mode 100755 index 0000000..d4ef91f --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/scripts/grade-output.sh @@ -0,0 +1,363 @@ +#!/bin/bash +# +# grade-output.sh - Interactive grading checklist for skill outputs +# +# Usage: ./grade-output.sh +# +# This script provides a structured checklist for evaluating skill test outputs. +# + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +# Parse arguments +SKILL_DIR="" + +usage() { + echo "Usage: $0 " + echo "" + echo "Examples:" + echo " $0 ~/.config/opencode/skills/my-skill" + echo " $0 ./my-skill" + exit 1 +} + +# Parse command line arguments +while [[ $# -gt 0 ]]; do + case $1 in + -h|--help) + usage + ;; + -*) + echo -e "${RED}Error: Unknown option $1${NC}" + usage + ;; + *) + if [[ -z "$SKILL_DIR" ]]; then + SKILL_DIR="$1" + else + echo -e "${RED}Error: Multiple skill directories provided${NC}" + usage + fi + shift + ;; + esac +done + +# Validate skill directory +if [[ -z "$SKILL_DIR" ]]; then + echo -e "${RED}Error: Skill directory is required${NC}" + usage +fi + +# Resolve path +SKILL_DIR="$(cd "$SKILL_DIR" 2>/dev/null && pwd)" || { + echo -e "${RED}Error: Cannot access directory: $SKILL_DIR${NC}" + exit 1 +} + +# Get skill name from directory +SKILL_NAME="$(basename "$SKILL_DIR")" + +echo "========================================" +echo "Grading Output for: $SKILL_NAME" +echo "========================================" +echo "" + +# Check for test results +if [[ -f "$SKILL_DIR/evals/test-results.json" ]]; then + echo -e "${BLUE}Found previous test results${NC}" + echo "" +fi + +# Initialize grading data +declare -a CRITERIA=( + "Correctness: Output matches expected result" + "Correctness: No factual errors" + "Correctness: Logic is sound" + "Correctness: Edge cases handled appropriately" + "Completeness: All requested tasks completed" + "Completeness: No steps skipped" + "Completeness: Appropriate level of detail" + "Completeness: Relevant context included" + "Format: Output follows specified format" + "Format: Consistent with examples in skill" + "Format: Easy to read and understand" + "Triggering: Skill activated when appropriate" + "Triggering: Did not activate when inappropriate" + "Efficiency: No unnecessary steps" + "Efficiency: Reasonable response length" + "Efficiency: Not overly verbose" +) + +GRADES=() +ISSUES=() + +# Function to ask yes/no question +ask_yes_no() { + local prompt="$1" + while true; do + read -p "$prompt (y/n): " -n 1 -r + echo + case $REPLY in + [Yy]) + return 0 + ;; + [Nn]) + return 1 + ;; + *) + echo "Please enter y or n" + ;; + esac + done +} + +echo "This checklist will help you systematically evaluate the skill output." +echo "Answer each question based on the test results you observed." +echo "" +read -p "Press Enter to begin grading..." +echo "" + +# Grade each criterion +echo "========================================" +echo -e "${CYAN}Grading Criteria${NC}" +echo "========================================" +echo "" + +for criterion in "${CRITERIA[@]}"; do + category="${criterion%%:*}" + description="${criterion#*: }" + + echo -e "${BLUE}[$category]${NC} $description" + + if ask_yes_no " Does it meet this criterion"; then + GRADES+=("$criterion: PASS") + echo -e " ${GREEN}✓ Pass${NC}" + else + GRADES+=("$criterion: FAIL") + echo -e " ${RED}✗ Fail${NC}" + + # Ask for issue description + echo " Briefly describe the issue:" + read -r issue + if [[ -n "$issue" ]]; then + ISSUES+=("[$category] $description: $issue") + fi + fi + echo "" +done + +# Overall assessment +echo "========================================" +echo -e "${CYAN}Overall Assessment${NC}" +echo "========================================" +echo "" + +echo "Overall Result:" +echo " [p] Pass - All or most criteria met" +echo " [f] Fail - Significant issues found" +echo " [i] Incomplete - Needs more testing" +echo "" + +while true; do + read -p "Overall result (p/f/i): " -n 1 -r + echo + case $REPLY in + [Pp]) + OVERALL_RESULT="pass" + break + ;; + [Ff]) + OVERALL_RESULT="fail" + break + ;; + [Ii]) + OVERALL_RESULT="incomplete" + break + ;; + *) + echo "Please enter p, f, or i" + ;; + esac +done + +echo "" + +# Priority assessment +echo "Priority of fixes needed:" +echo " [h] High - Critical issues, skill not usable" +echo " [m] Medium - Important issues, skill partially works" +echo " [l] Low - Minor issues, skill mostly works" +echo "" + +while true; do + read -p "Priority (h/m/l): " -n 1 -r + echo + case $REPLY in + [Hh]) + PRIORITY="high" + break + ;; + [Mm]) + PRIORITY="medium" + break + ;; + [Ll]) + PRIORITY="low" + break + ;; + *) + echo "Please enter h, m, or l" + ;; + esac +done + +echo "" + +# Suggested fixes +echo -e "${BLUE}Suggested Improvements (optional):${NC}" +echo "Describe what changes would address the issues:" +read -r SUGGESTED_FIXES + +echo "" + +# Pattern analysis +echo "========================================" +echo -e "${CYAN}Pattern Analysis${NC}" +echo "========================================" +echo "" + +if ask_yes_no "Did the same issue appear in multiple test cases"; then + echo "This suggests a systemic problem. Consider:" + echo " - Fixing the root cause rather than symptoms" + echo " - Adding a helper script for repeated tasks" + echo " - Clarifying instructions in SKILL.md" + PATTERN="systemic" +else + echo "Issues appear to be isolated to specific cases." + PATTERN="isolated" +fi + +echo "" + +# Extract to script recommendation +if ask_yes_no "Should any repeated work be extracted to a script"; then + echo "Consider creating a script in scripts/ directory." + EXTRACT_SCRIPT="true" +else + EXTRACT_SCRIPT="false" +fi + +echo "" + +# Generate grading report +TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ") + +# Build issues array +ISSUES_JSON="[" +for i in "${!ISSUES[@]}"; do + if [[ $i -gt 0 ]]; then + ISSUES_JSON+="," + fi + ISSUES_JSON+="\"${ISSUES[$i]}\"" +done +ISSUES_JSON+="]" + +# Build grades array +GRADES_JSON="[" +for i in "${!GRADES[@]}"; do + if [[ $i -gt 0 ]]; then + GRADES_JSON+="," + fi + GRADES_JSON+="\"${GRADES[$i]}\"" +done +GRADES_JSON+="]" + +# Calculate pass rate +TOTAL_CRITERIA=${#CRITERIA[@]} +PASSED_COUNT=0 +for grade in "${GRADES[@]}"; do + if [[ "$grade" == *"PASS" ]]; then + ((PASSED_COUNT++)) + fi +done + +PASS_RATE=$((PASSED_COUNT * 100 / TOTAL_CRITERIA)) + +cat > "$SKILL_DIR/evals/grading-report.json" << EOF +{ + "skill_name": "$SKILL_NAME", + "timestamp": "$TIMESTAMP", + "overall_result": "$OVERALL_RESULT", + "priority": "$PRIORITY", + "pass_rate": $PASS_RATE, + "criteria_passed": $PASSED_COUNT, + "criteria_total": $TOTAL_CRITERIA, + "pattern_analysis": "$PATTERN", + "extract_script_recommended": $EXTRACT_SCRIPT, + "detailed_grades": $GRADES_JSON, + "issues": $ISSUES_JSON, + "suggested_fixes": "$SUGGESTED_FIXES" +} +EOF + +echo "========================================" +echo -e "${GREEN}Grading Report Generated${NC}" +echo "========================================" +echo "" +echo "Saved to: evals/grading-report.json" +echo "" +echo "Summary:" +echo " Overall: $OVERALL_RESULT" +echo " Priority: $PRIORITY" +echo " Pass Rate: $PASS_RATE% ($PASSED_COUNT/$TOTAL_CRITERIA criteria)" +echo " Pattern: $PATTERN issues" +echo "" + +if [[ ${#ISSUES[@]} -gt 0 ]]; then + echo -e "${YELLOW}Issues Found:${NC}" + for issue in "${ISSUES[@]}"; do + echo " - $issue" + done + echo "" +fi + +# Next steps +echo "========================================" +echo "Next Steps" +echo "========================================" +echo "" + +if [[ "$OVERALL_RESULT" == "pass" ]]; then + echo -e "${GREEN}✓ Skill is working well!${NC}" + echo "" + echo "Consider:" + echo " - Adding more edge case tests" + echo " - Optimizing the description" + echo " - Documenting the skill" +else + echo "To improve the skill:" + echo "" + echo "1. Review grading-report.json for details" + echo "2. Update SKILL.md based on the issues found" + echo "" + if [[ "$EXTRACT_SCRIPT" == "true" ]]; then + echo "3. Create helper scripts for repeated tasks" + echo " - Place scripts in scripts/ directory" + echo " - Update SKILL.md to reference them" + echo "" + fi + echo "4. Re-run tests to verify improvements:" + echo " ~/.config/opencode/skills/skill-builder/scripts/run-tests.sh $SKILL_DIR" +fi + +echo "" +echo -e "${GREEN}Done!${NC}" diff --git a/.config/opencode/skills/.skill-builder.disabled/scripts/init-skill.sh b/.config/opencode/skills/.skill-builder.disabled/scripts/init-skill.sh new file mode 100755 index 0000000..0c5180c --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/scripts/init-skill.sh @@ -0,0 +1,391 @@ +#!/bin/bash +# +# init-skill.sh - Scaffold a new skill with proper structure +# +# Usage: ./init-skill.sh [options] +# +# Options: +# --local Create in current project (.opencode/skills/) instead of global +# --with-scripts Include scripts/ directory +# --with-refs Include references/ directory +# --with-assets Include assets/ directory +# --with-tests Include evals/ directory for test cases +# --full Include all optional directories +# + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' # No Color + +# Get script directory +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +SKILL_BUILDER_DIR="$(dirname "$SCRIPT_DIR")" + +# Parse arguments +SKILL_NAME="" +LOCAL=false +WITH_SCRIPTS=false +WITH_REFS=false +WITH_ASSETS=false +WITH_TESTS=false + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Options:" + echo " --local Create in current project (.opencode/skills/)" + echo " --with-scripts Include scripts/ directory" + echo " --with-refs Include references/ directory" + echo " --with-assets Include assets/ directory" + echo " --with-tests Include evals/ directory for test cases" + echo " --full Include all optional directories" + echo "" + echo "Examples:" + echo " $0 my-skill" + echo " $0 my-skill --full" + echo " $0 my-skill --local --with-scripts" + exit 1 +} + +# Validate skill name +validate_name() { + local name="$1" + + if [[ -z "$name" ]]; then + echo -e "${RED}Error: Skill name is required${NC}" + usage + fi + + if [[ ! "$name" =~ ^[a-z0-9-]+$ ]]; then + echo -e "${RED}Error: Skill name must be lowercase alphanumeric with hyphens only${NC}" + echo " Valid: my-skill, docker-helper, test-runner" + echo " Invalid: My Skill, docker_helper, testRunner" + exit 1 + fi + + if [[ ${#name} -lt 1 ]]; then + echo -e "${RED}Error: Skill name cannot be empty${NC}" + exit 1 + fi + + if [[ ${#name} -gt 64 ]]; then + echo -e "${RED}Error: Skill name must be under 64 characters${NC}" + exit 1 + fi +} + +# Parse command line arguments +while [[ $# -gt 0 ]]; do + case $1 in + --local) + LOCAL=true + shift + ;; + --with-scripts) + WITH_SCRIPTS=true + shift + ;; + --with-refs) + WITH_REFS=true + shift + ;; + --with-assets) + WITH_ASSETS=true + shift + ;; + --with-tests) + WITH_TESTS=true + shift + ;; + --full) + WITH_SCRIPTS=true + WITH_REFS=true + WITH_ASSETS=true + WITH_TESTS=true + shift + ;; + -h|--help) + usage + ;; + -*) + echo -e "${RED}Error: Unknown option $1${NC}" + usage + ;; + *) + if [[ -z "$SKILL_NAME" ]]; then + SKILL_NAME="$1" + else + echo -e "${RED}Error: Multiple skill names provided${NC}" + usage + fi + shift + ;; + esac +done + +# Validate skill name +validate_name "$SKILL_NAME" + +# Determine target directory +if [[ "$LOCAL" == true ]]; then + # Check if we're in a git repository + if ! git rev-parse --git-dir > /dev/null 2>&1; then + echo -e "${RED}Error: --local requires being in a git repository${NC}" + exit 1 + fi + + TARGET_DIR=".opencode/skills/$SKILL_NAME" + + # Check for AGENTS.md in project + if [[ -f "AGENTS.md" ]]; then + echo -e "${GREEN}✓ Found AGENTS.md in project root${NC}" + else + echo -e "${YELLOW}⚠ No AGENTS.md found in project root${NC}" + echo " Consider creating one for project-specific rules" + fi +else + TARGET_DIR="$HOME/.config/opencode/skills/$SKILL_NAME" +fi + +# Check if skill already exists +if [[ -d "$TARGET_DIR" ]]; then + echo -e "${RED}Error: Skill already exists at $TARGET_DIR${NC}" + exit 1 +fi + +echo -e "${GREEN}Creating skill: $SKILL_NAME${NC}" +echo " Location: $TARGET_DIR" +echo "" + +# Create skill directory +mkdir -p "$TARGET_DIR" + +# Create SKILL.md +cat > "$TARGET_DIR/SKILL.md" << 'SKILL_TEMPLATE' +--- +name: SKILL_NAME_PLACEHOLDER +description: This skill should be used when the user asks to "DESCRIPTION_HERE". Add specific trigger phrases that would activate this skill. +license: MIT +compatibility: opencode +metadata: + category: general + version: "1.0.0" +--- + +# SKILL_NAME_PLACEHOLDER + +Brief description of what this skill does and its purpose. + +## What This Skill Provides + +1. **Feature 1** - Brief description +2. **Feature 2** - Brief description +3. **Feature 3** - Brief description + +## Quick Start + +Basic usage example: + +```bash +# Example command +some-tool --option value +``` + +## Common Tasks + +### Task 1: Description + +Step-by-step instructions: + +1. First step +2. Second step +3. Third step + +### Task 2: Description + +Step-by-step instructions: + +1. First step +2. Second step + +## Important Notes + +- Keep this section brief +- Use bullet points for clarity +- Reference supporting files if available + +## Resources + +SKILL_TEMPLATE + +# Add resource section if directories were created +if [[ "$WITH_REFS" == true ]]; then + cat >> "$TARGET_DIR/SKILL.md" << 'REF_SECTION' + +### Reference Files + +- `references/detailed-guide.md` - Detailed documentation + +REF_SECTION +fi + +if [[ "$WITH_SCRIPTS" == true ]]; then + cat >> "$TARGET_DIR/SKILL.md" << 'SCRIPT_SECTION' + +### Scripts + +- `scripts/helper.sh` - Utility script + +SCRIPT_SECTION +fi + +if [[ "$WITH_ASSETS" == true ]]; then + cat >> "$TARGET_DIR/SKILL.md" << 'ASSET_SECTION' + +### Assets + +- `assets/template.txt` - Template file + +ASSET_SECTION +fi + +# Replace placeholder with actual skill name +sed -i "s/SKILL_NAME_PLACEHOLDER/$SKILL_NAME/g" "$TARGET_DIR/SKILL.md" + +# Create optional directories +if [[ "$WITH_SCRIPTS" == true ]]; then + mkdir -p "$TARGET_DIR/scripts" + + # Create example script + cat > "$TARGET_DIR/scripts/helper.sh" << 'SCRIPT_EXAMPLE' +#!/bin/bash +# +# Helper script for SKILL_NAME_PLACEHOLDER +# + +set -euo pipefail + +echo "Helper script for SKILL_NAME_PLACEHOLDER" +echo "Modify this script for your needs" +SCRIPT_EXAMPLE + + sed -i "s/SKILL_NAME_PLACEHOLDER/$SKILL_NAME/g" "$TARGET_DIR/scripts/helper.sh" + chmod +x "$TARGET_DIR/scripts/helper.sh" + + echo -e "${GREEN} ✓ Created scripts/ directory${NC}" +fi + +if [[ "$WITH_REFS" == true ]]; then + mkdir -p "$TARGET_DIR/references" + + # Create example reference + cat > "$TARGET_DIR/references/detailed-guide.md" << 'REF_EXAMPLE' +# Detailed Guide for SKILL_NAME_PLACEHOLDER + +This file contains detailed documentation that can be referenced on-demand. + +## Advanced Usage + +Detailed instructions go here... + +## Configuration + +Configuration options and examples... + +## Troubleshooting + +Common issues and solutions... +REF_EXAMPLE + + sed -i "s/SKILL_NAME_PLACEHOLDER/$SKILL_NAME/g" "$TARGET_DIR/references/detailed-guide.md" + + echo -e "${GREEN} ✓ Created references/ directory${NC}" +fi + +if [[ "$WITH_ASSETS" == true ]]; then + mkdir -p "$TARGET_DIR/assets" + + # Create example asset + cat > "$TARGET_DIR/assets/template.txt" << 'ASSET_EXAMPLE' +# Template file for SKILL_NAME_PLACEHOLDER + +This is a template file that can be used as a starting point. +Modify it according to your needs. +ASSET_EXAMPLE + + sed -i "s/SKILL_NAME_PLACEHOLDER/$SKILL_NAME/g" "$TARGET_DIR/assets/template.txt" + + echo -e "${GREEN} ✓ Created assets/ directory${NC}" +fi + +if [[ "$WITH_TESTS" == true ]]; then + mkdir -p "$TARGET_DIR/evals" + + # Create template evals.json + cat > "$TARGET_DIR/evals/evals.json" << 'EVALS_TEMPLATE' +{ + "skill_name": "SKILL_NAME_PLACEHOLDER", + "evals": [ + { + "id": 1, + "name": "common-case", + "prompt": "Typical user request with realistic context, file paths, and specific details. This represents the most common way users will ask for help.", + "expected_output": "Description of what the skill should produce for this request", + "assertions": [ + "Check that output contains specific expected content", + "Verify file is created at expected location (if applicable)" + ] + }, + { + "id": 2, + "name": "edge-case", + "prompt": "Unusual or tricky scenario that tests edge cases, error handling, or complex requirements. Include something that might break the skill.", + "expected_output": "Description of expected behavior for this edge case", + "assertions": [ + "Verify edge case is handled appropriately", + "Check for graceful error handling if applicable" + ] + }, + { + "id": 3, + "name": "varied-phrasing", + "prompt": "Same intent as the common case but expressed with different words, casual language, or alternative phrasing. Tests skill robustness to language variation.", + "expected_output": "Same expected output as common case", + "assertions": [ + "Output should match common case results", + "Skill triggers correctly with different phrasing" + ] + } + ] +} +EVALS_TEMPLATE + + sed -i "s/SKILL_NAME_PLACEHOLDER/$SKILL_NAME/g" "$TARGET_DIR/evals/evals.json" + + echo -e "${GREEN} ✓ Created evals/ directory with template evals.json${NC}" +fi + +echo "" +echo -e "${GREEN}✓ Skill '$SKILL_NAME' created successfully!${NC}" +echo "" +echo "Next steps:" +echo " 1. Edit $TARGET_DIR/SKILL.md" +echo " 2. Fill in the description with specific trigger phrases" +echo " 3. Add your skill content" +if [[ "$WITH_REFS" == true ]]; then + echo " 4. Add detailed content to references/ files" +fi +if [[ "$WITH_SCRIPTS" == true ]]; then + echo " 5. Implement scripts in scripts/ directory" +fi +if [[ "$WITH_TESTS" == true ]]; then + echo " 6. Customize test cases in evals/evals.json" + echo " 7. Run tests: ${SKILL_BUILDER_DIR}/scripts/run-tests.sh $TARGET_DIR" +fi +echo "" +echo "Validate your skill:" +echo " ${SKILL_BUILDER_DIR}/scripts/validate-skill.sh $TARGET_DIR" diff --git a/.config/opencode/skills/.skill-builder.disabled/scripts/run-tests.sh b/.config/opencode/skills/.skill-builder.disabled/scripts/run-tests.sh new file mode 100755 index 0000000..a1436c9 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/scripts/run-tests.sh @@ -0,0 +1,325 @@ +#!/bin/bash +# +# run-tests.sh - Execute test workflow and capture results +# +# Usage: ./run-tests.sh +# +# This script guides you through testing each test case manually and records results. +# + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' # No Color + +# Parse arguments +SKILL_DIR="" + +usage() { + echo "Usage: $0 " + echo "" + echo "Examples:" + echo " $0 ~/.config/opencode/skills/my-skill" + echo " $0 ./my-skill" + exit 1 +} + +# Parse command line arguments +while [[ $# -gt 0 ]]; do + case $1 in + -h|--help) + usage + ;; + -*) + echo -e "${RED}Error: Unknown option $1${NC}" + usage + ;; + *) + if [[ -z "$SKILL_DIR" ]]; then + SKILL_DIR="$1" + else + echo -e "${RED}Error: Multiple skill directories provided${NC}" + usage + fi + shift + ;; + esac +done + +# Validate skill directory +if [[ -z "$SKILL_DIR" ]]; then + echo -e "${RED}Error: Skill directory is required${NC}" + usage +fi + +# Resolve path +SKILL_DIR="$(cd "$SKILL_DIR" 2>/dev/null && pwd)" || { + echo -e "${RED}Error: Cannot access directory: $SKILL_DIR${NC}" + exit 1 +} + +# Get skill name from directory +SKILL_NAME="$(basename "$SKILL_DIR")" + +# Verify evals.json exists +if [[ ! -f "$SKILL_DIR/evals/evals.json" ]]; then + echo -e "${RED}Error: evals/evals.json not found${NC}" + echo "" + echo "Create test cases first:" + echo " ~/.config/opencode/skills/skill-builder/scripts/create-tests.sh $SKILL_DIR" + exit 1 +fi + +echo "========================================" +echo "Running Tests for: $SKILL_NAME" +echo "========================================" +echo "" + +# Check if jq is available +if ! command -v jq &> /dev/null; then + echo -e "${YELLOW}⚠ jq is not installed${NC}" + echo " Install jq for better JSON handling:" + echo " Ubuntu/Debian: sudo apt-get install jq" + echo " macOS: brew install jq" + echo "" + echo -e "${YELLOW}Falling back to basic parsing...${NC}" + USE_JQ=false +else + USE_JQ=true +fi + +# Initialize results array +RESULTS=() +TEST_COUNT=0 +PASSED=0 +FAILED=0 +SKIPPED=0 + +# Function to extract value from JSON using basic parsing +get_json_value() { + local file="$1" + local key="$2" + local index="${3:-}" + + if [[ "$USE_JQ" == true ]]; then + if [[ -n "$index" ]]; then + jq -r ".evals[$index].$key" "$file" 2>/dev/null || echo "" + else + jq -r ".$key" "$file" 2>/dev/null || echo "" + fi + else + # Basic grep-based extraction (fallback) + if [[ -n "$index" ]]; then + # This is a simplified fallback - won't handle nested structures well + grep -A 100 '"evals":' "$file" | grep -A 20 "\"id\": $index" | grep "\"$key\":" | head -1 | sed 's/.*"'$key'": "\(.*\)".*/\1/' | sed 's/",*$//' + else + grep "\"$key\":" "$file" | head -1 | sed 's/.*"'$key'": "\(.*\)".*/\1/' | sed 's/",*$//' + fi + fi +} + +# Count test cases +if [[ "$USE_JQ" == true ]]; then + TEST_COUNT=$(jq '.evals | length' "$SKILL_DIR/evals/evals.json") +else + TEST_COUNT=$(grep -c '"id":' "$SKILL_DIR/evals/evals.json" 2>/dev/null || echo "0") +fi + +if [[ "$TEST_COUNT" -eq 0 ]]; then + echo -e "${RED}Error: No test cases found in evals.json${NC}" + exit 1 +fi + +echo "Found $TEST_COUNT test case(s)" +echo "" + +# Process each test case +for ((i=0; i "$SKILL_DIR/evals/test-results.json" << EOF +{ + "skill_name": "$SKILL_NAME", + "run_timestamp": "$RUN_TIMESTAMP", + "summary": { + "total": $((${#RESULTS[@]})), + "passed": $PASSED, + "failed": $FAILED, + "skipped": $SKIPPED + }, + "results": $RESULTS_JSON +} +EOF + +echo -e "${GREEN}✓ Saved results to evals/test-results.json${NC}" +echo "" + +# Display summary +echo "========================================" +echo "Test Run Summary" +echo "========================================" +echo "" +echo "Total Tests: $((${#RESULTS[@]}))" +echo -e "${GREEN}Passed: $PASSED${NC}" +echo -e "${RED}Failed: $FAILED${NC}" +echo -e "${YELLOW}Skipped: $SKIPPED${NC}" +echo "" + +# Next steps +if [[ $FAILED -gt 0 ]]; then + echo "========================================" + echo -e "${YELLOW}Next Steps:${NC}" + echo "========================================" + echo "" + echo "Some tests failed. To improve your skill:" + echo "" + echo "1. Review test-results.json for details" + echo "2. Update SKILL.md to address issues" + echo "3. Run tests again:" + echo " $0 $SKILL_DIR" + echo "" + echo "For detailed grading:" + echo " ~/.config/opencode/skills/skill-builder/scripts/grade-output.sh $SKILL_DIR" + echo "" +fi + +echo -e "${GREEN}Done!${NC}" diff --git a/.config/opencode/skills/.skill-builder.disabled/scripts/validate-skill.sh b/.config/opencode/skills/.skill-builder.disabled/scripts/validate-skill.sh new file mode 100755 index 0000000..cbe3479 --- /dev/null +++ b/.config/opencode/skills/.skill-builder.disabled/scripts/validate-skill.sh @@ -0,0 +1,308 @@ +#!/bin/bash +# +# validate-skill.sh - Strictly validate a skill's structure and content +# +# Usage: ./validate-skill.sh [options] +# +# Options: +# --strict Fail on warnings (default) +# --lenient Only fail on errors, report warnings +# --verbose Show detailed output +# + +set -euo pipefail + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Validation counters +ERRORS=0 +WARNINGS=0 + +# Default mode +STRICT=true +VERBOSE=false + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Options:" + echo " --strict Fail on warnings (default)" + echo " --lenient Only fail on errors" + echo " --verbose Show detailed output" + echo "" + echo "Examples:" + echo " $0 ~/.config/opencode/skills/my-skill" + echo " $0 ./my-skill --verbose" + exit 1 +} + +# Error and warning functions +error() { + echo -e "${RED}✗ ERROR:${NC} $1" + ((ERRORS++)) || true +} + +warn() { + if [[ "$STRICT" == true ]]; then + echo -e "${YELLOW}✗ WARNING (treated as error in strict mode):${NC} $1" + ((ERRORS++)) || true + else + echo -e "${YELLOW}⚠ WARNING:${NC} $1" + ((WARNINGS++)) || true + fi +} + +info() { + if [[ "$VERBOSE" == true ]]; then + echo -e "${BLUE}ℹ${NC} $1" + fi +} + +success() { + echo -e "${GREEN}✓${NC} $1" +} + +# Parse arguments +SKILL_DIR="" + +while [[ $# -gt 0 ]]; do + case $1 in + --strict) + STRICT=true + shift + ;; + --lenient) + STRICT=false + shift + ;; + --verbose) + VERBOSE=true + shift + ;; + -h|--help) + usage + ;; + -*) + echo -e "${RED}Error: Unknown option $1${NC}" + usage + ;; + *) + if [[ -z "$SKILL_DIR" ]]; then + SKILL_DIR="$1" + else + echo -e "${RED}Error: Multiple skill directories provided${NC}" + usage + fi + shift + ;; + esac +done + +# Validate skill directory argument +if [[ -z "$SKILL_DIR" ]]; then + echo -e "${RED}Error: Skill directory is required${NC}" + usage +fi + +# Resolve path +SKILL_DIR="$(cd "$SKILL_DIR" 2>/dev/null && pwd)" || { + echo -e "${RED}Error: Cannot access directory: $SKILL_DIR${NC}" + exit 1 +} + +# Get skill name from directory +SKILL_NAME="$(basename "$SKILL_DIR")" + +echo "========================================" +echo "Validating skill: $SKILL_NAME" +echo "Directory: $SKILL_DIR" +echo "========================================" +echo "" + +# Check 1: Directory exists +if [[ ! -d "$SKILL_DIR" ]]; then + error "Directory does not exist: $SKILL_DIR" + exit 1 +fi +success "Directory exists" + +# Check 2: SKILL.md exists +SKILL_MD="$SKILL_DIR/SKILL.md" +if [[ ! -f "$SKILL_MD" ]]; then + error "SKILL.md file not found in $SKILL_DIR" + error "Skill must contain a SKILL.md file" + exit 1 +fi +success "SKILL.md file exists" + +# Check 3: SKILL.md starts with YAML frontmatter +if ! head -1 "$SKILL_MD" | grep -q '^---\s*$'; then + error "SKILL.md must start with YAML frontmatter (---)" +else + success "SKILL.md starts with YAML frontmatter" +fi + +# Extract frontmatter content +FRONTMATTER=$(sed -n '/^---$/,/^---$/p' "$SKILL_MD" | head -n -1 | tail -n +2) +info "Extracted frontmatter" + +# Check 4: Name field exists and is valid +if ! echo "$FRONTMATTER" | grep -q '^name:'; then + error "Frontmatter missing required field: 'name'" +else + FM_NAME=$(echo "$FRONTMATTER" | grep '^name:' | sed 's/^name:\s*//' | tr -d '"' | tr -d "'") + + if [[ -z "$FM_NAME" ]]; then + error "'name' field is empty" + elif [[ ! "$FM_NAME" =~ ^[a-z0-9-]+$ ]]; then + error "'name' must be lowercase alphanumeric with hyphens: '$FM_NAME'" + elif [[ "$FM_NAME" != "$SKILL_NAME" ]]; then + error "'name' ($FM_NAME) does not match directory name ($SKILL_NAME)" + else + success "'name' field is valid: $FM_NAME" + fi +fi + +# Check 5: Description field exists and is valid +if ! echo "$FRONTMATTER" | grep -q '^description:'; then + error "Frontmatter missing required field: 'description'" +else + FM_DESC=$(echo "$FRONTMATTER" | sed -n '/^description:/,/^\S/{/^description:/p;/^\S/q}' | sed 's/^description:\s*//' | tr -d '"' | tr -d "'" | sed ':a;N;$!ba;s/\n/ /g') + DESC_LEN=${#FM_DESC} + + if [[ -z "$FM_DESC" ]]; then + error "'description' field is empty" + elif [[ $DESC_LEN -lt 20 ]]; then + error "'description' must be at least 20 characters (found $DESC_LEN)" + elif [[ $DESC_LEN -gt 1024 ]]; then + error "'description' must be at most 1024 characters (found $DESC_LEN)" + else + success "'description' field length is valid ($DESC_LEN chars)" + fi + + # Check for common description issues + if echo "$FM_DESC" | grep -qi '^use this skill'; then + warn "Description uses second person ('Use this skill'). Use third person: 'This skill should be used when...'" + fi + + if echo "$FM_DESC" | grep -qi '^this skill provides\|^this skill helps'; then + warn "Description is vague. Include specific trigger phrases users would say" + fi + + # Check for quoted phrases (allow either single or double quotes in original) + if ! echo "$FRONTMATTER" | grep -A 5 '^description:' | grep -q '"\|\"'; then + warn "Description should include specific trigger phrases in quotes, e.g.: \"create X\", \"configure Y\"" + fi +fi + +# Check 6: No unknown frontmatter fields (informational) +KNOWN_FIELDS="^name:\|^description:\|^license:\|^compatibility:\|^metadata:\|^allowed-tools:" +UNKNOWN_FIELDS=$(echo "$FRONTMATTER" | grep -v '^\s*$' | grep -v "$KNOWN_FIELDS" || true) +if [[ -n "$UNKNOWN_FIELDS" ]]; then + info "Unknown frontmatter fields (will be ignored):" + echo "$UNKNOWN_FIELDS" | sed 's/^/ /' +fi + +# Check 7: YAML syntax is valid +if ! echo "$FRONTMATTER" | python3 -c "import yaml, sys; yaml.safe_load(sys.stdin)" 2>/dev/null; then + # Fallback: check with basic pattern matching + if echo "$FRONTMATTER" | grep -q '^\s*:\s*$'; then + error "Invalid YAML syntax: empty key found" + fi + + # Count lines with colons to detect malformed entries + MALFORMED=$(echo "$FRONTMATTER" | grep -c '^\s*[^:]*:\s*$' || true) + if [[ $MALFORMED -gt 0 ]]; then + error "Potential YAML syntax issues detected" + fi +else + success "YAML frontmatter syntax appears valid" +fi + +# Check 8: SKILL.md has content after frontmatter +# Remove everything from line 1 through the second occurrence of --- +BODY_CONTENT=$(sed '0,/^---$/d;0,/^---$/d' "$SKILL_MD") +if [[ -z "$(echo "$BODY_CONTENT" | tr -d '[:space:]')" ]]; then + error "SKILL.md has no content after frontmatter" +else + success "SKILL.md has body content" +fi + +# Check 9: Check for common writing style issues in body +if echo "$BODY_CONTENT" | grep -qE '^You (should|need|can|must)'; then + warn "Body uses second person ('You should...'). Prefer imperative form: 'Do this...'" +fi + +if echo "$BODY_CONTENT" | grep -q '^When to Use This Skill'; then + warn "'When to Use This Skill' section in body is redundant. Include triggers in description field instead" +fi + +# Check 10: Verify referenced files exist +if [[ -d "$SKILL_DIR/references" ]]; then + REF_COUNT=$(find "$SKILL_DIR/references" -type f | wc -l) + success "References directory exists with $REF_COUNT file(s)" + + # Check if references are mentioned in SKILL.md + for ref_file in "$SKILL_DIR/references"/*; do + if [[ -f "$ref_file" ]]; then + ref_name=$(basename "$ref_file") + if ! grep -q "$ref_name" "$SKILL_MD"; then + warn "Reference file '$ref_name' is not mentioned in SKILL.md" + fi + fi + done +fi + +if [[ -d "$SKILL_DIR/scripts" ]]; then + SCRIPT_COUNT=$(find "$SKILL_DIR/scripts" -type f | wc -l) + success "Scripts directory exists with $SCRIPT_COUNT file(s)" + + # Check scripts are executable + for script in "$SKILL_DIR/scripts"/*; do + if [[ -f "$script" ]] && [[ ! -x "$script" ]]; then + warn "Script '$(basename "$script")' is not executable (run: chmod +x)" + fi + done +fi + +if [[ -d "$SKILL_DIR/assets" ]]; then + ASSET_COUNT=$(find "$SKILL_DIR/assets" -type f | wc -l) + success "Assets directory exists with $ASSET_COUNT file(s)" +fi + +# Check 11: File organization +if [[ -f "$SKILL_DIR/README.md" ]]; then + warn "README.md found. Skills should not include README.md files" +fi + +if [[ -f "$SKILL_DIR/CHANGELOG.md" ]]; then + warn "CHANGELOG.md found. Skills should not include auxiliary documentation" +fi + +echo "" +echo "========================================" + +# Final report +if [[ $ERRORS -eq 0 ]]; then + echo -e "${GREEN}✓ VALIDATION PASSED${NC}" + if [[ $WARNINGS -gt 0 ]]; then + echo -e "${YELLOW} $WARNINGS warning(s) (not treated as errors)${NC}" + fi + echo "" + echo "Your skill is ready to use!" + exit 0 +else + echo -e "${RED}✗ VALIDATION FAILED${NC}" + echo -e "${RED} $ERRORS error(s) found${NC}" + if [[ $WARNINGS -gt 0 ]]; then + echo -e "${YELLOW} $WARNINGS warning(s)${NC}" + fi + echo "" + echo "Fix the errors above and run validation again." + exit 1 +fi diff --git a/.config/opencode/skills/agent-browser.disabled/SKILL.md.disabled b/.config/opencode/skills/agent-browser.disabled/SKILL.md.disabled new file mode 100644 index 0000000..73bd784 --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/SKILL.md.disabled @@ -0,0 +1,682 @@ +--- +name: agent-browser +description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. +allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*) +--- + +# Browser Automation with agent-browser + +The CLI uses Chrome/Chromium via CDP directly. Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Run `agent-browser upgrade` to update to the latest version. + +## Core Workflow + +Every browser automation follows this pattern: + +1. **Navigate**: `agent-browser open ` +2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`) +3. **Interact**: Use refs to click, fill, select +4. **Re-snapshot**: After navigation or DOM changes, get fresh refs + +```bash +agent-browser open https://example.com/form +agent-browser snapshot -i +# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit" + +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 +agent-browser wait --load networkidle +agent-browser snapshot -i # Check result +``` + +## Command Chaining + +Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls. + +```bash +# Chain open + wait + snapshot in one call +agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i + +# Chain multiple interactions +agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3 + +# Navigate and capture +agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png +``` + +**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs). + +## Handling Authentication + +When automating a site that requires login, choose the approach that fits: + +**Option 1: Import auth from the user's browser (fastest for one-off tasks)** + +```bash +# Connect to the user's running Chrome (they're already logged in) +agent-browser --auto-connect state save ./auth.json +# Use that auth state +agent-browser --state ./auth.json open https://app.example.com/dashboard +``` + +State files contain session tokens in plaintext -- add to `.gitignore` and delete when no longer needed. Set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. + +**Option 2: Persistent profile (simplest for recurring tasks)** + +```bash +# First run: login manually or via automation +agent-browser --profile ~/.myapp open https://app.example.com/login +# ... fill credentials, submit ... + +# All future runs: already authenticated +agent-browser --profile ~/.myapp open https://app.example.com/dashboard +``` + +**Option 3: Session name (auto-save/restore cookies + localStorage)** + +```bash +agent-browser --session-name myapp open https://app.example.com/login +# ... login flow ... +agent-browser close # State auto-saved + +# Next time: state auto-restored +agent-browser --session-name myapp open https://app.example.com/dashboard +``` + +**Option 4: Auth vault (credentials stored encrypted, login by name)** + +```bash +echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin +agent-browser auth login myapp +``` + +**Option 5: State file (manual save/load)** + +```bash +# After logging in: +agent-browser state save ./auth.json +# In a future session: +agent-browser state load ./auth.json +agent-browser open https://app.example.com/dashboard +``` + +See [references/authentication.md](references/authentication.md) for OAuth, 2FA, cookie-based auth, and token refresh patterns. + +## Essential Commands + +```bash +# Navigation +agent-browser open # Navigate (aliases: goto, navigate) +agent-browser close # Close browser + +# Snapshot +agent-browser snapshot -i # Interactive elements with refs (recommended) +agent-browser snapshot -i -C # Include cursor-interactive elements (divs with onclick, cursor:pointer) +agent-browser snapshot -s "#selector" # Scope to CSS selector + +# Interaction (use @refs from snapshot) +agent-browser click @e1 # Click element +agent-browser click @e1 --new-tab # Click and open in new tab +agent-browser fill @e2 "text" # Clear and type text +agent-browser type @e2 "text" # Type without clearing +agent-browser select @e1 "option" # Select dropdown option +agent-browser check @e1 # Check checkbox +agent-browser press Enter # Press key +agent-browser keyboard type "text" # Type at current focus (no selector) +agent-browser keyboard inserttext "text" # Insert without key events +agent-browser scroll down 500 # Scroll page +agent-browser scroll down 500 --selector "div.content" # Scroll within a specific container + +# Get information +agent-browser get text @e1 # Get element text +agent-browser get url # Get current URL +agent-browser get title # Get page title +agent-browser get cdp-url # Get CDP WebSocket URL + +# Wait +agent-browser wait @e1 # Wait for element +agent-browser wait --load networkidle # Wait for network idle +agent-browser wait --url "**/page" # Wait for URL pattern +agent-browser wait 2000 # Wait milliseconds +agent-browser wait --text "Welcome" # Wait for text to appear (substring match) +agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear +agent-browser wait "#spinner" --state hidden # Wait for element to disappear + +# Downloads +agent-browser download @e1 ./file.pdf # Click element to trigger download +agent-browser wait --download ./output.zip # Wait for any download to complete +agent-browser --download-path ./downloads open # Set default download directory + +# Network +agent-browser network requests # Inspect tracked requests +agent-browser network route "**/api/*" --abort # Block matching requests +agent-browser network har start # Start HAR recording +agent-browser network har stop ./capture.har # Stop and save HAR file + +# Viewport & Device Emulation +agent-browser set viewport 1920 1080 # Set viewport size (default: 1280x720) +agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots) +agent-browser set device "iPhone 14" # Emulate device (viewport + user agent) + +# Capture +agent-browser screenshot # Screenshot to temp dir +agent-browser screenshot --full # Full page screenshot +agent-browser screenshot --annotate # Annotated screenshot with numbered element labels +agent-browser screenshot --screenshot-dir ./shots # Save to custom directory +agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80 +agent-browser pdf output.pdf # Save as PDF + +# Clipboard +agent-browser clipboard read # Read text from clipboard +agent-browser clipboard write "Hello, World!" # Write text to clipboard +agent-browser clipboard copy # Copy current selection +agent-browser clipboard paste # Paste from clipboard + +# Diff (compare page states) +agent-browser diff snapshot # Compare current vs last snapshot +agent-browser diff snapshot --baseline before.txt # Compare current vs saved file +agent-browser diff screenshot --baseline before.png # Visual pixel diff +agent-browser diff url # Compare two pages +agent-browser diff url --wait-until networkidle # Custom wait strategy +agent-browser diff url --selector "#main" # Scope to element +``` + +## Batch Execution + +Execute multiple commands in a single invocation by piping a JSON array of string arrays to `batch`. This avoids per-command process startup overhead when running multi-step workflows. + +```bash +echo '[ + ["open", "https://example.com"], + ["snapshot", "-i"], + ["click", "@e1"], + ["screenshot", "result.png"] +]' | agent-browser batch --json + +# Stop on first error +agent-browser batch --bail < commands.json +``` + +Use `batch` when you have a known sequence of commands that don't depend on intermediate output. Use separate commands or `&&` chaining when you need to parse output between steps (e.g., snapshot to discover refs, then interact). + +## Common Patterns + +### Form Submission + +```bash +agent-browser open https://example.com/signup +agent-browser snapshot -i +agent-browser fill @e1 "Jane Doe" +agent-browser fill @e2 "jane@example.com" +agent-browser select @e3 "California" +agent-browser check @e4 +agent-browser click @e5 +agent-browser wait --load networkidle +``` + +### Authentication with Auth Vault (Recommended) + +```bash +# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY) +# Recommended: pipe password via stdin to avoid shell history exposure +echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin + +# Login using saved profile (LLM never sees password) +agent-browser auth login github + +# List/show/delete profiles +agent-browser auth list +agent-browser auth show github +agent-browser auth delete github +``` + +### Authentication with State Persistence + +```bash +# Login once and save state +agent-browser open https://app.example.com/login +agent-browser snapshot -i +agent-browser fill @e1 "$USERNAME" +agent-browser fill @e2 "$PASSWORD" +agent-browser click @e3 +agent-browser wait --url "**/dashboard" +agent-browser state save auth.json + +# Reuse in future sessions +agent-browser state load auth.json +agent-browser open https://app.example.com/dashboard +``` + +### Session Persistence + +```bash +# Auto-save/restore cookies and localStorage across browser restarts +agent-browser --session-name myapp open https://app.example.com/login +# ... login flow ... +agent-browser close # State auto-saved to ~/.agent-browser/sessions/ + +# Next time, state is auto-loaded +agent-browser --session-name myapp open https://app.example.com/dashboard + +# Encrypt state at rest +export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) +agent-browser --session-name secure open https://app.example.com + +# Manage saved states +agent-browser state list +agent-browser state show myapp-default.json +agent-browser state clear myapp +agent-browser state clean --older-than 7 +``` + +### Working with Iframes + +Iframe content is automatically inlined in snapshots. Refs inside iframes carry frame context, so you can interact with them directly. + +```bash +agent-browser open https://example.com/checkout +agent-browser snapshot -i +# @e1 [heading] "Checkout" +# @e2 [Iframe] "payment-frame" +# @e3 [input] "Card number" +# @e4 [input] "Expiry" +# @e5 [button] "Pay" + +# Interact directly — no frame switch needed +agent-browser fill @e3 "4111111111111111" +agent-browser fill @e4 "12/28" +agent-browser click @e5 + +# To scope a snapshot to one iframe: +agent-browser frame @e2 +agent-browser snapshot -i # Only iframe content +agent-browser frame main # Return to main frame +``` + +### Data Extraction + +```bash +agent-browser open https://example.com/products +agent-browser snapshot -i +agent-browser get text @e5 # Get specific element text +agent-browser get text body > page.txt # Get all page text + +# JSON output for parsing +agent-browser snapshot -i --json +agent-browser get text @e1 --json +``` + +### Parallel Sessions + +```bash +agent-browser --session site1 open https://site-a.com +agent-browser --session site2 open https://site-b.com + +agent-browser --session site1 snapshot -i +agent-browser --session site2 snapshot -i + +agent-browser session list +``` + +### Connect to Existing Chrome + +```bash +# Auto-discover running Chrome with remote debugging enabled +agent-browser --auto-connect open https://example.com +agent-browser --auto-connect snapshot + +# Or with explicit CDP port +agent-browser --cdp 9222 snapshot +``` + +Auto-connect discovers Chrome via `DevToolsActivePort`, common debugging ports (9222, 9229), and falls back to a direct WebSocket connection if HTTP-based CDP discovery fails. + +### Color Scheme (Dark Mode) + +```bash +# Persistent dark mode via flag (applies to all pages and new tabs) +agent-browser --color-scheme dark open https://example.com + +# Or via environment variable +AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com + +# Or set during session (persists for subsequent commands) +agent-browser set media dark +``` + +### Viewport & Responsive Testing + +```bash +# Set a custom viewport size (default is 1280x720) +agent-browser set viewport 1920 1080 +agent-browser screenshot desktop.png + +# Test mobile-width layout +agent-browser set viewport 375 812 +agent-browser screenshot mobile.png + +# Retina/HiDPI: same CSS layout at 2x pixel density +# Screenshots stay at logical viewport size, but content renders at higher DPI +agent-browser set viewport 1920 1080 2 +agent-browser screenshot retina.png + +# Device emulation (sets viewport + user agent in one step) +agent-browser set device "iPhone 14" +agent-browser screenshot device.png +``` + +The `scale` parameter (3rd argument) sets `window.devicePixelRatio` without changing CSS layout. Use it when testing retina rendering or capturing higher-resolution screenshots. + +### Visual Browser (Debugging) + +```bash +agent-browser --headed open https://example.com +agent-browser highlight @e1 # Highlight element +agent-browser inspect # Open Chrome DevTools for the active page +agent-browser record start demo.webm # Record session +agent-browser profiler start # Start Chrome DevTools profiling +agent-browser profiler stop trace.json # Stop and save profile (path optional) +``` + +Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode. + +### Local Files (PDFs, HTML) + +```bash +# Open local files with file:// URLs +agent-browser --allow-file-access open file:///path/to/document.pdf +agent-browser --allow-file-access open file:///path/to/page.html +agent-browser screenshot output.png +``` + +### iOS Simulator (Mobile Safari) + +```bash +# List available iOS simulators +agent-browser device list + +# Launch Safari on a specific device +agent-browser -p ios --device "iPhone 16 Pro" open https://example.com + +# Same workflow as desktop - snapshot, interact, re-snapshot +agent-browser -p ios snapshot -i +agent-browser -p ios tap @e1 # Tap (alias for click) +agent-browser -p ios fill @e2 "text" +agent-browser -p ios swipe up # Mobile-specific gesture + +# Take screenshot +agent-browser -p ios screenshot mobile.png + +# Close session (shuts down simulator) +agent-browser -p ios close +``` + +**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`) + +**Real devices:** Works with physical iOS devices if pre-configured. Use `--device ""` where UDID is from `xcrun xctrace list devices`. + +## Security + +All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output. + +### Content Boundaries (Recommended for AI Agents) + +Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content: + +```bash +export AGENT_BROWSER_CONTENT_BOUNDARIES=1 +agent-browser snapshot +# Output: +# --- AGENT_BROWSER_PAGE_CONTENT nonce= origin=https://example.com --- +# [accessibility tree] +# --- END_AGENT_BROWSER_PAGE_CONTENT nonce= --- +``` + +### Domain Allowlist + +Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on: + +```bash +export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com" +agent-browser open https://example.com # OK +agent-browser open https://malicious.com # Blocked +``` + +### Action Policy + +Use a policy file to gate destructive actions: + +```bash +export AGENT_BROWSER_ACTION_POLICY=./policy.json +``` + +Example `policy.json`: + +```json +{ "default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"] } +``` + +Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies. + +### Output Limits + +Prevent context flooding from large pages: + +```bash +export AGENT_BROWSER_MAX_OUTPUT=50000 +``` + +## Diffing (Verifying Changes) + +Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session. + +```bash +# Typical workflow: snapshot -> action -> diff +agent-browser snapshot -i # Take baseline snapshot +agent-browser click @e2 # Perform action +agent-browser diff snapshot # See what changed (auto-compares to last snapshot) +``` + +For visual regression testing or monitoring: + +```bash +# Save a baseline screenshot, then compare later +agent-browser screenshot baseline.png +# ... time passes or changes are made ... +agent-browser diff screenshot --baseline baseline.png + +# Compare staging vs production +agent-browser diff url https://staging.example.com https://prod.example.com --screenshot +``` + +`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage. + +## Timeouts and Slow Pages + +The default timeout is 25 seconds. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout: + +```bash +# Wait for network activity to settle (best for slow pages) +agent-browser wait --load networkidle + +# Wait for a specific element to appear +agent-browser wait "#content" +agent-browser wait @e1 + +# Wait for a specific URL pattern (useful after redirects) +agent-browser wait --url "**/dashboard" + +# Wait for a JavaScript condition +agent-browser wait --fn "document.readyState === 'complete'" + +# Wait a fixed duration (milliseconds) as a last resort +agent-browser wait 5000 +``` + +When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait ` or `wait @ref`. + +## Session Management and Cleanup + +When running multiple agents or automations concurrently, always use named sessions to avoid conflicts: + +```bash +# Each agent gets its own isolated session +agent-browser --session agent1 open site-a.com +agent-browser --session agent2 open site-b.com + +# Check active sessions +agent-browser session list +``` + +Always close your browser session when done to avoid leaked processes: + +```bash +agent-browser close # Close default session +agent-browser --session agent1 close # Close specific session +``` + +If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up before starting new work. + +To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments): + +```bash +AGENT_BROWSER_IDLE_TIMEOUT_MS=60000 agent-browser open example.com +``` + +## Ref Lifecycle (Important) + +Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after: + +- Clicking links or buttons that navigate +- Form submissions +- Dynamic content loading (dropdowns, modals) + +```bash +agent-browser click @e5 # Navigates to new page +agent-browser snapshot -i # MUST re-snapshot +agent-browser click @e1 # Use new refs +``` + +## Annotated Screenshots (Vision Mode) + +Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot. + +```bash +agent-browser screenshot --annotate +# Output includes the image path and a legend: +# [1] @e1 button "Submit" +# [2] @e2 link "Home" +# [3] @e3 textbox "Email" +agent-browser click @e2 # Click using ref from annotated screenshot +``` + +Use annotated screenshots when: + +- The page has unlabeled icon buttons or visual-only elements +- You need to verify visual layout or styling +- Canvas or chart elements are present (invisible to text snapshots) +- You need spatial reasoning about element positions + +## Semantic Locators (Alternative to Refs) + +When refs are unavailable or unreliable, use semantic locators: + +```bash +agent-browser find text "Sign In" click +agent-browser find label "Email" fill "user@test.com" +agent-browser find role button click --name "Submit" +agent-browser find placeholder "Search" type "query" +agent-browser find testid "submit-btn" click +``` + +## JavaScript Evaluation (eval) + +Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues. + +```bash +# Simple expressions work with regular quoting +agent-browser eval 'document.title' +agent-browser eval 'document.querySelectorAll("img").length' + +# Complex JS: use --stdin with heredoc (RECOMMENDED) +agent-browser eval --stdin <<'EVALEOF' +JSON.stringify( + Array.from(document.querySelectorAll("img")) + .filter(i => !i.alt) + .map(i => ({ src: i.src.split("/").pop(), width: i.width })) +) +EVALEOF + +# Alternative: base64 encoding (avoids all shell escaping issues) +agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)" +``` + +**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely. + +**Rules of thumb:** + +- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine +- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'` +- Programmatic/generated scripts -> use `eval -b` with base64 + +## Configuration File + +Create `agent-browser.json` in the project root for persistent settings: + +```json +{ + "headed": true, + "proxy": "http://localhost:8080", + "profile": "./browser-data" +} +``` + +Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config ` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced. + +## Deep-Dive Documentation + +| Reference | When to Use | +| -------------------------------------------------------------------- | --------------------------------------------------------- | +| [references/commands.md](references/commands.md) | Full command reference with all options | +| [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting | +| [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping | +| [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse | +| [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation | +| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis | +| [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies | + +## Browser Engine Selection + +Use `--engine` to choose a local browser engine. The default is `chrome`. + +```bash +# Use Lightpanda (fast headless browser, requires separate install) +agent-browser --engine lightpanda open example.com + +# Via environment variable +export AGENT_BROWSER_ENGINE=lightpanda +agent-browser open example.com + +# With custom binary path +agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com +``` + +Supported engines: +- `chrome` (default) -- Chrome/Chromium via CDP +- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome) + +Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation. + +## Ready-to-Use Templates + +| Template | Description | +| ------------------------------------------------------------------------ | ----------------------------------- | +| [templates/form-automation.sh](templates/form-automation.sh) | Form filling with validation | +| [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state | +| [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots | + +```bash +./templates/form-automation.sh https://example.com/form +./templates/authenticated-session.sh https://app.example.com/login +./templates/capture-workflow.sh https://example.com ./output +``` diff --git a/.config/opencode/skills/agent-browser.disabled/references/authentication.md b/.config/opencode/skills/agent-browser.disabled/references/authentication.md new file mode 100644 index 0000000..89f4788 --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/authentication.md @@ -0,0 +1,303 @@ +# Authentication Patterns + +Login flows, session persistence, OAuth, 2FA, and authenticated browsing. + +**Related**: [session-management.md](session-management.md) for state persistence details, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Import Auth from Your Browser](#import-auth-from-your-browser) +- [Persistent Profiles](#persistent-profiles) +- [Session Persistence](#session-persistence) +- [Basic Login Flow](#basic-login-flow) +- [Saving Authentication State](#saving-authentication-state) +- [Restoring Authentication](#restoring-authentication) +- [OAuth / SSO Flows](#oauth--sso-flows) +- [Two-Factor Authentication](#two-factor-authentication) +- [HTTP Basic Auth](#http-basic-auth) +- [Cookie-Based Auth](#cookie-based-auth) +- [Token Refresh Handling](#token-refresh-handling) +- [Security Best Practices](#security-best-practices) + +## Import Auth from Your Browser + +The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into. + +**Step 1: Start Chrome with remote debugging** + +```bash +# macOS +"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222 + +# Linux +google-chrome --remote-debugging-port=9222 + +# Windows +"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 +``` + +Log in to your target site(s) in this Chrome window as you normally would. + +> **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done. + +**Step 2: Grab the auth state** + +```bash +# Auto-discover the running Chrome and save its cookies + localStorage +agent-browser --auto-connect state save ./my-auth.json +``` + +**Step 3: Reuse in automation** + +```bash +# Load auth at launch +agent-browser --state ./my-auth.json open https://app.example.com/dashboard + +# Or load into an existing session +agent-browser state load ./my-auth.json +agent-browser open https://app.example.com/dashboard +``` + +This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies. + +> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices). + +**Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts: + +```bash +agent-browser --session-name myapp state load ./my-auth.json +# From now on, state is auto-saved/restored for "myapp" +``` + +## Persistent Profiles + +Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load: + +```bash +# First run: login once +agent-browser --profile ~/.myapp-profile open https://app.example.com/login +# ... complete login flow ... + +# All subsequent runs: already authenticated +agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard +``` + +Use different paths for different projects or test users: + +```bash +agent-browser --profile ~/.profiles/admin open https://app.example.com +agent-browser --profile ~/.profiles/viewer open https://app.example.com +``` + +Or set via environment variable: + +```bash +export AGENT_BROWSER_PROFILE=~/.myapp-profile +agent-browser open https://app.example.com/dashboard +``` + +## Session Persistence + +Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files: + +```bash +# Auto-saves state on close, auto-restores on next launch +agent-browser --session-name twitter open https://twitter.com +# ... login flow ... +agent-browser close # state saved to ~/.agent-browser/sessions/ + +# Next time: state is automatically restored +agent-browser --session-name twitter open https://twitter.com +``` + +Encrypt state at rest: + +```bash +export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) +agent-browser --session-name secure open https://app.example.com +``` + +## Basic Login Flow + +```bash +# Navigate to login page +agent-browser open https://app.example.com/login +agent-browser wait --load networkidle + +# Get form elements +agent-browser snapshot -i +# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In" + +# Fill credentials +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" + +# Submit +agent-browser click @e3 +agent-browser wait --load networkidle + +# Verify login succeeded +agent-browser get url # Should be dashboard, not login +``` + +## Saving Authentication State + +After logging in, save state for reuse: + +```bash +# Login first (see above) +agent-browser open https://app.example.com/login +agent-browser snapshot -i +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 +agent-browser wait --url "**/dashboard" + +# Save authenticated state +agent-browser state save ./auth-state.json +``` + +## Restoring Authentication + +Skip login by loading saved state: + +```bash +# Load saved auth state +agent-browser state load ./auth-state.json + +# Navigate directly to protected page +agent-browser open https://app.example.com/dashboard + +# Verify authenticated +agent-browser snapshot -i +``` + +## OAuth / SSO Flows + +For OAuth redirects: + +```bash +# Start OAuth flow +agent-browser open https://app.example.com/auth/google + +# Handle redirects automatically +agent-browser wait --url "**/accounts.google.com**" +agent-browser snapshot -i + +# Fill Google credentials +agent-browser fill @e1 "user@gmail.com" +agent-browser click @e2 # Next button +agent-browser wait 2000 +agent-browser snapshot -i +agent-browser fill @e3 "password" +agent-browser click @e4 # Sign in + +# Wait for redirect back +agent-browser wait --url "**/app.example.com**" +agent-browser state save ./oauth-state.json +``` + +## Two-Factor Authentication + +Handle 2FA with manual intervention: + +```bash +# Login with credentials +agent-browser open https://app.example.com/login --headed # Show browser +agent-browser snapshot -i +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 + +# Wait for user to complete 2FA manually +echo "Complete 2FA in the browser window..." +agent-browser wait --url "**/dashboard" --timeout 120000 + +# Save state after 2FA +agent-browser state save ./2fa-state.json +``` + +## HTTP Basic Auth + +For sites using HTTP Basic Authentication: + +```bash +# Set credentials before navigation +agent-browser set credentials username password + +# Navigate to protected resource +agent-browser open https://protected.example.com/api +``` + +## Cookie-Based Auth + +Manually set authentication cookies: + +```bash +# Set auth cookie +agent-browser cookies set session_token "abc123xyz" + +# Navigate to protected page +agent-browser open https://app.example.com/dashboard +``` + +## Token Refresh Handling + +For sessions with expiring tokens: + +```bash +#!/bin/bash +# Wrapper that handles token refresh + +STATE_FILE="./auth-state.json" + +# Try loading existing state +if [[ -f "$STATE_FILE" ]]; then + agent-browser state load "$STATE_FILE" + agent-browser open https://app.example.com/dashboard + + # Check if session is still valid + URL=$(agent-browser get url) + if [[ "$URL" == *"/login"* ]]; then + echo "Session expired, re-authenticating..." + # Perform fresh login + agent-browser snapshot -i + agent-browser fill @e1 "$USERNAME" + agent-browser fill @e2 "$PASSWORD" + agent-browser click @e3 + agent-browser wait --url "**/dashboard" + agent-browser state save "$STATE_FILE" + fi +else + # First-time login + agent-browser open https://app.example.com/login + # ... login flow ... +fi +``` + +## Security Best Practices + +1. **Never commit state files** - They contain session tokens + ```bash + echo "*.auth-state.json" >> .gitignore + ``` + +2. **Use environment variables for credentials** + ```bash + agent-browser fill @e1 "$APP_USERNAME" + agent-browser fill @e2 "$APP_PASSWORD" + ``` + +3. **Clean up after automation** + ```bash + agent-browser cookies clear + rm -f ./auth-state.json + ``` + +4. **Use short-lived sessions for CI/CD** + ```bash + # Don't persist state in CI + agent-browser open https://app.example.com/login + # ... login and perform actions ... + agent-browser close # Session ends, nothing persisted + ``` diff --git a/.config/opencode/skills/agent-browser.disabled/references/commands.md b/.config/opencode/skills/agent-browser.disabled/references/commands.md new file mode 100644 index 0000000..46de5f1 --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/commands.md @@ -0,0 +1,292 @@ +# Command Reference + +Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md. + +## Navigation + +```bash +agent-browser open # Navigate to URL (aliases: goto, navigate) + # Supports: https://, http://, file://, about:, data:// + # Auto-prepends https:// if no protocol given +agent-browser back # Go back +agent-browser forward # Go forward +agent-browser reload # Reload page +agent-browser close # Close browser (aliases: quit, exit) +agent-browser connect 9222 # Connect to browser via CDP port +``` + +## Snapshot (page analysis) + +```bash +agent-browser snapshot # Full accessibility tree +agent-browser snapshot -i # Interactive elements only (recommended) +agent-browser snapshot -c # Compact output +agent-browser snapshot -d 3 # Limit depth to 3 +agent-browser snapshot -s "#main" # Scope to CSS selector +``` + +## Interactions (use @refs from snapshot) + +```bash +agent-browser click @e1 # Click +agent-browser click @e1 --new-tab # Click and open in new tab +agent-browser dblclick @e1 # Double-click +agent-browser focus @e1 # Focus element +agent-browser fill @e2 "text" # Clear and type +agent-browser type @e2 "text" # Type without clearing +agent-browser press Enter # Press key (alias: key) +agent-browser press Control+a # Key combination +agent-browser keydown Shift # Hold key down +agent-browser keyup Shift # Release key +agent-browser hover @e1 # Hover +agent-browser check @e1 # Check checkbox +agent-browser uncheck @e1 # Uncheck checkbox +agent-browser select @e1 "value" # Select dropdown option +agent-browser select @e1 "a" "b" # Select multiple options +agent-browser scroll down 500 # Scroll page (default: down 300px) +agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto) +agent-browser drag @e1 @e2 # Drag and drop +agent-browser upload @e1 file.pdf # Upload files +``` + +## Get Information + +```bash +agent-browser get text @e1 # Get element text +agent-browser get html @e1 # Get innerHTML +agent-browser get value @e1 # Get input value +agent-browser get attr @e1 href # Get attribute +agent-browser get title # Get page title +agent-browser get url # Get current URL +agent-browser get cdp-url # Get CDP WebSocket URL +agent-browser get count ".item" # Count matching elements +agent-browser get box @e1 # Get bounding box +agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.) +``` + +## Check State + +```bash +agent-browser is visible @e1 # Check if visible +agent-browser is enabled @e1 # Check if enabled +agent-browser is checked @e1 # Check if checked +``` + +## Screenshots and PDF + +```bash +agent-browser screenshot # Save to temporary directory +agent-browser screenshot path.png # Save to specific path +agent-browser screenshot --full # Full page +agent-browser pdf output.pdf # Save as PDF +``` + +## Video Recording + +```bash +agent-browser record start ./demo.webm # Start recording +agent-browser click @e1 # Perform actions +agent-browser record stop # Stop and save video +agent-browser record restart ./take2.webm # Stop current + start new +``` + +## Wait + +```bash +agent-browser wait @e1 # Wait for element +agent-browser wait 2000 # Wait milliseconds +agent-browser wait --text "Success" # Wait for text (or -t) +agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u) +agent-browser wait --load networkidle # Wait for network idle (or -l) +agent-browser wait --fn "window.ready" # Wait for JS condition (or -f) +``` + +## Mouse Control + +```bash +agent-browser mouse move 100 200 # Move mouse +agent-browser mouse down left # Press button +agent-browser mouse up left # Release button +agent-browser mouse wheel 100 # Scroll wheel +``` + +## Semantic Locators (alternative to refs) + +```bash +agent-browser find role button click --name "Submit" +agent-browser find text "Sign In" click +agent-browser find text "Sign In" click --exact # Exact match only +agent-browser find label "Email" fill "user@test.com" +agent-browser find placeholder "Search" type "query" +agent-browser find alt "Logo" click +agent-browser find title "Close" click +agent-browser find testid "submit-btn" click +agent-browser find first ".item" click +agent-browser find last ".item" click +agent-browser find nth 2 "a" hover +``` + +## Browser Settings + +```bash +agent-browser set viewport 1920 1080 # Set viewport size +agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots) +agent-browser set device "iPhone 14" # Emulate device +agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation) +agent-browser set offline on # Toggle offline mode +agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers +agent-browser set credentials user pass # HTTP basic auth (alias: auth) +agent-browser set media dark # Emulate color scheme +agent-browser set media light reduced-motion # Light mode + reduced motion +``` + +## Cookies and Storage + +```bash +agent-browser cookies # Get all cookies +agent-browser cookies set name value # Set cookie +agent-browser cookies clear # Clear cookies +agent-browser storage local # Get all localStorage +agent-browser storage local key # Get specific key +agent-browser storage local set k v # Set value +agent-browser storage local clear # Clear all +``` + +## Network + +```bash +agent-browser network route # Intercept requests +agent-browser network route --abort # Block requests +agent-browser network route --body '{}' # Mock response +agent-browser network unroute [url] # Remove routes +agent-browser network requests # View tracked requests +agent-browser network requests --filter api # Filter requests +``` + +## Tabs and Windows + +```bash +agent-browser tab # List tabs +agent-browser tab new [url] # New tab +agent-browser tab 2 # Switch to tab by index +agent-browser tab close # Close current tab +agent-browser tab close 2 # Close tab by index +agent-browser window new # New window +``` + +## Frames + +```bash +agent-browser frame "#iframe" # Switch to iframe by CSS selector +agent-browser frame @e3 # Switch to iframe by element ref +agent-browser frame main # Back to main frame +``` + +### Iframe support + +Iframes are detected automatically during snapshots. When the main-frame snapshot runs, `Iframe` nodes are resolved and their content is inlined beneath the iframe element in the output (one level of nesting; iframes within iframes are not expanded). + +```bash +agent-browser snapshot -i +# @e3 [Iframe] "payment-frame" +# @e4 [input] "Card number" +# @e5 [button] "Pay" + +# Interact directly — refs inside iframes already work +agent-browser fill @e4 "4111111111111111" +agent-browser click @e5 + +# Or switch frame context for scoped snapshots +agent-browser frame @e3 # Switch using element ref +agent-browser snapshot -i # Snapshot scoped to that iframe +agent-browser frame main # Return to main frame +``` + +The `frame` command accepts: +- **Element refs** — `frame @e3` resolves the ref to an iframe element +- **CSS selectors** — `frame "#payment-iframe"` finds the iframe by selector +- **Frame name/URL** — matches against the browser's frame tree + +## Dialogs + +```bash +agent-browser dialog accept [text] # Accept dialog +agent-browser dialog dismiss # Dismiss dialog +``` + +## JavaScript + +```bash +agent-browser eval "document.title" # Simple expressions only +agent-browser eval -b "" # Any JavaScript (base64 encoded) +agent-browser eval --stdin # Read script from stdin +``` + +Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone. + +```bash +# Base64 encode your script, then: +agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ==" + +# Or use stdin with heredoc for multiline scripts: +cat <<'EOF' | agent-browser eval --stdin +const links = document.querySelectorAll('a'); +Array.from(links).map(a => a.href); +EOF +``` + +## State Management + +```bash +agent-browser state save auth.json # Save cookies, storage, auth state +agent-browser state load auth.json # Restore saved state +``` + +## Global Options + +```bash +agent-browser --session ... # Isolated browser session +agent-browser --json ... # JSON output for parsing +agent-browser --headed ... # Show browser window (not headless) +agent-browser --full ... # Full page screenshot (-f) +agent-browser --cdp ... # Connect via Chrome DevTools Protocol +agent-browser -p ... # Cloud browser provider (--provider) +agent-browser --proxy ... # Use proxy server +agent-browser --proxy-bypass # Hosts to bypass proxy +agent-browser --headers ... # HTTP headers scoped to URL's origin +agent-browser --executable-path

# Custom browser executable +agent-browser --extension ... # Load browser extension (repeatable) +agent-browser --ignore-https-errors # Ignore SSL certificate errors +agent-browser --help # Show help (-h) +agent-browser --version # Show version (-V) +agent-browser --help # Show detailed help for a command +``` + +## Debugging + +```bash +agent-browser --headed open example.com # Show browser window +agent-browser --cdp 9222 snapshot # Connect via CDP port +agent-browser connect 9222 # Alternative: connect command +agent-browser console # View console messages +agent-browser console --clear # Clear console +agent-browser errors # View page errors +agent-browser errors --clear # Clear errors +agent-browser highlight @e1 # Highlight element +agent-browser inspect # Open Chrome DevTools for this session +agent-browser trace start # Start recording trace +agent-browser trace stop trace.zip # Stop and save trace +agent-browser profiler start # Start Chrome DevTools profiling +agent-browser profiler stop trace.json # Stop and save profile +``` + +## Environment Variables + +```bash +AGENT_BROWSER_SESSION="mysession" # Default session name +AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path +AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths +AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider +AGENT_BROWSER_STREAM_PORT="9223" # WebSocket streaming port +AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location +``` diff --git a/.config/opencode/skills/agent-browser.disabled/references/profiling.md b/.config/opencode/skills/agent-browser.disabled/references/profiling.md new file mode 100644 index 0000000..bd47eaa --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/profiling.md @@ -0,0 +1,120 @@ +# Profiling + +Capture Chrome DevTools performance profiles during browser automation for performance analysis. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Profiling](#basic-profiling) +- [Profiler Commands](#profiler-commands) +- [Categories](#categories) +- [Use Cases](#use-cases) +- [Output Format](#output-format) +- [Viewing Profiles](#viewing-profiles) +- [Limitations](#limitations) + +## Basic Profiling + +```bash +# Start profiling +agent-browser profiler start + +# Perform actions +agent-browser navigate https://example.com +agent-browser click "#button" +agent-browser wait 1000 + +# Stop and save +agent-browser profiler stop ./trace.json +``` + +## Profiler Commands + +```bash +# Start profiling with default categories +agent-browser profiler start + +# Start with custom trace categories +agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing" + +# Stop profiling and save to file +agent-browser profiler stop ./trace.json +``` + +## Categories + +The `--categories` flag accepts a comma-separated list of Chrome trace categories. Default categories include: + +- `devtools.timeline` -- standard DevTools performance traces +- `v8.execute` -- time spent running JavaScript +- `blink` -- renderer events +- `blink.user_timing` -- `performance.mark()` / `performance.measure()` calls +- `latencyInfo` -- input-to-latency tracking +- `renderer.scheduler` -- task scheduling and execution +- `toplevel` -- broad-spectrum basic events + +Several `disabled-by-default-*` categories are also included for detailed timeline, call stack, and V8 CPU profiling data. + +## Use Cases + +### Diagnosing Slow Page Loads + +```bash +agent-browser profiler start +agent-browser navigate https://app.example.com +agent-browser wait --load networkidle +agent-browser profiler stop ./page-load-profile.json +``` + +### Profiling User Interactions + +```bash +agent-browser navigate https://app.example.com +agent-browser profiler start +agent-browser click "#submit" +agent-browser wait 2000 +agent-browser profiler stop ./interaction-profile.json +``` + +### CI Performance Regression Checks + +```bash +#!/bin/bash +agent-browser profiler start +agent-browser navigate https://app.example.com +agent-browser wait --load networkidle +agent-browser profiler stop "./profiles/build-${BUILD_ID}.json" +``` + +## Output Format + +The output is a JSON file in Chrome Trace Event format: + +```json +{ + "traceEvents": [ + { "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100, ... }, + ... + ], + "metadata": { + "clock-domain": "LINUX_CLOCK_MONOTONIC" + } +} +``` + +The `metadata.clock-domain` field is set based on the host platform (Linux or macOS). On Windows it is omitted. + +## Viewing Profiles + +Load the output JSON file in any of these tools: + +- **Chrome DevTools**: Performance panel > Load profile (Ctrl+Shift+I > Performance) +- **Perfetto UI**: https://ui.perfetto.dev/ -- drag and drop the JSON file +- **Trace Viewer**: `chrome://tracing` in any Chromium browser + +## Limitations + +- Only works with Chromium-based browsers (Chrome, Edge). Not supported on Firefox or WebKit. +- Trace data accumulates in memory while profiling is active (capped at 5 million events). Stop profiling promptly after the area of interest. +- Data collection on stop has a 30-second timeout. If the browser is unresponsive, the stop command may fail. diff --git a/.config/opencode/skills/agent-browser.disabled/references/proxy-support.md b/.config/opencode/skills/agent-browser.disabled/references/proxy-support.md new file mode 100644 index 0000000..e86a8fe --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/proxy-support.md @@ -0,0 +1,194 @@ +# Proxy Support + +Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments. + +**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Proxy Configuration](#basic-proxy-configuration) +- [Authenticated Proxy](#authenticated-proxy) +- [SOCKS Proxy](#socks-proxy) +- [Proxy Bypass](#proxy-bypass) +- [Common Use Cases](#common-use-cases) +- [Verifying Proxy Connection](#verifying-proxy-connection) +- [Troubleshooting](#troubleshooting) +- [Best Practices](#best-practices) + +## Basic Proxy Configuration + +Use the `--proxy` flag or set proxy via environment variable: + +```bash +# Via CLI flag +agent-browser --proxy "http://proxy.example.com:8080" open https://example.com + +# Via environment variable +export HTTP_PROXY="http://proxy.example.com:8080" +agent-browser open https://example.com + +# HTTPS proxy +export HTTPS_PROXY="https://proxy.example.com:8080" +agent-browser open https://example.com + +# Both +export HTTP_PROXY="http://proxy.example.com:8080" +export HTTPS_PROXY="http://proxy.example.com:8080" +agent-browser open https://example.com +``` + +## Authenticated Proxy + +For proxies requiring authentication: + +```bash +# Include credentials in URL +export HTTP_PROXY="http://username:password@proxy.example.com:8080" +agent-browser open https://example.com +``` + +## SOCKS Proxy + +```bash +# SOCKS5 proxy +export ALL_PROXY="socks5://proxy.example.com:1080" +agent-browser open https://example.com + +# SOCKS5 with auth +export ALL_PROXY="socks5://user:pass@proxy.example.com:1080" +agent-browser open https://example.com +``` + +## Proxy Bypass + +Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`: + +```bash +# Via CLI flag +agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com + +# Via environment variable +export NO_PROXY="localhost,127.0.0.1,.internal.company.com" +agent-browser open https://internal.company.com # Direct connection +agent-browser open https://external.com # Via proxy +``` + +## Common Use Cases + +### Geo-Location Testing + +```bash +#!/bin/bash +# Test site from different regions using geo-located proxies + +PROXIES=( + "http://us-proxy.example.com:8080" + "http://eu-proxy.example.com:8080" + "http://asia-proxy.example.com:8080" +) + +for proxy in "${PROXIES[@]}"; do + export HTTP_PROXY="$proxy" + export HTTPS_PROXY="$proxy" + + region=$(echo "$proxy" | grep -oP '^\w+-\w+') + echo "Testing from: $region" + + agent-browser --session "$region" open https://example.com + agent-browser --session "$region" screenshot "./screenshots/$region.png" + agent-browser --session "$region" close +done +``` + +### Rotating Proxies for Scraping + +```bash +#!/bin/bash +# Rotate through proxy list to avoid rate limiting + +PROXY_LIST=( + "http://proxy1.example.com:8080" + "http://proxy2.example.com:8080" + "http://proxy3.example.com:8080" +) + +URLS=( + "https://site.com/page1" + "https://site.com/page2" + "https://site.com/page3" +) + +for i in "${!URLS[@]}"; do + proxy_index=$((i % ${#PROXY_LIST[@]})) + export HTTP_PROXY="${PROXY_LIST[$proxy_index]}" + export HTTPS_PROXY="${PROXY_LIST[$proxy_index]}" + + agent-browser open "${URLS[$i]}" + agent-browser get text body > "output-$i.txt" + agent-browser close + + sleep 1 # Polite delay +done +``` + +### Corporate Network Access + +```bash +#!/bin/bash +# Access internal sites via corporate proxy + +export HTTP_PROXY="http://corpproxy.company.com:8080" +export HTTPS_PROXY="http://corpproxy.company.com:8080" +export NO_PROXY="localhost,127.0.0.1,.company.com" + +# External sites go through proxy +agent-browser open https://external-vendor.com + +# Internal sites bypass proxy +agent-browser open https://intranet.company.com +``` + +## Verifying Proxy Connection + +```bash +# Check your apparent IP +agent-browser open https://httpbin.org/ip +agent-browser get text body +# Should show proxy's IP, not your real IP +``` + +## Troubleshooting + +### Proxy Connection Failed + +```bash +# Test proxy connectivity first +curl -x http://proxy.example.com:8080 https://httpbin.org/ip + +# Check if proxy requires auth +export HTTP_PROXY="http://user:pass@proxy.example.com:8080" +``` + +### SSL/TLS Errors Through Proxy + +Some proxies perform SSL inspection. If you encounter certificate errors: + +```bash +# For testing only - not recommended for production +agent-browser open https://example.com --ignore-https-errors +``` + +### Slow Performance + +```bash +# Use proxy only when necessary +export NO_PROXY="*.cdn.com,*.static.com" # Direct CDN access +``` + +## Best Practices + +1. **Use environment variables** - Don't hardcode proxy credentials +2. **Set NO_PROXY appropriately** - Avoid routing local traffic through proxy +3. **Test proxy before automation** - Verify connectivity with simple requests +4. **Handle proxy failures gracefully** - Implement retry logic for unstable proxies +5. **Rotate proxies for large scraping jobs** - Distribute load and avoid bans diff --git a/.config/opencode/skills/agent-browser.disabled/references/session-management.md b/.config/opencode/skills/agent-browser.disabled/references/session-management.md new file mode 100644 index 0000000..bb5312d --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/session-management.md @@ -0,0 +1,193 @@ +# Session Management + +Multiple isolated browser sessions with state persistence and concurrent browsing. + +**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Named Sessions](#named-sessions) +- [Session Isolation Properties](#session-isolation-properties) +- [Session State Persistence](#session-state-persistence) +- [Common Patterns](#common-patterns) +- [Default Session](#default-session) +- [Session Cleanup](#session-cleanup) +- [Best Practices](#best-practices) + +## Named Sessions + +Use `--session` flag to isolate browser contexts: + +```bash +# Session 1: Authentication flow +agent-browser --session auth open https://app.example.com/login + +# Session 2: Public browsing (separate cookies, storage) +agent-browser --session public open https://example.com + +# Commands are isolated by session +agent-browser --session auth fill @e1 "user@example.com" +agent-browser --session public get text body +``` + +## Session Isolation Properties + +Each session has independent: +- Cookies +- LocalStorage / SessionStorage +- IndexedDB +- Cache +- Browsing history +- Open tabs + +## Session State Persistence + +### Save Session State + +```bash +# Save cookies, storage, and auth state +agent-browser state save /path/to/auth-state.json +``` + +### Load Session State + +```bash +# Restore saved state +agent-browser state load /path/to/auth-state.json + +# Continue with authenticated session +agent-browser open https://app.example.com/dashboard +``` + +### State File Contents + +```json +{ + "cookies": [...], + "localStorage": {...}, + "sessionStorage": {...}, + "origins": [...] +} +``` + +## Common Patterns + +### Authenticated Session Reuse + +```bash +#!/bin/bash +# Save login state once, reuse many times + +STATE_FILE="/tmp/auth-state.json" + +# Check if we have saved state +if [[ -f "$STATE_FILE" ]]; then + agent-browser state load "$STATE_FILE" + agent-browser open https://app.example.com/dashboard +else + # Perform login + agent-browser open https://app.example.com/login + agent-browser snapshot -i + agent-browser fill @e1 "$USERNAME" + agent-browser fill @e2 "$PASSWORD" + agent-browser click @e3 + agent-browser wait --load networkidle + + # Save for future use + agent-browser state save "$STATE_FILE" +fi +``` + +### Concurrent Scraping + +```bash +#!/bin/bash +# Scrape multiple sites concurrently + +# Start all sessions +agent-browser --session site1 open https://site1.com & +agent-browser --session site2 open https://site2.com & +agent-browser --session site3 open https://site3.com & +wait + +# Extract from each +agent-browser --session site1 get text body > site1.txt +agent-browser --session site2 get text body > site2.txt +agent-browser --session site3 get text body > site3.txt + +# Cleanup +agent-browser --session site1 close +agent-browser --session site2 close +agent-browser --session site3 close +``` + +### A/B Testing Sessions + +```bash +# Test different user experiences +agent-browser --session variant-a open "https://app.com?variant=a" +agent-browser --session variant-b open "https://app.com?variant=b" + +# Compare +agent-browser --session variant-a screenshot /tmp/variant-a.png +agent-browser --session variant-b screenshot /tmp/variant-b.png +``` + +## Default Session + +When `--session` is omitted, commands use the default session: + +```bash +# These use the same default session +agent-browser open https://example.com +agent-browser snapshot -i +agent-browser close # Closes default session +``` + +## Session Cleanup + +```bash +# Close specific session +agent-browser --session auth close + +# List active sessions +agent-browser session list +``` + +## Best Practices + +### 1. Name Sessions Semantically + +```bash +# GOOD: Clear purpose +agent-browser --session github-auth open https://github.com +agent-browser --session docs-scrape open https://docs.example.com + +# AVOID: Generic names +agent-browser --session s1 open https://github.com +``` + +### 2. Always Clean Up + +```bash +# Close sessions when done +agent-browser --session auth close +agent-browser --session scrape close +``` + +### 3. Handle State Files Securely + +```bash +# Don't commit state files (contain auth tokens!) +echo "*.auth-state.json" >> .gitignore + +# Delete after use +rm /tmp/auth-state.json +``` + +### 4. Timeout Long Sessions + +```bash +# Set timeout for automated scripts +timeout 60 agent-browser --session long-task get text body +``` diff --git a/.config/opencode/skills/agent-browser.disabled/references/snapshot-refs.md b/.config/opencode/skills/agent-browser.disabled/references/snapshot-refs.md new file mode 100644 index 0000000..3cc0fea --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/snapshot-refs.md @@ -0,0 +1,219 @@ +# Snapshot and Refs + +Compact element references that reduce context usage dramatically for AI agents. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [How Refs Work](#how-refs-work) +- [Snapshot Command](#the-snapshot-command) +- [Using Refs](#using-refs) +- [Ref Lifecycle](#ref-lifecycle) +- [Best Practices](#best-practices) +- [Ref Notation Details](#ref-notation-details) +- [Troubleshooting](#troubleshooting) + +## How Refs Work + +Traditional approach: +``` +Full DOM/HTML → AI parses → CSS selector → Action (~3000-5000 tokens) +``` + +agent-browser approach: +``` +Compact snapshot → @refs assigned → Direct interaction (~200-400 tokens) +``` + +## The Snapshot Command + +```bash +# Basic snapshot (shows page structure) +agent-browser snapshot + +# Interactive snapshot (-i flag) - RECOMMENDED +agent-browser snapshot -i +``` + +### Snapshot Output Format + +``` +Page: Example Site - Home +URL: https://example.com + +@e1 [header] + @e2 [nav] + @e3 [a] "Home" + @e4 [a] "Products" + @e5 [a] "About" + @e6 [button] "Sign In" + +@e7 [main] + @e8 [h1] "Welcome" + @e9 [form] + @e10 [input type="email"] placeholder="Email" + @e11 [input type="password"] placeholder="Password" + @e12 [button type="submit"] "Log In" + +@e13 [footer] + @e14 [a] "Privacy Policy" +``` + +## Using Refs + +Once you have refs, interact directly: + +```bash +# Click the "Sign In" button +agent-browser click @e6 + +# Fill email input +agent-browser fill @e10 "user@example.com" + +# Fill password +agent-browser fill @e11 "password123" + +# Submit the form +agent-browser click @e12 +``` + +## Ref Lifecycle + +**IMPORTANT**: Refs are invalidated when the page changes! + +```bash +# Get initial snapshot +agent-browser snapshot -i +# @e1 [button] "Next" + +# Click triggers page change +agent-browser click @e1 + +# MUST re-snapshot to get new refs! +agent-browser snapshot -i +# @e1 [h1] "Page 2" ← Different element now! +``` + +## Best Practices + +### 1. Always Snapshot Before Interacting + +```bash +# CORRECT +agent-browser open https://example.com +agent-browser snapshot -i # Get refs first +agent-browser click @e1 # Use ref + +# WRONG +agent-browser open https://example.com +agent-browser click @e1 # Ref doesn't exist yet! +``` + +### 2. Re-Snapshot After Navigation + +```bash +agent-browser click @e5 # Navigates to new page +agent-browser snapshot -i # Get new refs +agent-browser click @e1 # Use new refs +``` + +### 3. Re-Snapshot After Dynamic Changes + +```bash +agent-browser click @e1 # Opens dropdown +agent-browser snapshot -i # See dropdown items +agent-browser click @e7 # Select item +``` + +### 4. Snapshot Specific Regions + +For complex pages, snapshot specific areas: + +```bash +# Snapshot just the form +agent-browser snapshot @e9 +``` + +## Ref Notation Details + +``` +@e1 [tag type="value"] "text content" placeholder="hint" +│ │ │ │ │ +│ │ │ │ └─ Additional attributes +│ │ │ └─ Visible text +│ │ └─ Key attributes shown +│ └─ HTML tag name +└─ Unique ref ID +``` + +### Common Patterns + +``` +@e1 [button] "Submit" # Button with text +@e2 [input type="email"] # Email input +@e3 [input type="password"] # Password input +@e4 [a href="/page"] "Link Text" # Anchor link +@e5 [select] # Dropdown +@e6 [textarea] placeholder="Message" # Text area +@e7 [div class="modal"] # Container (when relevant) +@e8 [img alt="Logo"] # Image +@e9 [checkbox] checked # Checked checkbox +@e10 [radio] selected # Selected radio +``` + +## Iframes + +Snapshots automatically detect and inline iframe content. When the main-frame snapshot runs, each `Iframe` node is resolved and its child accessibility tree is included directly beneath it in the output. Refs assigned to elements inside iframes carry frame context, so interactions like `click`, `fill`, and `type` work without manually switching frames. + +```bash +agent-browser snapshot -i +# @e1 [heading] "Checkout" +# @e2 [Iframe] "payment-frame" +# @e3 [input] "Card number" +# @e4 [input] "Expiry" +# @e5 [button] "Pay" +# @e6 [button] "Cancel" + +# Interact with iframe elements directly using their refs +agent-browser fill @e3 "4111111111111111" +agent-browser fill @e4 "12/28" +agent-browser click @e5 +``` + +**Key details:** +- Only one level of iframe nesting is expanded (iframes within iframes are not recursed) +- Cross-origin iframes that block accessibility tree access are silently skipped +- Empty iframes or iframes with no interactive content are omitted from the output +- To scope a snapshot to a single iframe, use `frame @ref` then `snapshot -i` + +## Troubleshooting + +### "Ref not found" Error + +```bash +# Ref may have changed - re-snapshot +agent-browser snapshot -i +``` + +### Element Not Visible in Snapshot + +```bash +# Scroll down to reveal element +agent-browser scroll down 1000 +agent-browser snapshot -i + +# Or wait for dynamic content +agent-browser wait 1000 +agent-browser snapshot -i +``` + +### Too Many Elements + +```bash +# Snapshot specific container +agent-browser snapshot @e5 + +# Or use get text for content-only extraction +agent-browser get text @e5 +``` diff --git a/.config/opencode/skills/agent-browser.disabled/references/video-recording.md b/.config/opencode/skills/agent-browser.disabled/references/video-recording.md new file mode 100644 index 0000000..e6a9fb4 --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/references/video-recording.md @@ -0,0 +1,173 @@ +# Video Recording + +Capture browser automation as video for debugging, documentation, or verification. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Recording](#basic-recording) +- [Recording Commands](#recording-commands) +- [Use Cases](#use-cases) +- [Best Practices](#best-practices) +- [Output Format](#output-format) +- [Limitations](#limitations) + +## Basic Recording + +```bash +# Start recording +agent-browser record start ./demo.webm + +# Perform actions +agent-browser open https://example.com +agent-browser snapshot -i +agent-browser click @e1 +agent-browser fill @e2 "test input" + +# Stop and save +agent-browser record stop +``` + +## Recording Commands + +```bash +# Start recording to file +agent-browser record start ./output.webm + +# Stop current recording +agent-browser record stop + +# Restart with new file (stops current + starts new) +agent-browser record restart ./take2.webm +``` + +## Use Cases + +### Debugging Failed Automation + +```bash +#!/bin/bash +# Record automation for debugging + +agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm + +# Run your automation +agent-browser open https://app.example.com +agent-browser snapshot -i +agent-browser click @e1 || { + echo "Click failed - check recording" + agent-browser record stop + exit 1 +} + +agent-browser record stop +``` + +### Documentation Generation + +```bash +#!/bin/bash +# Record workflow for documentation + +agent-browser record start ./docs/how-to-login.webm + +agent-browser open https://app.example.com/login +agent-browser wait 1000 # Pause for visibility + +agent-browser snapshot -i +agent-browser fill @e1 "demo@example.com" +agent-browser wait 500 + +agent-browser fill @e2 "password" +agent-browser wait 500 + +agent-browser click @e3 +agent-browser wait --load networkidle +agent-browser wait 1000 # Show result + +agent-browser record stop +``` + +### CI/CD Test Evidence + +```bash +#!/bin/bash +# Record E2E test runs for CI artifacts + +TEST_NAME="${1:-e2e-test}" +RECORDING_DIR="./test-recordings" +mkdir -p "$RECORDING_DIR" + +agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm" + +# Run test +if run_e2e_test; then + echo "Test passed" +else + echo "Test failed - recording saved" +fi + +agent-browser record stop +``` + +## Best Practices + +### 1. Add Pauses for Clarity + +```bash +# Slow down for human viewing +agent-browser click @e1 +agent-browser wait 500 # Let viewer see result +``` + +### 2. Use Descriptive Filenames + +```bash +# Include context in filename +agent-browser record start ./recordings/login-flow-2024-01-15.webm +agent-browser record start ./recordings/checkout-test-run-42.webm +``` + +### 3. Handle Recording in Error Cases + +```bash +#!/bin/bash +set -e + +cleanup() { + agent-browser record stop 2>/dev/null || true + agent-browser close 2>/dev/null || true +} +trap cleanup EXIT + +agent-browser record start ./automation.webm +# ... automation steps ... +``` + +### 4. Combine with Screenshots + +```bash +# Record video AND capture key frames +agent-browser record start ./flow.webm + +agent-browser open https://example.com +agent-browser screenshot ./screenshots/step1-homepage.png + +agent-browser click @e1 +agent-browser screenshot ./screenshots/step2-after-click.png + +agent-browser record stop +``` + +## Output Format + +- Default format: WebM (VP8/VP9 codec) +- Compatible with all modern browsers and video players +- Compressed but high quality + +## Limitations + +- Recording adds slight overhead to automation +- Large recordings can consume significant disk space +- Some headless environments may have codec limitations diff --git a/.config/opencode/skills/agent-browser.disabled/templates/authenticated-session.sh b/.config/opencode/skills/agent-browser.disabled/templates/authenticated-session.sh new file mode 100755 index 0000000..b66c928 --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/templates/authenticated-session.sh @@ -0,0 +1,105 @@ +#!/bin/bash +# Template: Authenticated Session Workflow +# Purpose: Login once, save state, reuse for subsequent runs +# Usage: ./authenticated-session.sh [state-file] +# +# RECOMMENDED: Use the auth vault instead of this template: +# echo "" | agent-browser auth save myapp --url --username --password-stdin +# agent-browser auth login myapp +# The auth vault stores credentials securely and the LLM never sees passwords. +# +# Environment variables: +# APP_USERNAME - Login username/email +# APP_PASSWORD - Login password +# +# Two modes: +# 1. Discovery mode (default): Shows form structure so you can identify refs +# 2. Login mode: Performs actual login after you update the refs +# +# Setup steps: +# 1. Run once to see form structure (discovery mode) +# 2. Update refs in LOGIN FLOW section below +# 3. Set APP_USERNAME and APP_PASSWORD +# 4. Delete the DISCOVERY section + +set -euo pipefail + +LOGIN_URL="${1:?Usage: $0 [state-file]}" +STATE_FILE="${2:-./auth-state.json}" + +echo "Authentication workflow: $LOGIN_URL" + +# ================================================================ +# SAVED STATE: Skip login if valid saved state exists +# ================================================================ +if [[ -f "$STATE_FILE" ]]; then + echo "Loading saved state from $STATE_FILE..." + if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then + agent-browser wait --load networkidle + + CURRENT_URL=$(agent-browser get url) + if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then + echo "Session restored successfully" + agent-browser snapshot -i + exit 0 + fi + echo "Session expired, performing fresh login..." + agent-browser close 2>/dev/null || true + else + echo "Failed to load state, re-authenticating..." + fi + rm -f "$STATE_FILE" +fi + +# ================================================================ +# DISCOVERY MODE: Shows form structure (delete after setup) +# ================================================================ +echo "Opening login page..." +agent-browser open "$LOGIN_URL" +agent-browser wait --load networkidle + +echo "" +echo "Login form structure:" +echo "---" +agent-browser snapshot -i +echo "---" +echo "" +echo "Next steps:" +echo " 1. Note the refs: username=@e?, password=@e?, submit=@e?" +echo " 2. Update the LOGIN FLOW section below with your refs" +echo " 3. Set: export APP_USERNAME='...' APP_PASSWORD='...'" +echo " 4. Delete this DISCOVERY MODE section" +echo "" +agent-browser close +exit 0 + +# ================================================================ +# LOGIN FLOW: Uncomment and customize after discovery +# ================================================================ +# : "${APP_USERNAME:?Set APP_USERNAME environment variable}" +# : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}" +# +# agent-browser open "$LOGIN_URL" +# agent-browser wait --load networkidle +# agent-browser snapshot -i +# +# # Fill credentials (update refs to match your form) +# agent-browser fill @e1 "$APP_USERNAME" +# agent-browser fill @e2 "$APP_PASSWORD" +# agent-browser click @e3 +# agent-browser wait --load networkidle +# +# # Verify login succeeded +# FINAL_URL=$(agent-browser get url) +# if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then +# echo "Login failed - still on login page" +# agent-browser screenshot /tmp/login-failed.png +# agent-browser close +# exit 1 +# fi +# +# # Save state for future runs +# echo "Saving state to $STATE_FILE" +# agent-browser state save "$STATE_FILE" +# echo "Login successful" +# agent-browser snapshot -i diff --git a/.config/opencode/skills/agent-browser.disabled/templates/capture-workflow.sh b/.config/opencode/skills/agent-browser.disabled/templates/capture-workflow.sh new file mode 100755 index 0000000..3bc93ad --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/templates/capture-workflow.sh @@ -0,0 +1,69 @@ +#!/bin/bash +# Template: Content Capture Workflow +# Purpose: Extract content from web pages (text, screenshots, PDF) +# Usage: ./capture-workflow.sh [output-dir] +# +# Outputs: +# - page-full.png: Full page screenshot +# - page-structure.txt: Page element structure with refs +# - page-text.txt: All text content +# - page.pdf: PDF version +# +# Optional: Load auth state for protected pages + +set -euo pipefail + +TARGET_URL="${1:?Usage: $0 [output-dir]}" +OUTPUT_DIR="${2:-.}" + +echo "Capturing: $TARGET_URL" +mkdir -p "$OUTPUT_DIR" + +# Optional: Load authentication state +# if [[ -f "./auth-state.json" ]]; then +# echo "Loading authentication state..." +# agent-browser state load "./auth-state.json" +# fi + +# Navigate to target +agent-browser open "$TARGET_URL" +agent-browser wait --load networkidle + +# Get metadata +TITLE=$(agent-browser get title) +URL=$(agent-browser get url) +echo "Title: $TITLE" +echo "URL: $URL" + +# Capture full page screenshot +agent-browser screenshot --full "$OUTPUT_DIR/page-full.png" +echo "Saved: $OUTPUT_DIR/page-full.png" + +# Get page structure with refs +agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt" +echo "Saved: $OUTPUT_DIR/page-structure.txt" + +# Extract all text content +agent-browser get text body > "$OUTPUT_DIR/page-text.txt" +echo "Saved: $OUTPUT_DIR/page-text.txt" + +# Save as PDF +agent-browser pdf "$OUTPUT_DIR/page.pdf" +echo "Saved: $OUTPUT_DIR/page.pdf" + +# Optional: Extract specific elements using refs from structure +# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt" + +# Optional: Handle infinite scroll pages +# for i in {1..5}; do +# agent-browser scroll down 1000 +# agent-browser wait 1000 +# done +# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png" + +# Cleanup +agent-browser close + +echo "" +echo "Capture complete:" +ls -la "$OUTPUT_DIR" diff --git a/.config/opencode/skills/agent-browser.disabled/templates/form-automation.sh b/.config/opencode/skills/agent-browser.disabled/templates/form-automation.sh new file mode 100755 index 0000000..6784fcd --- /dev/null +++ b/.config/opencode/skills/agent-browser.disabled/templates/form-automation.sh @@ -0,0 +1,62 @@ +#!/bin/bash +# Template: Form Automation Workflow +# Purpose: Fill and submit web forms with validation +# Usage: ./form-automation.sh +# +# This template demonstrates the snapshot-interact-verify pattern: +# 1. Navigate to form +# 2. Snapshot to get element refs +# 3. Fill fields using refs +# 4. Submit and verify result +# +# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output + +set -euo pipefail + +FORM_URL="${1:?Usage: $0 }" + +echo "Form automation: $FORM_URL" + +# Step 1: Navigate to form +agent-browser open "$FORM_URL" +agent-browser wait --load networkidle + +# Step 2: Snapshot to discover form elements +echo "" +echo "Form structure:" +agent-browser snapshot -i + +# Step 3: Fill form fields (customize these refs based on snapshot output) +# +# Common field types: +# agent-browser fill @e1 "John Doe" # Text input +# agent-browser fill @e2 "user@example.com" # Email input +# agent-browser fill @e3 "SecureP@ss123" # Password input +# agent-browser select @e4 "Option Value" # Dropdown +# agent-browser check @e5 # Checkbox +# agent-browser click @e6 # Radio button +# agent-browser fill @e7 "Multi-line text" # Textarea +# agent-browser upload @e8 /path/to/file.pdf # File upload +# +# Uncomment and modify: +# agent-browser fill @e1 "Test User" +# agent-browser fill @e2 "test@example.com" +# agent-browser click @e3 # Submit button + +# Step 4: Wait for submission +# agent-browser wait --load networkidle +# agent-browser wait --url "**/success" # Or wait for redirect + +# Step 5: Verify result +echo "" +echo "Result:" +agent-browser get url +agent-browser snapshot -i + +# Optional: Capture evidence +agent-browser screenshot /tmp/form-result.png +echo "Screenshot saved: /tmp/form-result.png" + +# Cleanup +agent-browser close +echo "Done" diff --git a/.config/opencode/skills/brainstorming/SKILL.md b/.config/opencode/skills/brainstorming/SKILL.md new file mode 100644 index 0000000..edbc2b5 --- /dev/null +++ b/.config/opencode/skills/brainstorming/SKILL.md @@ -0,0 +1,164 @@ +--- +name: brainstorming +description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation." +--- + +# Brainstorming Ideas Into Designs + +Help turn ideas into fully formed designs and specs through natural collaborative dialogue. + +Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval. + + +Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity. + + +## Anti-Pattern: "This Is Too Simple To Need A Design" + +Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval. + +## Checklist + +You MUST create a task for each of these items and complete them in order: + +1. **Explore project context** — check files, docs, recent commits +2. **Offer visual companion** (if topic will involve visual questions) — this is its own message, not combined with a clarifying question. See the Visual Companion section below. +3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria +4. **Propose 2-3 approaches** — with trade-offs and your recommendation +5. **Present design** — in sections scaled to their complexity, get user approval after each section +6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD--design.md` and commit +7. **Spec review loop** — dispatch spec-document-reviewer subagent with precisely crafted review context (never your session history); fix issues and re-dispatch until approved (max 3 iterations, then surface to human) +8. **User reviews written spec** — ask user to review the spec file before proceeding +9. **Transition to implementation** — invoke writing-plans skill to create implementation plan + +## Process Flow + +```dot +digraph brainstorming { + "Explore project context" [shape=box]; + "Visual questions ahead?" [shape=diamond]; + "Offer Visual Companion\n(own message, no other content)" [shape=box]; + "Ask clarifying questions" [shape=box]; + "Propose 2-3 approaches" [shape=box]; + "Present design sections" [shape=box]; + "User approves design?" [shape=diamond]; + "Write design doc" [shape=box]; + "Spec review loop" [shape=box]; + "Spec review passed?" [shape=diamond]; + "User reviews spec?" [shape=diamond]; + "Invoke writing-plans skill" [shape=doublecircle]; + + "Explore project context" -> "Visual questions ahead?"; + "Visual questions ahead?" -> "Offer Visual Companion\n(own message, no other content)" [label="yes"]; + "Visual questions ahead?" -> "Ask clarifying questions" [label="no"]; + "Offer Visual Companion\n(own message, no other content)" -> "Ask clarifying questions"; + "Ask clarifying questions" -> "Propose 2-3 approaches"; + "Propose 2-3 approaches" -> "Present design sections"; + "Present design sections" -> "User approves design?"; + "User approves design?" -> "Present design sections" [label="no, revise"]; + "User approves design?" -> "Write design doc" [label="yes"]; + "Write design doc" -> "Spec review loop"; + "Spec review loop" -> "Spec review passed?"; + "Spec review passed?" -> "Spec review loop" [label="issues found,\nfix and re-dispatch"]; + "Spec review passed?" -> "User reviews spec?" [label="approved"]; + "User reviews spec?" -> "Write design doc" [label="changes requested"]; + "User reviews spec?" -> "Invoke writing-plans skill" [label="approved"]; +} +``` + +**The terminal state is invoking writing-plans.** Do NOT invoke frontend-design, mcp-builder, or any other implementation skill. The ONLY skill you invoke after brainstorming is writing-plans. + +## The Process + +**Understanding the idea:** + +- Check out the current project state first (files, docs, recent commits) +- Before asking detailed questions, assess scope: if the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag this immediately. Don't spend questions refining details of a project that needs to be decomposed first. +- If the project is too large for a single spec, help the user decompose into sub-projects: what are the independent pieces, how do they relate, what order should they be built? Then brainstorm the first sub-project through the normal design flow. Each sub-project gets its own spec → plan → implementation cycle. +- For appropriately-scoped projects, ask questions one at a time to refine the idea +- Prefer multiple choice questions when possible, but open-ended is fine too +- Only one question per message - if a topic needs more exploration, break it into multiple questions +- Focus on understanding: purpose, constraints, success criteria + +**Exploring approaches:** + +- Propose 2-3 different approaches with trade-offs +- Present options conversationally with your recommendation and reasoning +- Lead with your recommended option and explain why + +**Presenting the design:** + +- Once you believe you understand what you're building, present the design +- Scale each section to its complexity: a few sentences if straightforward, up to 200-300 words if nuanced +- Ask after each section whether it looks right so far +- Cover: architecture, components, data flow, error handling, testing +- Be ready to go back and clarify if something doesn't make sense + +**Design for isolation and clarity:** + +- Break the system into smaller units that each have one clear purpose, communicate through well-defined interfaces, and can be understood and tested independently +- For each unit, you should be able to answer: what does it do, how do you use it, and what does it depend on? +- Can someone understand what a unit does without reading its internals? Can you change the internals without breaking consumers? If not, the boundaries need work. +- Smaller, well-bounded units are also easier for you to work with - you reason better about code you can hold in context at once, and your edits are more reliable when files are focused. When a file grows large, that's often a signal that it's doing too much. + +**Working in existing codebases:** + +- Explore the current structure before proposing changes. Follow existing patterns. +- Where existing code has problems that affect the work (e.g., a file that's grown too large, unclear boundaries, tangled responsibilities), include targeted improvements as part of the design - the way a good developer improves code they're working in. +- Don't propose unrelated refactoring. Stay focused on what serves the current goal. + +## After the Design + +**Documentation:** + +- Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD--design.md` + - (User preferences for spec location override this default) +- Use elements-of-style:writing-clearly-and-concisely skill if available +- Commit the design document to git + +**Spec Review Loop:** +After writing the spec document: + +1. Dispatch spec-document-reviewer subagent (see spec-document-reviewer-prompt.md) +2. If Issues Found: fix, re-dispatch, repeat until Approved +3. If loop exceeds 3 iterations, surface to human for guidance + +**User Review Gate:** +After the spec review loop passes, ask the user to review the written spec before proceeding: + +> "Spec written and committed to ``. Please review it and let me know if you want to make any changes before we start writing out the implementation plan." + +Wait for the user's response. If they request changes, make them and re-run the spec review loop. Only proceed once the user approves. + +**Implementation:** + +- Invoke the writing-plans skill to create a detailed implementation plan +- Do NOT invoke any other skill. writing-plans is the next step. + +## Key Principles + +- **One question at a time** - Don't overwhelm with multiple questions +- **Multiple choice preferred** - Easier to answer than open-ended when possible +- **YAGNI ruthlessly** - Remove unnecessary features from all designs +- **Explore alternatives** - Always propose 2-3 approaches before settling +- **Incremental validation** - Present design, get approval before moving on +- **Be flexible** - Go back and clarify when something doesn't make sense + +## Visual Companion + +A browser-based companion for showing mockups, diagrams, and visual options during brainstorming. Available as a tool — not a mode. Accepting the companion means it's available for questions that benefit from visual treatment; it does NOT mean every question goes through the browser. + +**Offering the companion:** When you anticipate that upcoming questions will involve visual content (mockups, layouts, diagrams), offer it once for consent: +> "Some of what we're working on might be easier to explain if I can show it to you in a web browser. I can put together mockups, diagrams, comparisons, and other visuals as we go. This feature is still new and can be token-intensive. Want to try it? (Requires opening a local URL)" + +**This offer MUST be its own message.** Do not combine it with clarifying questions, context summaries, or any other content. The message should contain ONLY the offer above and nothing else. Wait for the user's response before continuing. If they decline, proceed with text-only brainstorming. + +**Per-question decision:** Even after the user accepts, decide FOR EACH QUESTION whether to use the browser or the terminal. The test: **would the user understand this better by seeing it than reading it?** + +- **Use the browser** for content that IS visual — mockups, wireframes, layout comparisons, architecture diagrams, side-by-side visual designs +- **Use the terminal** for content that is text — requirements questions, conceptual choices, tradeoff lists, A/B/C/D text options, scope decisions + +A question about a UI topic is not automatically a visual question. "What does personality mean in this context?" is a conceptual question — use the terminal. "Which wizard layout works better?" is a visual question — use the browser. + +If they agree to the companion, read the detailed guide before proceeding: +`skills/brainstorming/visual-companion.md` diff --git a/.config/opencode/skills/brainstorming/scripts/frame-template.html b/.config/opencode/skills/brainstorming/scripts/frame-template.html new file mode 100644 index 0000000..dcfe018 --- /dev/null +++ b/.config/opencode/skills/brainstorming/scripts/frame-template.html @@ -0,0 +1,214 @@ + + + + + Superpowers Brainstorming + + + +

+

Superpowers Brainstorming

+
Connected
+
+ +
+
+ +
+
+ +
+ Click an option above, then return to the terminal +
+ + + diff --git a/.config/opencode/skills/brainstorming/scripts/helper.js b/.config/opencode/skills/brainstorming/scripts/helper.js new file mode 100644 index 0000000..111f97f --- /dev/null +++ b/.config/opencode/skills/brainstorming/scripts/helper.js @@ -0,0 +1,88 @@ +(function() { + const WS_URL = 'ws://' + window.location.host; + let ws = null; + let eventQueue = []; + + function connect() { + ws = new WebSocket(WS_URL); + + ws.onopen = () => { + eventQueue.forEach(e => ws.send(JSON.stringify(e))); + eventQueue = []; + }; + + ws.onmessage = (msg) => { + const data = JSON.parse(msg.data); + if (data.type === 'reload') { + window.location.reload(); + } + }; + + ws.onclose = () => { + setTimeout(connect, 1000); + }; + } + + function sendEvent(event) { + event.timestamp = Date.now(); + if (ws && ws.readyState === WebSocket.OPEN) { + ws.send(JSON.stringify(event)); + } else { + eventQueue.push(event); + } + } + + // Capture clicks on choice elements + document.addEventListener('click', (e) => { + const target = e.target.closest('[data-choice]'); + if (!target) return; + + sendEvent({ + type: 'click', + text: target.textContent.trim(), + choice: target.dataset.choice, + id: target.id || null + }); + + // Update indicator bar (defer so toggleSelect runs first) + setTimeout(() => { + const indicator = document.getElementById('indicator-text'); + if (!indicator) return; + const container = target.closest('.options') || target.closest('.cards'); + const selected = container ? container.querySelectorAll('.selected') : []; + if (selected.length === 0) { + indicator.textContent = 'Click an option above, then return to the terminal'; + } else if (selected.length === 1) { + const label = selected[0].querySelector('h3, .content h3, .card-body h3')?.textContent?.trim() || selected[0].dataset.choice; + indicator.innerHTML = '' + label + ' selected — return to terminal to continue'; + } else { + indicator.innerHTML = '' + selected.length + ' selected — return to terminal to continue'; + } + }, 0); + }); + + // Frame UI: selection tracking + window.selectedChoice = null; + + window.toggleSelect = function(el) { + const container = el.closest('.options') || el.closest('.cards'); + const multi = container && container.dataset.multiselect !== undefined; + if (container && !multi) { + container.querySelectorAll('.option, .card').forEach(o => o.classList.remove('selected')); + } + if (multi) { + el.classList.toggle('selected'); + } else { + el.classList.add('selected'); + } + window.selectedChoice = el.dataset.choice; + }; + + // Expose API for explicit use + window.brainstorm = { + send: sendEvent, + choice: (value, metadata = {}) => sendEvent({ type: 'choice', value, ...metadata }) + }; + + connect(); +})(); diff --git a/.config/opencode/skills/brainstorming/scripts/server.cjs b/.config/opencode/skills/brainstorming/scripts/server.cjs new file mode 100644 index 0000000..86c3080 --- /dev/null +++ b/.config/opencode/skills/brainstorming/scripts/server.cjs @@ -0,0 +1,338 @@ +const crypto = require('crypto'); +const http = require('http'); +const fs = require('fs'); +const path = require('path'); + +// ========== WebSocket Protocol (RFC 6455) ========== + +const OPCODES = { TEXT: 0x01, CLOSE: 0x08, PING: 0x09, PONG: 0x0A }; +const WS_MAGIC = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'; + +function computeAcceptKey(clientKey) { + return crypto.createHash('sha1').update(clientKey + WS_MAGIC).digest('base64'); +} + +function encodeFrame(opcode, payload) { + const fin = 0x80; + const len = payload.length; + let header; + + if (len < 126) { + header = Buffer.alloc(2); + header[0] = fin | opcode; + header[1] = len; + } else if (len < 65536) { + header = Buffer.alloc(4); + header[0] = fin | opcode; + header[1] = 126; + header.writeUInt16BE(len, 2); + } else { + header = Buffer.alloc(10); + header[0] = fin | opcode; + header[1] = 127; + header.writeBigUInt64BE(BigInt(len), 2); + } + + return Buffer.concat([header, payload]); +} + +function decodeFrame(buffer) { + if (buffer.length < 2) return null; + + const secondByte = buffer[1]; + const opcode = buffer[0] & 0x0F; + const masked = (secondByte & 0x80) !== 0; + let payloadLen = secondByte & 0x7F; + let offset = 2; + + if (!masked) throw new Error('Client frames must be masked'); + + if (payloadLen === 126) { + if (buffer.length < 4) return null; + payloadLen = buffer.readUInt16BE(2); + offset = 4; + } else if (payloadLen === 127) { + if (buffer.length < 10) return null; + payloadLen = Number(buffer.readBigUInt64BE(2)); + offset = 10; + } + + const maskOffset = offset; + const dataOffset = offset + 4; + const totalLen = dataOffset + payloadLen; + if (buffer.length < totalLen) return null; + + const mask = buffer.slice(maskOffset, dataOffset); + const data = Buffer.alloc(payloadLen); + for (let i = 0; i < payloadLen; i++) { + data[i] = buffer[dataOffset + i] ^ mask[i % 4]; + } + + return { opcode, payload: data, bytesConsumed: totalLen }; +} + +// ========== Configuration ========== + +const PORT = process.env.BRAINSTORM_PORT || (49152 + Math.floor(Math.random() * 16383)); +const HOST = process.env.BRAINSTORM_HOST || '127.0.0.1'; +const URL_HOST = process.env.BRAINSTORM_URL_HOST || (HOST === '127.0.0.1' ? 'localhost' : HOST); +const SCREEN_DIR = process.env.BRAINSTORM_DIR || '/tmp/brainstorm'; +const OWNER_PID = process.env.BRAINSTORM_OWNER_PID ? Number(process.env.BRAINSTORM_OWNER_PID) : null; + +const MIME_TYPES = { + '.html': 'text/html', '.css': 'text/css', '.js': 'application/javascript', + '.json': 'application/json', '.png': 'image/png', '.jpg': 'image/jpeg', + '.jpeg': 'image/jpeg', '.gif': 'image/gif', '.svg': 'image/svg+xml' +}; + +// ========== Templates and Constants ========== + +const WAITING_PAGE = ` + +Brainstorm Companion + + +

Brainstorm Companion

+

Waiting for the agent to push a screen...

`; + +const frameTemplate = fs.readFileSync(path.join(__dirname, 'frame-template.html'), 'utf-8'); +const helperScript = fs.readFileSync(path.join(__dirname, 'helper.js'), 'utf-8'); +const helperInjection = ''; + +// ========== Helper Functions ========== + +function isFullDocument(html) { + const trimmed = html.trimStart().toLowerCase(); + return trimmed.startsWith('', content); +} + +function getNewestScreen() { + const files = fs.readdirSync(SCREEN_DIR) + .filter(f => f.endsWith('.html')) + .map(f => { + const fp = path.join(SCREEN_DIR, f); + return { path: fp, mtime: fs.statSync(fp).mtime.getTime() }; + }) + .sort((a, b) => b.mtime - a.mtime); + return files.length > 0 ? files[0].path : null; +} + +// ========== HTTP Request Handler ========== + +function handleRequest(req, res) { + touchActivity(); + if (req.method === 'GET' && req.url === '/') { + const screenFile = getNewestScreen(); + let html = screenFile + ? (raw => isFullDocument(raw) ? raw : wrapInFrame(raw))(fs.readFileSync(screenFile, 'utf-8')) + : WAITING_PAGE; + + if (html.includes('')) { + html = html.replace('', helperInjection + '\n'); + } else { + html += helperInjection; + } + + res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' }); + res.end(html); + } else if (req.method === 'GET' && req.url.startsWith('/files/')) { + const fileName = req.url.slice(7); + const filePath = path.join(SCREEN_DIR, path.basename(fileName)); + if (!fs.existsSync(filePath)) { + res.writeHead(404); + res.end('Not found'); + return; + } + const ext = path.extname(filePath).toLowerCase(); + const contentType = MIME_TYPES[ext] || 'application/octet-stream'; + res.writeHead(200, { 'Content-Type': contentType }); + res.end(fs.readFileSync(filePath)); + } else { + res.writeHead(404); + res.end('Not found'); + } +} + +// ========== WebSocket Connection Handling ========== + +const clients = new Set(); + +function handleUpgrade(req, socket) { + const key = req.headers['sec-websocket-key']; + if (!key) { socket.destroy(); return; } + + const accept = computeAcceptKey(key); + socket.write( + 'HTTP/1.1 101 Switching Protocols\r\n' + + 'Upgrade: websocket\r\n' + + 'Connection: Upgrade\r\n' + + 'Sec-WebSocket-Accept: ' + accept + '\r\n\r\n' + ); + + let buffer = Buffer.alloc(0); + clients.add(socket); + + socket.on('data', (chunk) => { + buffer = Buffer.concat([buffer, chunk]); + while (buffer.length > 0) { + let result; + try { + result = decodeFrame(buffer); + } catch (e) { + socket.end(encodeFrame(OPCODES.CLOSE, Buffer.alloc(0))); + clients.delete(socket); + return; + } + if (!result) break; + buffer = buffer.slice(result.bytesConsumed); + + switch (result.opcode) { + case OPCODES.TEXT: + handleMessage(result.payload.toString()); + break; + case OPCODES.CLOSE: + socket.end(encodeFrame(OPCODES.CLOSE, Buffer.alloc(0))); + clients.delete(socket); + return; + case OPCODES.PING: + socket.write(encodeFrame(OPCODES.PONG, result.payload)); + break; + case OPCODES.PONG: + break; + default: { + const closeBuf = Buffer.alloc(2); + closeBuf.writeUInt16BE(1003); + socket.end(encodeFrame(OPCODES.CLOSE, closeBuf)); + clients.delete(socket); + return; + } + } + } + }); + + socket.on('close', () => clients.delete(socket)); + socket.on('error', () => clients.delete(socket)); +} + +function handleMessage(text) { + let event; + try { + event = JSON.parse(text); + } catch (e) { + console.error('Failed to parse WebSocket message:', e.message); + return; + } + touchActivity(); + console.log(JSON.stringify({ source: 'user-event', ...event })); + if (event.choice) { + const eventsFile = path.join(SCREEN_DIR, '.events'); + fs.appendFileSync(eventsFile, JSON.stringify(event) + '\n'); + } +} + +function broadcast(msg) { + const frame = encodeFrame(OPCODES.TEXT, Buffer.from(JSON.stringify(msg))); + for (const socket of clients) { + try { socket.write(frame); } catch (e) { clients.delete(socket); } + } +} + +// ========== Activity Tracking ========== + +const IDLE_TIMEOUT_MS = 30 * 60 * 1000; // 30 minutes +let lastActivity = Date.now(); + +function touchActivity() { + lastActivity = Date.now(); +} + +// ========== File Watching ========== + +const debounceTimers = new Map(); + +// ========== Server Startup ========== + +function startServer() { + if (!fs.existsSync(SCREEN_DIR)) fs.mkdirSync(SCREEN_DIR, { recursive: true }); + + // Track known files to distinguish new screens from updates. + // macOS fs.watch reports 'rename' for both new files and overwrites, + // so we can't rely on eventType alone. + const knownFiles = new Set( + fs.readdirSync(SCREEN_DIR).filter(f => f.endsWith('.html')) + ); + + const server = http.createServer(handleRequest); + server.on('upgrade', handleUpgrade); + + const watcher = fs.watch(SCREEN_DIR, (eventType, filename) => { + if (!filename || !filename.endsWith('.html')) return; + + if (debounceTimers.has(filename)) clearTimeout(debounceTimers.get(filename)); + debounceTimers.set(filename, setTimeout(() => { + debounceTimers.delete(filename); + const filePath = path.join(SCREEN_DIR, filename); + + if (!fs.existsSync(filePath)) return; // file was deleted + touchActivity(); + + if (!knownFiles.has(filename)) { + knownFiles.add(filename); + const eventsFile = path.join(SCREEN_DIR, '.events'); + if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile); + console.log(JSON.stringify({ type: 'screen-added', file: filePath })); + } else { + console.log(JSON.stringify({ type: 'screen-updated', file: filePath })); + } + + broadcast({ type: 'reload' }); + }, 100)); + }); + watcher.on('error', (err) => console.error('fs.watch error:', err.message)); + + function shutdown(reason) { + console.log(JSON.stringify({ type: 'server-stopped', reason })); + const infoFile = path.join(SCREEN_DIR, '.server-info'); + if (fs.existsSync(infoFile)) fs.unlinkSync(infoFile); + fs.writeFileSync( + path.join(SCREEN_DIR, '.server-stopped'), + JSON.stringify({ reason, timestamp: Date.now() }) + '\n' + ); + watcher.close(); + clearInterval(lifecycleCheck); + server.close(() => process.exit(0)); + } + + function ownerAlive() { + if (!OWNER_PID) return true; + try { process.kill(OWNER_PID, 0); return true; } catch (e) { return false; } + } + + // Check every 60s: exit if owner process died or idle for 30 minutes + const lifecycleCheck = setInterval(() => { + if (!ownerAlive()) shutdown('owner process exited'); + else if (Date.now() - lastActivity > IDLE_TIMEOUT_MS) shutdown('idle timeout'); + }, 60 * 1000); + lifecycleCheck.unref(); + + server.listen(PORT, HOST, () => { + const info = JSON.stringify({ + type: 'server-started', port: Number(PORT), host: HOST, + url_host: URL_HOST, url: 'http://' + URL_HOST + ':' + PORT, + screen_dir: SCREEN_DIR + }); + console.log(info); + fs.writeFileSync(path.join(SCREEN_DIR, '.server-info'), info + '\n'); + }); +} + +if (require.main === module) { + startServer(); +} + +module.exports = { computeAcceptKey, encodeFrame, decodeFrame, OPCODES }; diff --git a/.config/opencode/skills/brainstorming/scripts/start-server.sh b/.config/opencode/skills/brainstorming/scripts/start-server.sh new file mode 100755 index 0000000..a0ef299 --- /dev/null +++ b/.config/opencode/skills/brainstorming/scripts/start-server.sh @@ -0,0 +1,153 @@ +#!/usr/bin/env bash +# Start the brainstorm server and output connection info +# Usage: start-server.sh [--project-dir ] [--host ] [--url-host ] [--foreground] [--background] +# +# Starts server on a random high port, outputs JSON with URL. +# Each session gets its own directory to avoid conflicts. +# +# Options: +# --project-dir Store session files under /.superpowers/brainstorm/ +# instead of /tmp. Files persist after server stops. +# --host Host/interface to bind (default: 127.0.0.1). +# Use 0.0.0.0 in remote/containerized environments. +# --url-host Hostname shown in returned URL JSON. +# --foreground Run server in the current terminal (no backgrounding). +# --background Force background mode (overrides Codex auto-foreground). + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" + +# Parse arguments +PROJECT_DIR="" +FOREGROUND="false" +FORCE_BACKGROUND="false" +BIND_HOST="127.0.0.1" +URL_HOST="" +while [[ $# -gt 0 ]]; do + case "$1" in + --project-dir) + PROJECT_DIR="$2" + shift 2 + ;; + --host) + BIND_HOST="$2" + shift 2 + ;; + --url-host) + URL_HOST="$2" + shift 2 + ;; + --foreground|--no-daemon) + FOREGROUND="true" + shift + ;; + --background|--daemon) + FORCE_BACKGROUND="true" + shift + ;; + *) + echo "{\"error\": \"Unknown argument: $1\"}" + exit 1 + ;; + esac +done + +if [[ -z "$URL_HOST" ]]; then + if [[ "$BIND_HOST" == "127.0.0.1" || "$BIND_HOST" == "localhost" ]]; then + URL_HOST="localhost" + else + URL_HOST="$BIND_HOST" + fi +fi + +# Some environments reap detached/background processes. Auto-foreground when detected. +if [[ -n "${CODEX_CI:-}" && "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then + FOREGROUND="true" +fi + +# Windows/Git Bash reaps nohup background processes. Auto-foreground when detected. +if [[ "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then + case "${OSTYPE:-}" in + msys*|cygwin*|mingw*) FOREGROUND="true" ;; + esac + if [[ -n "${MSYSTEM:-}" ]]; then + FOREGROUND="true" + fi +fi + +# Generate unique session directory +SESSION_ID="$$-$(date +%s)" + +if [[ -n "$PROJECT_DIR" ]]; then + SCREEN_DIR="${PROJECT_DIR}/.superpowers/brainstorm/${SESSION_ID}" +else + SCREEN_DIR="/tmp/brainstorm-${SESSION_ID}" +fi + +PID_FILE="${SCREEN_DIR}/.server.pid" +LOG_FILE="${SCREEN_DIR}/.server.log" + +# Create fresh session directory +mkdir -p "$SCREEN_DIR" + +# Kill any existing server +if [[ -f "$PID_FILE" ]]; then + old_pid=$(cat "$PID_FILE") + kill "$old_pid" 2>/dev/null + rm -f "$PID_FILE" +fi + +cd "$SCRIPT_DIR" + +# Resolve the harness PID (grandparent of this script). +# $PPID is the ephemeral shell the harness spawned to run us — it dies +# when this script exits. The harness itself is $PPID's parent. +OWNER_PID="$(ps -o ppid= -p "$PPID" 2>/dev/null | tr -d ' ')" +if [[ -z "$OWNER_PID" || "$OWNER_PID" == "1" ]]; then + OWNER_PID="$PPID" +fi + +# On Windows/MSYS2, the MSYS2 PID namespace is invisible to Node.js. +# Skip owner-PID monitoring — the 30-minute idle timeout prevents orphans. +case "${OSTYPE:-}" in + msys*|cygwin*|mingw*) OWNER_PID="" ;; +esac + +# Foreground mode for environments that reap detached/background processes. +if [[ "$FOREGROUND" == "true" ]]; then + echo "$$" > "$PID_FILE" + env BRAINSTORM_DIR="$SCREEN_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs + exit $? +fi + +# Start server, capturing output to log file +# Use nohup to survive shell exit; disown to remove from job table +nohup env BRAINSTORM_DIR="$SCREEN_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs > "$LOG_FILE" 2>&1 & +SERVER_PID=$! +disown "$SERVER_PID" 2>/dev/null +echo "$SERVER_PID" > "$PID_FILE" + +# Wait for server-started message (check log file) +for i in {1..50}; do + if grep -q "server-started" "$LOG_FILE" 2>/dev/null; then + # Verify server is still alive after a short window (catches process reapers) + alive="true" + for _ in {1..20}; do + if ! kill -0 "$SERVER_PID" 2>/dev/null; then + alive="false" + break + fi + sleep 0.1 + done + if [[ "$alive" != "true" ]]; then + echo "{\"error\": \"Server started but was killed. Retry in a persistent terminal with: $SCRIPT_DIR/start-server.sh${PROJECT_DIR:+ --project-dir $PROJECT_DIR} --host $BIND_HOST --url-host $URL_HOST --foreground\"}" + exit 1 + fi + grep "server-started" "$LOG_FILE" | head -1 + exit 0 + fi + sleep 0.1 +done + +# Timeout - server didn't start +echo '{"error": "Server failed to start within 5 seconds"}' +exit 1 diff --git a/.config/opencode/skills/brainstorming/scripts/stop-server.sh b/.config/opencode/skills/brainstorming/scripts/stop-server.sh new file mode 100755 index 0000000..2e5973d --- /dev/null +++ b/.config/opencode/skills/brainstorming/scripts/stop-server.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash +# Stop the brainstorm server and clean up +# Usage: stop-server.sh +# +# Kills the server process. Only deletes session directory if it's +# under /tmp (ephemeral). Persistent directories (.superpowers/) are +# kept so mockups can be reviewed later. + +SCREEN_DIR="$1" + +if [[ -z "$SCREEN_DIR" ]]; then + echo '{"error": "Usage: stop-server.sh "}' + exit 1 +fi + +PID_FILE="${SCREEN_DIR}/.server.pid" + +if [[ -f "$PID_FILE" ]]; then + pid=$(cat "$PID_FILE") + + # Try to stop gracefully, fallback to force if still alive + kill "$pid" 2>/dev/null || true + + # Wait for graceful shutdown (up to ~2s) + for i in {1..20}; do + if ! kill -0 "$pid" 2>/dev/null; then + break + fi + sleep 0.1 + done + + # If still running, escalate to SIGKILL + if kill -0 "$pid" 2>/dev/null; then + kill -9 "$pid" 2>/dev/null || true + + # Give SIGKILL a moment to take effect + sleep 0.1 + fi + + if kill -0 "$pid" 2>/dev/null; then + echo '{"status": "failed", "error": "process still running"}' + exit 1 + fi + + rm -f "$PID_FILE" "${SCREEN_DIR}/.server.log" + + # Only delete ephemeral /tmp directories + if [[ "$SCREEN_DIR" == /tmp/* ]]; then + rm -rf "$SCREEN_DIR" + fi + + echo '{"status": "stopped"}' +else + echo '{"status": "not_running"}' +fi diff --git a/.config/opencode/skills/brainstorming/spec-document-reviewer-prompt.md b/.config/opencode/skills/brainstorming/spec-document-reviewer-prompt.md new file mode 100644 index 0000000..35acbb6 --- /dev/null +++ b/.config/opencode/skills/brainstorming/spec-document-reviewer-prompt.md @@ -0,0 +1,49 @@ +# Spec Document Reviewer Prompt Template + +Use this template when dispatching a spec document reviewer subagent. + +**Purpose:** Verify the spec is complete, consistent, and ready for implementation planning. + +**Dispatch after:** Spec document is written to docs/superpowers/specs/ + +``` +Task tool (general-purpose): + description: "Review spec document" + prompt: | + You are a spec document reviewer. Verify this spec is complete and ready for planning. + + **Spec to review:** [SPEC_FILE_PATH] + + ## What to Check + + | Category | What to Look For | + |----------|------------------| + | Completeness | TODOs, placeholders, "TBD", incomplete sections | + | Consistency | Internal contradictions, conflicting requirements | + | Clarity | Requirements ambiguous enough to cause someone to build the wrong thing | + | Scope | Focused enough for a single plan — not covering multiple independent subsystems | + | YAGNI | Unrequested features, over-engineering | + + ## Calibration + + **Only flag issues that would cause real problems during implementation planning.** + A missing section, a contradiction, or a requirement so ambiguous it could be + interpreted two different ways — those are issues. Minor wording improvements, + stylistic preferences, and "sections less detailed than others" are not. + + Approve unless there are serious gaps that would lead to a flawed plan. + + ## Output Format + + ## Spec Review + + **Status:** Approved | Issues Found + + **Issues (if any):** + - [Section X]: [specific issue] - [why it matters for planning] + + **Recommendations (advisory, do not block approval):** + - [suggestions for improvement] +``` + +**Reviewer returns:** Status, Issues (if any), Recommendations diff --git a/.config/opencode/skills/brainstorming/visual-companion.md b/.config/opencode/skills/brainstorming/visual-companion.md new file mode 100644 index 0000000..537ed3c --- /dev/null +++ b/.config/opencode/skills/brainstorming/visual-companion.md @@ -0,0 +1,286 @@ +# Visual Companion Guide + +Browser-based visual brainstorming companion for showing mockups, diagrams, and options. + +## When to Use + +Decide per-question, not per-session. The test: **would the user understand this better by seeing it than reading it?** + +**Use the browser** when the content itself is visual: + +- **UI mockups** — wireframes, layouts, navigation structures, component designs +- **Architecture diagrams** — system components, data flow, relationship maps +- **Side-by-side visual comparisons** — comparing two layouts, two color schemes, two design directions +- **Design polish** — when the question is about look and feel, spacing, visual hierarchy +- **Spatial relationships** — state machines, flowcharts, entity relationships rendered as diagrams + +**Use the terminal** when the content is text or tabular: + +- **Requirements and scope questions** — "what does X mean?", "which features are in scope?" +- **Conceptual A/B/C choices** — picking between approaches described in words +- **Tradeoff lists** — pros/cons, comparison tables +- **Technical decisions** — API design, data modeling, architectural approach selection +- **Clarifying questions** — anything where the answer is words, not a visual preference + +A question *about* a UI topic is not automatically a visual question. "What kind of wizard do you want?" is conceptual — use the terminal. "Which of these wizard layouts feels right?" is visual — use the browser. + +## How It Works + +The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content, the user sees it in their browser and can click to select options. Selections are recorded to a `.events` file that you read on your next turn. + +**Content fragments vs full documents:** If your HTML file starts with `/.superpowers/brainstorm/` for the session directory. + +**Note:** Pass the project root as `--project-dir` so mockups persist in `.superpowers/brainstorm/` and survive server restarts. Without it, files go to `/tmp` and get cleaned up. Remind the user to add `.superpowers/` to `.gitignore` if it's not already there. + +**Launching the server by platform:** + +**Claude Code (macOS / Linux):** +```bash +# Default mode works — the script backgrounds the server itself +scripts/start-server.sh --project-dir /path/to/project +``` + +**Claude Code (Windows):** +```bash +# Windows auto-detects and uses foreground mode, which blocks the tool call. +# Use run_in_background: true on the Bash tool call so the server survives +# across conversation turns. +scripts/start-server.sh --project-dir /path/to/project +``` +When calling this via the Bash tool, set `run_in_background: true`. Then read `$SCREEN_DIR/.server-info` on the next turn to get the URL and port. + +**Codex:** +```bash +# Codex reaps background processes. The script auto-detects CODEX_CI and +# switches to foreground mode. Run it normally — no extra flags needed. +scripts/start-server.sh --project-dir /path/to/project +``` + +**Gemini CLI:** +```bash +# Use --foreground and set is_background: true on your shell tool call +# so the process survives across turns +scripts/start-server.sh --project-dir /path/to/project --foreground +``` + +**Other environments:** The server must keep running in the background across conversation turns. If your environment reaps detached processes, use `--foreground` and launch the command with your platform's background execution mechanism. + +If the URL is unreachable from your browser (common in remote/containerized setups), bind a non-loopback host: + +```bash +scripts/start-server.sh \ + --project-dir /path/to/project \ + --host 0.0.0.0 \ + --url-host localhost +``` + +Use `--url-host` to control what hostname is printed in the returned URL JSON. + +## The Loop + +1. **Check server is alive**, then **write HTML** to a new file in `screen_dir`: + - Before each write, check that `$SCREEN_DIR/.server-info` exists. If it doesn't (or `.server-stopped` exists), the server has shut down — restart it with `start-server.sh` before continuing. The server auto-exits after 30 minutes of inactivity. + - Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html` + - **Never reuse filenames** — each screen gets a fresh file + - Use Write tool — **never use cat/heredoc** (dumps noise into terminal) + - Server automatically serves the newest file + +2. **Tell user what to expect and end your turn:** + - Remind them of the URL (every step, not just first) + - Give a brief text summary of what's on screen (e.g., "Showing 3 layout options for the homepage") + - Ask them to respond in the terminal: "Take a look and let me know what you think. Click to select an option if you'd like." + +3. **On your next turn** — after the user responds in the terminal: + - Read `$SCREEN_DIR/.events` if it exists — this contains the user's browser interactions (clicks, selections) as JSON lines + - Merge with the user's terminal text to get the full picture + - The terminal message is the primary feedback; `.events` provides structured interaction data + +4. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated. + +5. **Unload when returning to terminal** — when the next step doesn't need the browser (e.g., a clarifying question, a tradeoff discussion), push a waiting screen to clear the stale content: + + ```html + +
+

Continuing in terminal...

+
+ ``` + + This prevents the user from staring at a resolved choice while the conversation has moved on. When the next visual question comes up, push a new content file as usual. + +6. Repeat until done. + +## Writing Content Fragments + +Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure). + +**Minimal example:** + +```html +

Which layout works better?

+

Consider readability and visual hierarchy

+ +
+
+
A
+
+

Single Column

+

Clean, focused reading experience

+
+
+
+
B
+
+

Two Column

+

Sidebar navigation with main content

+
+
+
+``` + +That's it. No ``, no CSS, no `