# Skills & Methods Register

Portable export. This file is self-contained: move it into a separate project folder and it carries everything you need to develop these into formal skills later. It captures the reusable methods ("the gold") that emerged while building the Agentic AI Governance glossary, separated from that project's subject matter so each method stands on its own.

Each entry is written so it could later be formalized as a Cowork skill (a SKILL.md). Status reflects how ready it is: **proven** (used end-to-end here, works), **drafted** (used partially), **concept** (articulated, not yet built as a repeatable tool).

Nothing here is subject-locked to AI governance. Every method generalizes to other domains.

---

## 1. Function-First Terminology Crosswalk
**Status:** proven

**What it does:** Builds a glossary where each term is defined by what it *does* (a one-line function statement), grouped by *purpose* rather than alphabety, and linked to other terms by shared function. Kills the "dictionary" failure mode where definitions describe what a thing *is* but never why it matters.

**Trigger:** "build a glossary / ontology / terminology map," "explain these terms to non-technical users," "make a crosswalk."

**Inputs:** a raw list of terms (any domain). **Outputs:** a function-clustered glossary + a crosswalk table mapping terms to a small set of purposes.

**Why it generalizes:** the function-first move works for legal, medical, financial, or any jargon-heavy field. The cluster set changes; the method does not.

---

## 2. Sovereign-Zero Translator
**Status:** drafted (design + data model complete; query layer pending)

**What it does:** Translates between two vocabularies (e.g. plain-English ↔ NIST/MIT/OWASP technical) by calibrating *both* against a neutral function-center, so neither vocabulary is treated as the source of truth. Prevents "contextual mismatch" where a translation silently imports one side's framing.

**Trigger:** "translate between technical and non-technical," "two teams use different words for the same thing," "align our terminology with a standard without losing our own."

**Inputs:** two term sets + a function-center definition. **Outputs:** alias links (`alias_of`) that bind different names to one shared function.

**Why it generalizes:** any field with competing dialects (clinical vs patient language, engineering vs sales, regulator vs operator).

---

## 3. Shepherd → Define → Type → Attribute Pipeline
**Status:** proven (the core engine of this whole project)

**What it does:** A staged pipeline that turns a raw term dump into a structured, auditable knowledge base:
1. **Shepherd** — ingest raw terms as `stub` records, de-duplicated, each tagged with abbreviated provenance family codes (NIST, OWASP, ARXIV, etc.).
2. **Define** — write function statements, assign cluster + purpose, graduate stub → candidate, family by family.
3. **Type** — upgrade associative links into a controlled relation vocabulary (`enables`, `requires`, `mitigates`, `instance_of`, etc.) with direction + evidence.
4. **Attribute** — fill domain-specific attributes via a rule engine, with hand-curated overrides.
5. **Gate** — a deterministic validation check runs on every write.

**Trigger:** "organize this messy term list into something structured," "build a knowledge base / data model from these notes."

**Inputs:** raw terms + optional provenance map. **Outputs:** a single validated JSON source of truth.

**Reusable code assets in this project:** `shepherd.py`, `define_pass*.py`, `type_related.py`, `attr_pass.py`.

---

## 4. Candidate-Register / Validation-Gate Discipline
**Status:** proven

**What it does:** A responsibility discipline for AI-assisted data work. Three rules: (a) every record is `candidate` until a human verifies it, never auto-certified; (b) a deterministic validation gate runs on every write and fails loudly; (c) every value is tagged with how it was produced (`preset` / `curated` / `derived`) so a human reviewer knows exactly what to trust. Abbreviated provenance via source-family codes keeps it efficient.

**Trigger:** any "organize my data responsibly," "I need this auditable," "don't let the AI just make things up" task.

**Why it generalizes:** this is the backbone of doing AI data work you can defend. Domain-independent.

---

## 5. Cold-Read Review
**Status:** proven

**What it does:** Instead of dumping a whole AI-generated dataset for review, it heuristically flags the records most likely to be wrong (e.g. where a rule-engine default contradicts the term's nature) and produces a short, prioritized human-review list grouped by failure pattern. Turns "review 400 rows" into "look at these 50, in 3 buckets."

**Trigger:** "check this AI-generated data," "QA this dataset," "what's likely wrong here."

**Inputs:** a dataset with method/provenance tags. **Outputs:** a ranked review list with the suspected error pattern named.

---

## 6. Red-Team-as-Feature
**Status:** proven

**What it does:** Adversarial review where the *weaknesses become the visible, focal feature* of the deliverable rather than being hidden. Produces a limitations / threat-model section and a re-derivation log, so the artifact's honesty is part of its value.

**Trigger:** "red team this," "what are the weaknesses," "make this honest / defensible."

**Reusable asset:** `Red-Team-Report.md` structure (technical / epistemic / governance findings, each with severity + status + reinforcement).

---

## 7. Truncation-Resistant Build Pipeline
**Status:** proven (technical method)

**What it does:** Builds large single-file HTML deliverables by assembling small part-files via shell instead of one big file write (which silently truncates past a size cap). Adds a multi-source CDN loader + offline 2D fallback so the artifact is genuinely shippable to another machine.

**Trigger:** building any large self-contained HTML/interactive artifact.

**Why it generalizes:** purely technical, applies to any big generated-file task.

---

## 8. 3D-to-2D Reflection Mapping
**Status:** drafted

**What it does:** Represents one structured model two ways from a single JSON source of truth: a 3D layered map (for spatial / multimodal pattern matching) and a flat 2D companion (for reading and search). Patterns you can see but can't name in one view often surface in the other.

**Trigger:** "visualize this model," "I want to see the structure," "multimodal pattern matching."

**Reusable assets:** the map build pipeline + the 2D reflections generator.

---

## How to develop these further
- The fastest path to a formal skill: pick one **proven** entry, point the `skill-creator` at its description + the named code assets, and let it scaffold a SKILL.md.
- Strongest candidates to formalize first: **#3 (the pipeline)** and **#4 (the discipline)** — they are the most transferable and least subject-locked.
- This register is a living document. Re-run a cold read on any future chat to mine new methods into it.
