Abby 2.0 Phase 6: Institutional Intelligence — The Organization Gets Smarter
Abby now learns from the entire research community. When a researcher builds a successful diabetes cohort, that pattern becomes available to every other researcher. Questions asked three or more times across users automatically become institutional FAQs with vetted answers. Data quality findings discovered by one team warn all teams. Phase 6 completes the Abby 2.0 cognitive architecture.
The Problem: Individual Intelligence, Institutional Amnesia
Through Phases 1-5, Abby became individually brilliant — she remembered each researcher, routed to the right brain, understood concept hierarchies, and could take actions. But each researcher's work existed in isolation. When Dr. Chen built an elegant hypertension cohort, Dr. Patel had no way to know. When the data team discovered that lab measurements were sparse before 2019, every subsequent researcher had to rediscover that independently.
Phase 6 turns individual intelligence into collective intelligence.
Knowledge Capture: Automatic Artifact Extraction
The KnowledgeCapture pipeline listens for successful research events and extracts reusable artifacts — no manual documentation required.
| Event | What's Captured | Artifact Type |
|---|---|---|
| Cohort created | Definition pattern, concept composition, design rationale | cohort_pattern |
| Analysis completed | Configuration, key findings, interpretation | analysis_config |
| User corrects Abby | Original response + correction pair | Stored in abby_corrections |
| Data quality issue found | Affected domain, tables, severity, workaround | Stored in abby_data_findings |
Every artifact is embedded with sentence-transformers (384-dim) and stored in abby_knowledge_artifacts with an HNSW vector index. This means semantic search works across all institutional knowledge — "diabetes cohort" finds artifacts about T2DM, HbA1c, and metformin, not just exact keyword matches.
Knowledge Surfacing: The Right Context at the Right Time
The KnowledgeSurfacer runs on every chat request, searching for institutional artifacts relevant to the current query. When it finds matches within the relevance threshold (cosine distance < 0.5), it injects them into the system prompt:
INSTITUTIONAL KNOWLEDGE (from other researchers):
- [cohort_pattern] T2DM Incident Cohort: Entry condition with 365-day washout, used 5x
- [analysis_config] Diabetes Characterization: 45% female, mean age 62, used 3x
Mention these to the user if relevant to their question.
Abby then naturally weaves this into her response: "I see that other researchers at your institution have built similar diabetes cohorts. The most recent uses a validated approach with a 365-day washout — would you like to start from that?"
FAQ Auto-Promotion
The FAQPromoter monitors question frequency across users. When distinct users ask the same question 3+ times (detected via text similarity), the question-answer pair is automatically promoted to an institutional FAQ.
This creates a self-curating knowledge base. The questions researchers actually ask become the documentation the institution maintains — no manual curation needed.
What Shipped
| Component | Tests | Purpose |
|---|---|---|
KnowledgeCapture | 6 | Automatic artifact extraction from research events |
KnowledgeSurfacer | 4 | Contextual institutional suggestions |
FAQPromoter | 3 | Automatic FAQ detection and promotion |
| Database tables | 3 | abby_knowledge_artifacts, abby_corrections, abby_data_findings |
| Pipeline integration | 3 | Surfacing + FAQ promotion wired into chat |
263 tests passing across the Python AI service.
Abby 2.0: Complete
Six phases, one session, one cognitive architecture:
| Phase | What It Gave Abby |
|---|---|
| 1: Memory | She remembers who you are |
| 2: Intelligence | She gets a bigger brain when needed |
| 3: Knowledge Graph | She understands concept relationships |
| 4: Agency | She can take actions for you |
| 5: Advanced Agency | She coordinates complex workflows |
| 6: Institutional | She makes the whole institution smarter |
From a stateless chatbot to a cognitive research assistant with persistent memory, hybrid intelligence, relational understanding, supervised autonomy, and institutional learning. 263 tests. 6 phases. All shipped.
