๐Ÿ“„ Reference โ€” Boyle System Suite ยท Document 6 of 6
Reference

ROADMAP

Prioritized Improvements, Open Questions & Feature Pipeline

What is being built, what needs partner input, and what is on the horizon. For institutional partners, program directors, and the development team.

Version 1.1 | March 2026

Medhavy AI, LLC  |  Bear Brown LLC  |  Humanitarians AI (501(c)(3))

Version 1.1 | March 2026 | Reviewed by Dev the Dev
How to use this document Action items (AI-001 through AI-008) are development tasks with priority levels and effort estimates. Open questions (Q1โ€“Q9) require partner input before they can be resolved โ€” look for the questions in your domain and respond to bear@bearbrown.co. The feature roadmap table shows the full horizon.

Prioritized Action Items

Critical Priority

AI-001 MVAL Enforcement Mechanism  CRITICAL  MVAL  Effort: 3โ€“5 days

Design and implement structural enforcement of MVAL field completion. Without this, the entire documentation protocol depends on individual researcher discipline โ€” the least reliable mechanism available. Options under evaluation:

Blocking: BD-006. Related: MVAL Reference. Input needed: Open Question 1.

AI-002 Ouroboros Citation Preservation Protocol  CRITICAL  Corpus  Effort: 2 days

Mandatory metadata checklist and standard template for pre-conversion documentation. Must be completed before every Ouroboros conversion, without exception. A missed conversion permanently destroys citation data.

Blocking: BD-002. Related: Corpus Management ยง2.

High Priority

AI-003 Python Execution Integration  HIGH  Core  Effort: 3โ€“5 days

Define and document the protocol for routing quantitative queries to Vertex AI Workbench or Google Colab. Quantitative outputs must be logged back to MVAL as artifacts. Without this, researchers who ask the Boyle System mathematical questions may receive confident but incorrect answers.

Blocking: BD-003. Input needed: Open Question 2.

AI-004 MAB Phase 1 Priors for Executive Education  HIGH  Adaptive  Effort: 5 days

Develop expert-seeded prior configurations for executive education, think tank, and graduate school cohort profiles. Without these, new institutional deployments begin in a cold-start period with no personalization โ€” a poor first impression for high-stakes institutional pilots.

Input needed: Open Question 3 and 8.

AI-005 Data Classification Governance Document  HIGH  Partners  Effort: 2 days

One-page data classification guide for all institutional partners covering: what can go in standard NotebookLM, what requires Workspace / Enterprise tier, and what is prohibited in any cloud system. This document should be deliverable before any new partner ingests institutional data.

Input needed: Open Question 4 (enterprise vs. workspace threshold).

Medium Priority

AI-006 Notebook Taxonomy Standard  MEDIUM  Effort: 1 day

Publish the recommended five-notebook segmentation taxonomy with standardized naming conventions. Standardize across all active deployments. Until this is published, naming conventions are inconsistent across pilots, making cross-project queries unreliable.

AI-007 MCP Server Deployment and Documentation  MEDIUM  Core  Effort: 3โ€“5 days

Configure and document MCP server integration for Claude Code / Claude Desktop. Publish a configuration template. This enables the most capable developer workflow โ€” the one most likely to sustain researcher adoption.

AI-008 Fairness Audit Protocol  MEDIUM  Adaptive  Effort: 3 days

Implement monitoring to verify that minimum exploration rates across all five instructional modes are maintained across learner demographics. Flag pigeonholing patterns before they solidify into permanent routing decisions.

Related: Adaptive Architecture โ€” Fairness Constraints.

Open Questions for Partners

These questions cannot be resolved by the development team alone. They require partner input. If you have relevant context, respond to bear@bearbrown.co with the question number.

Architecture

Q1 โ€” MVAL Enforcement Path
Architecture | Blocks AI-001
What is the lowest-friction mechanism for hard MVAL field validation that won't create overhead that prompts partners to circumvent it? Google Form vs. Markdown template vs. custom UI โ€” which is sustainable across your team's existing workflow?
Partner input needed before AI-001 can be scoped.
Q2 โ€” Quantitative Integration Path
Architecture | Blocks AI-003
What is the preferred path for quantitative tasks: Vertex AI Workbench sidebar integration, a Colab notebook that feeds outputs back to MVAL as artifacts, or a separate notebook layer? Tradeoffs are accessibility vs. rigor vs. institutional IT constraints.
Partner input needed. Answer varies significantly by institution.
Q3 โ€” MAB Deployment Scope for Executive Education
Architecture | Informs AI-004
Is the full five-mode bandit appropriate for executive education contexts, or is a simplified two-mode system (Direct Instruction vs. Socratic Questioning) more appropriate for program-level adoption? The full system requires more instrumentation; the simplified version trades personalization for deployment speed.
Program director input preferred.

Institutional Deployment

Q4 โ€” Enterprise vs. Workspace Threshold
Data Governance | Blocks AI-005
For graduate school and think tank partners, does Google Workspace for Education / Workspace for Organizations satisfy data protection requirements, or does full GCP Enterprise become necessary? The answer determines the cost structure of any institutional deployment.
Legal / IT compliance input needed per institution.
Q5 โ€” EU AI Act Compliance
Regulatory | Deadline: August 2026
The EU AI Act becomes fully applicable August 2026. For European institutional partners, does the current regional deployment model (EU multi-region via Discovery Engine) satisfy governance documentation requirements, or are additional controls needed before the deadline?
European partner legal review required before June 2026.
Q6 โ€” Executive Education MVAL Environment Field
Specification | Informs MVAL Variant
The standard MVAL Environment field covers cloud computing context. For executive education and policy research, "environment" means organizational context โ€” stakeholders present, constraints active, data vintage, regulatory regime. What fields are actually needed for a compliant executive education MVAL variant?
Executive education program director input preferred.

Measurement

Q7 โ€” Pilot Instrumentation
Measurement | Blocks formal evaluation
The target metrics are: gap-review <20% of meeting time, onboarding reduction >50%, duplicate work near zero. How are these currently measured in the active pilot, and what is the instrumentation plan for formal partner deployments? Without instrumentation, the metrics are aspirational, not evidential.
Active pilot leads โ€” this needs a response before the next partner evaluation.
Q8 โ€” MAB Reward Calibration by Context
Measurement | Informs AI-004
How should the composite reward function be weighted differently for executive education (where persistence and engagement may outweigh raw mastery gain) versus research training (where knowledge gain is paramount)? This question determines whether a single reward function can span deployment contexts or whether context-specific functions are required.
Learning scientist or program director input preferred.
Q9 โ€” CRITIQ Integration Scope
Tool Integration | Informs roadmap sequencing
Could CRITIQ's peer review protocol run against MVAL entries as a structured critic layer, automatically flagging statistical integrity issues or reproducibility gaps? If so, what is the minimum MVAL field structure CRITIQ would need to evaluate effectively? This would close the only automated quality gap currently covered only by the human Critic role.
CRITIQ team input needed.

Future Feature Roadmap

FeatureMechanismImpactStatus
Passage-Level Verification Block outputs lacking direct cited evidence Eliminates interpretive overreach and drift Planned
Hallucination Detector Post-hoc corpus auditing with reliability score Quantitative documentation quality metric per entry Planned
Full MAB Engine (5 Modes) Thompson Sampling + CMAB + IC-Cache Real-time personalized instructional mode selection Planned
GAMBITTS Integration LLM treatment embedding + bandit policy learning Robust learning despite stochastic LLM output Planned
MVAL Web Interface Required-field form โ†’ auto-ingests to active research notebook Structural enforcement of documentation standard In development
CRITIQ ร— Boyle Integration Peer review protocol applied to MVAL entries Automated statistical integrity flagging on log entries Planned
Executive Education MVAL Variant Adapted field definitions for non-technical contexts Extends Boyle System to business school and policy contexts Planned
OPT / Visa-Transition Handoff Template MVAL variant optimized for personnel transition documentation Preserves institutional knowledge across team changes Planned
Diagram Generation Multimodal visualization of experimental setups Improves legibility of complex workflows in MVAL entries Planned

Ecosystem Tool Integrations

ToolFunctionIntegration with BoyleStatus
CRITIQ Peer review: manuscript evaluation, statistical integrity Automated critic layer on MVAL entries; flags reproducibility gaps Planned
SOCRIT Socratic prompt evaluation (Paul-Elder framework) Quality validation for Socratic mode prompts Planned
Popper Assertion verification โ€” flags factual claims for review Post-hoc MVAL entry fact-checking Planned
Bookie the Bookmaker Chapter drafting for domain-specific textbooks Generates structured knowledge from MVAL entry archives Planned
Eddy the Editor Article review: structure, line edit, SEO, publish strategy Post-Bookie editorial pass on generated content Planned
Medhavi Platform AI-assisted textbook delivery and student documentation Student-facing Boyle System interface for academic contexts Roadmap TBD