๐ Reference โ Boyle System Suite ยท Document 5 of 6
Reference
DEPLOYMENT GUIDE
Cross-System Analysis, Operations, Integrations & Security
What to know before deploying โ system tradeoffs, integration stability,
active pilot data, security classifications, and target deployment contexts.
For institutional leads, IT administrators, and implementation teams.
Version 1.1 | March 2026 | Reviewed by Dev the Dev
What this reference covers
RAG vs. long-context tradeoffs, grounded vs. non-grounded LLM performance data, active deployment metrics, integration methods and their stability, security and data governance by context, and target deployment scenarios. If you need to understand the underlying system architecture, start with the
System Overview.
RAG vs. Long-Context Window
Institutional evaluators frequently ask why the Boyle System uses source-grounded RAG rather than simply loading all documents into a long-context window. The tradeoff is architectural, not preferential.
| Dimension | Long-Context Window | Source-Grounded RAG (Boyle System) |
| Data location | Entire document in active working memory | Semantic index; chunks retrieved per query |
| Citation precision | Low โ reasoning is holistic | High โ specific passage linked to every claim |
| Hallucination risk | Higher โ model may blend sources | Lower โ constrained to retrieved chunks |
| Audit trail | Difficult โ cannot trace specific claim to passage | Built-in โ inline citation to exact text |
| Best use case | Holistic synthesis of a single large document | Precise retrieval across 50+ diverse sources |
| Regulatory suitability | Limited โ hard to satisfy audit requirements | Strong โ every claim is traceable to source |
Grounded vs. Non-Grounded LLM Performance
| Metric | Non-Grounded LLM | NotebookLM (Boyle System) |
| Hallucination rate | ~40% | ~13% overall; ~0% on specific citation queries |
| Citation precision | Low / Variable | 95% in audited clinical tasks |
| Context window | Pre-trained knowledge (static) | ~25 million words per notebook (dynamic) |
| Update frequency | Requires retraining or fine-tuning | Instantaneous upon document upload |
| Data privacy | Often shared for training | Private; no sharing under enterprise agreement |
| Specificity of response | Generic โ drawn from broad pre-training | Context-specific โ bounded by uploaded corpus |
The constraint is the feature
The Boyle System's inability to answer questions about topics not in its corpus is not a limitation to work around โ it is the mechanism that makes citation precision possible. A system that can say "I don't have a source for that" is more trustworthy than one that generates plausible-sounding answers from training data.
Known Issues and Technical Debt
BD-001: Source Slot Ceiling
HIGH
50-source limit per notebook constrains long-running projects. Mitigated by Ouroboros and stitching, but adds manual overhead and citation risk. Recommendation: Source slot monitoring with automated alerts at 40-source threshold.
BD-002: Citation Loss on Ouroboros Conversion
CRITICAL
Converting notes to sources strips original inline citations. Risk escalates with each cycle.
Recommendation: Mandatory metadata checklist before every conversion. See
Corpus Management ยง2.
BD-003: No Native Python Execution
HIGH
NotebookLM cannot run code or perform mathematical calculations. May return confident but incorrect quantitative answers. Recommendation: Integration protocol with Vertex AI Workbench or Colab; quantitative outputs logged back to MVAL. See Roadmap AI-003.
BD-004: Knowledge-Based Poisoning Vulnerability
HIGH
Malicious or corrupted documents can bias outputs. Zero-width Unicode characters are invisible to human reviewers but readable by the AI. Recommendation: Mandatory source validation workflow before ingestion.
BD-005: No Automated Hallucination Scoring
MEDIUM
Hallucination auditing is currently manual.
Recommendation: Planned โ passage-level verification with automated reliability score. See
Roadmap ยง25.
BD-006: MVAL Not Enforced at Platform Level
CRITICAL
MVAL compliance depends entirely on researcher discipline. No hard field validation or submission gate exists. This is the structural gap most likely to undermine the system's core mission.
Recommendation: Structured intake form or custom lightweight web front-end with required fields. See
Roadmap AI-001.
BD-007: MAB Cold-Start Data Dependency
MEDIUM
The bandit requires interaction data to personalize. New institutional deployments begin in expert-guided Phase 1 with limited personalization capability.
Recommendation: Pre-load institutional cluster priors from similar cohort profiles. See
Roadmap AI-004.
Target Deployment Contexts
| Context | Primary Value Proposition | Key Features Used |
| Business School Executive Education |
Preserve case analysis reasoning across cohorts; structure participant documentation; reduce facilitator gap-filling |
MVAL (Why/Decisions field critical), Cognitive Apprenticeship mode, Handoff Notebook |
| Think Tanks & Policy Research |
Document research lineage; prevent institutional memory loss at analyst transitions; enable audit-ready citation trails |
Source-grounded RAG, Failure Archive, Passage-Level Citation, CRITIQ integration (planned) |
| Graduate & Professional Schools |
Replace ad hoc research documentation; shift advisor meetings from gap-filling to strategy; train reproducibility habits early |
Full MVAL protocol, Project Charter Notebook, Pre-Meeting Brief Generation, MAB pedagogy engine |
| Independent & Private Schools (STEM) |
Build structured research documentation habits early; scaffold inquiry-based learning; track student progress longitudinally |
Scaffolding + Direct Instruction modes, MVAL simplified template, Notebook Segmentation |
| Applied AI Research Labs |
Solve the vanishing laboratory problem in cloud-native ML research; enable reproducible experiment infrastructure |
Environment field (MVAL), Failure Artifact Protocol, MCP integration, Python execution bridge |
Active Deployments โ Pilot Data
Humanitarians AI Fellows Program
| Program | Research Domain | Primary Boyle Use Case | Status |
| AI Skunkworks (Partner University) |
Applied AI / Data Science |
Cloud pipeline documentation, inference reproducibility |
Live |
| Lyrical Literacy |
Music, neuroscience, language acquisition |
Software dev logs, neural connectivity tracking |
Live |
| Botspeak |
AI fluency and human-AI task delegation |
Strategic delegation logs, ethical boundary records |
Live |
| Fellows Program (general) |
Multi-domain applied AI (~150 volunteers) |
Onboarding documentation, project handoff infrastructure |
Live |
| Metric | Before Boyle System | After Boyle System |
| Mentor meeting time on gap-review | ~60% | ~20% |
| Mentor meeting time on strategic discussion | ~40% | ~80% |
| Onboarding time for new team members | Baseline | Target: >50% reduction |
| Duplicate work incidents | Frequent | Target: near zero |
Integration Methods and Stability
Stability matters for institutional deployment.
Three of the four available integration methods are unofficial and brittle. Plan accordingly. Build on the official Discovery Engine API for any production institutional deployment.
| Integration Method | Technical Mechanism | Key Capability | Stability |
| Python SDK (notebooklm-py) |
Browser automation via Playwright |
Full access to chat, sources, and artifacts |
โ Unofficial โ brittle. UI changes break it. |
| MCP Server |
Model Context Protocol |
Integration with Claude Desktop / Claude Code |
โ Unofficial โ promising but not production-ready. |
| Discovery Engine API |
Official GCP REST Endpoints |
Enterprise-grade notebook management |
โ Official โ enterprise-grade. Use for production. |
| Typer CLI |
Command-line interface |
Human-operated automation from terminal |
โ Unofficial โ suitable for individual researcher use only. |
MCP server deployment configuration is in development. See Roadmap AI-007.
Security and Data Governance
Data classification determines which tier of NotebookLM is appropriate for a given deployment. This table is the decision reference โ apply it before ingesting any document into any notebook.
| Data Class | Standard NotebookLM | Workspace / Enterprise |
| Public research papers, published documentation |
โ Permitted |
โ Permitted |
| Internal project charters, MVAL logs |
โ Assess organizational risk |
โ Permitted |
| Personal health records (HIPAA) |
โ Prohibited |
โ Requires Business Associate Agreement (BAA) |
| Financial records |
โ Prohibited |
โ Assess compliance posture |
| Export-controlled data (ITAR / EAR) |
โ Prohibited |
โ Prohibited |
| IRB-adjacent human subjects data |
โ Prohibited |
โ Consult IRB before ingesting any data |
EU AI Act โ August 2026
The EU AI Act becomes fully applicable August 2026. For European institutional partners, the current regional deployment model (EU multi-region via Discovery Engine) may need governance documentation review before that date. This is an open question requiring partner input. See
Roadmap Open Question 5.
A one-page data classification guide for institutional partners covering what can go in standard NotebookLM, what requires enterprise, and what is prohibited in any cloud system is in development. See
Roadmap AI-005.