๐Ÿ“„ Reference โ€” Boyle System Suite ยท Document 5 of 6
Reference

DEPLOYMENT GUIDE

Cross-System Analysis, Operations, Integrations & Security

What to know before deploying โ€” system tradeoffs, integration stability, active pilot data, security classifications, and target deployment contexts. For institutional leads, IT administrators, and implementation teams.

Version 1.1 | March 2026

Medhavy AI, LLC  |  Bear Brown LLC  |  Humanitarians AI (501(c)(3))

Version 1.1 | March 2026 | Reviewed by Dev the Dev
What this reference covers RAG vs. long-context tradeoffs, grounded vs. non-grounded LLM performance data, active deployment metrics, integration methods and their stability, security and data governance by context, and target deployment scenarios. If you need to understand the underlying system architecture, start with the System Overview.

RAG vs. Long-Context Window

Institutional evaluators frequently ask why the Boyle System uses source-grounded RAG rather than simply loading all documents into a long-context window. The tradeoff is architectural, not preferential.

DimensionLong-Context WindowSource-Grounded RAG (Boyle System)
Data locationEntire document in active working memorySemantic index; chunks retrieved per query
Citation precisionLow โ€” reasoning is holisticHigh โ€” specific passage linked to every claim
Hallucination riskHigher โ€” model may blend sourcesLower โ€” constrained to retrieved chunks
Audit trailDifficult โ€” cannot trace specific claim to passageBuilt-in โ€” inline citation to exact text
Best use caseHolistic synthesis of a single large documentPrecise retrieval across 50+ diverse sources
Regulatory suitabilityLimited โ€” hard to satisfy audit requirementsStrong โ€” every claim is traceable to source

Grounded vs. Non-Grounded LLM Performance

MetricNon-Grounded LLMNotebookLM (Boyle System)
Hallucination rate~40%~13% overall; ~0% on specific citation queries
Citation precisionLow / Variable95% in audited clinical tasks
Context windowPre-trained knowledge (static)~25 million words per notebook (dynamic)
Update frequencyRequires retraining or fine-tuningInstantaneous upon document upload
Data privacyOften shared for trainingPrivate; no sharing under enterprise agreement
Specificity of responseGeneric โ€” drawn from broad pre-trainingContext-specific โ€” bounded by uploaded corpus
The constraint is the feature The Boyle System's inability to answer questions about topics not in its corpus is not a limitation to work around โ€” it is the mechanism that makes citation precision possible. A system that can say "I don't have a source for that" is more trustworthy than one that generates plausible-sounding answers from training data.

Known Issues and Technical Debt

BD-001: Source Slot Ceiling HIGH
50-source limit per notebook constrains long-running projects. Mitigated by Ouroboros and stitching, but adds manual overhead and citation risk. Recommendation: Source slot monitoring with automated alerts at 40-source threshold.
BD-002: Citation Loss on Ouroboros Conversion CRITICAL
Converting notes to sources strips original inline citations. Risk escalates with each cycle. Recommendation: Mandatory metadata checklist before every conversion. See Corpus Management ยง2.
BD-003: No Native Python Execution HIGH
NotebookLM cannot run code or perform mathematical calculations. May return confident but incorrect quantitative answers. Recommendation: Integration protocol with Vertex AI Workbench or Colab; quantitative outputs logged back to MVAL. See Roadmap AI-003.
BD-004: Knowledge-Based Poisoning Vulnerability HIGH
Malicious or corrupted documents can bias outputs. Zero-width Unicode characters are invisible to human reviewers but readable by the AI. Recommendation: Mandatory source validation workflow before ingestion.
BD-005: No Automated Hallucination Scoring MEDIUM
Hallucination auditing is currently manual. Recommendation: Planned โ€” passage-level verification with automated reliability score. See Roadmap ยง25.
BD-006: MVAL Not Enforced at Platform Level CRITICAL
MVAL compliance depends entirely on researcher discipline. No hard field validation or submission gate exists. This is the structural gap most likely to undermine the system's core mission. Recommendation: Structured intake form or custom lightweight web front-end with required fields. See Roadmap AI-001.
BD-007: MAB Cold-Start Data Dependency MEDIUM
The bandit requires interaction data to personalize. New institutional deployments begin in expert-guided Phase 1 with limited personalization capability. Recommendation: Pre-load institutional cluster priors from similar cohort profiles. See Roadmap AI-004.

Target Deployment Contexts

ContextPrimary Value PropositionKey Features Used
Business School Executive Education Preserve case analysis reasoning across cohorts; structure participant documentation; reduce facilitator gap-filling MVAL (Why/Decisions field critical), Cognitive Apprenticeship mode, Handoff Notebook
Think Tanks & Policy Research Document research lineage; prevent institutional memory loss at analyst transitions; enable audit-ready citation trails Source-grounded RAG, Failure Archive, Passage-Level Citation, CRITIQ integration (planned)
Graduate & Professional Schools Replace ad hoc research documentation; shift advisor meetings from gap-filling to strategy; train reproducibility habits early Full MVAL protocol, Project Charter Notebook, Pre-Meeting Brief Generation, MAB pedagogy engine
Independent & Private Schools (STEM) Build structured research documentation habits early; scaffold inquiry-based learning; track student progress longitudinally Scaffolding + Direct Instruction modes, MVAL simplified template, Notebook Segmentation
Applied AI Research Labs Solve the vanishing laboratory problem in cloud-native ML research; enable reproducible experiment infrastructure Environment field (MVAL), Failure Artifact Protocol, MCP integration, Python execution bridge

Active Deployments โ€” Pilot Data

Humanitarians AI Fellows Program

ProgramResearch DomainPrimary Boyle Use CaseStatus
AI Skunkworks (Partner University) Applied AI / Data Science Cloud pipeline documentation, inference reproducibility Live
Lyrical Literacy Music, neuroscience, language acquisition Software dev logs, neural connectivity tracking Live
Botspeak AI fluency and human-AI task delegation Strategic delegation logs, ethical boundary records Live
Fellows Program (general) Multi-domain applied AI (~150 volunteers) Onboarding documentation, project handoff infrastructure Live
Measured Outcomes โ€” Active Pilot
MetricBefore Boyle SystemAfter Boyle System
Mentor meeting time on gap-review~60%~20%
Mentor meeting time on strategic discussion~40%~80%
Onboarding time for new team membersBaselineTarget: >50% reduction
Duplicate work incidentsFrequentTarget: near zero

Integration Methods and Stability

Stability matters for institutional deployment. Three of the four available integration methods are unofficial and brittle. Plan accordingly. Build on the official Discovery Engine API for any production institutional deployment.
Integration MethodTechnical MechanismKey CapabilityStability
Python SDK (notebooklm-py) Browser automation via Playwright Full access to chat, sources, and artifacts โš  Unofficial โ€” brittle. UI changes break it.
MCP Server Model Context Protocol Integration with Claude Desktop / Claude Code โš  Unofficial โ€” promising but not production-ready.
Discovery Engine API Official GCP REST Endpoints Enterprise-grade notebook management โœ“ Official โ€” enterprise-grade. Use for production.
Typer CLI Command-line interface Human-operated automation from terminal โš  Unofficial โ€” suitable for individual researcher use only.

MCP server deployment configuration is in development. See Roadmap AI-007.

Security and Data Governance

Data classification determines which tier of NotebookLM is appropriate for a given deployment. This table is the decision reference โ€” apply it before ingesting any document into any notebook.

Data ClassStandard NotebookLMWorkspace / Enterprise
Public research papers, published documentation โœ“ Permitted โœ“ Permitted
Internal project charters, MVAL logs โš  Assess organizational risk โœ“ Permitted
Personal health records (HIPAA) โœ— Prohibited โš  Requires Business Associate Agreement (BAA)
Financial records โœ— Prohibited โš  Assess compliance posture
Export-controlled data (ITAR / EAR) โœ— Prohibited โœ— Prohibited
IRB-adjacent human subjects data โœ— Prohibited โš  Consult IRB before ingesting any data
EU AI Act โ€” August 2026 The EU AI Act becomes fully applicable August 2026. For European institutional partners, the current regional deployment model (EU multi-region via Discovery Engine) may need governance documentation review before that date. This is an open question requiring partner input. See Roadmap Open Question 5.
A one-page data classification guide for institutional partners covering what can go in standard NotebookLM, what requires enterprise, and what is prohibited in any cloud system is in development. See Roadmap AI-005.