Version 1.1 | March 2026 | Reviewed by Dev the Dev

Presented byMedhavy AI, LLC

In association withBear Brown LLC

Research partnerHumanitarians AI (501(c)(3))

Contactbear@bearbrown.co | bear@humanitarians.ai

What this doc answers Why does institutional knowledge keep disappearing in research and AI labs — and what is the Boyle System's structural answer? This is an explanation document. If you need to write an MVAL log entry right now, go to the MVAL Reference. If you need to configure a corpus, go to Corpus Management.

This documentation suite — six documents

Explanation · You are here

System Overview

Partners, evaluators, research leads

Reference

MVAL Protocol

Researchers logging work

Reference

Corpus Management

Active users managing notebooks

Explanation

Adaptive Architecture

Engineers, program designers

Reference

Deployment Guide

Operations, institutional contacts

Reference

Roadmap

Partners, development team

1. The Problem This System Addresses

A client flags a number on a dashboard built six months ago. The data is still there. The pipeline is still running. The dashboard is still live. But the analyst who built it is gone. The metric definition was never written down. Is the number wrong? Nobody knows. Nobody can know. This is not a rare disaster. It is Tuesday.

Modern AI research occurs within virtualized, elastic cloud environments engineered for rapid instantiation and immediate abandonment. This architecture facilitates what the Boyle System calls the "vanishing laboratory" — where the intricate web of dependencies, library versions, hardware configurations, and environmental variables that produced a result evaporates the moment a virtual machine is decommissioned.

The Vanishing Laboratory

Virtual machines dissolve. Library versions, dataset checksums, and hardware configurations evaporate. The result survives. The conditions do not. The next researcher inherits a number with no reproducible path to it.

The Documentation Gap

Critical decisions happen in undocumented threads, ephemeral terminal sessions, and local notebooks never committed to a repository. The "why" disappears with each personnel transition — leaving successors to reverse-engineer intent from outcomes.

Core Insight The reproducibility crisis in machine learning is not primarily a problem of statistical methodology. It is a problem of vanishing laboratories. The Boyle System is a structural intervention — making the right documentation behavior the natural one, not the effortful one.

The same failure pattern appears in every institutional context where complex knowledge work is handed across people or time: research labs losing experiment context at team transitions, executive education programs where case reasoning disappears after cohort rotation, think tanks where analyst departure takes three years of research lineage with it, graduate programs where advisor meetings are consumed by re-establishing context that should have been logged.

2. Why "Boyle" — The Historical Foundation

Robert Boyle (1627–1691) understood that for an experiment to be scientifically valid, it had to be verifiable by others. Because the physical laboratory was private, Boyle developed a style of reporting so detailed that readers could become "virtual witnesses" — able to evaluate the conditions and reasoning behind a result without being present for it.

The Boyle System applies this same philosophy to cloud credentials, API keys, library versions, and instructional design choices. The log entry is the laboratory. The six MVAL fields are the witnessing protocol.

Documentation Dimension	Aristotelian (Pre-Boyle)	Boyle's Empirical Approach	The Boyle System (Modern)
Primary Methodology	Abstract logic and reasoning	Observation and experimentation	Grounded AI synthesis via RAG
Documentation Depth	Minimal; focused on final truths	Extensive; focused on conditions	Mandatory MVAL fields (all six)
Role of Failure	Ignored as an error of logic	Recorded as essential data	Logged as a first-class artifact
Verification Mechanism	Internal consistency of argument	"Virtual witnessing" via narrative	Citation-backed source grounding
Social Structure	Individual philosopher	Royal Society "matter of fact"	Collaborative AI research labs & classrooms

3. System Architecture

3.1 The Technical Core: Source-Grounded RAG

The Boyle System is powered by NotebookLM's Source-Grounded Retrieval-Augmented Generation (RAG) pipeline. Unlike standard large language models that generate from pre-trained patterns, the system can only "know" what has been uploaded to its corpus — its limitation is its superpower. Every claim the system makes is traceable to a specific passage in a specific source document.

┌─────────────────────────────────────────────────────────────────────┐
│                        RESEARCHER / LEARNER INPUT                    │
│   Project Charter · Degree Requirements · Boyle Principles ·        │
│   MVAL Entries · Cloud Configs · Failed Experiment Logs              │
└───────────────────────────────┬─────────────────────────────────────┘
                                │ Upload / Ingest
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     NOTEBOOKLM CORPUS (RAG)                          │
│  ┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐   │
│  │ Document        │   │ Gemini Embedding │   │ Vector Index    │   │
│  │ Ingestion       │──▶│ Model           │──▶│ (Nearest        │   │
│  │ (Chunking)      │   │ (Vectorization) │   │  Neighbor)      │   │
│  └─────────────────┘   └─────────────────┘   └────────┬────────┘   │
└────────────────────────────────────────────────────────┼────────────┘
                                                         │ Cosine Similarity Retrieval
                                                         ▼
┌─────────────────────────────────────────────────────────────────────┐
│               THREE-ROLE AI PARTNER + ADAPTIVE INSTRUCTOR            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────────────┐ │
│  │  TUTOR   │  │  CRITIC  │  │  GUIDE   │  │  MAB PEDAGOGY      │ │
│  │ Context- │  │Challenges│  │  Cloud   │  │  ENGINE (5 Modes)  │ │
│  │ aware    │  │  vague   │  │  infra   │  │  Socratic·Scaffold  │ │
│  │ guidance │  │  entries │  │  logging │  │  Direct·Apprentice  │ │
│  └──────────┘  └──────────┘  └──────────┘  │  Metacognitive     │ │
│                                              └────────────────────┘ │
└───────────────────────────────┬─────────────────────────────────────┘
                                │ Cited, Grounded, Personalized Response
                                ▼
┌─────────────────────────────────────────────────────────────────────┐
│                         MVAL LOG ENTRY                               │
│         What · Why · How · Environment · Results · Questions         │
└─────────────────────────────────────────────────────────────────────┘

3.2 Platform Capacity

Resource	Limit	Notes
Notebooks per account	100	Segment by project / domain / cohort
Sources per notebook	50	Managed via Ouroboros + stitching strategies
Words per source	500,000	Maximized via source stitching
Total corpus per notebook	~25 million words	Equivalent to ~25 large technical monographs
Context window (Gemini 1.5 Pro)	1M tokens	Near-perfect recall (>99.7%) up to this limit

4. The Three-Role AI Partnership

The Boyle System deploys three distinct AI roles within each NotebookLM corpus. These are not features; they are structural commitments about what kind of assistance the system provides and what it refuses to do.

🎓 Role 1 — Tutor

Function: Context-aware documentation guidance grounded in the researcher's or learner's actual project charter, degree requirements, and institutional standards — not generic best practices from the internet.

Example: A researcher asks how to document a Python web-scraping project. A generic AI returns README advice. The Boyle System returns guidance specific to the team's standards, citing page references from the uploaded project charter and compliance requirements from the institutional protocol document.

Key behavior: Cannot give generic advice — it has no generic context to draw from. Its knowledge is bounded by what has been uploaded.

🔍 Role 2 — Critic

Function: Continuous audit of log entries. Surfaces vague outcomes, implicit assumptions, and missing failure records.

Example prompts the Critic generates:

"This is an outcome, not a method. How did you get here?"
"What failed before this worked?"
"What assumptions are implicit here that the next researcher won't know?"

Key behavior: Combats "interpretive drift" — the gradual transformation of nuanced observations into unsupported factual declarations over time and across personnel.

⚙️ Role 3 — Operational Guide

Function: Treats cloud credentials, API keys, library versions, and environment variables as first-class research artifacts integrated into every log entry.

Key behavior: Transforms administrative overhead into a reproducible infrastructure artifact — the "matter of fact" of the cloud laboratory. Without this role, environment data lives only in memory and terminal history.

5. What the Boyle System Is Not

Clarity on scope prevents misaligned deployments.

What it is	What it is not
A structured documentation protocol enforced through AI grounding	A general-purpose research assistant or LLM wrapper
A corpus management system with defined notebook segmentation	A data warehouse or database replacement
An adaptive instructional layer with five evidence-based modes	A learning management system (LMS) or gradebook
A source-grounded RAG pipeline that limits hallucination	A code execution environment — NotebookLM cannot run code
An institutional knowledge transfer infrastructure	A replacement for version control (Git, DVC, MLflow)

Known limitation NotebookLM cannot execute Python or perform mathematical calculations. For quantitative workflows, the Boyle System requires a companion integration with Vertex AI Workbench or Google Colab. See Deployment Guide §21 for integration methods and Roadmap AI-003 for the planned execution bridge.

6. Where to Go from Here

Your role	Your immediate goal	Start here
Researcher / Fellow	Write my first MVAL log entry	MVAL Protocol Reference
Research lead / Team lead	Structure my NotebookLM corpus	Corpus Management Guide
Institutional partner / Evaluator	Assess deployment fit and security requirements	Deployment Guide
Engineer / Program designer	Understand the adaptive instruction layer	Adaptive Architecture
Development team / Partner	Track what's planned and what needs input	Roadmap & Open Questions

THE BOYLE SYSTEM

This documentation suite — six documents

1. The Problem This System Addresses

2. Why "Boyle" — The Historical Foundation

3. System Architecture

3.1 The Technical Core: Source-Grounded RAG

3.2 Platform Capacity

4. The Three-Role AI Partnership

5. What the Boyle System Is Not

6. Where to Go from Here