Skip to main content
Xavier

Research

/xavier research <topic> is the topic-first "teach me" skill. Where Learning Your Codebase is repo-first knowledge extraction and Investigate is symptom-first hypothesis search, research starts from a concept you want to understand and fans out across the web, internal docs, and your codebase in parallel. Each remora tackles a different facet, and Xavier stitches the findings into a structured digest you can read in one pass or save to your vault for later.

How it works

  1. Parse input — the --plan flag is extracted; everything else becomes the topic string.
  2. Check prior research — Xavier globs research/ in your vault for matching or related notes. If one exists, it shows you the existing note and asks: "Update or start fresh?"
  3. Decompose the topic — 3-5 research questions are generated using a guided template: Foundations, Practice, State of Art, plus 1-2 dynamic axes, plus Local Context if you are inside a git repo.
  4. --plan gate — if the flag is set, the decomposed questions are presented for approval or edits before any remoras spawn.
  5. Spawn remoras — one remora per question, all launched in parallel. Each has access to WebSearch, WebFetch, Glean, Confluence, and codebase grep/glob/read.
  6. Collect answers — each remora returns a concise factual answer under 500 words and a ### Sources subsection listing the URLs and file paths it consulted.
  7. Synthesize the digest — Xavier produces a TL;DR of 3-5 sentences, one section per axis distilled and connected across remora outputs, and a merged, deduplicated Sources list.
  8. Present inline — the full digest is shown in the conversation.
  9. Suggest a filename — Xavier proposes a kebab-case filename derived from the topic and asks you to confirm.
  10. Save to vault — the digest is written to ~/.xavier/research/<filename>.md.

The parallel spawn uses the same shark/remora pattern described in How It Works — you see progress as each remora completes, not one long blocking wait.

Research axes

The digest is structured around three fixed axes, plus one or two dynamic axes, plus an optional Local Context axis.

AxisWhat it covers
FoundationsCore concepts, principles, mental model
PracticeIndustry usage, tools, patterns, tradeoffs
State of ArtLatest developments, competing approaches, where the field is heading

Dynamic axes are generated per topic. For an authentication topic Xavier might add "security implications"; for a data-structure topic, "performance characteristics"; for a BI topic, "tooling landscape". You get 1-2 of these tailored to the subject so the digest is not just a generic template filled with specifics.

Local Context is only spawned when you run research from inside a git repo. A dedicated remora searches the codebase for files, modules, configuration, and comments related to the topic, then adds a Local Context section to the digest showing how the concept connects to the code you are working in. Run from a non-repo directory and this axis is skipped — no repo field in the note either.

The --plan flag

--plan makes Xavier present the decomposed questions via an interactive prompt before any remoras spawn. You can approve the set, edit individual questions, or add your own. Useful when the topic is broad and you want to steer what the remoras focus on before committing to the parallel fan-out.

/xavier research --plan "semantic modelling for BI tools"

Without --plan, Xavier proceeds immediately after decomposition.

Prior-research flow

If a research note on the topic already exists in your vault, Xavier shows you the existing note and asks whether to update or start fresh.

  • Update — the prior note's content is passed to every remora as context. The prompt instructs them to focus on what is new, changed, or was missed. The old note is overwritten once synthesis completes — not duplicated.
  • Fresh — remoras start clean with no prior context. Note that the old note is also overwritten in this path, because the filename is topic-derived and the same topic maps to the same file. If you want to keep the old version, move it aside first.

Research note schema

Research notes use Zettelkasten frontmatter with a type: research marker and two type-specific fields.

---
topic: "Semantic modelling for BI tools"
repo: my-app # optional — omitted if not run from a git repo
type: research
created: 2026-04-23
updated: 2026-04-23
tags: [research, bi, semantic-layer]
related: ["[[research/data-warehousing-patterns]]"]
sources:
- "https://cube.dev/docs/schema"
- "src/semantic/models.ts"
---
  • topic — the original topic string you passed to the command. This is the primary identifier for research notes because research is topic-first, not repo-first.
  • sources — a machine-readable list of URLs and file paths consulted by the remoras. It sits alongside the human-readable ## Sources body section so tools and scripts can read sources without parsing markdown.
  • repo — optional. Included when you run research from inside a git repo, omitted otherwise. A pure concept research session does not need to be pinned to a repository.

See Vault Structure for the full schema and how research notes fit alongside the other Zettelkasten types.

Example

You want to ramp up on semantic modelling before a design discussion:

/xavier research "semantic modelling for BI tools"

Xavier decomposes, spawns four remoras (three fixed axes plus one dynamic axis — tooling landscape — plus Local Context since you are in a repo), and after they return you see a digest like this:

## TL;DR
Semantic modelling is the layer between raw data warehouses and BI consumption tools
that defines business-level entities, measures, and relationships once, so every
dashboard and query uses consistent definitions. Modern semantic layers (Cube, dbt
Semantic Layer, LookML) express models as code and compile to SQL at query time.
The field is converging on headless, API-first semantic layers that decouple
modelling from a single BI vendor. Your codebase already has a thin dimensional
layer in `src/semantic/` — introducing a semantic-layer tool would formalize it.
## Foundations
A semantic model encodes business definitions — what a customer is, how revenue
is computed, how orders roll up to accounts — in a single place so that every
downstream consumer gets the same answer. Core constructs are entities, measures,
dimensions, and joins. The goal is one source of truth for metrics, decoupled
from the physical warehouse schema.
## Practice
LookML pioneered the code-based semantic layer and remains the reference for
enterprise BI. dbt Semantic Layer and Cube have emerged as warehouse-native,
BI-agnostic alternatives. Teams typically version the model in git, test it
with unit tests against known-answer queries, and expose it via SQL or a GraphQL
API. Tradeoffs: heavier upfront modelling cost in exchange for metric consistency
and faster dashboard authoring downstream.
## State of Art
The headless semantic layer is the current direction — Cube, dbt Semantic Layer,
and MetricFlow all position themselves as BI-vendor-agnostic. Query federation
across warehouses is an active area, and LLM-driven text-to-metric interfaces
are starting to appear as a layer on top of semantic models rather than on raw SQL.
## Tooling Landscape
- Cube: open-source, REST/GraphQL/SQL APIs, caching layer, strong community
- dbt Semantic Layer: integrates with existing dbt projects, Semantic Layer API
- LookML (Looker): mature, tightly coupled to Looker as the BI surface
- MetricFlow: acquired into dbt Labs, powers dbt Semantic Layer under the hood
- AtScale, Honeydew: enterprise-focused alternatives
## Local Context
`src/semantic/models.ts` defines Customer, Order, and Revenue entities with
manual SQL templates. `src/reports/` assembles dashboards by composing these
templates — the same metric is recomputed across three files. Introducing a
semantic-layer tool would centralize those templates and remove the drift risk
already visible between `revenueThisMonth` and `monthlyRevenue`.
## Sources
- [Cube Semantic Layer docs](https://cube.dev/docs/schema) — entity and measure syntax
- [dbt Semantic Layer overview](https://docs.getdbt.com/docs/build/about-metricflow) — MetricFlow architecture
- [LookML reference](https://cloud.google.com/looker/docs/what-is-lookml) — the original code-based semantic layer
- `src/semantic/models.ts` — current dimensional definitions
- `src/reports/revenue.ts` — example of duplicated metric logic

Xavier then proposes semantic-modelling-for-bi-tools.md as the filename, you confirm, and the digest lands in ~/.xavier/research/.

When to use it

Pick research for "teach me X" — a concept, technology, or domain you want to understand. Pick Learning Your Codebase for "map my existing repo". Pick Investigate for "why is this broken". Research pulls knowledge in; learn and investigate analyze what is already there.

For the command signature and flag details, see the CLI commands reference.

Last updated: 4/23/26, 10:15 AM

Edit this page on GitHub
XavierAI Agent Orchestrator & Knowledge System
Community
github