diff --git a/CLAUDE.md b/CLAUDE.md index ea262d27..9c0e8b0f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -36,7 +36,7 @@ Read the relevant docs before starting work on a subsystem. - `/api/v1/auth/*` is owned end-to-end by **better-auth** (social sign-in, OAuth 2.1 authorize/token, sessions/sign-out) — see `AUTH_DESIGN.md`. The **web UI is served at root `/`** (any non-`/api` GET → static assets / SPA fallback), including the `/login` page the CLI authorize flow redirects to. - **Auth**: humans authenticate via **OAuth (GitHub/Google)** → a **better-auth session** — an httpOnly cookie for the web UI, and for the CLI an OAuth 2.1 authorization-code + PKCE loopback flow (RFC 8252) yielding access/refresh tokens. Agents — and a user, headlessly — use an **api key** (`me..`, a user PAT or agent key). Api keys are **global** per-principal credentials, not space-bound: the same key works in any space the principal has been admitted to (the space comes from `X-Me-Space`, gated by `build_tree_access`). better-auth owns session + OAuth-token storage (OAuth access/refresh tokens hashed sha256 at rest); api-key secrets are **core** sha256 (compared by equality in SQL), not argon2. Full design: `AUTH_DESIGN.md`. - **Embedding**: Vercel AI SDK; OpenAI `text-embedding-3-small` (1536-dim) in production; Ollama supported for local dev. -- **CLI**: `me` binary — `login`, `logout`, `whoami`, `space`, `group`, `access`, `agent`, `apikey`, `memory` (+ top-level aliases like `me search`, `me create` — except `import`), `import` (the source group: `memories`/`claude`/`codex`/`opencode`/`git`; `me memory import` and `me import` remain as aliases), `mcp`, `claude`/`codex`/`gemini`/`opencode`, `serve`, `pack`. +- **CLI**: `me` binary — `login`, `logout`, `whoami`, `space`, `group`, `access`, `agent`, `apikey`, `memory` (+ top-level aliases like `me search`, `me create` — except `import`), `import` (the source group: `memories`/`claude`/`codex`/`opencode`/`granola`/`git`; `me memory import` and `me import` remain as aliases), `mcp`, `claude`/`codex`/`gemini`/`opencode`, `serve`, `pack`. ## Principals, members, spaces (terminology) diff --git a/docs/cli/me-import.md b/docs/cli/me-import.md index aa185fed..391db455 100644 --- a/docs/cli/me-import.md +++ b/docs/cli/me-import.md @@ -8,6 +8,7 @@ Get data into Memory Engine — one subcommand per source. - [me import claude](#me-import-claude--codex--opencode) -- import Claude Code sessions - [me import codex](#me-import-claude--codex--opencode) -- import Codex sessions - [me import opencode](#me-import-claude--codex--opencode) -- import OpenCode sessions +- [me import granola](#me-import-granola) -- import Granola meeting notes and transcripts - [me import git](#me-import-git) -- import a repo's git commit history - [me import git-hook](#me-import-git-hook) -- install a post-commit hook that keeps git history memories current @@ -41,6 +42,83 @@ See [agent session imports](agent-session-imports.md) for the shared option refe --- +## me import granola + +Import meetings from [Granola](https://granola.ai) — one memory per meeting, holding the AI summary notes and (by default) the full transcript. Past meetings become searchable agent context ("what did we decide about X", "who was in the Y review"). + +``` +me import granola [options] +``` + +| Option | Description | +|--------|-------------| +| `--tree-root ` | Tree root under which `` leaves are placed. Default: `~/granola`. | +| `--since ` | Only import meetings started at or after this ISO 8601 timestamp. | +| `--until ` | Only import meetings started at or before this ISO 8601 timestamp. | +| `--no-transcript` | Import notes only, skipping the full meeting transcript (and its per-meeting API call). | +| `--include-invalid` | Include notes Granola did not flag as a valid meeting (ad-hoc notes, calendar stubs). | +| `--granola-dir ` | Override the Granola application-support directory (default: the standard macOS path). | +| `--dry-run` | Fetch and report what would be imported without writing. | + +### Authentication (no separate login) + +The import reuses the **Granola desktop app's** existing session — there is no separate `me`-side Granola login. It reads Granola's locally-stored, `safeStorage`-encrypted WorkOS tokens (decrypting them via the macOS login keychain), refreshes the short-lived access token through Granola's API, then pulls your meetings. Requirements: + +- The Granola desktop app is **installed and signed in** on this machine. +- **macOS only** for now (the credential read uses the login keychain). On other platforms the command exits with an actionable error. + +If the token refresh fails (e.g. Granola has been signed out), open the Granola app to refresh its session and re-run. + +### Tree layout + +Each meeting is a named leaf (its Granola `document_id`) under the tree root: + +``` +/ +``` + +The default root is your personal home (`~/granola`), so meetings are private to you. Pass `--tree-root /share/meetings` (or similar) to import into a shared space instead. + +### Content shape + +Each memory's content is a self-contained Markdown document: a title heading, a metadata line (date, attendees), the AI **summary notes**, and — unless `--no-transcript` — the full **transcript**. Notes are sourced, in order of preference, from the meeting's own `notes_markdown`, then an AI summary panel (its structured content, else its HTML). Transcript segments are grouped into speaker turns labelled `Me` (your microphone) and `Them` (everyone else); Granola does not attribute remote speakers by name. + +Meetings Granola flagged as not a valid meeting are skipped by default (`--include-invalid` keeps them), as are meetings with neither notes nor a transcript. + +### Idempotency and re-runs + +Idempotency is keyed on `(tree, document_id)` — each meeting is named by its Granola document id. The id is a timestamp-prefixed UUIDv7 (meeting start in the prefix, random tail), so meetings sort by date on the id. Re-imports reconcile in place via the server's content-aware upsert: an unchanged meeting is a no-op, a meeting whose notes/transcript changed (or an importer-version bump) is rewritten, and nothing is ever duplicated. Run it on a schedule to keep your meeting memory current. + +### Metadata + +| Key | Description | +|-----|-------------| +| `type` | Always `"granola_meeting"`. | +| `source_tool` | Always `"granola"`. | +| `source_document_id` | Granola document id (also the leaf name). | +| `display_name` | Human label for the web tree (`"Title — YYYY-MM-DD"`); the leaf `name` stays the document id so re-imports stay idempotent. | +| `source_workspace_id` | Granola workspace id (when present). | +| `source_calendar_event_id` | Google Calendar event id (when the meeting has one). | +| `attendees` | Calendar attendee emails (when present). | +| `content_mode` | `"with_transcript"` or `"notes_only"`. | +| `has_notes` / `has_transcript` | Whether each section was captured. | +| `transcript_segment_count` | Number of transcript segments. | +| `valid_meeting` | Granola's valid-meeting flag (when set). | +| `importer_version` | Version tag of the importer schema. | + +Temporal spans the meeting: calendar start→end when known, else the meeting's created time and last transcript segment. + +### Example + +```bash +me import granola --dry-run # preview everything Granola has +me import granola # full import (notes + transcripts) into ~/granola +me import granola --no-transcript # notes only (faster; fewer API calls) +me import granola --since 2026-01-01 # just this year's meetings +``` + +--- + ## me import git Import a repo's git commit history as memories — one memory per commit, holding the commit message plus a capped changed-file list. Commit intent ("why did we do X") and touched paths become searchable agent context. diff --git a/docs/concepts.md b/docs/concepts.md index cc3446e5..8743aad4 100644 --- a/docs/concepts.md +++ b/docs/concepts.md @@ -156,6 +156,9 @@ Metadata is indexed with a GIN index, making attribute-based filtering fast. You | `status` | Track lifecycle | `"active"`, `"implemented"`, `"superseded"`, `"archived"` | | `source` | Where it came from | `"slack"`, `"meeting"`, `"docs"`, `"code-review"` | | `confidence` | How certain you are | `"high"`, `"medium"`, `"low"` | +| `display_name` | Human label for the web tree | `"Weekly Sync — 2026-06-23"` | + +`display_name` is a presentation hint: the web UI's tree view prefers it over the memory's `name` and content when labelling a leaf. Use it when the stable `name` is an opaque id (e.g. an importer keying idempotency on a source id) but you still want a readable label. It does not affect addressing or search. ### Meta vs. tree diff --git a/packages/cli/commands/import-granola.ts b/packages/cli/commands/import-granola.ts new file mode 100644 index 00000000..0de0a35d --- /dev/null +++ b/packages/cli/commands/import-granola.ts @@ -0,0 +1,197 @@ +/** + * `me import granola` — import Granola meeting notes & transcripts as memories. + * + * Granola stores its meetings behind its cloud API, but persists the signed-in + * session locally (encrypted with Electron safeStorage). We read & decrypt that + * session, refresh the access token, then pull every meeting and write one + * memory per meeting under `.` (default `~/granola`). + * Idempotency is keyed on `(tree, name=document_id)`, so re-runs reconcile in + * place via the server's content-aware `onConflict: "replace"`. + * + * macOS only for now (the local-credential read uses the login keychain). + */ + +import * as clack from "@clack/prompts"; +import { Command } from "commander"; +import { resolveCredentials } from "../credentials.ts"; +import { + GranolaAuthError, + readGranolaTokens, +} from "../importers/granola/auth.ts"; +import { + DEFAULT_GRANOLA_TREE_ROOT, + type GranolaImportOptions, + type GranolaImportResult, + runGranolaImport, +} from "../importers/granola/index.ts"; +import { createProgressReporter } from "../importers/index.ts"; +import { getOutputFormat, output } from "../output.ts"; +import { + buildMemoryClient, + handleError, + requireAuth, + requireSpace, +} from "../util.ts"; +import { VALID_TREE_ROOT_RE } from "./import.ts"; + +/** Validate raw Commander opts into a typed import-option set (minus secrets). */ +function buildGranolaOptions( + opts: Record, +): Omit { + const treeRoot = + typeof opts.treeRoot === "string" + ? opts.treeRoot + : DEFAULT_GRANOLA_TREE_ROOT; + if (!VALID_TREE_ROOT_RE.test(treeRoot)) { + throw new Error( + `Invalid --tree-root: '${treeRoot}'. Use ltree labels ([A-Za-z0-9_-]) ` + + `separated by '.' or '/', with an optional leading '~' for your home.`, + ); + } + for (const field of ["since", "until"] as const) { + const value = opts[field]; + if (typeof value === "string" && Number.isNaN(Date.parse(value))) { + throw new Error( + `Invalid --${field}: '${value}' is not a valid ISO 8601 timestamp`, + ); + } + } + return { + granolaDir: + typeof opts.granolaDir === "string" ? opts.granolaDir : undefined, + treeRoot, + since: typeof opts.since === "string" ? opts.since : undefined, + until: typeof opts.until === "string" ? opts.until : undefined, + // --include-invalid disables the default skip of non-meeting notes. + skipInvalid: opts.includeInvalid !== true, + // Transcripts are included by default; --no-transcript turns them off. + includeTranscript: opts.transcript !== false, + dryRun: opts.dryRun === true, + }; +} + +/** Run one Granola import end-to-end and render the outcome. */ +export async function runGranolaImportCommand( + rawOpts: Record, + globalOpts: Record, +): Promise { + const creds = resolveCredentials( + typeof globalOpts.server === "string" ? globalOpts.server : undefined, + ); + const fmt = getOutputFormat(globalOpts); + requireAuth(creds, fmt); + requireSpace(creds, fmt); + + let opts: Omit; + try { + opts = buildGranolaOptions(rawOpts); + } catch (error) { + handleError(error, fmt); + } + + // Read & decrypt Granola's local session before touching the network. + let refreshToken: string; + try { + refreshToken = readGranolaTokens(opts.granolaDir).refresh_token; + } catch (error) { + if (error instanceof GranolaAuthError) { + handleError(error, fmt); + } + throw error; + } + + const engine = buildMemoryClient(creds); + const progress = + fmt === "text" ? createProgressReporter(process.stderr) : undefined; + progress?.start(); + + let result: GranolaImportResult; + try { + result = await runGranolaImport( + engine, + { ...opts, refreshToken }, + progress, + ); + } catch (error) { + progress?.stop(); + handleError(error, fmt); + } finally { + progress?.stop(); + } + + renderGranolaResult(result, fmt); + if (result.failed > 0 && result.inserted === 0 && result.updated === 0) { + process.exit(2); + } + if (result.failed > 0) process.exit(1); +} + +/** Print the import result in text or structured format. */ +function renderGranolaResult( + result: GranolaImportResult, + fmt: "text" | "json" | "yaml", +): void { + output(result, fmt, () => { + const verb = result.dryRun ? "Would import" : "Imported"; + clack.log.success( + `${verb} ${result.inserted} new, ${result.updated} updated, ` + + `${result.skipped} unchanged, ${result.failed} failed meetings ` + + `into ${result.tree}`, + ); + console.log(` Scanned ${result.meetingsSeen} Granola meetings`); + const skipTotal = Object.values(result.skipReasons).reduce( + (a, b) => a + b, + 0, + ); + if (skipTotal > 0) { + const parts = Object.entries(result.skipReasons) + .filter(([, n]) => n > 0) + .map(([reason, n]) => `${reason}=${n}`); + console.log(` Meetings skipped: ${parts.join(", ")}`); + } + if (!result.includeTranscript) { + console.log(" Transcripts omitted (--no-transcript)"); + } + for (const e of result.errors) { + console.log(` ✗ ${e.documentId}: ${e.error}`); + } + }); +} + +/** `me import granola` subcommand factory. */ +export function createGranolaImportCommand(): Command { + return new Command("granola") + .description("import Granola meeting notes and transcripts as memories") + .option( + "--tree-root ", + `tree root under which '' leaves are placed (default: ${DEFAULT_GRANOLA_TREE_ROOT})`, + ) + .option( + "--since ", + "only import meetings started at or after this timestamp", + ) + .option( + "--until ", + "only import meetings started at or before this timestamp", + ) + .option( + "--no-transcript", + "import notes only, skipping the full meeting transcript", + ) + .option( + "--include-invalid", + "include notes Granola did not flag as valid meetings", + ) + .option( + "--granola-dir ", + "override the Granola application-support directory", + ) + .option( + "--dry-run", + "fetch and report what would be imported without writing", + ) + .action(async (opts, cmdRef) => { + const globalOpts = cmdRef.optsWithGlobals(); + await runGranolaImportCommand(opts, globalOpts); + }); +} diff --git a/packages/cli/commands/import-group.ts b/packages/cli/commands/import-group.ts index c5db9d4c..f75f6085 100644 --- a/packages/cli/commands/import-group.ts +++ b/packages/cli/commands/import-group.ts @@ -8,6 +8,7 @@ * me import codex Codex sessions * me import opencode OpenCode sessions * me import git [repo] git commit history + * me import granola Granola meeting notes & transcripts * * There is deliberately no bare default: `me import ` does not parse. * The pre-group spellings stay registered as aliases built from the same @@ -23,6 +24,7 @@ import { } from "./import.ts"; import { createGitImportCommand } from "./import-git.ts"; import { createGitHookCommand } from "./import-git-hook.ts"; +import { createGranolaImportCommand } from "./import-granola.ts"; import { createMemoryImportCommand } from "./memory-import.ts"; export function createImportCommand(): Command { @@ -33,6 +35,7 @@ export function createImportCommand(): Command { imp.addCommand(createClaudeImportCommand("claude")); imp.addCommand(createCodexImportCommand("codex")); imp.addCommand(createOpenCodeImportCommand("opencode")); + imp.addCommand(createGranolaImportCommand()); imp.addCommand(createGitImportCommand()); imp.addCommand(createGitHookCommand()); imp.addHelpText( diff --git a/packages/cli/importers/granola/auth.test.ts b/packages/cli/importers/granola/auth.test.ts new file mode 100644 index 00000000..0014ab7c --- /dev/null +++ b/packages/cli/importers/granola/auth.test.ts @@ -0,0 +1,116 @@ +/** + * Tests for the Granola local-credential reader. We don't have Granola's real + * keychain entry under test, so we reproduce its two-layer scheme in reverse to + * synthesize a `storage.dek` + `supabase.json.enc` pair from a known password, + * then assert `readGranolaTokens` recovers the tokens — and surfaces actionable + * `GranolaAuthError`s when material is missing or corrupt. + */ + +import { afterEach, beforeEach, describe, expect, test } from "bun:test"; +import { createCipheriv, pbkdf2Sync, randomBytes } from "node:crypto"; +import { mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { + type CommandRunner, + GranolaAuthError, + readGranolaTokens, +} from "./auth.ts"; + +const PASSWORD = "dGVzdC1wYXNzd29yZC0xMjM0"; // arbitrary base64-looking string +const CBC_IV = Buffer.alloc(16, 0x20); + +/** Encrypt a base64 DEK into a `storage.dek` blob (CBC, v10 prefix). */ +function makeStorageDek(password: string, dek: Buffer): Buffer { + const cbcKey = pbkdf2Sync(password, "saltysalt", 1003, 16, "sha1"); + const cipher = createCipheriv("aes-128-cbc", cbcKey, CBC_IV); + const body = Buffer.concat([ + cipher.update(dek.toString("base64"), "utf8"), + cipher.final(), + ]); + return Buffer.concat([Buffer.from("v10"), body]); +} + +/** Encrypt a plaintext into a Granola `.enc` blob (GCM: iv|ct|tag). */ +function makeEnc(dek: Buffer, plaintext: string): Buffer { + const iv = randomBytes(12); + const cipher = createCipheriv("aes-256-gcm", dek, iv); + const ct = Buffer.concat([cipher.update(plaintext, "utf8"), cipher.final()]); + return Buffer.concat([iv, ct, cipher.getAuthTag()]); +} + +let dir: string; + +/** A runner that returns the known password for the keychain lookup. */ +const okRunner: CommandRunner = () => ({ + exitCode: 0, + stdout: `${PASSWORD}\n`, +}); + +beforeEach(() => { + dir = mkdtempSync(join(tmpdir(), "granola-auth-")); + const dek = randomBytes(32); + writeFileSync(join(dir, "storage.dek"), makeStorageDek(PASSWORD, dek)); + const tokens = { + access_token: "access-abc", + refresh_token: "refresh-xyz", + expires_in: 3600, + }; + const supabase = { workos_tokens: JSON.stringify(tokens) }; + writeFileSync( + join(dir, "supabase.json.enc"), + makeEnc(dek, JSON.stringify(supabase)), + ); +}); + +afterEach(() => { + rmSync(dir, { recursive: true, force: true }); +}); + +describe("readGranolaTokens", () => { + test.skipIf(process.platform !== "darwin")( + "decrypts the DEK chain and recovers tokens", + () => { + const tokens = readGranolaTokens(dir, okRunner); + expect(tokens.access_token).toBe("access-abc"); + expect(tokens.refresh_token).toBe("refresh-xyz"); + expect(tokens.expires_in).toBe(3600); + }, + ); + + test.skipIf(process.platform !== "darwin")( + "throws GranolaAuthError when the keychain lookup fails", + () => { + const failRunner: CommandRunner = () => ({ exitCode: 1, stdout: "" }); + expect(() => readGranolaTokens(dir, failRunner)).toThrow( + GranolaAuthError, + ); + }, + ); + + test.skipIf(process.platform !== "darwin")( + "throws GranolaAuthError when storage.dek is missing", + () => { + rmSync(join(dir, "storage.dek")); + expect(() => readGranolaTokens(dir, okRunner)).toThrow(GranolaAuthError); + }, + ); + + test.skipIf(process.platform !== "darwin")( + "throws GranolaAuthError when the password is wrong", + () => { + const wrongPw: CommandRunner = () => ({ + exitCode: 0, + stdout: "totally-different-password\n", + }); + expect(() => readGranolaTokens(dir, wrongPw)).toThrow(GranolaAuthError); + }, + ); + + test.skipIf(process.platform === "darwin")( + "throws on non-macOS platforms", + () => { + expect(() => readGranolaTokens(dir, okRunner)).toThrow(GranolaAuthError); + }, + ); +}); diff --git a/packages/cli/importers/granola/auth.ts b/packages/cli/importers/granola/auth.ts new file mode 100644 index 00000000..a464aadc --- /dev/null +++ b/packages/cli/importers/granola/auth.ts @@ -0,0 +1,225 @@ +/** + * Granola local-credential reader. + * + * Granola (an Electron app) persists its WorkOS session tokens locally, but + * encrypted with Electron's `safeStorage`. There is no public credential file + * to read — instead we reproduce the same two-layer key derivation the app + * uses, entirely from local material: + * + * 1. The OS keychain holds a base64 password under the generic-password + * service `"Granola Safe Storage"` (Chromium's `os_crypt` convention). + * 2. That password, run through PBKDF2-HMAC-SHA1 (salt `saltysalt`, 1003 + * iterations, 16-byte key), is an AES-128-CBC key (IV = 16 spaces) that + * decrypts `storage.dek` — yielding a base64 32-byte **data encryption + * key** (DEK). + * 3. The DEK decrypts the app's `*.enc` blobs as AES-256-GCM with the layout + * `iv(12) | ciphertext | tag(16)`. + * + * `supabase.json.enc` decrypts to `{ workos_tokens: "" }`, whose + * inner JSON carries the `access_token` / `refresh_token`. The access token is + * short-lived (~hours), so callers refresh it through the Granola API before + * use (see `client.ts`). + * + * This is macOS-only for now: the keychain step shells out to `security`. On + * any other platform — or when Granola isn't installed / the user isn't logged + * in — `readGranolaTokens` throws a `GranolaAuthError` with an actionable + * message instead of a raw crypto failure. + */ + +import { type CipherKey, createDecipheriv, pbkdf2Sync } from "node:crypto"; +import { readFileSync } from "node:fs"; +import { homedir } from "node:os"; +import { join } from "node:path"; + +/** Default Granola application-support directory (macOS). */ +export function defaultGranolaDir(): string { + return join(homedir(), "Library", "Application Support", "Granola"); +} + +/** Keychain generic-password service that holds the safeStorage password. */ +const KEYCHAIN_SERVICE = "Granola Safe Storage"; +/** Chromium os_crypt PBKDF2 parameters. */ +const PBKDF2_SALT = "saltysalt"; +const PBKDF2_ITERATIONS = 1003; +const PBKDF2_KEY_LEN = 16; +/** safeStorage CBC IV is 16 spaces; the ciphertext carries a 3-byte `v10` tag. */ +const CBC_IV = Buffer.alloc(16, 0x20); +const SAFE_STORAGE_PREFIX_LEN = 3; +/** GCM layout used by Granola's `*.enc` blobs. */ +const GCM_IV_LEN = 12; +const GCM_TAG_LEN = 16; + +/** The WorkOS token bundle Granola persists (only the fields we use). */ +export interface GranolaTokens { + access_token: string; + refresh_token: string; + /** Seconds-from-issue lifetime, when present. */ + expires_in?: number; +} + +/** A user-actionable failure to read Granola's local credentials. */ +export class GranolaAuthError extends Error { + constructor(message: string) { + super(message); + this.name = "GranolaAuthError"; + } +} + +/** Minimal shape of a process runner, injectable for tests. */ +export type CommandRunner = ( + cmd: string[], +) => { exitCode: number; stdout: string } | null; + +/** Default runner: shells out via Bun.spawnSync, returns null on spawn error. */ +const defaultRunner: CommandRunner = (cmd) => { + try { + const r = Bun.spawnSync({ cmd, stdout: "pipe", stderr: "pipe" }); + return { exitCode: r.exitCode ?? 1, stdout: r.stdout.toString() }; + } catch { + return null; + } +}; + +/** Read the base64 safeStorage password from the macOS login keychain. */ +function readKeychainPassword(run: CommandRunner): string { + if (process.platform !== "darwin") { + throw new GranolaAuthError( + "Granola import currently supports macOS only (it reads Granola's " + + "credentials from the login keychain).", + ); + } + const r = run([ + "security", + "find-generic-password", + "-s", + KEYCHAIN_SERVICE, + "-w", + ]); + if (!r || r.exitCode !== 0 || r.stdout.trim().length === 0) { + throw new GranolaAuthError( + `Could not read the Granola key from the login keychain (service ` + + `"${KEYCHAIN_SERVICE}"). Is the Granola desktop app installed and ` + + `signed in on this machine?`, + ); + } + return r.stdout.trim(); +} + +/** Derive the 32-byte AES-256-GCM data encryption key from `storage.dek`. */ +function readDataEncryptionKey(dir: string, run: CommandRunner): Buffer { + const password = readKeychainPassword(run); + const cbcKey = pbkdf2Sync( + password, + PBKDF2_SALT, + PBKDF2_ITERATIONS, + PBKDF2_KEY_LEN, + "sha1", + ); + + let dekBlob: Buffer; + try { + dekBlob = readFileSync(join(dir, "storage.dek")); + } catch { + throw new GranolaAuthError( + `Granola's storage.dek key file was not found under ${dir}. Is Granola ` + + `installed and signed in on this machine?`, + ); + } + + let dekBase64: string; + try { + const decipher = createDecipheriv("aes-128-cbc", cbcKey, CBC_IV); + const plaintext = Buffer.concat([ + decipher.update(dekBlob.subarray(SAFE_STORAGE_PREFIX_LEN)), + decipher.final(), + ]); + dekBase64 = plaintext.toString("utf8"); + } catch { + throw new GranolaAuthError( + "Failed to decrypt Granola's data key (storage.dek). The keychain key " + + "may be stale — try opening the Granola app, then re-run.", + ); + } + + const dek = Buffer.from(dekBase64, "base64"); + if (dek.length !== 32) { + throw new GranolaAuthError( + `Granola's decrypted data key has an unexpected length (${dek.length} ` + + `bytes; expected 32). Granola's storage format may have changed.`, + ); + } + return dek; +} + +/** AES-256-GCM-decrypt one of Granola's `*.enc` blobs with the DEK. */ +function decryptEnc(blob: Buffer, key: CipherKey): string { + const iv = blob.subarray(0, GCM_IV_LEN); + const tag = blob.subarray(blob.length - GCM_TAG_LEN); + const ciphertext = blob.subarray(GCM_IV_LEN, blob.length - GCM_TAG_LEN); + const decipher = createDecipheriv("aes-256-gcm", key, iv); + decipher.setAuthTag(tag); + return Buffer.concat([ + decipher.update(ciphertext), + decipher.final(), + ]).toString("utf8"); +} + +/** + * Read Granola's locally-stored WorkOS tokens, decrypting the on-disk + * `supabase.json.enc` blob. Throws `GranolaAuthError` with an actionable + * message when Granola isn't installed / signed in or the format has changed. + * + * `dir` defaults to the standard application-support path; `run` is injectable + * so tests can stub the keychain lookup. + */ +export function readGranolaTokens( + dir: string = defaultGranolaDir(), + run: CommandRunner = defaultRunner, +): GranolaTokens { + const dek = readDataEncryptionKey(dir, run); + + let blob: Buffer; + try { + blob = readFileSync(join(dir, "supabase.json.enc")); + } catch { + throw new GranolaAuthError( + `Granola's supabase.json.enc was not found under ${dir}. Sign in to the ` + + `Granola desktop app, then re-run.`, + ); + } + + let outer: { workos_tokens?: string }; + try { + outer = JSON.parse(decryptEnc(blob, dek)) as { workos_tokens?: string }; + } catch { + throw new GranolaAuthError( + "Failed to decrypt Granola's session file (supabase.json.enc). Try " + + "opening the Granola app to refresh its local state, then re-run.", + ); + } + + if (!outer.workos_tokens) { + throw new GranolaAuthError( + "Granola's session file did not contain WorkOS tokens. Sign in to the " + + "Granola desktop app, then re-run.", + ); + } + + let tokens: GranolaTokens; + try { + tokens = JSON.parse(outer.workos_tokens) as GranolaTokens; + } catch { + throw new GranolaAuthError( + "Granola's WorkOS token bundle was not valid JSON. Granola's storage " + + "format may have changed.", + ); + } + + if (!tokens.access_token || !tokens.refresh_token) { + throw new GranolaAuthError( + "Granola's session is missing an access or refresh token. Sign in to " + + "the Granola desktop app, then re-run.", + ); + } + return tokens; +} diff --git a/packages/cli/importers/granola/client.ts b/packages/cli/importers/granola/client.ts new file mode 100644 index 00000000..92cee72e --- /dev/null +++ b/packages/cli/importers/granola/client.ts @@ -0,0 +1,200 @@ +/** + * Minimal Granola HTTP API client. + * + * Granola's desktop app talks to `api.granola.ai` with a WorkOS bearer token + * plus a client-version header — the server rejects requests without a + * recognized `X-Client-Version` ("Unsupported client"). We mirror those + * headers and refresh the (short-lived) access token through + * `/v1/refresh-access-token` before the first data call, so an import works + * even when Granola's on-disk access token has expired. + * + * Only the read endpoints the importer needs are wrapped: + * - `POST /v2/get-documents` → meeting metadata + AI notes (paged by offset) + * - `POST /v1/get-document-panels` → per-meeting AI summary panels (rich notes) + * - `POST /v1/get-document-transcript` → per-meeting transcript segments + */ + +/** Base URL for Granola's API. */ +const API_BASE = "https://api.granola.ai"; +/** + * Client version sent as `X-Client-Version`. The server gates on a recognized + * value; this tracks a known-good desktop release. Overridable via + * `GRANOLA_CLIENT_VERSION` if Granola tightens the gate. + */ +const DEFAULT_CLIENT_VERSION = "7.356.2"; + +function clientVersion(): string { + return process.env.GRANOLA_CLIENT_VERSION || DEFAULT_CLIENT_VERSION; +} + +/** Thrown for any non-2xx Granola API response. */ +export class GranolaApiError extends Error { + constructor( + message: string, + readonly status: number, + ) { + super(message); + this.name = "GranolaApiError"; + } +} + +/** One meeting document as returned by `get-documents` (only fields we read). */ +export interface GranolaDocument { + id: string; + title?: string | null; + created_at?: string | null; + updated_at?: string | null; + notes_markdown?: string | null; + notes_plain?: string | null; + /** ProseMirror notes doc, when present. */ + notes?: unknown; + summary?: string | null; + overview?: string | null; + valid_meeting?: boolean | null; + deleted_at?: string | null; + google_calendar_event?: GranolaCalendarEvent | null; + people?: unknown; + workspace_id?: string | null; +} + +/** The slice of a meeting's Google Calendar event we surface. */ +export interface GranolaCalendarEvent { + id?: string; + summary?: string; + start?: { dateTime?: string; date?: string; timeZone?: string }; + end?: { dateTime?: string; date?: string; timeZone?: string }; + attendees?: Array<{ + email?: string; + responseStatus?: string; + self?: boolean; + }>; + htmlLink?: string; +} + +/** An AI summary panel (rich notes) for a meeting. */ +export interface GranolaPanel { + id: string; + document_id?: string; + title?: string | null; + template_slug?: string | null; + /** ProseMirror content doc. */ + content?: unknown; + /** HTML rendering of the panel, when present. */ + original_content?: string | null; + created_at?: string | null; + updated_at?: string | null; +} + +/** One transcript segment for a meeting. */ +export interface GranolaTranscriptSegment { + id?: string; + document_id?: string; + start_timestamp?: string; + end_timestamp?: string; + text?: string; + /** "microphone" (the user) vs "system" (everyone else). */ + source?: string; + is_final?: boolean; + detected_speaker_name?: string | null; +} + +/** + * A live Granola API session: holds the current access token and refreshes it + * once up front. Construct via `createGranolaClient`, which performs the + * refresh, so every data call carries a valid token. + */ +export class GranolaClient { + private constructor(private accessToken: string) {} + + /** + * Build a client from a refresh token, exchanging it for a fresh access + * token. Using the refresh token (rather than the possibly-expired on-disk + * access token) means an import works regardless of how long Granola has been + * closed. + */ + static async create(refreshToken: string): Promise { + const res = await fetch(`${API_BASE}/v1/refresh-access-token`, { + method: "POST", + headers: { + "Content-Type": "application/json", + "X-Client-Version": clientVersion(), + "User-Agent": `Granola/${clientVersion()} Electron`, + }, + body: JSON.stringify({ refresh_token: refreshToken }), + }); + if (!res.ok) { + throw new GranolaApiError( + `Granola token refresh failed (HTTP ${res.status}). Re-open the ` + + `Granola desktop app to refresh its session, then retry.`, + res.status, + ); + } + const body = (await res.json()) as { access_token?: string }; + if (!body.access_token) { + throw new GranolaApiError( + "Granola token refresh returned no access token.", + res.status, + ); + } + return new GranolaClient(body.access_token); + } + + /** POST a JSON body to a Granola API path and parse the JSON response. */ + private async post(path: string, body: unknown): Promise { + const res = await fetch(`${API_BASE}${path}`, { + method: "POST", + headers: { + Authorization: `Bearer ${this.accessToken}`, + "Content-Type": "application/json", + "X-Client-Version": clientVersion(), + "User-Agent": `Granola/${clientVersion()} Electron`, + }, + body: JSON.stringify(body), + }); + if (!res.ok) { + throw new GranolaApiError( + `Granola API ${path} failed (HTTP ${res.status}).`, + res.status, + ); + } + return (await res.json()) as T; + } + + /** + * Stream every meeting document, paging by offset. Granola returns documents + * newest-first; we yield each page's docs in order until a short page signals + * the end. + */ + async *listDocuments(pageSize = 100): AsyncIterable { + let offset = 0; + for (;;) { + const page = await this.post<{ docs?: GranolaDocument[] }>( + "/v2/get-documents", + { limit: pageSize, offset }, + ); + const docs = page.docs ?? []; + for (const doc of docs) yield doc; + if (docs.length < pageSize) return; + offset += docs.length; + } + } + + /** Fetch the AI summary panels for one meeting (may be empty). */ + async getPanels(documentId: string): Promise { + const res = await this.post( + "/v1/get-document-panels", + { document_id: documentId }, + ); + if (Array.isArray(res)) return res; + return res.panels ?? []; + } + + /** Fetch the transcript segments for one meeting (may be empty). */ + async getTranscript(documentId: string): Promise { + const res = await this.post< + GranolaTranscriptSegment[] | { segments?: GranolaTranscriptSegment[] } + >("/v1/get-document-transcript", { document_id: documentId }); + if (Array.isArray(res)) return res; + return res.segments ?? []; + } +} diff --git a/packages/cli/importers/granola/index.test.ts b/packages/cli/importers/granola/index.test.ts new file mode 100644 index 00000000..acd809a6 --- /dev/null +++ b/packages/cli/importers/granola/index.test.ts @@ -0,0 +1,222 @@ +/** + * Tests for the Granola importer orchestration: pre-filtering (deleted, + * invalid, since/until), the empty-meeting skip, per-meeting panel/transcript + * fetching, and the batchCreate submission path — all against a fake source + + * in-memory engine (no network, no DB). + */ +import { describe, expect, test } from "bun:test"; +import type { MemoryClient } from "../../client.ts"; +import type { + GranolaDocument, + GranolaPanel, + GranolaTranscriptSegment, +} from "./client.ts"; +import { + type GranolaImportOptions, + type GranolaSource, + runGranolaImport, +} from "./index.ts"; + +const BASE_OPTS: Omit = { + treeRoot: "~.granola", + skipInvalid: true, + includeTranscript: true, + dryRun: false, +}; + +function opts(over: Partial = {}): GranolaImportOptions { + return { refreshToken: "r", ...BASE_OPTS, ...over }; +} + +/** A fake Granola API source backed by in-memory docs/panels/transcripts. */ +function fakeSource(opts: { + docs: GranolaDocument[]; + panels?: Record; + transcripts?: Record; +}): GranolaSource & { panelCalls: string[]; transcriptCalls: string[] } { + const panelCalls: string[] = []; + const transcriptCalls: string[] = []; + return { + panelCalls, + transcriptCalls, + async *listDocuments() { + for (const d of opts.docs) yield d; + }, + async getPanels(id) { + panelCalls.push(id); + return opts.panels?.[id] ?? []; + }, + async getTranscript(id) { + transcriptCalls.push(id); + return opts.transcripts?.[id] ?? []; + }, + }; +} + +/** An in-memory engine that records batchCreate inputs and reports inserts. */ +function mockEngine() { + const submitted: Array<{ tree: string; name?: string | null }> = []; + const client = { + memory: { + batchCreate: async (p: { + memories: Array<{ tree: string; name?: string | null }>; + }) => { + submitted.push(...p.memories); + return { + results: p.memories.map((m) => ({ + id: "00000000-0000-7000-8000-000000000000", + status: "inserted" as const, + name: m.name ?? null, + })), + }; + }, + }, + } as unknown as MemoryClient; + return { client, submitted }; +} + +const NOTES_DOC: GranolaDocument = { + id: "doc-notes", + title: "Has Notes", + created_at: "2026-01-02T00:00:00.000Z", + notes_markdown: "# real notes", + valid_meeting: true, +}; + +describe("runGranolaImport pre-filters", () => { + test("skips deleted, invalid, and out-of-window meetings", async () => { + const { client, submitted } = mockEngine(); + const source = fakeSource({ + docs: [ + NOTES_DOC, + { id: "del", title: "Deleted", deleted_at: "2026-01-01T00:00:00Z" }, + { id: "inv", title: "Invalid", valid_meeting: false }, + { + id: "old", + title: "Old", + created_at: "2020-01-01T00:00:00.000Z", + notes_markdown: "old", + }, + ], + }); + const result = await runGranolaImport( + client, + opts({ since: "2026-01-01T00:00:00Z" }), + undefined, + source, + ); + expect(result.meetingsSeen).toBe(4); + expect(result.inserted).toBe(1); + expect(result.skipReasons.deleted).toBe(1); + expect(result.skipReasons.invalid_meeting).toBe(1); + expect(result.skipReasons.since_filter).toBe(1); + expect(submitted).toHaveLength(1); + expect(submitted[0]?.name).toBe("doc-notes"); + expect(submitted[0]?.tree).toBe("~.granola"); + }); + + test("--include-invalid keeps non-meeting notes", async () => { + const { client, submitted } = mockEngine(); + const source = fakeSource({ + docs: [ + { + id: "inv", + title: "Invalid", + notes_markdown: "x", + valid_meeting: false, + }, + ], + }); + const result = await runGranolaImport( + client, + opts({ skipInvalid: false }), + undefined, + source, + ); + expect(result.inserted).toBe(1); + expect(submitted).toHaveLength(1); + }); +}); + +describe("runGranolaImport content fetching", () => { + test("fetches panels only when the doc lacks notes_markdown", async () => { + const { client } = mockEngine(); + const source = fakeSource({ + docs: [ + NOTES_DOC, + { + id: "no-notes", + title: "Needs Panel", + created_at: "2026-01-03T00:00:00Z", + }, + ], + panels: { + "no-notes": [{ id: "p", original_content: "

panel notes

" }], + }, + }); + await runGranolaImport(client, opts(), undefined, source); + // doc-notes already has notes_markdown → no panel call; no-notes → one call. + expect(source.panelCalls).toEqual(["no-notes"]); + }); + + test("skips a meeting with no notes and no transcript", async () => { + const { client, submitted } = mockEngine(); + const source = fakeSource({ + docs: [ + { id: "stub", title: "Empty", created_at: "2026-01-04T00:00:00Z" }, + ], + }); + const result = await runGranolaImport(client, opts(), undefined, source); + expect(result.skipReasons.empty).toBe(1); + expect(result.inserted).toBe(0); + expect(submitted).toHaveLength(0); + }); + + test("--no-transcript skips transcript fetches", async () => { + const { client } = mockEngine(); + const source = fakeSource({ + docs: [NOTES_DOC], + transcripts: { "doc-notes": [{ text: "hi", source: "microphone" }] }, + }); + await runGranolaImport( + client, + opts({ includeTranscript: false }), + undefined, + source, + ); + expect(source.transcriptCalls).toHaveLength(0); + }); + + test("includes a meeting that has only a transcript", async () => { + const { client, submitted } = mockEngine(); + const source = fakeSource({ + docs: [ + { id: "t-only", title: "Talk", created_at: "2026-01-05T00:00:00Z" }, + ], + transcripts: { + "t-only": [ + { text: "Hello", source: "microphone" }, + { text: "Hi", source: "system" }, + ], + }, + }); + const result = await runGranolaImport(client, opts(), undefined, source); + expect(result.inserted).toBe(1); + expect(submitted).toHaveLength(1); + }); +}); + +describe("runGranolaImport dry run", () => { + test("reports planned inserts without submitting", async () => { + const { client, submitted } = mockEngine(); + const source = fakeSource({ docs: [NOTES_DOC] }); + const result = await runGranolaImport( + client, + opts({ dryRun: true }), + undefined, + source, + ); + expect(result.inserted).toBe(1); + expect(submitted).toHaveLength(0); + }); +}); diff --git a/packages/cli/importers/granola/index.ts b/packages/cli/importers/granola/index.ts new file mode 100644 index 00000000..578a50cc --- /dev/null +++ b/packages/cli/importers/granola/index.ts @@ -0,0 +1,251 @@ +/** + * Granola meeting importer. + * + * Reads Granola's locally-stored session (see `auth.ts`), refreshes the access + * token, and pulls every meeting from the Granola API (`client.ts`). Each + * meeting becomes one memory under `.`, named by the meeting's + * Granola document id so `(tree, name)` is the idempotency key — re-imports + * collapse onto the same row. The id is a timestamp-prefixed UUIDv7 so memories + * sort by meeting time. + * + * Reconciliation is server-side: every meeting is submitted through + * `memory.batchCreate` with `onConflict: "replace"`. The deterministic meta + * carries `importer_version`, so a render change (version bump) re-renders in + * place while an unchanged re-import is a no-op. Notes-only meetings are cheap + * (one list call); transcripts and panels are fetched per meeting only when + * needed, so `--no-transcript` imports skip those round-trips entirely. + */ + +import type { MemoryCreateParams } from "@memory.build/protocol/memory"; +import { batchCreateChunked } from "../../chunk.ts"; +import type { MemoryClient } from "../../client.ts"; +import { IMPORTER_VERSION } from "../index.ts"; +import type { ProgressReporter } from "../progress.ts"; +import { boundedUniqueLabel } from "../slug.ts"; +import { uuidv7At } from "../uuid.ts"; +import { + GranolaClient, + type GranolaDocument, + type GranolaPanel, + type GranolaTranscriptSegment, +} from "./client.ts"; +import { type GranolaMeeting, meetingStart, renderMeeting } from "./render.ts"; + +/** Default tree root for imported meetings. Under the caller's home. */ +export const DEFAULT_GRANOLA_TREE_ROOT = "~.granola"; +/** Memory-name length cap (DB CHECK) for the meeting leaf. */ +const MEETING_NAME_MAX = 128; + +/** Options that affect what the importer pulls and writes. */ +export interface GranolaImportOptions { + /** Override the Granola application-support directory. */ + granolaDir?: string; + /** Refresh token (already read from local storage by the caller). */ + refreshToken: string; + /** Tree root under which `` leaves are placed. */ + treeRoot: string; + /** Only import meetings started at or after this ISO timestamp. */ + since?: string; + /** Only import meetings started at or before this ISO timestamp. */ + until?: string; + /** Skip meetings Granola flagged `valid_meeting: false`. Default true. */ + skipInvalid: boolean; + /** Include the full transcript in each memory (extra API calls). */ + includeTranscript: boolean; + /** Don't write — just report what would happen. */ + dryRun: boolean; +} + +/** Per-reason skip counts for the run. */ +export type GranolaSkipReason = + | "deleted" + | "invalid_meeting" + | "since_filter" + | "until_filter" + | "empty"; + +/** Structured result of one Granola import run. */ +export interface GranolaImportResult { + tree: string; + dryRun: boolean; + includeTranscript: boolean; + meetingsSeen: number; + inserted: number; + updated: number; + skipped: number; + failed: number; + skipReasons: Record; + errors: Array<{ documentId: string; error: string }>; +} + +/** The meeting leaf name within the tree root: the Granola document id. */ +function meetingName(documentId: string): string { + return boundedUniqueLabel( + documentId, + (s) => s.replace(/[^A-Za-z0-9._-]/g, "_"), + MEETING_NAME_MAX, + ); +} + +/** Decide whether a document is skipped before any per-meeting fetches. */ +function preFilter( + doc: GranolaDocument, + options: GranolaImportOptions, +): GranolaSkipReason | null { + if (doc.deleted_at) return "deleted"; + if (options.skipInvalid && doc.valid_meeting === false) { + return "invalid_meeting"; + } + const start = meetingStart(doc); + if (start) { + const ms = Date.parse(start); + if (options.since) { + const s = Date.parse(options.since); + if (!Number.isNaN(s) && ms < s) return "since_filter"; + } + if (options.until) { + const u = Date.parse(options.until); + if (!Number.isNaN(u) && ms > u) return "until_filter"; + } + } + return null; +} + +/** + * Build a meeting's memory payload, fetching panels (notes) and, when + * requested, the transcript. Returns null when the meeting renders to nothing + * worth storing (no notes and no transcript and no title). + */ +async function buildMeetingMemory( + client: GranolaSource, + doc: GranolaDocument, + options: GranolaImportOptions, +): Promise { + // Notes may already be on the document; otherwise panels carry the AI summary. + const needPanels = !doc.notes_markdown?.trim(); + const panels = needPanels ? await client.getPanels(doc.id) : []; + const transcript = options.includeTranscript + ? await client.getTranscript(doc.id) + : []; + + const meeting: GranolaMeeting = { document: doc, panels, transcript }; + const rendered = renderMeeting(meeting, { + includeTranscript: options.includeTranscript, + }); + + // A meeting with no notes and no transcript is just a calendar stub — skip it + // unless the user explicitly wants transcripts (where an empty one is still a + // deliberate capture). We treat "has a title + a date" as enough signal only + // when there's also notes or transcript content. + if (!rendered.meta.has_notes && !rendered.meta.has_transcript) { + return null; + } + + const startMs = rendered.startedAt + ? Date.parse(rendered.startedAt) + : Date.now(); + const temporal = rendered.startedAt + ? rendered.endedAt + ? { start: rendered.startedAt, end: rendered.endedAt } + : { start: rendered.startedAt } + : undefined; + + return { + id: uuidv7At(startMs), + name: meetingName(doc.id), + content: rendered.content, + meta: { ...rendered.meta, importer_version: IMPORTER_VERSION }, + tree: options.treeRoot, + ...(temporal ? { temporal } : {}), + }; +} + +/** The slice of GranolaClient the importer consumes (injectable for tests). */ +export interface GranolaSource { + listDocuments(pageSize?: number): AsyncIterable; + getPanels(documentId: string): Promise; + getTranscript(documentId: string): Promise; +} + +/** + * Run a full Granola import: list meetings, render each, and submit through the + * server's conditional upsert. Progress (when provided) ticks per meeting. + * + * `source` is injectable for tests; in production the caller omits it and we + * build a real `GranolaClient` from the refresh token. + */ +export async function runGranolaImport( + engine: MemoryClient, + options: GranolaImportOptions, + progress?: ProgressReporter, + source?: GranolaSource, +): Promise { + const client = source ?? (await GranolaClient.create(options.refreshToken)); + + const skipReasons: Record = {}; + const errors: Array<{ documentId: string; error: string }> = []; + const planned: MemoryCreateParams[] = []; + let meetingsSeen = 0; + + for await (const doc of client.listDocuments()) { + meetingsSeen++; + progress?.process(doc.title?.trim() || doc.id); + + const skip = preFilter(doc, options); + if (skip) { + skipReasons[skip] = (skipReasons[skip] ?? 0) + 1; + continue; + } + + try { + const payload = await buildMeetingMemory(client, doc, options); + if (!payload) { + skipReasons.empty = (skipReasons.empty ?? 0) + 1; + continue; + } + planned.push(payload); + } catch (error) { + errors.push({ + documentId: doc.id, + error: error instanceof Error ? error.message : String(error), + }); + } + } + + let inserted = 0; + let updated = 0; + let skipped = 0; + let failed = errors.length; + + if (options.dryRun) { + inserted = planned.length; + } else if (planned.length > 0) { + const { results, errors: chunkErrors } = await batchCreateChunked( + engine, + planned, + { onConflict: "replace" }, + ); + for (const r of results) { + if (r.status === "inserted") inserted++; + else if (r.status === "updated") updated++; + else if (r.status === "skipped") skipped++; + } + for (const e of chunkErrors) { + failed += e.itemCount; + for (const id of e.ids) errors.push({ documentId: id, error: e.error }); + } + } + + return { + tree: options.treeRoot, + dryRun: options.dryRun, + includeTranscript: options.includeTranscript, + meetingsSeen, + inserted, + updated, + skipped, + failed, + skipReasons, + errors, + }; +} diff --git a/packages/cli/importers/granola/render.test.ts b/packages/cli/importers/granola/render.test.ts new file mode 100644 index 00000000..e39f7400 --- /dev/null +++ b/packages/cli/importers/granola/render.test.ts @@ -0,0 +1,251 @@ +/** + * Tests for the Granola meeting renderer: HTML→Markdown and + * ProseMirror→Markdown conversion, notes extraction precedence, transcript + * speaker-turn grouping, and the assembled memory payload/metadata. + */ +import { describe, expect, test } from "bun:test"; +import type { + GranolaDocument, + GranolaPanel, + GranolaTranscriptSegment, +} from "./client.ts"; +import { + displayName, + extractNotes, + type GranolaMeeting, + htmlToMarkdown, + meetingStart, + meetingTitle, + proseMirrorToMarkdown, + renderMeeting, +} from "./render.ts"; + +function meeting(over: Partial = {}): GranolaMeeting { + return { + document: { id: "doc-1", title: "Weekly Sync" }, + panels: [], + transcript: [], + ...over, + }; +} + +describe("htmlToMarkdown", () => { + test("converts headings, lists, bold, and links", () => { + const html = + "

Topic

  • First point
  • Second bold
" + + '

See docs

'; + const md = htmlToMarkdown(html); + expect(md).toContain("### Topic"); + expect(md).toContain("- First point"); + expect(md).toContain("- Second **bold**"); + expect(md).toContain("[docs](https://x.test)"); + }); + + test("decodes entities and strips unknown tags", () => { + expect(htmlToMarkdown("

Tom & Jerry <3

x")).toBe( + "Tom & Jerry <3\n\nx", + ); + }); +}); + +describe("proseMirrorToMarkdown", () => { + test("renders headings, bullets, and marked text", () => { + const doc = { + type: "doc", + content: [ + { + type: "heading", + attrs: { level: 2 }, + content: [{ type: "text", text: "Decisions" }], + }, + { + type: "bulletList", + content: [ + { + type: "listItem", + content: [ + { + type: "paragraph", + content: [ + { type: "text", text: "Ship " }, + { + type: "text", + text: "now", + marks: [{ type: "bold" }], + }, + ], + }, + ], + }, + ], + }, + ], + }; + const md = proseMirrorToMarkdown(doc); + expect(md).toContain("## Decisions"); + expect(md).toContain("- Ship **now**"); + }); + + test("renders links from marks", () => { + const doc = { + type: "doc", + content: [ + { + type: "paragraph", + content: [ + { + type: "text", + text: "site", + marks: [{ type: "link", attrs: { href: "https://y.test" } }], + }, + ], + }, + ], + }; + expect(proseMirrorToMarkdown(doc)).toBe("[site](https://y.test)"); + }); + + test("returns empty for nullish or non-object input", () => { + expect(proseMirrorToMarkdown(null)).toBe(""); + expect(proseMirrorToMarkdown("nope")).toBe(""); + }); +}); + +describe("extractNotes precedence", () => { + test("prefers document notes_markdown", () => { + const m = meeting({ + document: { id: "d", notes_markdown: "# from doc" }, + panels: [{ id: "p", original_content: "

from panel

" }], + }); + expect(extractNotes(m)).toBe("# from doc"); + }); + + test("prefers panel prosemirror over its HTML, falls back to HTML", () => { + // ProseMirror models nested lists, so it wins when both are present. + const bothPanel: GranolaPanel = { + id: "p1", + original_content: "

from html

", + content: { + type: "doc", + content: [ + { + type: "paragraph", + content: [{ type: "text", text: "prose body" }], + }, + ], + }, + }; + expect(extractNotes(meeting({ panels: [bothPanel] }))).toBe("prose body"); + + // HTML is used when the panel has no structured content. + const htmlOnly: GranolaPanel = { + id: "p2", + original_content: "

Summary

", + }; + expect(extractNotes(meeting({ panels: [htmlOnly] }))).toContain( + "### Summary", + ); + }); + + test("returns empty string when no notes anywhere", () => { + expect(extractNotes(meeting())).toBe(""); + }); +}); + +describe("displayName", () => { + test("suffixes the meeting date when known", () => { + expect(displayName("Weekly Sync", "2026-06-23T18:30:00.000Z")).toBe( + "Weekly Sync — 2026-06-23", + ); + }); + + test("returns the bare title without a start time", () => { + expect(displayName("Ad-hoc note")).toBe("Ad-hoc note"); + expect(displayName("Bad date", "not-a-date")).toBe("Bad date"); + }); +}); + +describe("meetingStart / meetingTitle", () => { + test("meetingStart prefers calendar start over created_at", () => { + const doc: GranolaDocument = { + id: "d", + created_at: "2026-01-01T00:00:00.000Z", + google_calendar_event: { + start: { dateTime: "2026-02-02T10:00:00-05:00" }, + }, + }; + expect(meetingStart(doc)).toBe("2026-02-02T15:00:00.000Z"); + }); + + test("meetingStart falls back to created_at", () => { + expect( + meetingStart({ id: "d", created_at: "2026-03-03T08:00:00.000Z" }), + ).toBe("2026-03-03T08:00:00.000Z"); + }); + + test("meetingTitle falls back to calendar summary then default", () => { + expect( + meetingTitle({ + id: "d", + google_calendar_event: { summary: "Cal Title" }, + }), + ).toBe("Cal Title"); + expect(meetingTitle({ id: "d" })).toBe("Untitled meeting"); + }); +}); + +describe("renderMeeting", () => { + const transcript: GranolaTranscriptSegment[] = [ + { text: "Hello there.", source: "microphone" }, + { text: "How are you?", source: "microphone" }, + { text: "Doing well.", source: "system" }, + { text: "Great.", source: "microphone" }, + ]; + + test("groups transcript into speaker turns when included", () => { + const m = meeting({ + document: { + id: "d", + title: "Standup", + notes_markdown: "notes here", + google_calendar_event: { + start: { dateTime: "2026-01-01T10:00:00Z" }, + end: { dateTime: "2026-01-01T10:30:00Z" }, + attendees: [{ email: "A@Example.com" }, { email: "b@example.com" }], + }, + }, + transcript, + }); + const r = renderMeeting(m, { includeTranscript: true }); + expect(r.title).toBe("Standup"); + expect(r.content).toContain("# Standup"); + expect(r.content).toContain("**Attendees:** a@example.com, b@example.com"); + expect(r.content).toContain("## Notes"); + expect(r.content).toContain("## Transcript"); + // Two microphone turns separated by one system turn. + expect(r.content).toContain("**Me:** Hello there. How are you?"); + expect(r.content).toContain("**Them:** Doing well."); + expect(r.content).toContain("**Me:** Great."); + expect(r.meta.has_transcript).toBe(true); + expect(r.meta.transcript_segment_count).toBe(4); + expect(r.meta.display_name).toBe("Standup — 2026-01-01"); + expect(r.meta.attendees).toEqual(["a@example.com", "b@example.com"]); + expect(r.startedAt).toBe("2026-01-01T10:00:00.000Z"); + expect(r.endedAt).toBe("2026-01-01T10:30:00.000Z"); + }); + + test("omits transcript section when not requested", () => { + // The importer leaves transcript empty when --no-transcript, so render sees + // an empty array and reports has_transcript=false. + const r = renderMeeting( + meeting({ + document: { id: "d", title: "T", notes_markdown: "n" }, + transcript: [], + }), + { includeTranscript: false }, + ); + expect(r.content).not.toContain("## Transcript"); + expect(r.meta.content_mode).toBe("notes_only"); + expect(r.meta.has_transcript).toBe(false); + }); +}); diff --git a/packages/cli/importers/granola/render.ts b/packages/cli/importers/granola/render.ts new file mode 100644 index 00000000..d00e11d2 --- /dev/null +++ b/packages/cli/importers/granola/render.ts @@ -0,0 +1,385 @@ +/** + * Render a Granola meeting into memory content + metadata. + * + * One meeting becomes one memory. The content is a Markdown document with a + * title heading, a metadata line (date, attendees), the AI summary notes, and + * — when requested — the full transcript. Keeping it one self-contained + * Markdown blob (rather than fanning out per-segment memories like the agent + * importers) matches how a human reads a meeting note and keeps the memory + * searchable as a unit. + * + * Notes are sourced, in order of preference, from: the document's + * `notes_markdown`, an AI summary panel's `original_content` (HTML → Markdown), + * or a panel's ProseMirror `content`. The transcript is grouped into + * speaker-turn blocks (microphone = "Me", system = "Them") since Granola's + * segments carry a source but no per-speaker names. + */ + +import type { + GranolaDocument, + GranolaPanel, + GranolaTranscriptSegment, +} from "./client.ts"; + +/** A meeting assembled from the three Granola endpoints. */ +export interface GranolaMeeting { + document: GranolaDocument; + panels: GranolaPanel[]; + transcript: GranolaTranscriptSegment[]; +} + +/** Options affecting rendered content. */ +export interface RenderOptions { + /** Include the full transcript text below the notes. */ + includeTranscript: boolean; +} + +/** The rendered memory payload (content + structured metadata). */ +export interface RenderedMeeting { + title: string; + content: string; + meta: Record; + /** ISO start timestamp for the memory's temporal, when known. */ + startedAt?: string; + /** ISO end timestamp for the memory's temporal, when known. */ + endedAt?: string; +} + +/** A meeting's best-known start time: calendar start, else created_at. */ +export function meetingStart(doc: GranolaDocument): string | undefined { + const dt = doc.google_calendar_event?.start?.dateTime; + if (dt && !Number.isNaN(Date.parse(dt))) return new Date(dt).toISOString(); + if (doc.created_at && !Number.isNaN(Date.parse(doc.created_at))) { + return new Date(doc.created_at).toISOString(); + } + return undefined; +} + +/** A meeting's best-known end time: calendar end, else last transcript segment. */ +function meetingEnd( + doc: GranolaDocument, + transcript: GranolaTranscriptSegment[], +): string | undefined { + const dt = doc.google_calendar_event?.end?.dateTime; + if (dt && !Number.isNaN(Date.parse(dt))) return new Date(dt).toISOString(); + const last = transcript[transcript.length - 1]?.end_timestamp; + if (last && !Number.isNaN(Date.parse(last))) + return new Date(last).toISOString(); + return undefined; +} + +/** Attendee email list from the calendar event (deduped, lowercased). */ +function attendeeEmails(doc: GranolaDocument): string[] { + const attendees = doc.google_calendar_event?.attendees ?? []; + const emails = new Set(); + for (const a of attendees) { + if (a.email) emails.add(a.email.toLowerCase()); + } + return [...emails]; +} + +/** A human title for the meeting, with sensible fallbacks. */ +export function meetingTitle(doc: GranolaDocument): string { + const t = doc.title?.trim(); + if (t) return t; + const cal = doc.google_calendar_event?.summary?.trim(); + if (cal) return cal; + return "Untitled meeting"; +} + +/** + * Extract the notes body as Markdown. Prefers the document's own + * `notes_markdown`; otherwise renders the first non-empty AI summary panel. + * + * For a panel we prefer its ProseMirror `content` over the HTML + * `original_content`: ProseMirror models nested lists structurally, so the + * converter preserves indentation, whereas Granola's HTML nests `
    ` inside + * `
  • ` and a flat regex pass would merge a child bullet into its parent line. + * HTML is the fallback when a panel carries no structured content. + */ +export function extractNotes(meeting: GranolaMeeting): string { + const md = meeting.document.notes_markdown?.trim(); + if (md) return md; + + for (const panel of meeting.panels) { + const fromProse = proseMirrorToMarkdown(panel.content); + if (fromProse.trim()) return fromProse.trim(); + const html = panel.original_content?.trim(); + if (html) return htmlToMarkdown(html); + } + + // Last resort: the document's ProseMirror notes, if any. + const fromDocProse = proseMirrorToMarkdown(meeting.document.notes); + return fromDocProse.trim(); +} + +/** + * Render the full meeting into a memory payload. Returns content even when + * notes and transcript are empty (the metadata header alone is still a useful, + * searchable record of the meeting). + */ +export function renderMeeting( + meeting: GranolaMeeting, + options: RenderOptions, +): RenderedMeeting { + const { document: doc } = meeting; + const title = meetingTitle(doc); + const startedAt = meetingStart(doc); + const endedAt = meetingEnd(doc, meeting.transcript); + const emails = attendeeEmails(doc); + + const lines: string[] = [`# ${title}`, ""]; + const metaBits: string[] = []; + if (startedAt) metaBits.push(`**Date:** ${formatDate(startedAt)}`); + if (emails.length > 0) metaBits.push(`**Attendees:** ${emails.join(", ")}`); + if (metaBits.length > 0) { + lines.push(metaBits.join(" \n"), ""); + } + + const notes = extractNotes(meeting); + if (notes) { + lines.push("## Notes", "", notes, ""); + } + + if (options.includeTranscript && meeting.transcript.length > 0) { + lines.push("## Transcript", "", renderTranscript(meeting.transcript), ""); + } + + const meta: Record = { + type: "granola_meeting", + source_tool: "granola", + source_document_id: doc.id, + // A human label for the tree/UI (the leaf `name` is the document id, which + // is the idempotency key); "Title — YYYY-MM-DD" so meetings read well and a + // recurring title stays distinguishable by date. + display_name: displayName(title, startedAt), + content_mode: options.includeTranscript ? "with_transcript" : "notes_only", + has_notes: notes.length > 0, + has_transcript: meeting.transcript.length > 0, + transcript_segment_count: meeting.transcript.length, + }; + if (doc.workspace_id) meta.source_workspace_id = doc.workspace_id; + if (doc.google_calendar_event?.id) { + meta.source_calendar_event_id = doc.google_calendar_event.id; + } + if (emails.length > 0) meta.attendees = emails; + if (typeof doc.valid_meeting === "boolean") { + meta.valid_meeting = doc.valid_meeting; + } + + return { + title, + content: lines.join("\n").trimEnd(), + meta, + startedAt, + endedAt, + }; +} + +/** + * Render transcript segments into speaker-turn Markdown. Granola tags each + * segment with a `source` ("microphone" = the local user, "system" = remote + * participants) but no names, so we collapse consecutive same-source segments + * into one labelled paragraph. + */ +function renderTranscript(segments: GranolaTranscriptSegment[]): string { + const blocks: string[] = []; + let currentSource: string | undefined; + let buffer: string[] = []; + + const flush = (): void => { + if (buffer.length === 0) return; + const label = currentSource === "microphone" ? "Me" : "Them"; + blocks.push(`**${label}:** ${buffer.join(" ")}`); + buffer = []; + }; + + for (const seg of segments) { + const text = seg.text?.trim(); + if (!text) continue; + if (seg.source !== currentSource) { + flush(); + currentSource = seg.source; + } + buffer.push(text); + } + flush(); + return blocks.join("\n\n"); +} + +/** + * A human display label: the title, suffixed with the meeting's calendar date + * (`YYYY-MM-DD`, UTC) when known. Used as `meta.display_name` so the web tree + * shows a friendly name while the leaf `name` stays the stable document id. + */ +export function displayName(title: string, startedAt?: string): string { + if (!startedAt) return title; + const date = new Date(startedAt); + if (Number.isNaN(date.getTime())) return title; + return `${title} — ${date.toISOString().slice(0, 10)}`; +} + +/** Format an ISO timestamp as a readable date (UTC, no library). */ +function formatDate(iso: string): string { + const d = new Date(iso); + if (Number.isNaN(d.getTime())) return iso; + return d + .toISOString() + .replace("T", " ") + .replace(/:\d\d\.\d{3}Z$/, " UTC"); +} + +/** + * Convert Granola's panel HTML into Markdown. The panels use a small, fixed + * tag set (`h1`-`h6`, `ul`/`ol`/`li`, `p`, `strong`/`b`, `em`/`i`, `a`, `hr`, + * `br`), so a focused converter is simpler and more predictable than pulling + * in a general HTML→MD dependency. + */ +export function htmlToMarkdown(html: string): string { + let out = html; + + // Links: text → [text](x) + out = out.replace( + /]*?href=["']([^"']*)["'][^>]*>([\s\S]*?)<\/a>/gi, + (_m, href, text) => `[${stripTags(text).trim()}](${href})`, + ); + // Bold / italic. + out = out.replace(/<(strong|b)\b[^>]*>([\s\S]*?)<\/\1>/gi, "**$2**"); + out = out.replace(/<(em|i)\b[^>]*>([\s\S]*?)<\/\1>/gi, "*$2*"); + + // Headings. + for (let level = 1; level <= 6; level++) { + const hashes = "#".repeat(level); + const re = new RegExp( + `]*>([\\s\\S]*?)<\\/h${level}>`, + "gi", + ); + out = out.replace( + re, + (_m, inner) => `\n${hashes} ${stripTags(inner).trim()}\n`, + ); + } + + // List items: turn each
  • into a bullet. Nested lists are flattened with + // indentation approximated by their depth in the original markup is lost, so + // we keep a single level — good enough for searchable notes. + out = out.replace(/]*>([\s\S]*?)<\/li>/gi, (_m, inner) => { + const text = stripTags(inner).replace(/\s+/g, " ").trim(); + return text ? `- ${text}\n` : ""; + }); + // Drop list wrappers; their items already became bullets. + out = out.replace(/<\/?(ul|ol)\b[^>]*>/gi, "\n"); + + // Paragraphs and line breaks. + out = out.replace(/<\/p>/gi, "\n\n").replace(/]*>/gi, ""); + out = out.replace(//gi, "\n"); + out = out.replace(//gi, "\n---\n"); + + // Anything left over. + out = stripTags(out); + out = decodeEntities(out); + + // Collapse excessive blank lines. + return out.replace(/\n{3,}/g, "\n\n").trim(); +} + +/** Strip any remaining HTML tags. */ +function stripTags(s: string): string { + return s.replace(/<[^>]+>/g, ""); +} + +/** Decode the handful of HTML entities Granola emits. */ +function decodeEntities(s: string): string { + return s + .replace(/&/g, "&") + .replace(/</g, "<") + .replace(/>/g, ">") + .replace(/"/g, '"') + .replace(/'/g, "'") + .replace(/ /g, " "); +} + +/** + * Render a ProseMirror document node into Markdown. Handles the node types + * Granola's panels emit (doc, heading, paragraph, bulletList/orderedList, + * listItem, text with bold/link marks, horizontalRule). Returns "" for an + * empty or unrecognized input. + */ +export function proseMirrorToMarkdown(node: unknown): string { + if (!node || typeof node !== "object") return ""; + const lines = renderProseNode(node as ProseNode, 0); + return lines + .join("\n") + .replace(/\n{3,}/g, "\n\n") + .trim(); +} + +interface ProseNode { + type?: string; + text?: string; + attrs?: { level?: number }; + marks?: Array<{ type?: string; attrs?: { href?: string } }>; + content?: ProseNode[]; +} + +/** Render a ProseMirror node to an array of Markdown lines. */ +function renderProseNode(node: ProseNode, listDepth: number): string[] { + switch (node.type) { + case "doc": + return (node.content ?? []).flatMap((c) => [ + ...renderProseNode(c, listDepth), + "", + ]); + case "heading": { + const level = node.attrs?.level ?? 1; + return [`${"#".repeat(level)} ${renderInline(node.content ?? [])}`]; + } + case "paragraph": + return [renderInline(node.content ?? [])]; + case "bulletList": + case "orderedList": + return (node.content ?? []).flatMap((item) => + renderProseNode(item, listDepth), + ); + case "listItem": { + const indent = " ".repeat(listDepth); + const parts = node.content ?? []; + const head = parts[0] ? renderInline(parts[0].content ?? []) : ""; + const lines = [`${indent}- ${head}`]; + // Nested lists or extra paragraphs inside the item. + for (const child of parts.slice(1)) { + lines.push(...renderProseNode(child, listDepth + 1)); + } + return lines; + } + case "horizontalRule": + return ["---"]; + case "text": + return [renderInline([node])]; + default: + return node.content + ? renderProseNode({ type: "doc", content: node.content }, listDepth) + : []; + } +} + +/** Render inline ProseMirror nodes (text with bold/link marks) to Markdown. */ +function renderInline(nodes: ProseNode[]): string { + return nodes + .map((n) => { + if (n.type !== "text" || !n.text) { + // Could be a nested inline structure; recurse on content. + return n.content ? renderInline(n.content) : ""; + } + let text = n.text; + for (const mark of n.marks ?? []) { + if (mark.type === "bold") text = `**${text}**`; + else if (mark.type === "italic") text = `*${text}*`; + else if (mark.type === "link" && mark.attrs?.href) { + text = `[${text}](${mark.attrs.href})`; + } + } + return text; + }) + .join(""); +} diff --git a/packages/web/src/lib/tree-build.test.ts b/packages/web/src/lib/tree-build.test.ts index 71e005e5..a79a0ee5 100644 --- a/packages/web/src/lib/tree-build.test.ts +++ b/packages/web/src/lib/tree-build.test.ts @@ -270,6 +270,33 @@ describe("memoryToLeaf + sortLeaves + titleForMemory", () => { expect(leaf.title).toBe("jwt-rotation"); }); + test("memoryToLeaf prefers meta.display_name over name and content", () => { + const leaf = memoryToLeaf( + mkMemory({ + meta: { display_name: "Weekly Sync — 2026-06-23" }, + name: "d5b1e4f7-c265-4e41-af4a-83f44deef56a", + content: "# Ghost/Memory Weekly\n\nbody", + }), + 0, + ); + expect(leaf.title).toBe("Weekly Sync — 2026-06-23"); + }); + + test("memoryToLeaf ignores a blank or non-string display_name", () => { + expect( + memoryToLeaf( + mkMemory({ meta: { display_name: " " }, name: "keep-name" }), + 0, + ).title, + ).toBe("keep-name"); + expect( + memoryToLeaf( + mkMemory({ meta: { display_name: 42 }, name: "keep-name" }), + 0, + ).title, + ).toBe("keep-name"); + }); + test("sortLeaves: newest temporal first, nulls last, title tiebreak", () => { const leaves = [ memoryToLeaf( diff --git a/packages/web/src/lib/tree-build.ts b/packages/web/src/lib/tree-build.ts index b3748b79..c97d6a2e 100644 --- a/packages/web/src/lib/tree-build.ts +++ b/packages/web/src/lib/tree-build.ts @@ -193,9 +193,14 @@ export function memoryToLeaf( return { kind: "memory", id: memory.id, - // A named memory shows its name (the filename-like leaf); otherwise fall - // back to the first content line, then the id tail. - title: memory.name ?? titleForMemory(memory.content, memory.id), + // Label preference: an explicit `meta.display_name` (a human label an + // importer chose, e.g. a meeting's "Title — date", independent of the + // idempotency-key `name`), then the filename-like `name`, then the first + // content line, then the id tail. + title: + displayNameFromMeta(memory.meta) ?? + memory.name ?? + titleForMemory(memory.content, memory.id), tree: memory.tree, temporalStart: memory.temporal?.start ?? null, depth, @@ -313,6 +318,20 @@ function compareLeaves(a: MemoryLeaf, b: MemoryLeaf): number { return a.title.localeCompare(b.title); } +/** + * A non-empty `meta.display_name` string, trimmed, or undefined. Lets an + * importer pick a human-friendly tree label without disturbing the memory's + * `name` (which the engine uses as the `(tree, name)` idempotency key). + */ +export function displayNameFromMeta( + meta: Record | null | undefined, +): string | undefined { + const value = meta?.display_name; + if (typeof value !== "string") return undefined; + const trimmed = value.trim(); + return trimmed.length > 0 ? trimmed : undefined; +} + /** * First non-empty line of `content`, stripped of leading markdown heading * chars and truncated to ~60 chars. Falls back to the last 8 chars of the