docs: update AGENTS.md and README with pipeline and citation details

This commit is contained in:
Ignacio Ballesteros
2026-02-20 10:02:36 +01:00
parent 511b003da8
commit 9ecc9336da
2 changed files with 113 additions and 9 deletions

View File

@@ -246,21 +246,22 @@ git branch -d feature/my-feature
## Org-Roam Workflow
Notes live in a **separate directory** outside this repo. The export script
converts them to Markdown via ox-hugo, then Quartz builds the site.
Notes live in a **separate directory** outside this repo. The export pipeline
converts them to Markdown via ox-hugo, applies post-processing transforms, then
Quartz builds the site.
### Tooling
The dev shell (`nix develop`) provides:
- `nodejs_22` — Quartz build
- `elixir` — runs the export script
- `elixir` — runs the export script and pipeline
- `emacs` + `ox-hugo` — performs the org → markdown conversion
### Export and build
```bash
# Export only (wipes content/, exports all .org files)
# Export only (wipes content/, exports all .org files, runs pipeline)
NOTES_DIR=/path/to/notes npm run export
# Export then build the site
@@ -270,9 +271,85 @@ NOTES_DIR=/path/to/notes npm run build:notes
elixir scripts/export.exs /path/to/notes
```
The export script (`scripts/export.exs`) runs Emacs in batch mode, calling
`org-hugo-export-to-md` per file (per-file mode, not per-subtree). It uses
YAML frontmatter (ox-hugo default). `content/` is wiped before each export.
Optional env vars for the pipeline:
| Var | Default | Purpose |
| --------------- | ------------------------ | ----------------------------------------- |
| `BIBTEX_FILE` | — | Path to `.bib` file for citation fallback |
| `ZOTERO_URL` | `http://localhost:23119` | Zotero Better BibTeX base URL |
| `CITATION_MODE` | `warn` | `silent` / `warn` / `strict` |
### Export pipeline phases
`scripts/export.exs` runs four phases in sequence:
1. **Wipe** `content/` (preserving `.gitkeep`)
2. **Export** each `.org` file via `emacs --batch` + `ox-hugo``content/**/*.md`
3. **Pipeline** — run Elixir transform modules over every `.md` file
4. **Index** — generate a fallback `content/index.md` if none was exported
The export uses TOML frontmatter (`+++`) and per-file mode (not per-subtree).
### Markdown pipeline (`scripts/pipeline/`)
A standalone Mix project that post-processes `content/*.md` after ox-hugo.
It is compiled automatically on first run; subsequent runs use the `_build/`
cache and are fast.
**Architecture:**
```
scripts/pipeline/
├── mix.exs # deps: req, jason
└── lib/
├── pipeline.ex # Generic runner (fold transforms over .md files)
├── pipeline/
│ ├── application.ex # OTP app — starts Finch HTTP pool
│ ├── transform.ex # Behaviour: init/1, apply/3, teardown/1
│ ├── transforms/
│ │ └── citations.ex # Resolves cite:key → [Label](url)
│ └── resolvers/
│ ├── zotero.ex # JSON-RPC to Zotero Better BibTeX
│ ├── bibtex.ex # Parses local .bib file
│ └── doi.ex # Bare-key fallback (always succeeds)
```
**Adding a new transform:**
1. Create `scripts/pipeline/lib/pipeline/transforms/my_transform.ex`
2. Implement the `Pipeline.Transform` behaviour (`init/1`, `apply/3`)
3. Append the module to `transforms` in `scripts/export.exs`
```elixir
transforms = [
Pipeline.Transforms.Citations,
Pipeline.Transforms.MyTransform, # new
]
```
### Citation resolution (`Pipeline.Transforms.Citations`)
Handles org-citar syntax that passes through ox-hugo unchanged:
| Syntax | Example |
| ---------------- | -------------------- |
| org-cite / citar | `[cite:@key]` |
| multiple keys | `[cite:@key1;@key2]` |
| bare (legacy) | `cite:key` |
Resolution chain (first success wins):
1. **Zotero** — JSON-RPC to `localhost:23119/better-bibtex/json-rpc`
- Calls `item.search` to find the item, then `item.attachments` to get
the PDF link (`zotero://open-pdf/library/items/KEY`)
- Falls back to `zotero://select/library/items/KEY` if no PDF attachment
- Probe uses a JSON-RPC call, **not** `/better-bibtex/cayw`
(that endpoint blocks waiting for interactive input)
2. **BibTeX** — parses `BIBTEX_FILE`; extracts authors, year, DOI/URL
3. **DOI fallback** — always succeeds; renders bare key or `https://doi.org/...`
**Zotero JSON-RPC gotcha:** `Req 0.5` does not allow combining `:finch` and
`:connect_options` in the same call. Use `:receive_timeout` only.
## Important Notes
@@ -280,3 +357,5 @@ YAML frontmatter (ox-hugo default). `content/` is wiped before each export.
- **Isomorphic code**: `quartz/util/path.ts` must not use Node.js APIs
- **Incremental builds**: Plugins can implement `partialEmit` for efficiency
- **Markdown flavors**: Supports Obsidian (`ofm.ts`) and Roam (`roam.ts`) syntax
- **Pipeline build artifacts**: `scripts/pipeline/_build/` and `scripts/pipeline/deps/`
are gitignored — run `mix deps.get` inside `scripts/pipeline/` after a fresh clone

View File

@@ -41,8 +41,33 @@ NOTES_DIR=/path/to/notes npm run serve:notes
NOTES_DIR=/path/to/notes npm run export
```
The export script mirrors the subdirectory structure of your notes into `content/`
and generates a fallback `index.md` if your notes don't include one.
The export pipeline runs in four phases:
1. **Wipe** `content/` clean
2. **Export** every `.org` file via `emacs --batch` + ox-hugo → Markdown
3. **Transform** — post-process the Markdown (citation resolution, etc.)
4. **Index** — generate a fallback `index.md` if none was exported
#### Citations (org-citar → Zotero links)
org-citar references (`[cite:@key]`) are resolved to clickable Zotero links.
With Zotero running and the [Better BibTeX](https://retorque.re/zotero-better-bibtex/)
plugin installed, no extra configuration is needed — the pipeline detects it
automatically and links directly to the PDF in your library.
```bash
# Use a local .bib file as fallback when Zotero is not running
BIBTEX_FILE=/path/to/refs.bib NOTES_DIR=/path/to/notes npm run export
# Control warning verbosity for unresolved keys
CITATION_MODE=strict NOTES_DIR=/path/to/notes npm run export
```
| Env var | Default | Purpose |
| --------------- | ------------------------ | ----------------------------------------- |
| `BIBTEX_FILE` | — | Path to `.bib` file for citation fallback |
| `ZOTERO_URL` | `http://localhost:23119` | Zotero Better BibTeX base URL |
| `CITATION_MODE` | `warn` | `silent` / `warn` / `strict` |
### Building without org-roam notes