Compare commits

...

9 Commits

Author SHA1 Message Date
Ignacio Ballesteros
0ea5808cd2 initial flake app 2026-02-20 15:08:04 +01:00
Ignacio Ballesteros
d913138726 fix(ox-hugo): serve gitignored content and static assets, fix figure shortcode order
- glob.ts: add optional respectGitignore param (default true) so callers
  can opt out of gitignore filtering for generated directories
- build.ts: pass respectGitignore=false when globbing content/ so
  gitignored-but-generated markdown files are always picked up
- static.ts: copy top-level static/ → output/ root with gitignore disabled,
  mirroring Hugo's convention so ox-hugo images at static/ox-hugo/* are
  served at /ox-hugo/* as expected by the exported markdown src paths
- oxhugofm.ts: run replaceFigureWithMdImg before removeHugoShortcode so the
  precise figure regex sees the intact shortcode before the generic {{.*}}
  stripper destroys it; also upgrade figureTagRegex to figureShortcodeRegex
  which matches the full shortcode form ox-hugo actually emits
2026-02-20 11:11:42 +01:00
Ignacio Ballesteros
3deec2d011 example: images 2026-02-20 10:51:05 +01:00
Ignacio Ballesteros
a907d6513b docs(examples): add image reference example note with all three scenarios
Adds notes/example-images.org demonstrating how org-mode image links of
each kind pass through ox-hugo and land in content/:

  Scenario 1 — External URL: linked image using #+ATTR_HTML + a local
    thumbnail as the display image; ox-hugo emits a {{< figure >}} shortcode
    with a link= attribute pointing to the remote URL.

  Scenario 2 — Local file (same notes dir): [[file:quartz-logo.jpg]]
    ox-hugo copies the file to static/ox-hugo/ and rewrites the src to
    /ox-hugo/quartz-logo.jpg.

  Scenario 3 — Absolute path outside notes dir: [[file:/abs/path/img.jpg]]
    ox-hugo copies the file from outside the notes tree into static/ox-hugo/
    and rewrites the src identically to scenario 2.

Also adds:
  notes/quartz-logo.jpg          — image asset for scenarios 1 & 2
  notes-external/                — simulates an assets dir outside notes
  static/ox-hugo/ added to .gitignore (populated at export time)
2026-02-20 10:09:54 +01:00
Ignacio Ballesteros
9ecc9336da docs: update AGENTS.md and README with pipeline and citation details 2026-02-20 10:02:36 +01:00
Ignacio Ballesteros
511b003da8 Add Elixir markdown pipeline with org-citar citation resolution
Introduces scripts/pipeline/, a Mix project that runs as a post-export
transformation pass over content/*.md before Quartz builds the site.

Pipeline (scripts/export.exs phase 3):
- Compiles and loads the Mix project at export time (cached after first run)
- Applies a list of Transform modules sequentially over all .md files
- Only rewrites files that were actually changed

Citations transform (Pipeline.Transforms.Citations):
- Resolves [cite:@key] and bare cite:key syntax produced by ox-hugo/citar
- Resolution chain: Zotero BBT JSON-RPC → BibTeX file → DOI/bare-key fallback
- Zotero probe uses a no-op JSON-RPC call (cayw endpoint blocks indefinitely)
- Zotero resolver fetches PDF attachments via item.attachments, producing
  zotero://open-pdf/... links; falls back to zotero://select/library/items/...
- BibTeX resolver parses .bib files with a simple regex parser (no deps)
- DOI resolver is the always-succeeding last resort

Configuration via env vars:
  BIBTEX_FILE   — path to .bib file for fallback resolution
  ZOTERO_URL    — Zotero base URL (default: http://localhost:23119)
  CITATION_MODE — silent | warn (default) | strict

Adding future transforms requires only implementing Pipeline.Transform
behaviour and appending the module to the transforms list in export.exs.
2026-02-20 10:00:11 +01:00
Ignacio Ballesteros
ec1426241c Ignore generated content/ output and erl_crash.dump 2026-02-19 18:21:02 +01:00
Ignacio Ballesteros
692f23bc36 Add org-roam workflow: ox-hugo export pipeline and example notes
- Add flake.nix dev shell with Node.js 22, Elixir, and Emacs+ox-hugo
- Add scripts/export.exs: exports org-roam notes to content/ via ox-hugo,
  mirroring subdirectory structure and generating a fallback index.md
- Add npm scripts: export, build:notes, serve:notes
- Configure FrontMatter plugin for TOML (ox-hugo default output)
- Replace ObsidianFlavoredMarkdown with OxHugoFlavouredMarkdown
- Add example notes: Madrid public transport (metro, bus, roads)
- Update README and AGENTS.md with org-roam workflow instructions
2026-02-19 18:20:43 +01:00
Ignacio Ballesteros
86eaaae945 Add branch workflow documentation to AGENTS.md 2026-02-18 12:44:49 +01:00
32 changed files with 1830 additions and 17 deletions

13
.gitignore vendored
View File

@@ -9,3 +9,16 @@ tsconfig.tsbuildinfo
private/
.replit
replit.nix
erl_crash.dump
# content/ is generated by the export script; only keep the placeholder
content/*
!content/.gitkeep
# static/ox-hugo/ is populated by ox-hugo during export
static/ox-hugo/
# Elixir/Mix build artifacts for the pipeline project
scripts/pipeline/_build/
scripts/pipeline/deps/
scripts/pipeline/erl_crash.dump
# Test helpers (not needed in production)
scripts/test.bib
scripts/test_pipeline.exs

361
AGENTS.md Normal file
View File

@@ -0,0 +1,361 @@
# AGENTS.md - Coding Agent Instructions
This document provides essential information for AI coding agents working in this repository.
## Project Overview
**Quartz** is a static site generator for publishing digital gardens and notes as websites.
Built with TypeScript, Preact, and unified/remark/rehype for markdown processing.
| Stack | Technology |
| ------------- | ----------------------------------------- |
| Language | TypeScript 5.x (strict mode) |
| Runtime | Node.js >=22 (v22.16.0 pinned) |
| Package Mgr | npm >=10.9.2 |
| Module System | ES Modules (`"type": "module"`) |
| UI Framework | Preact 10.x (JSX with `react-jsx` pragma) |
| Build Tool | esbuild |
| Styling | SCSS via esbuild-sass-plugin |
## Environment
This is a Nix project. Use the provided `flake.nix` to enter a dev shell with Node.js 22 and npm:
```bash
nix develop
```
All `npm` commands below must be run inside the dev shell.
## Build, Lint, and Test Commands
```bash
# Type check and format check (CI validation)
npm run check
# Auto-format code with Prettier
npm run format
# Run all tests
npm run test
# Run a single test file
npx tsx --test quartz/util/path.test.ts
# Run tests matching a pattern (use --test-name-pattern)
npx tsx --test --test-name-pattern="typeguards" quartz/util/path.test.ts
# Build the static site
npx quartz build
# Build and serve with hot reload
npx quartz build --serve
# Profile build performance
npm run profile
```
### Test Files Location
Tests use Node.js native test runner via `tsx`. Test files follow the `*.test.ts` pattern:
- `quartz/util/path.test.ts`
- `quartz/util/fileTrie.test.ts`
- `quartz/components/scripts/search.test.ts`
## Code Style Guidelines
### Prettier Configuration (`.prettierrc`)
```json
{
"printWidth": 100,
"tabWidth": 2,
"semi": false,
"trailingComma": "all",
"quoteProps": "as-needed"
}
```
**No ESLint** - only Prettier for formatting. Run `npm run format` before committing.
### TypeScript Configuration
- **Strict mode enabled** (`strict: true`)
- `noUnusedLocals: true` - no unused variables
- `noUnusedParameters: true` - no unused function parameters
- JSX configured for Preact (`jsxImportSource: "preact"`)
### Import Conventions
```typescript
// 1. External packages first
import { PluggableList } from "unified"
import { visit } from "unist-util-visit"
// 2. Internal utilities/types (relative paths)
import { QuartzTransformerPlugin } from "../types"
import { FilePath, slugifyFilePath } from "../../util/path"
import { i18n } from "../../i18n"
```
### Naming Conventions
| Element | Convention | Example |
| ---------------- | ------------ | ----------------------------------- |
| Files (utils) | camelCase | `path.ts`, `fileTrie.ts` |
| Files (comps) | PascalCase | `TableOfContents.tsx`, `Search.tsx` |
| Types/Interfaces | PascalCase | `QuartzComponent`, `FullSlug` |
| Type Guards | `is*` prefix | `isFilePath()`, `isFullSlug()` |
| Constants | UPPER_CASE | `QUARTZ`, `UPSTREAM_NAME` |
| Options types | `Options` | `interface Options { ... }` |
### Branded Types Pattern
This codebase uses branded types for type-safe path handling:
```typescript
type SlugLike<T> = string & { __brand: T }
export type FilePath = SlugLike<"filepath">
export type FullSlug = SlugLike<"full">
export type SimpleSlug = SlugLike<"simple">
// Always validate with type guards before using
export function isFilePath(s: string): s is FilePath { ... }
```
### Component Pattern (Preact)
Components use a factory function pattern with attached static properties:
```typescript
export default ((userOpts?: Partial<Options>) => {
const opts: Options = { ...defaultOptions, ...userOpts }
const ComponentName: QuartzComponent = ({ cfg, displayClass }: QuartzComponentProps) => {
return <div class={classNames(displayClass, "component-name")}>...</div>
}
ComponentName.css = style // SCSS styles
ComponentName.afterDOMLoaded = script // Client-side JS
return ComponentName
}) satisfies QuartzComponentConstructor
```
### Plugin Pattern
Three plugin types: transformers, filters, and emitters.
```typescript
export const PluginName: QuartzTransformerPlugin<Partial<Options>> = (userOpts) => {
const opts = { ...defaultOptions, ...userOpts }
return {
name: "PluginName",
markdownPlugins(ctx) { return [...] },
htmlPlugins(ctx) { return [...] },
externalResources(ctx) { return { js: [], css: [] } },
}
}
```
### Testing Pattern
Use Node.js native test runner with `assert`:
```typescript
import test, { describe, beforeEach } from "node:test"
import assert from "node:assert"
describe("FeatureName", () => {
test("should do something", () => {
assert.strictEqual(actual, expected)
assert.deepStrictEqual(actualObj, expectedObj)
assert(condition) // truthy assertion
assert(!condition) // falsy assertion
})
})
```
### Error Handling
- Use `try/catch` for critical operations (file I/O, parsing)
- Custom `trace` utility for error reporting with stack traces
- `process.exit(1)` for fatal errors
- `console.warn()` for non-fatal issues
### Async Patterns
- Prefer `async/await` over raw promises
- Use async generators (`async *emit()`) for streaming file output
- Use `async-mutex` for concurrent build protection
## Project Structure
```
quartz/
├── bootstrap-cli.mjs # CLI entry point
├── build.ts # Build orchestration
├── cfg.ts # Configuration types
├── components/ # Preact UI components
│ ├── *.tsx # Components
│ ├── scripts/ # Client-side scripts (*.inline.ts)
│ └── styles/ # Component SCSS
├── plugins/
│ ├── transformers/ # Markdown AST transformers
│ ├── filters/ # Content filters
│ ├── emitters/ # Output generators
│ └── types.ts # Plugin type definitions
├── processors/ # Build pipeline (parse/filter/emit)
├── util/ # Utility functions
└── i18n/ # Internationalization (30+ locales)
```
## Branch Workflow
This is a fork of [jackyzha0/quartz](https://github.com/jackyzha0/quartz) with org-roam customizations.
| Branch | Purpose |
| ----------- | ------------------------------------------------ |
| `main` | Clean mirror of upstream quartz — no custom code |
| `org-roam` | Default branch — all customizations live here |
| `feature/*` | Short-lived branches off `org-roam` |
### Pulling Upstream Updates
```bash
git checkout main
git fetch upstream
git merge upstream/main
git checkout org-roam
git merge main
# Resolve conflicts if any, then commit
```
### Working on Features
```bash
git checkout org-roam
git checkout -b feature/my-feature
# ... work ...
git checkout org-roam
git merge feature/my-feature
git branch -d feature/my-feature
```
**Merge direction:** `upstream → main → org-roam → feature/*`
## Org-Roam Workflow
Notes live in a **separate directory** outside this repo. The export pipeline
converts them to Markdown via ox-hugo, applies post-processing transforms, then
Quartz builds the site.
### Tooling
The dev shell (`nix develop`) provides:
- `nodejs_22` — Quartz build
- `elixir` — runs the export script and pipeline
- `emacs` + `ox-hugo` — performs the org → markdown conversion
### Export and build
```bash
# Export only (wipes content/, exports all .org files, runs pipeline)
NOTES_DIR=/path/to/notes npm run export
# Export then build the site
NOTES_DIR=/path/to/notes npm run build:notes
# Positional arg also works
elixir scripts/export.exs /path/to/notes
```
Optional env vars for the pipeline:
| Var | Default | Purpose |
| --------------- | ------------------------ | ----------------------------------------- |
| `BIBTEX_FILE` | — | Path to `.bib` file for citation fallback |
| `ZOTERO_URL` | `http://localhost:23119` | Zotero Better BibTeX base URL |
| `CITATION_MODE` | `warn` | `silent` / `warn` / `strict` |
### Export pipeline phases
`scripts/export.exs` runs four phases in sequence:
1. **Wipe** `content/` (preserving `.gitkeep`)
2. **Export** each `.org` file via `emacs --batch` + `ox-hugo``content/**/*.md`
3. **Pipeline** — run Elixir transform modules over every `.md` file
4. **Index** — generate a fallback `content/index.md` if none was exported
The export uses TOML frontmatter (`+++`) and per-file mode (not per-subtree).
### Markdown pipeline (`scripts/pipeline/`)
A standalone Mix project that post-processes `content/*.md` after ox-hugo.
It is compiled automatically on first run; subsequent runs use the `_build/`
cache and are fast.
**Architecture:**
```
scripts/pipeline/
├── mix.exs # deps: req, jason
└── lib/
├── pipeline.ex # Generic runner (fold transforms over .md files)
├── pipeline/
│ ├── application.ex # OTP app — starts Finch HTTP pool
│ ├── transform.ex # Behaviour: init/1, apply/3, teardown/1
│ ├── transforms/
│ │ └── citations.ex # Resolves cite:key → [Label](url)
│ └── resolvers/
│ ├── zotero.ex # JSON-RPC to Zotero Better BibTeX
│ ├── bibtex.ex # Parses local .bib file
│ └── doi.ex # Bare-key fallback (always succeeds)
```
**Adding a new transform:**
1. Create `scripts/pipeline/lib/pipeline/transforms/my_transform.ex`
2. Implement the `Pipeline.Transform` behaviour (`init/1`, `apply/3`)
3. Append the module to `transforms` in `scripts/export.exs`
```elixir
transforms = [
Pipeline.Transforms.Citations,
Pipeline.Transforms.MyTransform, # new
]
```
### Citation resolution (`Pipeline.Transforms.Citations`)
Handles org-citar syntax that passes through ox-hugo unchanged:
| Syntax | Example |
| ---------------- | -------------------- |
| org-cite / citar | `[cite:@key]` |
| multiple keys | `[cite:@key1;@key2]` |
| bare (legacy) | `cite:key` |
Resolution chain (first success wins):
1. **Zotero** — JSON-RPC to `localhost:23119/better-bibtex/json-rpc`
- Calls `item.search` to find the item, then `item.attachments` to get
the PDF link (`zotero://open-pdf/library/items/KEY`)
- Falls back to `zotero://select/library/items/KEY` if no PDF attachment
- Probe uses a JSON-RPC call, **not** `/better-bibtex/cayw`
(that endpoint blocks waiting for interactive input)
2. **BibTeX** — parses `BIBTEX_FILE`; extracts authors, year, DOI/URL
3. **DOI fallback** — always succeeds; renders bare key or `https://doi.org/...`
**Zotero JSON-RPC gotcha:** `Req 0.5` does not allow combining `:finch` and
`:connect_options` in the same call. Use `:receive_timeout` only.
## Important Notes
- **Client-side scripts**: Use `.inline.ts` suffix, bundled via esbuild
- **Isomorphic code**: `quartz/util/path.ts` must not use Node.js APIs
- **Incremental builds**: Plugins can implement `partialEmit` for efficiency
- **Markdown flavors**: Supports Obsidian (`ofm.ts`) and Roam (`roam.ts`) syntax
- **Pipeline build artifacts**: `scripts/pipeline/_build/` and `scripts/pipeline/deps/`
are gitignored — run `mix deps.get` inside `scripts/pipeline/` after a fresh clone

View File

@@ -1,13 +1,96 @@
# Quartz v4
# Quartz v4 — org-roam edition
> [One] who works with the door open gets all kinds of interruptions, but [they] also occasionally gets clues as to what the world is and what might be important. — Richard Hamming
> "[One] who works with the door open gets all kinds of interruptions, but [they] also occasionally gets clues as to what the world is and what might be important." — Richard Hamming
Quartz is a set of tools that helps you publish your [digital garden](https://jzhao.xyz/posts/networked-thought) and notes as a website for free.
🔗 Read the documentation and get started: https://quartz.jzhao.xyz/
This fork adds first-class support for [org-roam](https://www.orgroam.com/) notes via [ox-hugo](https://ox-hugo.scripter.co/).
🔗 Upstream documentation: https://quartz.jzhao.xyz/
[Join the Discord Community](https://discord.gg/cRFFHYye7t)
## Quick Start
### Prerequisites
This project uses Nix. Enter the development shell, which provides Node.js 22, Elixir, and Emacs with ox-hugo:
```bash
nix develop
```
All commands below must be run inside this shell.
```bash
npm install
```
### Building from org-roam notes
Your org-roam notes live in a separate directory. Point `NOTES_DIR` at it:
```bash
# Export notes to content/ and build the site
NOTES_DIR=/path/to/notes npm run build:notes
# Export, build, and serve with hot reload
NOTES_DIR=/path/to/notes npm run serve:notes
# Export only (wipes content/ and re-exports all .org files)
NOTES_DIR=/path/to/notes npm run export
```
The export pipeline runs in four phases:
1. **Wipe** `content/` clean
2. **Export** every `.org` file via `emacs --batch` + ox-hugo → Markdown
3. **Transform** — post-process the Markdown (citation resolution, etc.)
4. **Index** — generate a fallback `index.md` if none was exported
#### Citations (org-citar → Zotero links)
org-citar references (`[cite:@key]`) are resolved to clickable Zotero links.
With Zotero running and the [Better BibTeX](https://retorque.re/zotero-better-bibtex/)
plugin installed, no extra configuration is needed — the pipeline detects it
automatically and links directly to the PDF in your library.
```bash
# Use a local .bib file as fallback when Zotero is not running
BIBTEX_FILE=/path/to/refs.bib NOTES_DIR=/path/to/notes npm run export
# Control warning verbosity for unresolved keys
CITATION_MODE=strict NOTES_DIR=/path/to/notes npm run export
```
| Env var | Default | Purpose |
| --------------- | ------------------------ | ----------------------------------------- |
| `BIBTEX_FILE` | — | Path to `.bib` file for citation fallback |
| `ZOTERO_URL` | `http://localhost:23119` | Zotero Better BibTeX base URL |
| `CITATION_MODE` | `warn` | `silent` / `warn` / `strict` |
### Building without org-roam notes
If you manage `content/` directly with Markdown files:
```bash
# Build the site
npx quartz build
# Build and serve with hot reload
npx quartz build --serve
```
The site is generated in `public/`. When serving, visit http://localhost:8080.
### Development
```bash
npm run check # type check + format check
npm run format # auto-format with Prettier
npm run test # run tests
```
## Sponsors
<p align="center">

61
flake.lock generated Normal file
View File

@@ -0,0 +1,61 @@
{
"nodes": {
"flake-utils": {
"inputs": {
"systems": "systems"
},
"locked": {
"lastModified": 1731533236,
"narHash": "sha256-l0KFg5HjrsfsO/JpG+r7fRrqm12kzFHyUHqHCVpMMbI=",
"owner": "numtide",
"repo": "flake-utils",
"rev": "11707dc2f618dd54ca8739b309ec4fc024de578b",
"type": "github"
},
"original": {
"owner": "numtide",
"repo": "flake-utils",
"type": "github"
}
},
"nixpkgs": {
"locked": {
"lastModified": 1771008912,
"narHash": "sha256-gf2AmWVTs8lEq7z/3ZAsgnZDhWIckkb+ZnAo5RzSxJg=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "a82ccc39b39b621151d6732718e3e250109076fa",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "nixos-unstable",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"flake-utils": "flake-utils",
"nixpkgs": "nixpkgs"
}
},
"systems": {
"locked": {
"lastModified": 1681028828,
"narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
"owner": "nix-systems",
"repo": "default",
"rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
"type": "github"
},
"original": {
"owner": "nix-systems",
"repo": "default",
"type": "github"
}
}
},
"root": "root",
"version": 7
}

99
flake.nix Normal file
View File

@@ -0,0 +1,99 @@
{
description = "Quartz org-roam dev shell and build app";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
flake-utils.url = "github:numtide/flake-utils";
};
outputs = { self, nixpkgs, flake-utils }:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = import nixpkgs { inherit system; };
# Emacs with ox-hugo — shared between devShell and buildApp
emacsWithOxHugo = (pkgs.emacsPackagesFor pkgs.emacs-nox).emacsWithPackages
(epkgs: [ epkgs.ox-hugo ]);
# Pre-fetched npm dependency tree (node_modules)
quartzDeps = pkgs.buildNpmPackage {
pname = "quartz-deps";
version = "4.5.2";
src = ./.;
npmDepsHash = "sha256-7u+VlIx44B3/ivM9vLMIOn+e4TL4eS6B682vhS+Ikb4=";
dontBuild = true;
installPhase = ''
mkdir -p $out
cp -r node_modules $out/node_modules
'';
};
# Pre-fetched Hex/Mix dependencies for scripts/pipeline
pipelineMixDeps = pkgs.beamPackages.fetchMixDeps {
pname = "pipeline-mix-deps";
version = "0.1.0";
src = ./scripts/pipeline;
sha256 = "sha256-E79X+nUy86G1Jrwv3T7dXekoGv8Hd14ZgJSKWjvlmAw=";
};
# The build application wrapper script
buildApp = pkgs.writeShellApplication {
name = "build";
runtimeInputs = [ pkgs.nodejs_22 pkgs.elixir emacsWithOxHugo ];
text = ''
NOTES_DIR="''${1:?Usage: build <path-to-notes-dir>}"
NOTES_DIR=$(realpath "$NOTES_DIR")
ORIG_CWD=$(pwd)
# Set up a writable working copy of the repo in a temp dir
WORK=$(mktemp -d)
trap 'rm -rf "$WORK"' EXIT
cp -r ${self}/. "$WORK/repo"
chmod -R u+w "$WORK/repo"
# Drop in pre-built node_modules
ln -s ${quartzDeps}/node_modules "$WORK/repo/node_modules"
# Drop in pre-fetched Mix deps so mix compile runs offline
cp -r ${pipelineMixDeps} "$WORK/repo/scripts/pipeline/deps"
chmod -R u+w "$WORK/repo/scripts/pipeline/deps"
# ox-hugo requires static/ to exist before it can copy image assets
mkdir -p "$WORK/repo/static"
# Run the export pipeline (org md, citations transform)
NOTES_DIR="$NOTES_DIR" elixir "$WORK/repo/scripts/export.exs"
# Build the static site from within the repo copy so relative paths
# (e.g. ./package.json in constants.js) resolve correctly.
# --output is absolute so the result lands in the caller's cwd.
cd "$WORK/repo"
node quartz/bootstrap-cli.mjs build \
--directory "$WORK/repo/content" \
--output "$ORIG_CWD/public"
'';
};
in
{
devShells.default = pkgs.mkShell {
buildInputs = [
pkgs.nodejs_22
pkgs.elixir
emacsWithOxHugo
pkgs.mcp-nixos
];
shellHook = ''
echo "Node $(node --version) / npm $(npm --version)"
elixir --version 2>/dev/null | head -1 || true
echo "Emacs $(emacs --version | head -1)"
'';
};
packages.default = buildApp;
packages.build = buildApp;
apps.default = { type = "app"; program = "${buildApp}/bin/build"; };
apps.build = { type = "app"; program = "${buildApp}/bin/build"; };
});
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

16
notes/bus/emt-madrid.org Normal file
View File

@@ -0,0 +1,16 @@
:PROPERTIES:
:ID: emt-madrid
:END:
#+title: EMT Madrid (urban bus)
Empresa Municipal de Transportes (EMT) operates the urban bus network
within the municipality of Madrid — around 200 lines.
* Notable lines
- *Line 27* — connects Embajadores with Barrio de la Concepción, one of the
oldest routes in the network.
- *Line 34* — Argüelles to Carabanchel, crossing the city centre via Gran Vía.
- *Búho (owl) lines* — night buses running from Cibeles from midnight to 6 am.
* See also
- [[id:madrid-transport][Madrid Public Transport]]

View File

@@ -0,0 +1,13 @@
#+title: Example: Citation Reference
This file demonstrates how org-citar citations pass through ox-hugo into
markdown, where the pipeline transform resolves them.
The methodology described in [cite:@podlovics2021journalArticle] provides a
useful framework for analysis.
Multiple citations can appear together:
[cite:@podlovics2021journalArticle]
Older bare-cite style (org-roam v1 / older citar) also works:
cite:podlovics2021journalArticle

33
notes/example-images.org Normal file
View File

@@ -0,0 +1,33 @@
:PROPERTIES:
:ID: example-images
:END:
#+title: Example: Image References
This note demonstrates the three image reference scenarios that the pipeline
must handle.
* Scenario 1: External image (URL)
An image hosted on the web — ox-hugo passes the URL through as-is and no
local file handling is needed.
#+attr_html: :link "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSkzsTuLOt8esM6enoKwkzqA52G3p9hldlf2g&s"
[[file:quartz-logo-external.png]]
* Scenario 2: Local image (same notes directory)
An image sitting next to this .org file inside the notes directory.
ox-hugo copies files referenced with a relative path into the Hugo =static/=
assets tree automatically.
#+CAPTION: Quartz logo (local, same notes dir)
[[file:quartz-logo.png]]
* Scenario 3: External image (outside notes directory)
An image that lives outside the notes directory entirely — for example a
shared assets folder or a system path. ox-hugo still copies it into =static/=
and rewrites the reference.
#+CAPTION: Quartz logo (outside notes dir)
[[file:../notes-external/external-location-image.png]]

View File

@@ -0,0 +1,17 @@
:PROPERTIES:
:ID: madrid-transport
:END:
#+title: Madrid Public Transport
Madrid has one of the most extensive public transport networks in Europe,
operated primarily by [[id:crtm][Consorcio Regional de Transportes de Madrid]] (CRTM).
* Modes
- [[id:metro-madrid][Metro de Madrid]] — 13 lines, ~300 km of track
- [[id:emt-madrid][EMT Bus]] — urban buses within the city
- Cercanías — suburban rail run by Renfe
- Interurbano — regional buses to the wider Community of Madrid
* Ticketing
A single [[https://www.crtm.es][tarjeta transporte]] (transport card) works across all modes.
The Multi card covers zones AC and is topped up at any metro station.

View File

@@ -0,0 +1,18 @@
:PROPERTIES:
:ID: metro-madrid
:END:
#+title: Metro de Madrid
The Madrid Metro is the main rapid transit network in the city, opened in 1919.
It is the second oldest metro in the Iberian Peninsula after Barcelona.
* Key Lines
| Line | Name | Colour | Terminals |
|------+-----------------+--------+------------------------------|
| L1 | Pinar de ChamartínValdecarros | Blue | Pinar de Chamartín / Valdecarros |
| L6 | Circular | Grey | Circular (loop) |
| L10 | — | Dark blue | Hospital Infanta Sofía / Tres Olivos |
* See also
- [[id:madrid-transport][Madrid Public Transport]]
- [[id:sol-interchange][Sol interchange]]

View File

@@ -0,0 +1,12 @@
:PROPERTIES:
:ID: sol-interchange
:END:
#+title: Sol (interchange)
Sol is the busiest interchange station in the Madrid Metro, sitting beneath
Puerta del Sol in the city centre.
Lines serving Sol: [[id:metro-madrid][L1]], L2, L3.
It also connects to the Cercanías hub underneath, making it the de-facto
zero point of Madrid's public transport.

BIN
notes/quartz-logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.2 KiB

22
notes/roads/crtm.org Normal file
View File

@@ -0,0 +1,22 @@
:PROPERTIES:
:ID: crtm
:END:
#+title: CRTM — Consorcio Regional de Transportes de Madrid
The CRTM is the regional authority that coordinates public transport across
the Community of Madrid. It does not operate services directly but sets
fares, zones, and integration policy.
* Fare zones
| Zone | Coverage |
|-------+-----------------------------|
| A | Municipality of Madrid |
| B1 | Inner ring municipalities |
| B2 | Outer ring municipalities |
| B3 | Further suburban area |
| C1C2 | Commuter belt |
* Related
- [[id:madrid-transport][Madrid Public Transport]]
- [[id:metro-madrid][Metro de Madrid]]
- [[id:emt-madrid][EMT Madrid]]

19
notes/roads/m30.org Normal file
View File

@@ -0,0 +1,19 @@
:PROPERTIES:
:ID: m30
:END:
#+title: M-30
The M-30 is Madrid's innermost ring road, circling the city centre at a
radius of roughly 35 km from Puerta del Sol.
It runs mostly underground through the Madrid Río tunnel section along the
Manzanares river, built during the 20042007 renovation that reclaimed the
riverbank as a public park.
* Key junctions
- Nudo Norte — connects to A-1 (Burgos) and A-6 (La Coruña)
- Nudo Sur — connects to A-4 (Cádiz) and A-42 (Toledo)
* See also
- [[id:crtm][CRTM]]
- [[id:madrid-transport][Madrid Public Transport]]

10
opencode.json Normal file
View File

@@ -0,0 +1,10 @@
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"nixos": {
"type": "local",
"command": ["mcp-nixos"],
"enabled": true
}
}
}

View File

@@ -17,7 +17,10 @@
"check": "tsc --noEmit && npx prettier . --check",
"format": "npx prettier . --write",
"test": "tsx --test",
"profile": "0x -D prof ./quartz/bootstrap-cli.mjs build --concurrency=1"
"profile": "0x -D prof ./quartz/bootstrap-cli.mjs build --concurrency=1",
"export": "elixir scripts/export.exs",
"build:notes": "elixir scripts/export.exs && npx quartz build",
"serve:notes": "elixir scripts/export.exs && npx quartz build --serve"
},
"engines": {
"npm": ">=10.9.2",

View File

@@ -55,7 +55,7 @@ const config: QuartzConfig = {
},
plugins: {
transformers: [
Plugin.FrontMatter(),
Plugin.FrontMatter({ delimiters: "+++", language: "toml" }),
Plugin.CreatedModifiedDate({
priority: ["frontmatter", "git", "filesystem"],
}),
@@ -66,7 +66,11 @@ const config: QuartzConfig = {
},
keepBackground: false,
}),
Plugin.ObsidianFlavoredMarkdown({ enableInHtmlEmbed: false }),
// OxHugoFlavouredMarkdown must come before GitHubFlavoredMarkdown.
// Note: not compatible with ObsidianFlavoredMarkdown — use one or the other.
// If ox-hugo exports TOML frontmatter, change FrontMatter to:
// Plugin.FrontMatter({ delims: "+++", language: "toml" })
Plugin.OxHugoFlavouredMarkdown(),
Plugin.GitHubFlavoredMarkdown(),
Plugin.TableOfContents(),
Plugin.CrawlLinks({ markdownLinkResolution: "shortest" }),

View File

@@ -71,7 +71,7 @@ async function buildQuartz(argv: Argv, mut: Mutex, clientRefresh: () => void) {
console.log(`Cleaned output directory \`${output}\` in ${perf.timeSince("clean")}`)
perf.addEvent("glob")
const allFiles = await glob("**/*.*", argv.directory, cfg.configuration.ignorePatterns)
const allFiles = await glob("**/*.*", argv.directory, cfg.configuration.ignorePatterns, false)
const markdownPaths = allFiles.filter((fp) => fp.endsWith(".md")).sort()
console.log(
`Found ${markdownPaths.length} input files from \`${argv.directory}\` in ${perf.timeSince("glob")}`,

View File

@@ -7,6 +7,7 @@ import { dirname } from "path"
export const Static: QuartzEmitterPlugin = () => ({
name: "Static",
async *emit({ argv, cfg }) {
// Copy Quartz's own internal static assets (quartz/static/) → output/static/
const staticPath = joinSegments(QUARTZ, "static")
const fps = await glob("**", staticPath, cfg.configuration.ignorePatterns)
const outputStaticPath = joinSegments(argv.output, "static")
@@ -18,6 +19,21 @@ export const Static: QuartzEmitterPlugin = () => ({
await fs.promises.copyFile(src, dest)
yield dest
}
// Copy user-facing static assets (static/) → output/ preserving paths.
// This mirrors Hugo's convention: static/ox-hugo/foo.png is served at /ox-hugo/foo.png,
// which matches the src="/ox-hugo/..." paths that ox-hugo writes into exported markdown.
const userStaticPath = "static"
if (fs.existsSync(userStaticPath)) {
const userFps = await glob("**", userStaticPath, cfg.configuration.ignorePatterns, false)
for (const fp of userFps) {
const src = joinSegments(userStaticPath, fp) as FilePath
const dest = joinSegments(argv.output, fp) as FilePath
await fs.promises.mkdir(dirname(dest), { recursive: true })
await fs.promises.copyFile(src, dest)
yield dest
}
}
},
async *partialEmit() {},
})

View File

@@ -27,7 +27,10 @@ const defaultOptions: Options = {
const relrefRegex = new RegExp(/\[([^\]]+)\]\(\{\{< relref "([^"]+)" >\}\}\)/, "g")
const predefinedHeadingIdRegex = new RegExp(/(.*) {#(?:.*)}/, "g")
const hugoShortcodeRegex = new RegExp(/{{(.*)}}/, "g")
const figureTagRegex = new RegExp(/< ?figure src="(.*)" ?>/, "g")
// Matches the full Hugo {{< figure src="..." ... >}} shortcode and captures src.
// Must run before the generic shortcode stripper to avoid partial-match issues
// with captions that contain HTML (e.g. <span class="figure-number">).
const figureShortcodeRegex = new RegExp(/{{<\s*figure\b[^}]*\bsrc="([^"]*)"[^}]*>}}/, "g")
// \\\\\( -> matches \\(
// (.+?) -> Lazy match for capturing the equation
// \\\\\) -> matches \\)
@@ -70,6 +73,14 @@ export const OxHugoFlavouredMarkdown: QuartzTransformerPlugin<Partial<Options>>
})
}
if (opts.replaceFigureWithMdImg) {
src = src.toString()
src = src.replaceAll(figureShortcodeRegex, (_value, ...capture) => {
const [imgSrc] = capture
return `![](${imgSrc})`
})
}
if (opts.removeHugoShortcode) {
src = src.toString()
src = src.replaceAll(hugoShortcodeRegex, (_value, ...capture) => {
@@ -78,14 +89,6 @@ export const OxHugoFlavouredMarkdown: QuartzTransformerPlugin<Partial<Options>>
})
}
if (opts.replaceFigureWithMdImg) {
src = src.toString()
src = src.replaceAll(figureTagRegex, (_value, ...capture) => {
const [src] = capture
return `![](${src})`
})
}
if (opts.replaceOrgLatex) {
src = src.toString()
src = src.replaceAll(inlineLatexRegex, (_value, ...capture) => {

View File

@@ -10,12 +10,13 @@ export async function glob(
pattern: string,
cwd: string,
ignorePatterns: string[],
respectGitignore: boolean = true,
): Promise<FilePath[]> {
const fps = (
await globby(pattern, {
cwd,
ignore: ignorePatterns,
gitignore: true,
gitignore: respectGitignore,
})
).map(toPosixPath)
return fps as FilePath[]

217
scripts/export.exs Normal file
View File

@@ -0,0 +1,217 @@
#!/usr/bin/env elixir
# Export org-roam notes (per-file) to content/ via ox-hugo,
# then run the markdown transformation pipeline (citations, etc.).
#
# Usage:
# NOTES_DIR=~/notes elixir scripts/export.exs
# elixir scripts/export.exs /path/to/notes
#
# Optional env vars:
# BIBTEX_FILE — path to a .bib file used as citation fallback
# ZOTERO_URL — Zotero Better BibTeX base URL (default: http://localhost:23119)
# CITATION_MODE — silent | warn (default) | strict
#
# The positional argument takes precedence over the NOTES_DIR env var.
# ---------------------------------------------------------------------------
# Load the pipeline Mix project so its modules are available in this script.
# ---------------------------------------------------------------------------
repo_root = __DIR__ |> Path.join("..") |> Path.expand()
pipeline_dir = Path.join(repo_root, "scripts/pipeline")
# Compile and load the pipeline project's modules into this runtime.
# Mix.install is NOT used here because we have a local Mix project — instead
# we compile it and push its beam files onto the code path.
#
# This runs `mix deps.get` + `mix compile` the first time; subsequent runs
# use the compiled artifacts from _build/ (fast, same as Mix caching).
{_, 0} =
System.cmd("mix", ["deps.get", "--quiet"],
cd: pipeline_dir,
env: [{"MIX_ENV", "prod"}],
into: IO.stream()
)
{_, 0} =
System.cmd("mix", ["compile", "--quiet"],
cd: pipeline_dir,
env: [{"MIX_ENV", "prod"}],
into: IO.stream()
)
# Add compiled beam files to the load path so we can call pipeline modules.
pipeline_build = Path.join(pipeline_dir, "_build/prod/lib")
pipeline_build
|> File.ls!()
|> Enum.each(fn app ->
ebin = Path.join([pipeline_build, app, "ebin"])
if File.dir?(ebin), do: Code.prepend_path(ebin)
end)
# Start the pipeline OTP application (which starts Finch for HTTP).
Application.ensure_all_started(:pipeline)
# ---------------------------------------------------------------------------
# Argument / env resolution
# ---------------------------------------------------------------------------
notes_dir =
case System.argv() do
[dir | _] -> dir
[] ->
System.get_env("NOTES_DIR") ||
(IO.puts(:stderr, "Usage: NOTES_DIR=/path/to/notes elixir scripts/export.exs")
System.halt(1))
end
notes_dir = Path.expand(notes_dir)
content_dir = Path.join(repo_root, "content")
unless File.dir?(notes_dir) do
IO.puts(:stderr, "Error: notes directory does not exist: #{notes_dir}")
System.halt(1)
end
# ---------------------------------------------------------------------------
# Phase 1: Wipe content/
# ---------------------------------------------------------------------------
IO.puts("==> Wiping #{content_dir}")
content_dir
|> File.ls!()
|> Enum.reject(&(&1 == ".gitkeep"))
|> Enum.each(fn entry ->
Path.join(content_dir, entry) |> File.rm_rf!()
end)
# ---------------------------------------------------------------------------
# Phase 2: Export org files via Emacs + ox-hugo
# ---------------------------------------------------------------------------
IO.puts("==> Exporting org files from #{notes_dir}")
org_files =
Path.join(notes_dir, "**/*.org")
|> Path.wildcard()
if org_files == [] do
IO.puts("No .org files found in #{notes_dir}")
System.halt(0)
end
results =
Enum.map(org_files, fn orgfile ->
IO.puts(" exporting: #{orgfile}")
section =
orgfile
|> Path.dirname()
|> Path.relative_to(notes_dir)
{output, exit_code} =
System.cmd(
"emacs",
[
"--batch",
"--eval", "(require 'ox-hugo)",
"--eval", ~s[(setq org-hugo-base-dir "#{repo_root}")],
"--eval", ~s[(setq org-hugo-default-section-directory "#{section}")],
"--visit", orgfile,
"--funcall", "org-hugo-export-to-md"
],
stderr_to_stdout: true
)
filtered =
output
|> String.split("\n")
|> Enum.reject(&String.match?(&1, ~r/^Loading|^ad-handle|^For information/))
|> Enum.join("\n")
if filtered != "", do: IO.puts(filtered)
{orgfile, exit_code}
end)
failures = Enum.filter(results, fn {_, code} -> code != 0 end)
if failures != [] do
IO.puts(:stderr, "\nFailed to export #{length(failures)} file(s):")
Enum.each(failures, fn {f, code} -> IO.puts(:stderr, " [exit #{code}] #{f}") end)
System.halt(1)
end
# ---------------------------------------------------------------------------
# Phase 3: Markdown transformation pipeline
# ---------------------------------------------------------------------------
IO.puts("==> Running markdown pipeline")
pipeline_opts = %{
zotero_url: System.get_env("ZOTERO_URL", "http://localhost:23119"),
bibtex_file: System.get_env("BIBTEX_FILE"),
citation_mode:
case System.get_env("CITATION_MODE", "warn") do
"silent" -> :silent
"strict" -> :strict
_ -> :warn
end
}
transforms = [Pipeline.Transforms.Citations]
case Pipeline.run(content_dir, transforms, pipeline_opts) do
{:ok, stats} ->
Enum.each(stats, fn {mod, count} ->
IO.puts(" #{inspect(mod)}: #{count} file(s) modified")
end)
{:error, reason} ->
IO.puts(:stderr, "Pipeline error: #{inspect(reason)}")
System.halt(1)
end
# ---------------------------------------------------------------------------
# Phase 4: Generate default index.md if none was exported
# ---------------------------------------------------------------------------
md_count =
Path.join(content_dir, "**/*.md")
|> Path.wildcard()
|> length()
index_path = Path.join(content_dir, "index.md")
unless File.exists?(index_path) do
IO.puts("==> Generating default index.md")
pages =
Path.join(content_dir, "**/*.md")
|> Path.wildcard()
|> Enum.map(fn path ->
slug = Path.relative_to(path, content_dir) |> Path.rootname()
title =
path
|> File.read!()
|> then(fn content ->
case Regex.run(~r/^title\s*=\s*"(.+)"/m, content) do
[_, t] -> t
_ -> slug
end
end)
{slug, title}
end)
|> Enum.sort_by(fn {_, title} -> title end)
|> Enum.map(fn {slug, title} -> "- [#{title}](#{slug})" end)
|> Enum.join("\n")
File.write!(index_path, """
---
title: Index
---
#{pages}
""")
end
IO.puts("==> Done. #{md_count} markdown files in #{content_dir}")

View File

@@ -0,0 +1,83 @@
defmodule Pipeline do
@moduledoc """
Post-export markdown transformation pipeline.
Applies a list of transform modules sequentially over every .md file
in a content directory. Each transform module must implement:
apply(content :: String.t(), opts :: map()) :: String.t()
Transforms are applied in the order given. A file is rewritten only
when at least one transform mutates its content (checked via equality).
## Usage
opts = %{
zotero_url: "http://localhost:23119",
bibtex_file: System.get_env("BIBTEX_FILE"),
citation_mode: :warn # :silent | :warn | :strict
}
Pipeline.run(content_dir, [Pipeline.Transforms.Citations], opts)
"""
require Logger
@type transform :: module()
@type opts :: map()
@doc """
Run all transforms over every .md file under `content_dir`.
Returns `{:ok, stats}` where stats maps each transform to a count of files it changed.
"""
@spec run(String.t(), [transform()], opts()) :: {:ok, map()}
def run(content_dir, transforms, opts \\ %{}) do
md_files =
content_dir
|> Path.join("**/*.md")
|> Path.wildcard()
if md_files == [] do
Logger.warning("Pipeline: no .md files found in #{content_dir}")
{:ok, %{}}
else
Logger.info("Pipeline: processing #{length(md_files)} markdown files with #{length(transforms)} transform(s)")
# Initialise transforms (allows them to perform setup such as loading a .bib file).
# Each transform module must implement the Pipeline.Transform behaviour.
initialized =
Enum.map(transforms, fn mod ->
state = mod.init(opts)
{mod, state}
end)
stats =
Enum.reduce(md_files, %{}, fn path, acc ->
original = File.read!(path)
{transformed, file_stats} =
Enum.reduce(initialized, {original, %{}}, fn {mod, state}, {content, fstats} ->
result = mod.apply(content, state, opts)
changed = result != content
{result, Map.update(fstats, mod, (if changed, do: 1, else: 0), &(&1 + (if changed, do: 1, else: 0)))}
end)
if transformed != original do
File.write!(path, transformed)
Logger.debug("Pipeline: updated #{Path.relative_to_cwd(path)}")
end
Map.merge(acc, file_stats, fn _k, a, b -> a + b end)
end)
Enum.each(initialized, fn {mod, state} ->
# teardown/1 is optional in the behaviour
if function_exported?(mod, :teardown, 1) do
mod.teardown(state)
end
end)
{:ok, stats}
end
end
end

View File

@@ -0,0 +1,14 @@
defmodule Pipeline.Application do
@moduledoc false
use Application
@impl true
def start(_type, _args) do
children = [
{Finch, name: Pipeline.Finch}
]
opts = [strategy: :one_for_one, name: Pipeline.Supervisor]
Supervisor.start_link(children, opts)
end
end

View File

@@ -0,0 +1,178 @@
defmodule Pipeline.Resolvers.BibTeX do
@moduledoc """
Resolves citation keys from a local BibTeX (.bib) file.
Configured via the `BIBTEX_FILE` environment variable, or passed directly
as `opts.bibtex_file`. The file is parsed once at init time and the
resulting entry map is reused for all lookups.
Supports extracting: author last names, year, title, DOI, URL.
BibTeX entry format parsed:
@type{citationkey,
author = {Last, First and Last2, First2},
year = {2021},
title = {Some Title},
doi = {10.xxxx/yyyy},
url = {https://example.com},
}
Returns `{:ok, %{label: "Author, Year", url: "..."}}` or `:error`.
"""
require Logger
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
@doc """
Parse a .bib file and return a map of `%{citation_key => entry_map}`.
Returns `{:ok, entries}` or `{:error, reason}`.
"""
@spec load(String.t()) :: {:ok, map()} | {:error, term()}
def load(path) do
case File.read(path) do
{:ok, content} ->
entries = parse_entries(content)
Logger.info("BibTeX: loaded #{map_size(entries)} entries from #{path}")
{:ok, entries}
{:error, reason} ->
{:error, reason}
end
end
@doc """
Resolve a citation key from pre-loaded BibTeX entries.
"""
@spec resolve(String.t(), map()) :: {:ok, map()} | :error
def resolve(key, entries) do
case Map.fetch(entries, key) do
{:ok, entry} ->
label = build_label(entry)
url = build_url(entry)
{:ok, %{label: label, url: url}}
:error ->
:error
end
end
# ------------------------------------------------------------------
# Parsing
# ------------------------------------------------------------------
# Match @type{key, ...fields...}
# We handle nested braces by scanning character by character after
# finding the opening, rather than relying on a single regex.
@entry_header ~r/@\w+\s*\{\s*([^,\s]+)\s*,/
defp parse_entries(content) do
# Split on "@" boundaries, then parse each chunk
content
|> String.split(~r/(?=@\w+\s*\{)/, trim: true)
|> Enum.reduce(%{}, fn chunk, acc ->
case Regex.run(@entry_header, chunk) do
[_, key] ->
fields = parse_fields(chunk)
Map.put(acc, String.trim(key), fields)
_ ->
acc
end
end)
end
# Extract key = {value} or key = "value" pairs from an entry block.
# Handles simple single-depth braces; good enough for common fields.
@field_regex ~r/(\w+)\s*=\s*(?:\{([^{}]*(?:\{[^{}]*\}[^{}]*)*)\}|"([^"]*)")/
defp parse_fields(chunk) do
@field_regex
|> Regex.scan(chunk)
|> Enum.reduce(%{}, fn match, acc ->
field_name = Enum.at(match, 1) |> String.downcase()
# Value is in capture group 2 (braces) or 3 (quotes)
value =
case {Enum.at(match, 2, ""), Enum.at(match, 3, "")} do
{"", q} -> q
{b, _} -> b
end
Map.put(acc, field_name, String.trim(value))
end)
end
# ------------------------------------------------------------------
# Label & URL building
# ------------------------------------------------------------------
defp build_label(entry) do
author_part =
entry
|> Map.get("author", "")
|> parse_authors()
|> format_authors()
year = Map.get(entry, "year", Map.get(entry, "date", ""))
year = extract_year(year)
if year && author_part != "", do: "#{author_part}, #{year}", else: author_part
end
defp parse_authors(""), do: []
defp parse_authors(author_str) do
author_str
|> String.split(" and ", trim: true)
|> Enum.map(&extract_last_name/1)
|> Enum.reject(&(&1 == ""))
end
# Handles "Last, First" and "First Last" formats
defp extract_last_name(name) do
name = String.trim(name)
cond do
String.contains?(name, ",") ->
name |> String.split(",") |> List.first() |> String.trim()
String.contains?(name, " ") ->
name |> String.split(" ") |> List.last() |> String.trim()
true ->
name
end
end
defp format_authors([]), do: "Unknown"
defp format_authors([single]), do: single
defp format_authors([first | rest]), do: "#{first} & #{List.last(rest)}"
defp extract_year(""), do: nil
defp extract_year(str) do
case Regex.run(~r/\b(\d{4})\b/, str) do
[_, year] -> year
_ -> nil
end
end
defp build_url(entry) do
cond do
doi = Map.get(entry, "doi", "") |> non_empty() ->
"https://doi.org/#{doi}"
url = Map.get(entry, "url", "") |> non_empty() ->
url
true ->
nil
end
end
defp non_empty(""), do: nil
defp non_empty(v), do: v
end

View File

@@ -0,0 +1,18 @@
defmodule Pipeline.Resolvers.DOI do
@moduledoc """
Last-resort citation resolver — always succeeds.
If the citation key looks like a DOI (starts with "10."), returns a
`https://doi.org/...` link. Otherwise returns the key itself as a
plain label with no URL.
"""
@spec resolve(String.t()) :: {:ok, map()}
def resolve(key) do
if String.starts_with?(key, "10.") do
{:ok, %{label: key, url: "https://doi.org/#{key}"}}
else
{:ok, %{label: key, url: nil}}
end
end
end

View File

@@ -0,0 +1,182 @@
defmodule Pipeline.Resolvers.Zotero do
@moduledoc """
Resolves citation keys via Zotero Better BibTeX's JSON-RPC API.
Requires Zotero to be running with the Better BibTeX plugin installed.
Default endpoint: http://localhost:23119/better-bibtex/json-rpc
Resolution strategy:
1. Search by citation key via `item.search`
2. If found, try to get a PDF attachment link (zotero://open-pdf/...)
3. Fall back to zotero://select/items/@key
Returns `{:ok, %{label: "Author, Year", url: "zotero://..."}}` or `:error`.
"""
require Logger
@rpc_path "/better-bibtex/json-rpc"
@doc """
Attempt to resolve `key` against a running Zotero instance.
`base_url` defaults to `http://localhost:23119`.
"""
@spec resolve(String.t(), String.t()) :: {:ok, map()} | :error
def resolve(key, base_url \\ "http://localhost:23119") do
url = base_url <> @rpc_path
payload =
Jason.encode!(%{
jsonrpc: "2.0",
method: "item.search",
params: [
[["citationKey", "is", key]]
],
id: 1
})
case Req.post(url,
body: payload,
headers: [{"content-type", "application/json"}],
receive_timeout: 5_000,
finch: Pipeline.Finch
) do
{:ok, %{status: 200, body: body}} ->
parse_response(body, key, base_url)
{:ok, %{status: status}} ->
Logger.debug("Zotero: unexpected HTTP #{status} for key #{key}")
:error
{:error, reason} ->
Logger.debug("Zotero: connection failed for key #{key}: #{inspect(reason)}")
:error
other ->
Logger.debug("Zotero: unexpected result for key #{key}: #{inspect(other)}")
:error
end
rescue
e ->
Logger.debug("Zotero: exception resolving key #{key}: #{inspect(e)}")
:error
end
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
defp parse_response(%{"result" => [item | _]}, key, base_url) do
label = build_label(item)
url = resolve_url(item, key, base_url)
{:ok, %{label: label, url: url}}
end
defp parse_response(%{"result" => []}, key, _base_url) do
Logger.debug("Zotero: no item found for key #{key}")
:error
end
defp parse_response(%{"error" => err}, key, _base_url) do
Logger.debug("Zotero: RPC error for key #{key}: #{inspect(err)}")
:error
end
defp parse_response(body, key, _base_url) do
Logger.debug("Zotero: unexpected response shape for key #{key}: #{inspect(body)}")
:error
end
defp fetch_pdf_url(key, base_url) do
payload =
Jason.encode!(%{
jsonrpc: "2.0",
method: "item.attachments",
params: [key],
id: 2
})
case Req.post(base_url <> @rpc_path,
body: payload,
headers: [{"content-type", "application/json"}],
receive_timeout: 5_000,
finch: Pipeline.Finch
) do
{:ok, %{status: 200, body: %{"result" => attachments}}} when is_list(attachments) ->
attachments
|> Enum.find_value(fn att ->
open = Map.get(att, "open", "")
path = Map.get(att, "path", "")
if String.ends_with?(path, ".pdf"), do: open, else: nil
end)
_ ->
nil
end
rescue
_ -> nil
end
# CSL-JSON format: authors are under "author" with "family"/"given" keys.
# Year is under "issued" -> "date-parts" -> [[year, month, day]].
defp build_label(item) do
authors = Map.get(item, "author", [])
year = extract_year(item)
author_part =
case authors do
[] ->
"Unknown"
[single] ->
Map.get(single, "family", Map.get(single, "literal", "Unknown"))
[first | rest] ->
first_name = Map.get(first, "family", Map.get(first, "literal", "Unknown"))
last_name =
rest
|> List.last()
|> then(&Map.get(&1, "family", Map.get(&1, "literal", "Unknown")))
"#{first_name} & #{last_name}"
end
if year, do: "#{author_part}, #{year}", else: author_part
end
# "issued": {"date-parts": [["2021", 2, 3]]}
defp extract_year(item) do
case get_in(item, ["issued", "date-parts"]) do
[[year | _] | _] -> to_string(year)
_ -> nil
end
end
defp resolve_url(item, key, base_url) do
# Prefer zotero://open-pdf/... for items with a PDF attachment.
# Fall back to zotero://select/library/items/KEY to open the item in Zotero.
# The "id" field is a URI like "http://zotero.org/users/123/items/ABCD1234".
pdf_url = fetch_pdf_url(key, base_url)
if pdf_url do
pdf_url
else
item_key =
item
|> Map.get("id", "")
|> String.split("/")
|> List.last()
|> non_empty()
if item_key do
"zotero://select/library/items/#{item_key}"
else
"zotero://select/items/@#{key}"
end
end
end
defp non_empty(nil), do: nil
defp non_empty(""), do: nil
defp non_empty(v), do: v
end

View File

@@ -0,0 +1,48 @@
defmodule Pipeline.Transform do
@moduledoc """
Behaviour that all markdown transform modules must implement.
## Callbacks
- `init/1` — called once before processing; returns transform-specific state.
Default implementation returns the opts map unchanged.
- `apply/3` — called per .md file; returns the (possibly modified) content.
- `teardown/1` — optional cleanup after all files are processed.
## Example
defmodule MyTransform do
@behaviour Pipeline.Transform
@impl true
def init(opts), do: %{some_state: opts[:value]}
@impl true
def apply(content, state, _opts) do
String.replace(content, "foo", state.some_state)
end
end
"""
@doc "One-time initialisation. Returns opaque state passed to apply/3."
@callback init(opts :: map()) :: term()
@doc "Transform file content. Returns the (possibly modified) content string."
@callback apply(content :: String.t(), state :: term(), opts :: map()) :: String.t()
@doc "Optional cleanup after all files are processed."
@callback teardown(state :: term()) :: :ok
@optional_callbacks teardown: 1
defmacro __using__(_) do
quote do
@behaviour Pipeline.Transform
@impl Pipeline.Transform
def init(opts), do: opts
defoverridable init: 1
end
end
end

View File

@@ -0,0 +1,231 @@
defmodule Pipeline.Transforms.Citations do
@moduledoc """
Markdown transform: resolves org-citar citation keys to hyperlinks.
## Recognised citation syntax (as output by ox-hugo from org-citar)
[cite:@key] → org-cite / citar standard (most common)
[cite:@key1;@key2] → multiple citations
cite:key → older roam-style bare cite syntax
## Resolution chain (in order)
1. Zotero (live instance via Better BibTeX JSON-RPC) — preferred
2. BibTeX file (BIBTEX_FILE env var) — fallback
3. DOI / bare key — always succeeds
## Modes (opts.citation_mode)
:silent — silently use DOI/bare-key fallback when Zotero+BibTeX fail
:warn — (default) emit a Logger.warning for unresolved keys
:strict — raise on unresolved keys (aborts pipeline)
## Format
Resolved citations are rendered as:
[Label](url) when a URL is available
[Label] when no URL could be determined (bare key fallback)
Multiple semicolon-separated keys become space-separated links:
[cite:@a;@b] → [Author A, 2020](url_a) [Author B, 2019](url_b)
## init/1 callback
Loads the BibTeX file (if configured) once before processing begins,
and probes Zotero availability, emitting warnings as appropriate.
"""
@behaviour Pipeline.Transform
require Logger
alias Pipeline.Resolvers.Zotero
alias Pipeline.Resolvers.BibTeX
alias Pipeline.Resolvers.DOI
# Match [cite:@key] and [cite:@key1;@key2;...] (org-cite / citar style)
@cite_bracket_regex ~r/\[cite:(@[^\]]+)\]/
# Match bare cite:key (older roam style, no brackets, no @ prefix)
@cite_bare_regex ~r/(?<![(\[])cite:([a-zA-Z0-9_:-]+)/
# ------------------------------------------------------------------
# Pipeline callbacks
# ------------------------------------------------------------------
@doc """
Called once before processing any files. Loads BibTeX, probes Zotero.
Returns a state map passed to every `apply/3` call.
"""
def init(opts) do
bibtex_entries = load_bibtex(opts)
zotero_available = probe_zotero(opts)
if not zotero_available and bibtex_entries == %{} do
Logger.warning(
"Citations: neither Zotero nor a BibTeX file is available. " <>
"All citations will fall back to bare-key rendering. " <>
"Set BIBTEX_FILE env var or start Zotero with Better BibTeX to resolve citations."
)
end
%{
bibtex_entries: bibtex_entries,
zotero_available: zotero_available,
zotero_url: Map.get(opts, :zotero_url, "http://localhost:23119"),
citation_mode: Map.get(opts, :citation_mode, :warn)
}
end
@doc """
Apply citation resolution to a single markdown file's content.
"""
def apply(content, state, _opts) do
content
|> resolve_bracket_citations(state)
|> resolve_bare_citations(state)
end
# ------------------------------------------------------------------
# Resolution passes
# ------------------------------------------------------------------
defp resolve_bracket_citations(content, state) do
Regex.replace(@cite_bracket_regex, content, fn _full, keys_str ->
keys_str
|> String.split(";")
|> Enum.map(&String.trim/1)
|> Enum.map(fn "@" <> key -> key end)
|> Enum.map(&resolve_key(&1, state))
|> Enum.join(" ")
end)
end
defp resolve_bare_citations(content, state) do
Regex.replace(@cite_bare_regex, content, fn _full, key ->
resolve_key(key, state)
end)
end
# ------------------------------------------------------------------
# Single-key resolution chain
# ------------------------------------------------------------------
defp resolve_key(key, state) do
info =
with :error <- try_zotero(key, state),
:error <- try_bibtex(key, state) do
handle_unresolved(key, state)
else
{:ok, citation_info} -> citation_info
end
format_result(info)
end
defp try_zotero(_key, %{zotero_available: false}), do: :error
defp try_zotero(key, %{zotero_url: url}) do
Zotero.resolve(key, url)
end
defp try_bibtex(_key, %{bibtex_entries: entries}) when map_size(entries) == 0, do: :error
defp try_bibtex(key, %{bibtex_entries: entries}) do
BibTeX.resolve(key, entries)
end
defp handle_unresolved(key, %{citation_mode: mode}) do
case mode do
:strict ->
raise "Citations: could not resolve citation key '#{key}' and mode is :strict"
:warn ->
Logger.warning("Citations: unresolved citation key '#{key}' — using bare-key fallback")
{:ok, result} = DOI.resolve(key)
result
:silent ->
{:ok, result} = DOI.resolve(key)
result
end
end
defp format_result(%{label: label, url: nil}), do: "[#{label}]"
defp format_result(%{label: label, url: url}), do: "[#{label}](#{url})"
# ------------------------------------------------------------------
# Init helpers
# ------------------------------------------------------------------
defp load_bibtex(opts) do
path = Map.get(opts, :bibtex_file) || System.get_env("BIBTEX_FILE")
cond do
is_nil(path) ->
Logger.debug("Citations: BIBTEX_FILE not set — BibTeX resolver disabled")
%{}
not File.exists?(path) ->
Logger.warning("Citations: BIBTEX_FILE=#{path} does not exist — BibTeX resolver disabled")
%{}
true ->
case BibTeX.load(path) do
{:ok, entries} -> entries
{:error, reason} ->
Logger.warning("Citations: failed to load BibTeX file #{path}: #{inspect(reason)}")
%{}
end
end
end
defp probe_zotero(opts) do
url = Map.get(opts, :zotero_url, "http://localhost:23119")
# Use a no-op JSON-RPC call to probe availability.
# /better-bibtex/cayw is intentionally avoided — it blocks waiting for
# user interaction and never returns without a pick.
payload =
Jason.encode!(%{
jsonrpc: "2.0",
method: "item.search",
params: [[[]]],
id: 0
})
result =
try do
Req.post(url <> "/better-bibtex/json-rpc",
body: payload,
headers: [{"content-type", "application/json"}],
receive_timeout: 3_000,
finch: Pipeline.Finch
)
rescue
e -> {:error, e}
end
case result do
{:ok, %{status: 200}} ->
Logger.info("Citations: Zotero Better BibTeX is available at #{url}")
true
{:ok, %{status: status}} ->
Logger.warning(
"Citations: Zotero responded HTTP #{status} at #{url}" <>
"is Better BibTeX installed?"
)
false
_ ->
Logger.warning(
"Citations: Zotero not reachable at #{url}" <>
"start Zotero with Better BibTeX or set BIBTEX_FILE as fallback"
)
false
end
end
end

27
scripts/pipeline/mix.exs Normal file
View File

@@ -0,0 +1,27 @@
defmodule Pipeline.MixProject do
use Mix.Project
def project do
[
app: :pipeline,
version: "0.1.0",
elixir: "~> 1.15",
start_permanent: Mix.env() == :prod,
deps: deps()
]
end
def application do
[
extra_applications: [:logger, :inets, :ssl],
mod: {Pipeline.Application, []}
]
end
defp deps do
[
{:req, "~> 0.5"},
{:jason, "~> 1.4"}
]
end
end

11
scripts/pipeline/mix.lock Normal file
View File

@@ -0,0 +1,11 @@
%{
"finch": {:hex, :finch, "0.21.0", "b1c3b2d48af02d0c66d2a9ebfb5622be5c5ecd62937cf79a88a7f98d48a8290c", [:mix], [{:mime, "~> 1.0 or ~> 2.0", [hex: :mime, repo: "hexpm", optional: false]}, {:mint, "~> 1.6.2 or ~> 1.7", [hex: :mint, repo: "hexpm", optional: false]}, {:nimble_options, "~> 0.4 or ~> 1.0", [hex: :nimble_options, repo: "hexpm", optional: false]}, {:nimble_pool, "~> 1.1", [hex: :nimble_pool, repo: "hexpm", optional: false]}, {:telemetry, "~> 0.4 or ~> 1.0", [hex: :telemetry, repo: "hexpm", optional: false]}], "hexpm", "87dc6e169794cb2570f75841a19da99cfde834249568f2a5b121b809588a4377"},
"hpax": {:hex, :hpax, "1.0.3", "ed67ef51ad4df91e75cc6a1494f851850c0bd98ebc0be6e81b026e765ee535aa", [:mix], [], "hexpm", "8eab6e1cfa8d5918c2ce4ba43588e894af35dbd8e91e6e55c817bca5847df34a"},
"jason": {:hex, :jason, "1.4.4", "b9226785a9aa77b6857ca22832cffa5d5011a667207eb2a0ad56adb5db443b8a", [:mix], [{:decimal, "~> 1.0 or ~> 2.0", [hex: :decimal, repo: "hexpm", optional: true]}], "hexpm", "c5eb0cab91f094599f94d55bc63409236a8ec69a21a67814529e8d5f6cc90b3b"},
"mime": {:hex, :mime, "2.0.7", "b8d739037be7cd402aee1ba0306edfdef982687ee7e9859bee6198c1e7e2f128", [:mix], [], "hexpm", "6171188e399ee16023ffc5b76ce445eb6d9672e2e241d2df6050f3c771e80ccd"},
"mint": {:hex, :mint, "1.7.1", "113fdb2b2f3b59e47c7955971854641c61f378549d73e829e1768de90fc1abf1", [:mix], [{:castore, "~> 0.1.0 or ~> 1.0", [hex: :castore, repo: "hexpm", optional: true]}, {:hpax, "~> 0.1.1 or ~> 0.2.0 or ~> 1.0", [hex: :hpax, repo: "hexpm", optional: false]}], "hexpm", "fceba0a4d0f24301ddee3024ae116df1c3f4bb7a563a731f45fdfeb9d39a231b"},
"nimble_options": {:hex, :nimble_options, "1.1.1", "e3a492d54d85fc3fd7c5baf411d9d2852922f66e69476317787a7b2bb000a61b", [:mix], [], "hexpm", "821b2470ca9442c4b6984882fe9bb0389371b8ddec4d45a9504f00a66f650b44"},
"nimble_pool": {:hex, :nimble_pool, "1.1.0", "bf9c29fbdcba3564a8b800d1eeb5a3c58f36e1e11d7b7fb2e084a643f645f06b", [:mix], [], "hexpm", "af2e4e6b34197db81f7aad230c1118eac993acc0dae6bc83bac0126d4ae0813a"},
"req": {:hex, :req, "0.5.17", "0096ddd5b0ed6f576a03dde4b158a0c727215b15d2795e59e0916c6971066ede", [:mix], [{:brotli, "~> 0.3.1", [hex: :brotli, repo: "hexpm", optional: true]}, {:ezstd, "~> 1.0", [hex: :ezstd, repo: "hexpm", optional: true]}, {:finch, "~> 0.17", [hex: :finch, repo: "hexpm", optional: false]}, {:jason, "~> 1.0", [hex: :jason, repo: "hexpm", optional: false]}, {:mime, "~> 2.0.6 or ~> 2.1", [hex: :mime, repo: "hexpm", optional: false]}, {:nimble_csv, "~> 1.0", [hex: :nimble_csv, repo: "hexpm", optional: true]}, {:plug, "~> 1.0", [hex: :plug, repo: "hexpm", optional: true]}], "hexpm", "0b8bc6ffdfebbc07968e59d3ff96d52f2202d0536f10fef4dc11dc02a2a43e39"},
"telemetry": {:hex, :telemetry, "1.3.0", "fedebbae410d715cf8e7062c96a1ef32ec22e764197f70cda73d82778d61e7a2", [:rebar3], [], "hexpm", "7015fc8919dbe63764f4b4b87a95b7c0996bd539e0d499be6ec9d7f3875b79e6"},
}