Files
quartz-org-roam/scripts/pipeline/lib/pipeline.ex
Ignacio Ballesteros 511b003da8 Add Elixir markdown pipeline with org-citar citation resolution
Introduces scripts/pipeline/, a Mix project that runs as a post-export
transformation pass over content/*.md before Quartz builds the site.

Pipeline (scripts/export.exs phase 3):
- Compiles and loads the Mix project at export time (cached after first run)
- Applies a list of Transform modules sequentially over all .md files
- Only rewrites files that were actually changed

Citations transform (Pipeline.Transforms.Citations):
- Resolves [cite:@key] and bare cite:key syntax produced by ox-hugo/citar
- Resolution chain: Zotero BBT JSON-RPC → BibTeX file → DOI/bare-key fallback
- Zotero probe uses a no-op JSON-RPC call (cayw endpoint blocks indefinitely)
- Zotero resolver fetches PDF attachments via item.attachments, producing
  zotero://open-pdf/... links; falls back to zotero://select/library/items/...
- BibTeX resolver parses .bib files with a simple regex parser (no deps)
- DOI resolver is the always-succeeding last resort

Configuration via env vars:
  BIBTEX_FILE   — path to .bib file for fallback resolution
  ZOTERO_URL    — Zotero base URL (default: http://localhost:23119)
  CITATION_MODE — silent | warn (default) | strict

Adding future transforms requires only implementing Pipeline.Transform
behaviour and appending the module to the transforms list in export.exs.
2026-02-20 10:00:11 +01:00

84 lines
2.6 KiB
Elixir

defmodule Pipeline do
@moduledoc """
Post-export markdown transformation pipeline.
Applies a list of transform modules sequentially over every .md file
in a content directory. Each transform module must implement:
apply(content :: String.t(), opts :: map()) :: String.t()
Transforms are applied in the order given. A file is rewritten only
when at least one transform mutates its content (checked via equality).
## Usage
opts = %{
zotero_url: "http://localhost:23119",
bibtex_file: System.get_env("BIBTEX_FILE"),
citation_mode: :warn # :silent | :warn | :strict
}
Pipeline.run(content_dir, [Pipeline.Transforms.Citations], opts)
"""
require Logger
@type transform :: module()
@type opts :: map()
@doc """
Run all transforms over every .md file under `content_dir`.
Returns `{:ok, stats}` where stats maps each transform to a count of files it changed.
"""
@spec run(String.t(), [transform()], opts()) :: {:ok, map()}
def run(content_dir, transforms, opts \\ %{}) do
md_files =
content_dir
|> Path.join("**/*.md")
|> Path.wildcard()
if md_files == [] do
Logger.warning("Pipeline: no .md files found in #{content_dir}")
{:ok, %{}}
else
Logger.info("Pipeline: processing #{length(md_files)} markdown files with #{length(transforms)} transform(s)")
# Initialise transforms (allows them to perform setup such as loading a .bib file).
# Each transform module must implement the Pipeline.Transform behaviour.
initialized =
Enum.map(transforms, fn mod ->
state = mod.init(opts)
{mod, state}
end)
stats =
Enum.reduce(md_files, %{}, fn path, acc ->
original = File.read!(path)
{transformed, file_stats} =
Enum.reduce(initialized, {original, %{}}, fn {mod, state}, {content, fstats} ->
result = mod.apply(content, state, opts)
changed = result != content
{result, Map.update(fstats, mod, (if changed, do: 1, else: 0), &(&1 + (if changed, do: 1, else: 0)))}
end)
if transformed != original do
File.write!(path, transformed)
Logger.debug("Pipeline: updated #{Path.relative_to_cwd(path)}")
end
Map.merge(acc, file_stats, fn _k, a, b -> a + b end)
end)
Enum.each(initialized, fn {mod, state} ->
# teardown/1 is optional in the behaviour
if function_exported?(mod, :teardown, 1) do
mod.teardown(state)
end
end)
{:ok, stats}
end
end
end