Skip to content

Parsers & writers

The parse layer turns text into TreeBuffers (plus an AnnotationTable); the write layer serializes back out. Number formatting follows Java’s Double.toString / Integer.toString semantics so FigTree and BEAST files round-trip byte-for-byte.

import { detectFormat } from "insomni-phylo";
detectFormat("#NEXUS\n..."); // "nexus"
detectFormat("((A,B),C);"); // "newick"

detectFormat is a cheap leading-token sniff: a #NEXUS header → "nexus", otherwise "newick". TreeFormat is "newick" | "nexus".

parseNewick is a single-pass scanner that pre-sizes its buffers from the source length, so even 500k-leaf trees never reallocate mid-parse.

import { AnnotationTable, NewickParseError, parseNewick } from "insomni-phylo";
const annotations = new AnnotationTable();
const diagnostics = [];
try {
const tree = parseNewick("((A:0.1,B:0.2):0.3,C:0.4);", {
annotations, // [&k=v] and [&&NHX:...] blocks are extracted into this table
diagnostics, // recoverable parse warnings collected here
});
} catch (err) {
if (err instanceof NewickParseError) console.error(err.message, err.pos);
}

ParseNewickOptions has two optional fields: annotations (an AnnotationTable to populate) and diagnostics (a ParseDiagnostic[] sink). NewickParseError carries a pos source offset.

Annotation bodies are parsed by parseAnnotationBody(body, table, nodeIdx), which is exported for direct use and handles both conventions:

  • FigTree / BEAST: &key=value,key=value,… (comma-separated)
  • NHX: &&NHX:key=value:key=value:… (colon-separated)

Both support quoted strings, barewords, numbers, booleans, and {a,b,c} nested lists. Plain bracketed comments (no leading &) are ignored.

parseNexus returns a NexusDocument — it parses every TREES block, the TAXA labels, the FIGTREE settings block, and the TRANSLATE map, and preserves the verbatim text of blocks it does not model (DATA, CHARACTERS, custom BEAST blocks) so round-trips never silently drop payloads.

import { NexusParseError, parseNexus } from "insomni-phylo";
const doc = parseNexus(nexusText);
for (const t of doc.trees) {
console.log(t.name, t.isDefault, t.rooted, t.tree.tipCount);
}
doc.figtreeSettings; // Map<string, string> of `set key=value;` entries, raw

NexusDocument fields:

FieldMeaning
taxa: string[]TAXA labels, in declared order.
trees: NexusTree[]Each carries name, rooted ([&R]/[&U]/null), isDefault, tree, annotations.
figtreeSettings: Map<string,string>FIGTREE-block set entries, verbatim.
translate: Map<string,string>TRANSLATE map (numeric id → label).
preservedBlocks?: string[]Verbatim source of unmodeled blocks.
seenBlocks: string[]Top-level block names in order (derailment detection).
warnings: string[] / diagnostics: ParseDiagnostic[]Recoverable issues.
import { writeNewick, writeNexus } from "insomni-phylo";
const newick = writeNewick(tree, { annotations });
const nexus = writeNexus(doc, { useTranslate: true });

writeNewick(tree, options?) walks iteratively (stack-safe on huge trees), single-quotes names with special characters, and never emits a root branch length. WriteNewickOptions:

  • annotations — emit [&k=v,…] after each node’s label for columns that have a value on that node.
  • nameMap — rewrite stored names to emitted names (the NEXUS writer uses this to swap tip labels for numeric TRANSLATE keys).
  • wrapWidth — insert a newline after a comma past this many chars (helps legacy apps that freeze on one giant line).
  • precisiondeprecated and ignored; lengths use Java Double.toString semantics for byte-stable round-trips.

writeNexus(doc, options?) emits only the blocks whose backing maps are non-empty, and re-emits preserved blocks. WriteNexusOptions has useTranslate (default true) and wrapWidth.

Three formatting primitives back the round-trip and are exported for direct use:

FunctionPurpose
formatJavaDouble(n)Double.toString-compatible (5.0, 1.0E-7, NaN).
formatJavaInteger(n)Integer.toString, 32-bit range checked.
encodeFigtreeColor(r, g, b, a?)#rrggbb / #aarrggbb color token.

parseNewick / parseNexus produce a semantic model: great for layout and rendering, but they normalize whitespace, comment placement, and number formatting. When the goal is to edit a file in place and write back exactly what was there minus your edit, use @phylon/parser instead — a concrete syntax tree that retains every token, comment, and byte of formatting for surgical, lossless round-trips.

Rule of thumb: parse with insomni-phylo to understand a tree, parse with @phylon/parser to rewrite one without disturbing the rest of the file.