Changelog¶
v2.3.0¶
A focused release that brings alt-chord training to drill and book
modes, adds an interactive chordgen add command for one-off word
additions, and ships a long list of alt-generation cleanups so the
default chords.csv is noticeably less noisy.
Highlights¶
- Alt chords show up everywhere. Drill and book modes now
surface
alt1/alt2/alt3inflections (e.g.sets,setting,settingsforset) — they inherit their base row's FSRS mastery, get the slot-suffixed chord display (au→au1), and light up thealt1 alt2 alt3indicator below the keyboard. Disable viadrill.include_alts: false. chordgen add WORDS...— interactive one-off additions tochords.csvwith a live red/green chord prompt, inline rejection reasons, auto-detected category, and reserved-row pinning so futurechordgen genruns leave the chosen chord alone.always_show_chordsfor drill and book. Reveal chords beneath every word from the start, not just on a stumble. Off by default in both modes.- Cleaner alt generation by default. Six families of bogus
alts (contraction-stem residue, subtitle artefacts, mis-tagged
interjections, non-gradable adjectives, double-pluralised
rows, irregular-verb conjugations) no longer leak into
chords.csvfromchordgen setup/chordgen gen. - README + docs revamp. README is now targeted at first-time visitors with a one-paragraph value pitch, highlights, and quickstart. Development instructions moved to a new Development page. Docs site published to GitHub Pages.
Detailed changes¶
- New
chordgen addcommand for interactively adding words tochords.csvone at a time. Skips words that already exist, auto-detects category (with override), shows collision-free chord options ranked by score (or accepts a custom chord validated through the same scorer the assigner uses), generates alts, and writes the row with the chord pinned so futurechordgen genruns leave it alone. The CSV is rewritten atomically after every accepted word. The chord prompt is live: as you type, the buffer turns red until it becomes a valid choice (then green), with the rejection reason shown inline next to the buffer. - Renamed
docs/images/training.pngtolearn.pngto match the renamedlearncommand. README and learn-mode docs updated to point at the new asset path. - Alt chords surface in drill and book modes. Alt-slot
inflections (
alt1/alt2/alt3columns ofchords.csv) inherit their base row's FSRS mastery, so graduatingsetalso drillssets/setting/settingsand highlights them in book prose. The chord shown beneath a stumbled word is suffixed with the slot digit (aubecomesau1foralt1) and analt1 alt2 alt3indicator below the keyboard always shows the firmware modifier names — the active slot is highlighted in yellow when the current word is an alt form. Disable viadrill.include_alts: false. drill.always_show_chords(defaultfalse). When enabled, drill mode reveals chords below every word in the row from the start, not just on a stumble. Useful while you're still building muscle memory.book.always_show_chords(defaultfalse). When enabled, book mode renders a chord row beneath every text line, with each learned word's chord aligned under its first letter. Mirrors the drill-mode option.- Cleaner alt generation. Closed off five families of bogus
alts that surfaced in
chords.csv: - SUBTLEX contraction-stem residue (
ca/wo/aifromcan't/won't/ain't) is dropped at ingest, so the pipeline no longer imports them as bogus verbs with garbage conjugations. - Other SUBTLEX subtitle artefacts that aren't real standalone
words (
coas a hyphenation prefix, colloquial residuenaandda) are dropped via the same_non_wordsentinel. - The interjection
eh, which SUBTLEX mis-tags asverb, is retagged to no category at ingest so it stops generatingehs/ehed/ehingalts. - Non-gradable adjectives (
other,whole,welcome,chinese,important,available, …) now getmore X/most Xinstead ofwholer/importanter. A length-based fallback also catches long adjectives wherepattern.enproduced a naiveXer/Xestsuffix. - Plurals are suppressed for
-thing/-one/-body/-wherecompounds, mass-noun pseudo-words (gonna,wanna,gotta,huh,hm, …), and rows that are already plural —mpsno longer pluralises tompss,earsno longer becomesearss. - Irregular verbs use a small override table so
pay→paid(notpayed),feed→fed(notfeed),escape→escaped(notscaped), andbear→bore/born. Pseudo-verbs likewanna/gotta/gonna/bornand the contraction stems silence all four conjugation forms. is_base_formfor nouns now also rejects rows wherepluralize(w)is exactlyw + "s", catching the double-sartefacts above. Existing rows in yourchords.csvkeep their stale alts unless you re-runchordgen genwithgen.alts.overwrite: trueor blank out the affected alt columns. Existingca/wo/airows need to be deleted by hand — the SUBTLEX fix only stops futurechordgen setupruns from re-importing them.- Pronoun alts category. SUBTLEX rows tagged
pronounnow feed a built-in lookup-table inflector covering personal pronouns (I/you/he/she/it/we/they) with five forms each (nominative, objective, possessive determiner, possessive pronoun, reflexive). Defaultgen.alts.pronoun.formsis[objective, possessive_det, reflexive]. Re-runchordgen setupto refresh categories, thenchordgen gento populate alt slots. - Casing preserved on import.
chords.csvnow stores SUBTLEX surface forms verbatim (so"I", not"i"); dedup is case-insensitive ("The"and"the"collapse to the higher-frequency entry). Drill and learn comparisons are strictly case-sensitive — chords output the correct casing, so typing lowercaseiagainst"I"is treated as a mistake. Existing user CSVs are unaffected;setup --forcerewrites them. - Drill on arbitrary words, with learned-word highlighting. The
default pool is still your FSRS-graduated words. When you pass an
explicit list (positional
WORDSor--words-file) drill now uses every word from that list with a chord assigned — graduated words are highlighted in yellow, the rest are shown dim, and reveal-on-stumble still works. - Import SUBTLEX-split contraction tails as apostrophe-prefixed
forms. SUBTLEX tokenises on whitespace, so
he's/we'll/I'metc. surface as bogus high-frequencys/ll/mrows. These are now rewritten to's,'re,'m,'ve,'ll,'dunder a newcontractioncategory (no alts generated). The kept-intact contractionn't(don't/can't/won't/...) is also retagged into the same category. The qmk/zmk/kanata/charachorder emitters prepend a backspace before allcontractionrows so the apostrophe attaches cleanly to the previously-typed word. Genuine apostrophe words likeo'clockflow through untouched. Learn mode excludes thecontractioncategory since those forms aren't typed standalone. Re-runchordgen setup --forceto pick up the new rows. gen.key_replacementaccepts punctuation keys. Source-side keys are no longer restricted to alphabetic letters, so you can remap characters like'(e.g."'": x) to a real keyboard letter for chord scoring. The typed word is unaffected — only the chord string changes. Replacement values still must be single lowercase letters present on your keyboard. Recommended for users who don't keep'on a comfortable chord position: without a remap, contraction rows like's/n'tand apostrophe words likeo'clockwill fail to score (the keyboard scorer rejects') and won't get chords assigned.- Drop lone-letter subtitle artefacts on import. SUBTLEX-UK
surfaces single letters like
e(used as grades / spelling letters) taggedunclassified. OnlyaandI/iare real English single-letter words, so the pipeline now drops every other lone letter at ingest. Re-runchordgen setup --forceto refresh. - Demonstrative, modal, and number alt categories. Three new
closed-class lookup-table inflectors join
pronoun: demonstrative(this/that/these/those) with axessingular/plural/proximal/distal. Default forms:[plural, distal]sothis->these,that.modal(can/could,will/would,shall/should,may/might,must) paired present <-> past. Default forms:[past]. Also fixes the long-standing bug where SUBTLEX tagged these asverbandpattern.enproduced nonsense likecanned/canning.number(one..million + ordinals) withcardinal/ordinalpairs. Default forms:[ordinal]soone->first. Affected words are retagged at ingest so the proper inflector runs. Re-runchordgen setup --forcethenchordgen gento refresh.- Bug fixes.
- Directional change penalty (
directional_change_penalty) was always 0 regardless of config —get_directional_changesreturned the wrong variable. It now counts real per-hand direction changes, and-1correctly rejects such chords. - Identity self-alts (past of "hurt" → "hurt", plural of "series" → "series") now skip the alt slot instead of wasting it.
- Same-finger middle+bottom row pairs (e.g. qwerty
a+z) now receive thesame_column_chord_penalty— previously only top+middle was penalised and top+bottom rejected. frequency_exponentconfig docstring now correctly says the default is 3.0 (not 1.0).- Contractions (
's,'m, …) are now exempt frommin_word_lengthin the assigner pool filter, matching the scorer — previously they were scored but then silently dropped from the chord pool.
v2.2.0¶
A focused release that tightens the training loop, sharpens chord assignment defaults, and introduces book mode and drill improvements.
Renamed train → learn¶
The interactive spaced-repetition command is now chordgen learn,
the matching config.yaml section is learn, and the TUI title
reads "chordgen learn". The old train config key is silently
ignored.
Learn mode¶
- Two-phase chord reveal during initial learning. Brand-new
cards show their chord for the first
learn.show_chord_steps(default 3) consecutive correct reps, then hide it for the remaininglearn.learning_steps - show_chord_steps(2) reps before graduating. An error resets the FSRS step counter to zero, which brings the chord back — guided reps first, then recall from memory. - Separate learning vs. relearning step counts.
make_schedulernow takes distinctlearning_steps(new cards, default 5) andrelearn_steps(lapsed cards, default 2), so a longer initial staircase no longer drags out lapsed-card re-graduation. - Calendar-date overdue check. Overdue cards are matched by calendar date (Anki-style) instead of wall-clock time, so cards due later today appear in the review queue immediately.
- Chord stays hidden during relearning. When a mastered word lapses and returns to Relearning, the chord no longer reappears by default — it only shows on a current-word error, same as mastered words.
- Red flash no longer blocks keystrokes. The red flash on mistype is purely visual — backspace, correct letters, and space-to-complete all work during the flash instead of being silently dropped for 400–500 ms.
Book mode¶
- New
chordgen book PATHcommand. Type your way through an arbitrary book (.txt,.md, or.epub). The TUI shows a window of the text centred on the cursor, your keyboard layout pinned to the bottom, and a sliding-window WPM (default last 30 seconds, configurable viabook.wpm_window_seconds). Words you've already learned (FSRS Review state) are highlighted in yellow — including the current word — and mistyping a learned word reveals its chord and lights up the chord keys on the keyboard view. Cursor position is auto-saved per-book to~/.config/chordgen/books.jsonso re-runningchordgen book <path>resumes where you left off (--restartto start over). Navigation:←/→by word,↑/↓by paragraph,PgUp/PgDnby half a screen-page. - Line-based rendering with max width. The book view scrolls
by line rather than by word. The cursor's line stays vertically
centred with as many previous and following lines as fit on
screen, and text wraps to
book.max_width(default 80 columns). The view fills the available height immediately on launch — a second render is scheduled after the initial layout pass so the text window uses the widget's full height from the start. - Typeable-character normalisation. Smart quotes, em/en dashes, ligatures, accented Latin (à, é, ñ, ç, æ, œ, ß, …) and miscellaneous symbols (™, …, •, ©, ®, °, ×, ÷) are folded to plain ASCII on load so a basic QWERTY layout never gets stuck on an untypeable glyph.
- Resume uses file content hash. Progress is keyed by SHA-1 of the file's bytes instead of its absolute path, so renamed or moved books still pick up where you left off.
Drill mode¶
- Arbitrary word lists. Pass words as positional arguments
(
chordgen drill the quick brown fox) or point at a file with--words-file/-f. In this mode the FSRS graduated pool is bypassed andprogress.jsonis left untouched. Words without a chord inchords.csvare silently dropped. - Personal-best leaderboard. Each completed drill records its
WPM into a per-keyboard-layout leaderboard at
~/.config/chordgen/scores.json, stored separately fromprogress.jsonso high-scores survive FSRS schema migrations. Only scores that beat the current #1 are recorded; every PB milestone is kept on disk and the summary screen shows the top 5 with dates. The leaderboard is keyed by<keyboard-type>:<layout>; custom layouts use the newgen.keyboard.<type>.custom_layout_namefield (defaultcustom) so multiple custom layouts can keep separate scoreboards. - Failed words captured at first mistype. A word is recorded as failed the moment you mistype it, rather than when the word completes, so words you were stuck on when the timer expires now correctly appear in the failed-words list.
Chord assignment¶
- Default
frequency_exponentraised to 3.0. Cubic frequency weighting ensures short chords go to common words. A word at Zipf 6.0 is now weighted 216× more than one at 1.0. Previously 1.0 (linear) let rare words compete too aggressively. - Key replacement for missing layout keys. If your keyboard
lacks certain letters (e.g.
qorz), setgen.key_replacementinconfig.yamlto substitute them in chord candidates — chords usekforq,sforz, while the typed word stays unchanged. - Faster re-runs. When
alts.overwriteisfalseand all three alt slots are already populated, the expensivepatternlibrary import is skipped entirely, makingchordgen genmuch faster on iterations.
TUI / general¶
- Layout refresh. Learn and drill modes render session stats above the word stream, with the current word horizontally centred on screen. The underline cursor between the word and its chord has been removed so the chord sits directly beneath the word. The keyboard view stays at the bottom.
New config keys¶
| Key | Default | Section | Purpose |
|---|---|---|---|
learn.learning_steps |
5 | learn | Consecutive corrects before a new card graduates |
learn.show_chord_steps |
3 | learn | How many learning steps show the chord |
book.wpm_window_seconds |
30 | book | Sliding window for running WPM |
book.max_width |
80 | book | Max width of rendered text block |
gen.key_replacement |
{} |
gen | Map missing layout keys to replacements |
gen.keyboard.<type>.custom_layout_name |
"custom" |
gen.keyboard | Leaderboard label for the custom layout |
Changed defaults¶
| Key | Old | New |
|---|---|---|
learn.relearn_steps |
3 | 2 |
gen.assignment.frequency_exponent |
1.0 | 3.0 |
v2.0.0¶
A major release that overhauls the chord-generation pipeline and introduces interactive practice. Highlights:
- Train mode — a Textual TUI backed by the FSRS spaced-repetition algorithm, with Anki-style daily quotas, per-word speed grading, leech detection, and an ASCII keyboard view that highlights chord keys.
- Drill mode — a read-only speed-drill TUI for words you've already graduated, with timer or word-count sessions and live WPM.
- Vocabulary pipeline — on-demand SUBTLEX downloads at
setuptime, with explicitfrequency(Zipf) andcategorycolumns; reserve a chord by leaving itsfrequencycell empty. - Optimal chord assignment — replaces the old greedy + 2-swap passes with a sparse minimum-cost bipartite matcher, plus alt-coverage filtering and optional frequency tiers.
- Redesigned alt generator — category/inflector registry instead
of hard-coded UD POS tags, fully configurable from
config.yaml.
Upgrading¶
The vocabulary pipeline and chords.csv schema have changed. Existing users must recreate their chords.csv:
This will re-download the frequency list and regenerate chords.csv with the
new columns. Any manual edits to the previous chords.csv will be lost — back
it up first if you want to preserve them.
Train mode¶
- New
chordgen traincommand — an interactive Textual-based TUI that drills your chords with spaced repetition. Words flow horizontally across the screen; type each word followed by a space and the next one is appended. - Long-term scheduling is backed by py-fsrs
(the FSRS algorithm). Per-word state — FSRS card, cumulative
review count, lapse count, last-seen date, and per-word WPM EWMA —
persists to
~/.config/chordgen/progress.jsonafter every word commit, so quitting mid-session never loses progress. - Anki-style daily quotas: each calendar day has a budget of
new_words_per_daybrand-new words andreviews_per_dayoverdue reviews. Once both budgets are spent and any in-flight learning words have graduated, you land on a "No more words due today!" screen instead of a per-session summary. - Anki-style queue counts under the chord row show the on-screen composition at a glance: blue = new, red = learning / relearning, green = graduated.
- New / learning words show their chord directly under the word.
Once a word has graduated to FSRS Review state and accumulated
mastery_thresholdtotal reviews, the chord is hidden until you lapse on it. - Any mistake during a word grades the review as
Again, sending the card back into the learning queue. AnAgainon a card already in Review counts as a lapse; words that accumulateleech_thresholdlapses are flagged as leeches so you can re-pin or revise the chord inchords.csv. - Per-word speed grading: each clean word's WPM is compared to a
rolling median of recent samples. Words below
slow_wpm_fractionof the median are gradedHard(instead ofGood) so FSRS schedules them sooner. The first word and any word that flashed red are excluded from speed grading. - Words rescheduled mid-session are appended to the tail of the visible queue rather than inserted right after the current word, so the next word doesn't flip under your fingers.
- An ASCII representation of the configured keyboard renders below
the word list, with the chord keys for the current word highlighted.
Both
standardanddirectionalkeyboard types are supported. - The active Textual theme is persisted to
config.yamlwhenever you change it from the in-app command palette (Ctrl+P). - New
trainblock inconfig.yaml:show_words(default 10),new_words_per_day(default 20),reviews_per_day(default 200),leech_threshold(default 8),mastery_threshold(default 3),relearn_steps(default 3),target_retention(default 0.9),slow_wpm_fraction(default 0.7),slow_min_samples(default 20). - New top-level
themefield inconfig.yaml(default"textual-dark").
Drill mode¶
- New
chordgen drillcommand — a read-only speed-drill TUI for words you've already learned. Drill mode does not touch FSRS state, lapse counters, or daily quotas; use it to warm up or benchmark WPM against the chords you already know. - Word pool is restricted to cards in FSRS Review state; if no
graduated words exist yet, drill prompts you to run
chordgen trainfirst. Words are sampled by random shuffle. - A drill ends after a fixed word count (
drill.mode = count, usingdrill.count) or a fixed timer (drill.mode = time, usingdrill.time_seconds). Default is a 30-second timed drill. - Live WPM is displayed during the run. Chords stay hidden unless you make a mistake, mirroring the mastered-word behaviour in train mode.
- Summary screen reports WPM, accuracy, and any failed words
(de-duplicated). Press
Tabto start another drill (Tab also restarts mid-drill), orEsc/Ctrl+Cto quit. - New
drillblock inconfig.yaml:show_words(default 10),mode(default"time"),count(default 25),time_seconds(default 30).
Vocabulary pipeline¶
- Replaced the bundled word list with on-demand downloads from
SUBTLEX
during
chordgen setup. - Added
--sourceflag to choose betweensubtlex-usandsubtlex-uk. - Added explicit
frequencycolumn tochords.csv(Zipf scale 0–7 for SUBTLEX sources). Replaces the implicit row-order frequency assumption. - Removed the
reserved_chordcolumn. To reserve a chord, set thechordcolumn on a row with an emptyfrequencycell — that combination is the new signal for "user-pinned, leave alone". See the README for details. - Renamed CLI flag
--min-zipf→--min-frequency. - Renamed
poscolumn →categoryto match the alt-generator config naming.
Alt generation¶
- Redesigned the alt-generator around a category/inflector registry instead of hard-coded UD POS tags.
- Added
altsblock toconfig.yamlschema with per-category options (all enabled by default).
Chord assignment¶
- New optimal assignment algorithm in
assigner.py: - Reduces the problem to a sparse minimum-weight bipartite matching solved
exactly with
scipy.sparse.csgraph.min_weight_full_bipartite_matching. Replaces the previous greedy + 2-swap + eviction passes with a single globally-optimal solve. - Cost model:
option.score * weight(word), where weight is the row'sfrequencyfloored atmin_frequency_weight. Frequent words attract low-score (short / fast) chords. - Slack edges with cost
unmatched_penalty * weightkeep the matching feasible; words for which leaving them unmatched is cheaper than the best available chord are reported in diagnostics with the words holding their top candidates. - Runs in well under a second for ~2000 words.
- Alt-coverage filter: words already reachable as another row's alt (e.g.
madeismake's past tense) no longer get their own primary chord. Cycles in the alt graph (e.g.could ↔ can) are broken by keeping the higher-frequency word. assignmentblock inconfig.yamlexposesmin_frequency_weight,unmatched_penalty,frequency_exponent(raise the cost weight of frequent words so they aren't out-bid by rare ones), andpriority_tiers(split the pool into frequency tiers and solve each in order, reserving previous-tier chords). The previoustop_kandmax_swap_passesknobs were removed; the matcher considers every viable option per word.