MarkdownDB

A MarkdownDB is a ContentBase where the primary text files are markdown.

A database where the "data" is markdown.

Markdown with frontmatter combines unstructured content with structured data with an open, highly accessible format.

Combined that with blocks i.e. backtick labelled sections and you have extensible unstructured content in which you can embed other structured or unstructured content.

Combine that with web components / JSX and you have a full inline javascript.

More reasons notes/markdown-is-eating-the-world (aka markdown is cool) 😎

Job Stories High Level

i.e. why do i want a markdowndb

When generating rendered pages I want to get a list of all or some of the markdown files we have so that i can render them and create blog index etc

When generating an individual rendered page i want information about all the pages that reference it so that i can list all the backlinks

When creating a tag page I want to list all pages with that tag so that I can generate that page

Job stories

Index a folder of files

I want to create a db index given a folder of markdown and other files

Bonus

Index multiple folders (with support for configuring e.g. prefixing in some way e.g. i have all my blog files in this separate folder over here)

Structured data extraction

Frontmatter

Extract frontmatter

deal with nested frontmatter
deal with casting types e.g. string, number so that we can query in useful ways e.g. find me all blog posts before date X

Link extraction

I want to extract links so i can compute backlinks or deadlinks etc

standard markdown links
obsidian wiki links
embeds of files e.g. ![...]
- wiki link embeds of files

So all of these

[...](...)
![...](...)
[[...]]   # wiki link style including with title
[[...|my title]]   # wiki link style including with title
![[...]]  # for images and other embeds

Tasks

Extracting tasks like:

- [ ] this is a task

See obsidian data view.

Computed fields (or just any operations on incoming records)

cf https://www.contentlayer.dev/docs/reference/source-files/define-document-type#computedfields

When loading a file i want to create new fields based on some computation so that i can have additional metadata

Add a type based on the folder of the file so that i can label blog posts
Add layout based on the folder so i can change layouts based on folder

Validation of data on way in

When loading a file I want to validate it against a schema/type so that I know the data in the database is "valid"

When validation fails what happens?
Error messages should be super helpful
Follow the principle of erroring early

When loading a file i want to allow "extra" metadata by default so that i don't get endless warnings about

When accessing a File I want to cast it to a proper typescript type so that i can use it from code with all the benefits of typescript

BYOT (bring your own types)

When working with markdowndb i want to create my own types … so that when i get an object out it is cast to the right typescript type

Misc

CLI tool for indexing

When working with a folder of markdown files I want to create a markdowndb (index) on the command line so that I it to use

Architecture

Tasks

What is the pipeline
How do we have have "plugins" e.g. for adding

Schema

Nodes / Files
Blocks within them

Node properties:

type
- - text (markdown)
- blob
metadata
body/payload ? not sure we have this in DB (this is on disk / storage)
blob_type (? or inferred)

MarkdownDB

Job Stories High Level

Job stories

Index a folder of files

Structured data extraction

Frontmatter

Tags

Link extraction

Tasks

Computed fields (or just any operations on incoming records)

Validation of data on way in

BYOT (bring your own types)

Misc

CLI tool for indexing

Architecture

Tasks

Schema