#QueryEngine

Karsten Schmidttoxi@mastodon.thi.ng
2026-02-07

#ReleaseSaturday — This week I've been working on extracting, refactoring & generalizing the minimal column store database I've been using for my personal knowledge/media management toolset, and I'm happy to share it with the world now:

thi.ng/column-store

This is an in-memory column store database with:

- Customizable column storage types with configurable min/max cardinality, support for optional and/or tuple-values, default values
- Support for custom column type implementations
- Optional dictionary encoding of column values (memory & filesize saving)
- Powerful extensible multi-term query engine with built-in OR/AND/NOR/NAND operators and predicate-based matchers (column, row, partial row). Queries can be pre-built and then executed as standard JS iterables
- Optional bitfield indexing for dramatic query acceleration (esp. for complex multi-term queries)
- Dynamic adding/removing of columns
- JSON serialization with optional RLE compression (in my PKM dataset with ~20k items, the RLE compressed version is only 29% of the normal JSON serialization)

I hope the readme and code examples give a decent overview for now... I've been using the overall system for a couple of years now, but this new packaged version is still marked as _alpha_. Everything's still being worked on.

Also, for those wondering what's the point of this all and why not using SQLite etc. — I find there're many use cases for a which a pure JSON-based approach is more than sufficient (without requiring extra tools and interfacing layers). The structure/storage model and the bitfield optimizations enable very fast query performance (compared to other JSON db's I've tried in the past)...

(Including all dependencies [only some other thi.ng packages], the entire DB package is ~6KB brotli'd, 19KB uncompressed...)

#ThingUmbrella #TypeScript #JavaScript #JSON #Database #QueryEngine #RLE #SmallWeb

TypeScript code example from the package readme (too long for alt text, link to original: https://github.com/thi-ng/umbrella/tree/develop/packages/column-store#basic-usage)
AI Daily Postaidailypost
2025-11-16

Embedding policy enforcement directly into query engines gives AI agents fine‑grained, auditable control over data. Think row‑ and column‑level security, purpose‑binding, and seamless IAM integration—without sacrificing performance. Learn how this opens the path to trustworthy, open‑source AI.

🔗 aidailypost.com/news/embedding

Karsten Schmidttoxi@mastodon.thi.ng
2025-11-07

Tags are sets. Many apps support tagging of content, but most of them (incl. Mastodon) treat tags only as singular/isolated topic filters, akin to a flat folder-based approach. But tagging can be so, so much more powerful when treating tags as sets and offering users the possibility to combine and query tagged content as sets (think Venn diagrams), i.e. allowing tags to be combined using AND/OR/NOT aka intersection/union/difference operations...

Below is a simple query engine to do just that in ~40 lines of code (sans comments), incl. using an extensible interpreter for a simple Lisp-like S-Expression language to define arbitrarily complex nested tag queries (the code is actually lifted & simplified from my personal knowledge graph tooling, also talked about here recently[1]...)

gist.github.com/postspectacula

For example, the query:

`(and (or 'Alps' 'PNW') (or 'LandscapePhotography' 'NaturePhotography') (not 'Monochrome'))`

...would select all items which have been tagged with `Alps` OR `PNW`, AND have at least one of the two photography tags given, but have NOT the `Monochrome` tag.

Whilst this syntax is probably alien-looking to the average user, it'd would be fairly straightforward to create visual/structural UIs for defining such queries (over the past 20 years I've done that myself several times already), heck even a SLM (small language model) could be used to translate natural language into such query expressions — what matters here is the widespread lack of treating tags this way in terms of conceptual/data modeling in most applications. Imagine being able to use hashtags this way on Mastodon to assemble personalized timelines (and extend the system to not just deal with hashtags, but other post metadata/provenance too)...

The code example illustrates how, with the right tools, such features are actually not hard to implement (or to integrate into existing apps). The example uses the following #ThingUmbrella packages for its key functionality:

- thi.ng/associative: Set-theory operations, custom Map/Set data types (unused here)
- thi.ng/lispy: Customizable/extensible S-expression parser, interpreter & runtime
- thi.ng/oquery: Optimized object and array pattern query engine

[1] mastodon.thi.ng/@toxi/11549755

#Tagging #Sets #QueryEngine #Lisp #Syntax #Parser #Interpreter #TypeScript #JavaScript

Syntax highlighted TypeScript sourcecode of the linked code example...
2024-01-09

This looks like a very interesting project that has gained a ton of interest and adoption. pola.rs/ #DataFrames #QueryEngine

@homelessjun

This is probably the smartest thing I've seen in ages.

I remember when I had a #MediaWiki locally and was using it pretty liberally until I had to update in which case I had no clue how to do that so I kinda lost touch with it later after switching from #Windows to #Linux.

What I'd prefer to have this time around though would be a #Wikibase and setting up a #QueryEngine so that I can query all data I enter in my personal database.

You got one up and running?

heise online (inoffiziell)heiseonline@squeet.me
2022-08-04
Die auf SQL und weitere Sprachen ausgelegte Suchmaschine Photon steht nun für Lakehouse-Datensysteme auf den wichtigsten Cloud-Plattformen bereit.
Query Engine Photon für alle Lakehouse-Systeme

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst