#dataformats

DataFormatHubdataformathub
2026-02-03

๐Ÿ“ฐ Python Data Processing 2026: Deep Dive into Pandas, Polars, and DuckDB

Stop waiting for your CSVs to load. Learn how Pandas 2.x, Polars, and DuckDB are revolutionizing tabular data processing with Apache Arrow in 2026.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/python-

DataFormatHubdataformathub
2026-02-02

๐Ÿ“ฐ TOML vs INI vs ENV: Why Configuration is Still Broken in 2026

Stop using .env files for secrets. Discover why INI, TOML, and ENV variables are failing modern dev teams in 2026 and how to fix your config workflow now.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/toml-vs

DataFormatHubdataformathub
2026-02-01

๐Ÿ“ฐ Zod vs JSON Schema: Why 2026 is the Year of Type-Safe Data Contracts

Stop guessing your data types. Discover how Zod v4, JSON Schema Draft 2020-12, and TypeBox are revolutionizing API safety and performance in 2026.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/zod-vs-

DataFormatHubdataformathub
2026-01-26

๐Ÿ“ฐ ELK Stack vs OpenTelemetry: The Ultimate Guide to Log Parsing in 2026

Is your observability platform a house of cards? Learn the truth about schema drift, AI-driven anomalies, and the real cost of log storage in 2026.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/elk-sta

DataFormatHubdataformathub
2026-01-21

๐Ÿ“ฐ JSON vs JSON5 vs YAML: The Ultimate Data Format Guide for 2026

Master the evolution of JSON, JSON5, and YAML in 2026. Learn how to handle massive payloads, avoid YAML bombs, and implement strict schema validation today.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/json-vs

N-gated Hacker Newsngate
2026-01-12

๐Ÿ›’ Ah, the modern marvel of squishing into a flat text file! Because who doesn't want the thrill of assembling their own dataset from a jumble of words? ๐Ÿ˜‚ was obviously too roomy, so they went ahead and invented 'CommerceTXT' - the IKEA of data formats. ๐Ÿ—๏ธ๐Ÿ“ฆ
huggingface.co/datasets/tsazan

JavaScriptBuzzJavaScriptBuzz
2026-01-10

JSON Handling: Which Language Makes It Easier?!

JavaScript JSON vs PHP JSON - native vs functions, which wins? SURPRISING results!

youtube.com/watch?v=K6QzMpuBY7s

DataFormatHubdataformathub
2025-12-31

๐Ÿ“ฐ Zod vs Yup vs TypeBox: The Ultimate Schema Validation Guide for 2025

Stop guessing your data's shape. Master Zod, Yup, and TypeBox to build bulletproof, type-safe TypeScript applications in 2025. Learn the latest features now.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/zod-vs-

DataFormatHubdataformathub
2025-12-25

๐Ÿ“ฐ JSON vs YAML vs JSON5: The Truth About Data Formats in 2025

Is YAML a security nightmare? Discover the truth about JSON, JSON5, and YAML in 2025. Learn why your choice of data format impacts performance and safety.

๐ŸŒ Also in: ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡น

๐Ÿ”— dataformathub.com/blog/json-vs

2025-11-24

Dear Archive Bubble,
are there any journals and magazines I should be aware of when it comes to news about (digital) archiving?
โ€ข Also data formats, metadata, events, etc.
โ€ข Journals from Europe

Feel free to share and thanks!

#archives #dataformats #archivistodon berlin.social/@aoe/11558175405

2025-11-20

Liebe Archiv-Bubble.
Gibt es Fachzeitschriften, die ich kennen sollte, wenn es um Nachrichten rund um das Thema (digitale) Archivarbeit geht?
โ€ข Auch Datenformate, Metadaten, Events etc.
โ€ข Zeitschriften aus D-A-CH aber auch englischsprachige Journale aus dem europรคischen Ausland.

Gerne boosten - Danke!
#Archiv #Archive #archives #dataformats #archivistodon

N-gated Hacker Newsngate
2025-08-26

๐ŸŽ‰ Behold, the groundbreaking revelation: is not the Holy Grail of data formats! ๐Ÿš€ Apparently, using xz for digital preservation is like using a sieve as a bucketโ€”bound to fail. Who knew? ๐Ÿคฆโ€โ™‚๏ธ Stick to , , or if you want actual functionality and avoid sinking your data into the abyss of inadequacy. ๐Ÿ”๐Ÿ’พ
nongnu.org/lzip/xz_inadequate.

N-gated Hacker Newsngate
2025-08-24

Ah, the thrilling world of Parquet file formats, where the only thing more riveting than the two versions is the general towards updating them. ๐Ÿคฆโ€โ™‚๏ธ Apparently, SQL engines have taken on the role of format overlords, ensuring progress is locked away tighter than my interest in reading another tech blog post. ๐Ÿ™„
jeronimo.dev/the-two-versions-

2025-06-19

Shane Oโ€™Sullivan: Search Huge JSON files on the Web. โ€œWorking with very large JSON files (20MB+) using online tools tends to be a crashy affair. Whether youโ€™re looking to format or search them, all the tools I found just crash. I found myself having to work with huge JSON files recently, so I built a tool specifically optimized for huge JSON files, called Huge JSON Viewer.โ€

https://rbfirehose.com/2025/06/19/shane-osullivan-search-huge-json-files-on-the-web/

๐Ÿ’ง๐ŸŒ Greg CocksGregCocks@techhub.social
2025-01-31
Conceptual diagram of GSPy workflow. Data from a variety of formats and types are read into GSPy, along with required metadata files. Through the GSPy software, data are converted into a standardized NetCDF file containing the dataset and metadata appropriate for archiving and sharing.GS data convention. (A) Datasets are structured into three fundamental group types based on content and data geometry. The Survey group contains general metadata about the dataset. Unstructured datasets, such as from CSV or TXT files, form Tabular groups, whereas structured (gridded) datasets are categorized under the Raster group. Metadata is attached to all groups, with various required attributes (green text) that expands on the CF-1.8 convention. (B) Groups follow a strict hierarchy in the NetCDF file, with a single Survey group at the top to which all data groups are attached. Datasets are indexed within their respective group type. (C) Tabular and Raster data groups must contain clearly defined dimensions, such as index or x, y, z, as well as coordinate variables. Raster groups are distinct in that dimensions are also coordinates, whereas Tabular datasets are assigned spatial coordinates that align with the index dimension. Lastly, the coordinate variable โ€œspatial_refโ€ is required for all data groups, which expands on the โ€œcoordinate_informationโ€ variable required in the Survey metadata.photo - rigs preparing to do a seismic survey, Middle EastGSPy code base - Writing and plotting examples. Once all groups have been attached to a Survey, the โ€œwrite_netcdfโ€ and โ€œwrite_ncmlโ€ methods will write the GS NetCDF and NcML files, respectively. GSPy also provides methods to generate scatter and pcolor plots for variables.
2023-12-12

mastodon.edufor.me/@schmittlau

Maybe #FAIR ness in research data should extend to #FAIR ness in other forms of data we generate and consume in our daily lives?

#FAIRData #OpenData #DataFormats #DeustcheBahn #DB and maybe even #NFDI #NFDIRocks ?

Doc Edward Morbius โญ•โ€‹dredmorbius@toot.cat
2021-04-30

On Paperwork vs. Digital Formats

tired: Our customer's paperwork is profit. Our own paperwork is loss.[1]

wired: Your proprietay data format is loss. Our proprietary data format is profit.

I'd remembered the first aphorism from a long-ago collection of Murphy's Laws.

Thinking through my struggles at organising online and digital media, references, etc., I realised that a huge problem is that these formats don't serve my goals. They're designed far more around their authors' goals, or even more often, the publishers' goals, largely around advertising, marketing, tracking, building lock-in, creating and defending monopolies, and the like.

Digital formats that are in the end-user's interest and specification serve the user. Those that are in the publisher's specification serve the publisher.

A related thought is that a key affordance of printed periodicals (newspapers, magazines, journals) is that of garbage collection, to put a contemporary spin on it.

When you're done reading a newspaper or magazine, you pick up the whole lot and throw it out. There's an intermediate level of organisation other than "the article" and "the whole collection" (that is, everything published in your office or home), "the issue". (Or perhaps a box or shelf of archived media.) That is, _there are multiple naturally-occurring levels of aggregation.)

When you're trying to sort through a set of browser tabs, you generally have only two levels of aggregation: the individual tab, or the entire session. There are typically no intermediate levels, and sorting through what you want to keep (or re-read, or work with) means you've got to go through the set one at a time and resolve disposition. The data format serves the browser vendor, but not the user.

Tools such as Tree-Style Tabs, an absolutely essential Firefox extension, give a higher level of natural organisation, the tab tree. Here, a structure emerges, without user effort, of related content. At the top of the tree is whatever page began an exploration, and as you descend it, you go further down into the search. When cleaning up, it's possible to pick any given tab, branch, or whole tree, and close it out in one fell swoop. Garbage collection costs are reduced.

(Three guesses as to what I've been attempting to do, and the first two don't count.)

#media #paperwork #DigitalMedia #DigitalFormats #FileFormats #DataFormats #kfc #docfs #UserCentricDesign #TreeStyleTabs

https://mastodon.social/@rzrrzr@social.samsunginter.net
2019-08-28

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst