Harmonique

Play radio

Notes

Making Discogs Data 13% Smaller with Parquet

Marc in Space · Builds software and draws with robots

Recently, I have been working with the Discogs data dumps. Discogs uploads monthly dumps of their database in a gzipped XML format. They release dumps for: artists, labels, masters, and releases. I was curious about converting them to the Parquet file format. Parquet is a binary columnar file format heavily used in data engineering. It allows different compression algorithms per column and nested structures. It is also natively supported by databases such as ClickHouse or DuckDB. I was mostly curious about the size of a parquet file vs a compressed XML file. Would parquet files be smaller than a gzipped XML? If so, by how much? Also, what would be the conversion speed?

Continue reading →

0b5vr GLSL Techno Live Set - "0mix"

Marc in Space · Builds software and draws with robots

A 7-minute techno live set created entirely in GLSL shaders that fits in just 64KB. Yes, 64kb. This WebGL intro by 0b5vr was submitted to the Revision 2023 demoscene competition. Procedural visuals meets algorave meets extreme compression. My mind is blown.

Small-scale data engineering with Go and PostgreSQL: a few lessons learned

Marc in Space · Builds software and draws with robots

I just released dgtools, a command line utility to work with the Discogs data dumps. This little endeavor was supposed to be a quick side quest, but it transformed into a rabbit hole.

Discogs is the go-to service for record collectors. They might have one of the biggest databases for physical music releases. On a monthly basis, they release a compressed XML of a subset of their database under a CC0 license. Tools already exist to import them into a PostgreSQL database, but I wanted the flexibility of a custom-built solution. I started building something in a Ruby on Rails app but quickly diverged to Go as I didn't want to pay the ActiveRecord performance cost.

Continue reading →

OpenSimplex noise

Marc in Space · Builds software and draws with robots

OpenSimplex noise is a gradient noise function designed to avoid patent issues with simplex noise while fixing the directional artifacts in Perlin noise. It uses a different grid structure with stretched hypercubic honeycombs and larger kernel sizes, making it smoother but slower than simplex noise.

The Art of Rosa Menkman

Marc in Space · Builds software and draws with robots

Late to the party (as I often can be), I recently discovered Rosa Menkman’s work while at NXT Museum for the “Still Processing” exhibition.

Turns out, Rosa Menkman has quite the background in Glitch Art, having worked on theorizing it and having produced artworks bought in the Stedelijk Museum collection. Usually not a big fan of video essays, I ended up being very interested in two of her productions. The first one about racial and sexist biases in analog and digital image processing. The second one about the changing nature of rainbows due to atmospheric conditions (pollution) or changes in our analog wetware (eyes).

Continue reading →