luna's up on her bullshit again

do you want to send words at me? reach at email [email protected]

Alternative title: “My answer for the next decade of my digital life”.

I am, generally speaking, a data hoarder. I don't really have the money to invest in large ZFS or CephFS or whateverFS clusters of storage, so I try my best with my 1TB drive I use for my entire system (nowadays with a 256GB NVMe drive for the root partition, so another 100GB can be quickly eaten away by datahoard-ness).

If I recall correctly, this strategy of doing things started back in the 2010s where broadband was still shy inside the country (10Mbps down, less than 1Mbps up was common, nowadays in 2022, I'm getting almost 300Mbps down, 40Mbps up, with the same pricing from back then), and a girl needed her electronic music fix. Since my school doesn't have a connection, I started hoarding youtube mp3 files from a random youtube-to-mp3 service, and copying the files to my 2GB phone.

My laptop had way more than 2GB, and there's an infinite amount of music to listen to. What to do? Organizational systems, of course! The first thing I remember doing was ordering folders by increments of 1, where each folder had some 10 to 20 files. So I could grow the library up in my laptop, and copy the most recent tunes to my phone in a very easy manner (delete all folders below 25, for example), while not having to lose the old stuff, just in case. If you know my soulseek, you can see that same system living in 2022.

As time passed, I started gathering different kinds of media, videos, papers, books, and I noticed a pattern: they aren't properly read in sequential order. Say, if I wanted to find some scientific paper I had saved ages ago, and I know what it's about, I would have to either: – Go through each folder sequentially. – Hope that I remember the title.

If you're yelling “booru” out of the top of your mind, then yes, I didn't know at the time, even though I was a heavy user of booru software to search for new images of anime women holding hands, but booru systems are the ideal solution for this kind of problem. Non-hierarchical, tagging systems that have (digitally) lived with us for a decade now, but the ideas behind it come much earlier.

I'm ashamed that it took this long for me to refer to it in writing, but Nayuki's blogpost on non-hierarchical systems is an eye opener to what I've been wanting to have for ages and I didn't know it. It's absurdly long, but it's orchestrated akin to a machine gun of ideas, and I love that kind of stuff.

There are many booru systems out there, but one of them that's close to what I've been wanting is Hydrus. From the website: > The hydrus network client is a desktop application written for Anonymous and other internet enthusiasts with large media collections. It organises your files into an internal database and browses them with tags instead of folders, a little like a booru on your desktop. I have attempted to use it to organize my libraries more than once, since it would “technically” fit like a glove, but I had issues with it, though they are not to say that hydrus is bad, more that it doesn't align with my vision for such a system.

In hydrus, you import your files into the system, you can add/remove tags, share metadata around in the “hydrus network”, etc. But the biggest dealbreaker for me is that once a file is added, its original location becomes meaningless. To be able to refer to that file in the future, you must use hydrus to find it, because everything is in an internal client_files/ directory where the filenames are renamed to their hashes.

Everything still works from a filesystem level, sure, but if you lose access to Hydrus, you now get a client_files/ folder you can't understand anything out of by filepath anymore. That is a design decision that I perfectly understand where it comes from, but it brings me pain (see: the folder organization structure I just showed is not possible to happen inside Hydrus unless I create hacks like symbolic links from client_files/ into the sequential folder structure).

So, if I wanted to make my own non-hierarchical system, it would have to operate as an overlay on top of an existing filesystem, keeping references to the original file paths. That becomes a problem really fast, and that's the main reason why hydrus does what it does by design: renames. If a file is renamed, your reference to it goes away. A system that does not take ownership of the file contents entirely would have to keep track of path renaming.

There are two approaches I have found for this: – FUSE. – syscall tracing.

FUSE is something I have never touched on and there's the possibility of causing high latency on FS operations as things go back and forth from kernel-space to user-space (in theory io_uring could help in this case but I have no idea how it works).

Syscall tracing is possible thanks to the eBPF virtual machine that's in the Linux Kernel. bpftrace is a CLI utility for Linux that is heavily inspired by dtrace which works on the BSDs and illumos systems.

So, well. I guess I made it.

https://github.com/lun-4/awtfdb.

Here it is, thanks to bpftrace integration I am able to make my own “rename tracker” and update the index database file automatically. Because of all of that, I have a much harsher opinion on bpftrace, but that's answered by “it's not 1.0 yet, so don't complain”.

awtfdb is a collection of CLI tools (at the moment, Linux-only, MacOS soon maybe?) that operate on a single index database file, powered mainly by Zig, SQLite, and my power to bother a colleague so hard they make a library. You can include files and their tags into the index with ainclude, search for tags with afind, remove files through arm, etc. Since bpftrace has to be run as root, awtfdb-watcher is provided so that only that dedicated process can have root access, while the others just operate under your normal user.

This goes back to the alternative title. If all goes well, this project is planned to stay around for a long time for me, and is how I want to manage my media libraries in the long far future. It isn't super stable right now, and there's design questions to answer, but I'm hopeful for that future.

It has been almost two months since I received my own VR device: The Oculus (now Meta) Quest 2. A few years ago I used a PlayStation VR at a friend's house, but that was only for a handful of hours, and nothing other than Beat Saber.

I would say it has been one of the greatest investments I've using the money I got from my full-time job. Thanks to economies, I'm not able to just get a powerful setup to have PCVR, which means the Quest 2, with its price range, becomes very enticing (as in, orders of magnitude more enticing). The amount of things and experiments I can do with it justify the cost, and several times a day I've been having even more ideas on what to do with it, it's great.

Because of all of this, I'm now also considering financing my PCVR setup, but that's for the long term (as in, “2-3 years” long term to save enough, if something more pressing doesn't happen by next year).

This blogpost will condense various takes I've had over the months regarding various topics related to VR.

The limitations of Standalone

The Quest 2 is, for all intents and purposes, an Android phone with a big screen, 4 infrared cameras, weird lenses, wild speakers, and buttons! Fucked up! It's absurdly impressive work by the Meta engineers. I believe that the future will be in this standalone space, because you can drive the price down, and cheaper tech means more adoption of that said tech.

Currently (December 2021) a Valve Index costs 1000 USD, then you need a powerful PC to achieve smooth VR, and for me, currency conversion and shipping costs mean that it's too much of a risk for me. The Quest 2, on the other hand, is 300 USD by itself. Sure, import taxes bring that value up a notch, but it's much less risky of a purchase for someone in my situation. And now I just need the “powerful PC” part, since I can use something like Virtual Desktop to play SteamVR games in the Quest 2.

This is not to say the Quest 2 is perfect. It is a phone, and so, performance takes a big punch. If you compare PCVR experiences and their equivalent Quest 2 ports, you can smell the optimization (without that smell box!! It's a fucking smell box).

Sidenote: Privacy

You can remove your connections to Oculus's servers, but in the end, that turns the Quest 2 into a PCVR headset, and while that's a viable plan for me, who wants a PCVR setup in the future, I do not think that is a useful answer to the large amount of Standalone VR players that will begin to appear in the future, see this market forecast, and Steam's hardware survey, of which Quest 2 has the majority with 36.32% (November 2021 Hardware Survey data).

If what we want is to release ourselves from Meta's shackles, I think the only way to do it is to root the Quest 2, and start to develop custom system software. While the former is a hard endeavour, the latter is an even harder one. You would need to: – Reverse engineer the Qualcomm Snapdragon XR2, an SoC which does not have public documentation, and I don't want to talk to Qualcomm to find out if I need an NDA or not. – Write your own SLAM system to counteract Quest Insight – God, the hand tracking data (it's just a neural network, for sure, but oof) – Possibly need to reverse engineer VrApi, that might change as Meta wants things to use OpenXR, but I can't find reliable sources for that one. – Oh god how do we get APKs from the Oculus Store to run on the headset – And a lot more that I don't know of!

There still should be the ability to root so that those reverse engineering efforts can begin. The whole point of the Quest 2 is in the hardware it's giving and some amount of a social platform, but if you jailbreak that hardware, you jailbreak the value of the headset. Enabling freedom while making Meta lose money with the headset (source for the business loss).

Metaverse projects and the point of VR

Right now, there are around 10000000000000 metaverse wannabes, I'll categorize them into 3 areas: – VRChat – Horizon Worlds – The rest (Rec Room, AltspaceVR, etc)

I am of the opinion that VRChat is the only one that will have any kind of foothold in the future because of one singular thing the others don't have: Personal Identity. I have talked about this topic before with many others, but in the other platforms you have a “style picker”-style screen which you have seen in many other games, like The Sims, while on VRC you can have just about any 3D mesh (within the limits, of course). That allows for a much greater array of possibilities compared to the other platforms, and in turn, generally better social interactions. A close friend of mine says that when she's talking to me in Rec Room (when we were trying it out), it felt like talking to a manufactured avatar, but switching back to VRC, she reported it felt like having a conversation with me, Luna, even though I was appearing to her as a completely different being.

I'm not sure if other people would experience this slice of VR differently, and I might have biases on identity exploration due to being transgender and doing such myself, but if someone wants to be a cute anime girl or whatever else they want, they should be encouraged to do so, and being a permutation of a humanoid avatar without legs is not really unconstrained enough to do such experiments with.

VRChat and Geocities

Practically all of the worlds and avatars in VRChat are created by users. The VRChat company provides the client and server infrastructure, and sometimes partnered worlds, for which I don't know the actual creation process of.

You can draw direct parallels from that into how Geocities worked. It was free user content hosting servers, and then someone else came with the web browser, but it was raw, random creativity by other people, of which you had to dig through and find yourself.

Though, Geocities died of unknown reasons. I was too young to have gone overseas in internet space, and the country I'm in didn't really have any sort of Geocities-esque movement, we were diving in by the millions by the time FAANG already became an information behemoth.

One guess I can make, which also applies to many other platforms, is the cost versus revenue math. You are giving people infrastructure for free, and how do you plan to pay for that infrastructure? It isn't even known if YouTube is properly profitable for Google, but AdSense pays for it all. Will something akin to that happen in the VR space?

To take a more positive light, even though I feel a lot of worry in the infrastructure side of things, I don't feel as much in the actual content generation ecosystem. Likely because platforms like Patreon exist, which enables a far easier donation scheme than back in Geocities days. Hell, even Neocities, the spiritual successor to it, is backed up by either donations or a small 5USD/mo plan.

My wanted VR future

Disclaimer: The “My” in that title is on purpose. You might disagree with what I said in here, in any front.

While I don't usually love to think in long-term timelines, I have some wishes here and there.

  • VR should be brought to more people, and it's why I can see the bet on standalone VR because it is a great driver for that.
  • Metaverse being owned by people, not profit motive. VRChat is not that, as they are a startup and own the entire infrastructure of itself.
    • Can federation solve this? I don't know. I feel disillusioned by it nowadays, you can probably get the benefits by being centralized, but structured like a worker-owned cooperative.
    • I don't think blockchain will save us because of the complexity of the technical hurdles while centralized systems bring development speed and other things, but this could be written in more detail on another blogpost.
  • Full-dive VR is not something I directly want in the current state of things, it requires us as a society to think differently about technology, because in virtual, anything is instantly copyable compared to meatspace.
    • There's also the whole “write-based BCI” which brings an avalanche of ethical discussions, which, again, can be developed in a separate blogpost.

Conclusion

I think VR is neat. Its neatness should be defended.

Here's a fun music track: Dom & Roland – Imagination

Hello, if this is your first time reading this, I'm luna, one half of LavaTech and we provide continuous integration (CI) services for the Zig programming language.

We started doing this because Zig could not use their pre-existing CI service for FreeBSD builds, sr.ht, because the compiler became very RAM-intensive and that hits the limits on their CI. We stepped in and we now provide a selfhosted Sourcehut instance where FreeBSD builds can still happen while the self-hosted compiler is not finished yet (Genuinely, I have no idea what will happen once the memory usage is brought down, but for now, I'm happy to provide the service, and wish to continue).

On a sunny day with nothing much to do, I decided to bring better NetBSD support for Zig, and one of the areas I can learn on is the one where I'm operating: CI.

This blogpost outlines the things I had to do to bring NetBSD to Zig CI.

Some background, or, How does CI work

Sourcehut is a “software forge”, which is a collection of tools to manage codebases, either publicly or privately. sr.ht is the main instance of Sourcehut, operated by its original creator, Drew Devault.

builds.sr.ht, the CI service of Sourcehut, works by providing QEMU/KVM virtual machines that run specifically-crafted operating system images, and from them, your project can be built and the resulting artifacts can be tested inside the virtual machine, all via executing SSH into the VM. After all the commands are run without errors, the build has “passed”. An example of builds.sr.ht on the main instance can be seen here

The outline of the available VM images can be seen in the compatibility page of builds.sr.ht, and from there you can see Alpine Linux, Arch Linux, FreeBSD, among others (hell, even 9front!).

NetBSD on sr.ht

Back when I was developing support for it, the compatibility page did not mention NetBSD, and thanks to Michael Forney, it now is!

The history for it is as follows: – The initial versions of the NetBSD image were created by Drew Devault. – However, they weren't really finished. – I began working on finishing it. – Michael Forney also began working on it, in parallel. Their patch has been successfully upstreamed! – Even though we worked in parallel, our changes are mostly equivalent and so, there isn't a need to upstream my changes: – Replacing anita (The automated NetBSD installer) for directly downloading the binary sets and partitioning the virtual disk. This is done to remove the QEMU requirement on the build script, because I am building the image in a NetBSD VM itself, and I was not sure on how to bring nested virtualization to it. – Replacing pkgsrc building source packages with downloading binary packages via pkgin.

Here is a successful test build with the image!

Preparing the CI

Zig CI does not depend on the upstream platforms' LLVM builds. Instead, a self-built LLVM is created via the zig-bootstrap project. With it, you can go from a C compiler to a fully functioning Zig compiler for any architecture. It achieves this via four large compile steps (with the relevant names I will use for them throughout this blogpost): – Compiling LLVM for the host system (“llvm-host”). – Compiling Zig for the host system (“zig-host”). – Compiling LLVM for the target system (“llvm-crosscompiled”). – Compiling Zig for the target system (“zig-crosscompiled”).

With, for example, the zig-bootstrap result for aarch64-linux-musl, you can start an ARM64 Linux CI, where zig-crosscompiled compiles the new commits of Zig. This is possible because Zig is a C compiler as well. Here's the script for Zig Linux CI on Azure of such artifacts working in a CI script ($CACHE_BASENAME).

For NetBSD, the process shouldn't be different (FreeBSD follows it as well), so what we need to do is get zig-bootstrap running on it. This process took a couple of days, with help of andrewrk and washbear, and long arduous moments of the having to cyclically “Start the build, find something to do, mentally switch back to fixing the build”. It all paid off, and I generated fantastic quickfixes in the end! Here they all are!.

The submitted PR is more of a discussion place about what can be done regarding my workarounds, rather than putting all of them into zig-bootstrap (because I am pretty sure some of those fixes would break... every other target, lol).

Spinning up new infrastructure for CI

builds.sr.ht has the idea of build runners for jobs, so you can dynamically add more capacity for CI as needed. Each runner can have N workers, and those are the ones running QEMU VMs.

As we started with FreeBSD, we learned that a single worker requires 16GB RAM (the actual compile requirements are approximately 8GB, but there are a lot of build artifacts, the filesystem for the VM is composed of both the image and a growing temporary ramdisk). And as time passed, we learned that 4 build workers can handle the throughput needed by Zig when it comes to FreeBSD (as of July 2021, this is true).

To have NetBSD support added to the mix, we would have to add 4 build workers to the network, which means finding 64GB of available RAM somewhere in our infrastructure. LavaTech operates in a hybrid cloud model: There are VPS'es in various cloud providers, but most of our services run in our own colocated hardware, shared with general programming. We are able to do so because of Hurricane Electric's free colo deal. Also, Peer with us!

In general, our rack has two power supply units, one big momma computer (of which we kindly hostnamed it “laserjet”, it was close to being “HP-LaserJet-Enterprise-MFP”), and a bunch of blades.

There will not be a hard requirement for this list (or the infrastructure mentioned in this article) to be updated as time passes.

And this is the build runner allocation before NetBSD work: – runner1: Laserjet VM (2 workers) – runner2: Laserjet VM (2 workers)

After talking to our friends at generalprogramming, we were able to allocate two more runners: – runner3: GP Blade (2 workers) – runner4: GP Blade (2 workers)

However, they were unstable as time passed, and we decided to rent a server which would take care of both FreeBSD and NetBSD: – bigrunner: Hetzner Rented Server (8 workers)

runner1 and runner2 were decommissioned to decrease load on Laserjet. runner3 and runner4 were decommissioned because of their instability.

One infrastructure note to keep in mind is that we aren't using all of the available blades in the rack, because of power issues. The PSUs can handle 20A maximum current, and we have both of them so that we can stay redundant. Even though we have 40A theoretical max current to be used, we must only use 20A max, and we were already on that limit. Foreshadowing: A recent failure in the datacenter brought down one of the PSUs and our router was not plugged into both, causing an outage.

A collection of operation problems that happened in the build network while getting NetBSD CI up

Accidental Ramdisk in Production (July 12th 2021)

The issue was first identified by seeing jobs close successfully with “Connection to localhost closed by remote host”. This is bad (it should fail, really), but we knew that it was related to the OOM killer, as we had it before.

We noticed that the RAM usage on affected runners was 50%, even though no CI jobs were running on them. Before deciding to reboot, thinking it was some cursed I/O caching eating RAM (it wasn't, after checking free output), I checked df and found that the rootfs was a tmpfs, not something like ext4.

Turns out I accidentally installed those runners on the alpine livecd ramdisk. I didn't run setup-alpine. The motd as you enter a shell tells you to run it. I don't know how I missed it.

After backing up the important data (build logs), I was able to properly recreate them. Since I knew that more runners were bound to be created in the future, I wrote a pyinfra script to set everything up beforehand, and became a fangirl of it afterwards. Oh well, it happens.

Network Timeouts (July 13th 2021 and beyond)

Sometimes connectivity to the builds service simply times out with curl: (7) Failed to connect to builds.hut.lavatech.top port 443: Host is unreachable.

There are high packet losses, maybe because of one of our network upstreams, but I can't dive deeper into this issue. Hopefully it isn't as frequent.

Docker networking mishaps (July 13th 2021)

Jobs simply didn't start with the following error:

docker: Error response from daemon: Conflict. The container name "/builds_job_unknown_1626160482" is already in use by container "90c8a6c874840755e9b75db5b3d9b223fbc00ba0a540b60f43d78378efc4376a". You have to remove (or rename) that container to be able to reuse that name.

The name of the Docker container running QEMU is decided in the control script, here's the relevant snippet of code:

        --name "builds_job_${BUILD_JOB_ID:-unknown_$(date +"%s")}" \

The script that actually boots the VM containing a CI job is the control script, usually found in /var/lib/images/control (when using the Alpine Linux Sourcehut packages, at least). Sourcehut documentation says that the user running that script should be locked down to only being able to run it so that vulnerabilities on the worker process don't lead to a privilege escalation. That is done via the doas utility, an alternative to sudo.

A missing parameter on the doas.conf file makes the control script run without any environment variables, since BUILD_JOB_ID is missing, it uses date +"%s".

This issue didn't arise until NetBSD builds were merged into Zig. Before that happened, we only spawned a single job per PR: the FreeBSD one. However, after adding NetBSD, we started spawning two jobs per PR. Since they were close in time, the container names would be the same, one of them would work, and the other would crash with the aforementioned error.

VM Settling times out (July 28th 2021)

Some VMs were timing out on their settling, but only when the machine was under load (such as after a merge spree in the Zig project). When running a CI job, a new VM is created running the specific OS image, then a “settling” task is run for it. The task attempts to SSH into the virtual machine, run echo "hello world", and if that works, send the rest of the commands (cloning the repo, running the scripts declared in the YAML manifest, etc).

We were always missing timing on it, so by the time I connected to the runner, it was fine and stable. But by the date mentioned in the subtitle, we were able to catch it in real time, and saw that the load numbers are close to 16, even though the runner has 8 cores total.

Turns out the CI VMs have 2 cores allocated to them, that's hardcoded in the control script.

Current solution: Edit the control script to make the VMs be single-core.

Possible future solution: Buy more servers

What next?

  • It works, and zig provides master versions of the compiler for NetBSD!
  • It might take a while until I decide to add OpenBSD into the mix.
  • All of this should have a status page and log analysis to find CI failures that were caused by operation errors, instead of false-positives due to someone writing incomplete code in a work-in-progress PR.

A bit of history

In the current LavaTech infrastructure, we maintain a single application server that runs Proxmox VE, and from there we slice it up containers for most of our applications, with some (like the ejabberd instance powering a3.pm) having their own dedicated servers.

Proxmox VE provides both LXC containers and KVM/QEMU virtual machines so we can use specific operating systems. One of the VMs provides FreeBSD CI infrastructure for the Zig project. The architecture for it is composed of two VMs: one that runs the Sourcehut instance on Alpine Linux, and another running FreeBSD for experimentation─such as compiling LLVM if needed─and also to update the Sourcehut instance's FreeBSD CI image.

That was the first proper contact with FreeBSD I had, lol.

Why?

Since then, I have been experimenting about with NetBSD and FreeBSD, and one of the latest experiments was installing FreeBSD over a serial line.

There wasn't a production need to do it, as we don't have a project that needs thousands of FreeBSD VMs, but the serial line idea helps us because we wouldn't need to go the whole way through Proxmox VE's web admin panel to get a VNC session. In theory we should be able to just SSH to the host and manage that VM directly, without needing SSH inside the actual VM.

The experiment was succesful and here are my findings.

Resources

Proxmox provides a way to attach to a VM that has a serial line, all documented here, so the bit that I had to dig through was FreeBSD's own interaction with the serial line.

I found a random blog post that helped in explaining the general idea of FreeBSD-install-in-a-serial-line, but it isn't directly applicable because this process uses USB installation media, and that's something that I did not want to research, because our mental model to creating VMs describes creating them with ISO files/CD/DVD, and Proxmox doesn't quite provide an obvious button for that.

From that finding, we know that we would need to create our own ISO file containing the serial line settings, then I found the installation manuals for FreeBSD 7.4 that talked about using the serial line for installation, all here, strangely I couldn't locate that same documentation on 12.1-RELEASE.

Actual installation

After creating a modified ISO file with the process from the FreeBSD manuals (and from that blogpost, as I don't know if comconsole_speed="115200" is a default, I assumed not), it was straightforward to add it to the Proxmox VM image library, and boot up the VM. After doing qm terminal ... I got the bootloader messages, the bootloader screen, and was able to have it process my input to boot (it would automatically boot either way, but pressing a button and seeing things happen is a good feedback mechanism).

FreeBSD bootloader screen on serial line

While it was booting though, we have found an anomaly.

I typed cd9660:/dev/cd0, hoping it would work, because I didn't understand the issue that was happening (Ave cited that maybe something related to ISO repacking was doing something bad here, but I didn't investigate it further), sure enough, it worked, and I got to a selection of which terminal am I on so that the installer can draw its boxes!

I recommend you to use xterm when possible, because ansi and vt100 can be hard to use and lead you to type wrong things in the wrong places.

Drawbacks compared to doing VGA

Serial line can get clunky, especially if you try to just tmux(1) inside of it (I was a reboot away from being unable to boot due to a broken rc.conf, as vi drew things on the terminal in the wrong ways), but it is enough to curl my SSH keys to a user, and use it inside a proper terminal.

I'm glad you asked (you did, I promise) because I can infodump for an hour.

hatsune miku votes too!

Huh???

Every 2 years, Brazil runs elections. One for the local elections, where someone votes for the city Mayor and the city's representative, and other where someone votes for the President, State Senator, State governor, and State deputies.

Local elections happened in November 15th 2020, so what better day to start writing a blogpost describing how it works than today.

This is going to be a technical explanation of how things work. So I'm skipping many pre-election and post-election processes. I'm going from candidate loading until vote counting.

The courts

In Brazil, we have the TSE, Tribunal Superior Eleitoral, it translates directly to the Supreme Electoral Court. It is the practical creator and maintainer of the voting systems used around the country. After TSE comes the TREs, Tribunal Regional Eleitoral, translates to Regional Electoral Court, and are the courts that actually collect the data from the voting machines back to TSE (more about that in the future of this post).

Pre-election things

Every Brazillian citizen between 18-70 years of age must vote. Citizens above 16 can vote. Not voting when you must vote means that you must pay a fine to TSE (if the fine is not paid, there are other things as well, such as being unable to emit a Passport, being unable to be admitted to a public university, etc). To be able to vote, a citizen must create their Voter Card (my translation to Título Eleitoral), and as of recent years, the Voter Card creation process involves taking a picture of the voter and a record of their fingerprints (more on that later).

In the creation process, the citizen is asked which place would they want to have their voting take place. I don't know the criteria, but places can become unavailable as they might be full of voters. After your selections, you are assigned an Electoral Zone and an Electoral Session. The hierarchy goes as follows: State –> Location (city, suburb) –> Zone (number) –> Session (number). Schools are the most common place to be Electoral Zones.

After candidates for the respective year's positions are selected, they are loaded in blank voting machines. The loading process is done via a specific kind of USB flash drive with a specific plastic shell to guide it into the back of the machine (preventing people from trying to jam it in, then switch sides, then attempt again, the usual USB problem). After that is done, another USB flash drive (of same structure) is loaded in, and the spot it's on is locked, as that flash drive will contain votes.

Electoral Session pre-election-day ceremonies

The election days are selected by TSE, composed of a 1st and 2nd round. 2nd rounds happen when a candidate for a major position (such as mayor, president, or governor) does not reach a majority (51%+) in a location with more than 200 thousand voters. 2nd rounds just have the 2 leading candidates from the 1st round.

On election day, the mesários (which are voluntary workers for TSE/TRE to manage the election) assigned for a specific Electoral Session, with relevant managers, do the starting ceremonies for the voting machine the respective TRE gave them. That ceremony happens before the actual hours of the election day (election day happens from 07:00 (7AM) until 17:00 (5PM)), so between 6-7AM.

The setup ceremonies involve testing the machine's hardware and making it print a report that it does not contain any votes inside. That report is called the Zerésima. You can see examples of them on the TSE website.

After those ceremonies, the session is ready to accept voters.

Incoming voter process

When a new voter comes in the voting session, they must have some form of identification of thsemselves and their “voter selves”. For the former, that can be done either via TSE's digital voter card app (recommended during the pandemic), the E-Título (only works for voters that registered a picture of themselves) or any government-issued ID document (passport, national ID card, driver's license). For the latter, either E-Título or their voter card works.

Upon showing their identification, one of the Mesários checks the identification provided on the session's book. That is a physical paper notebook possibly contains a picture of a voter, their name, and their voter card number. When a match is found, the voter number is told to the Mesário operating the Terminal. Depending if the voter had their fingerprint registered or not, the Terminal requires the voter to put that in its reader. (This bit may be disabled entirely, considering COVID. As an example, I was not asked to put my fingerprint in for 2020's elections, even though I did in 2018's)

The Terminal controls the voting machine in terms of “starting a voting session”, “ending the session”, “stopping a vote if the voter didn't finish their vote”, “requesting accessibility options”, etc.

After voter identification, the voting machine is now ready to receive the vote.

Voting machine architecture (physical)

The voting machines are all manufactured by TSE and TSE contractors and sent out to TREs around the country in the months leading to the election.

Here's a picture of it:

brazillian voting machine

You can use a little simulation TSE made using highly advanced JavaScript (note: fully portuguese)

Voting machines do not have any networking equipment (Wi-Fi, Ethernet, Bluetooth, etc), its only form of communication is with the Mesário Terminal, and a power cable. Voting machines have batteries as well, to handle remote locations.

Voting machine software architecture (high-level)

The voting machine runs Linux with a highly customized userland made by TSE.

Inside the voting machine there is the Registro Digital do Voto (RDV, Digital Vote Registry), that file can be thought of as a spreadsheet where the columns are the political positions, each row is a voter, and each cell is a vote for that political position. The vote, when casted by the voter, is randomized amongst the other votes, emulating a physical ballot box being shuffled around. Portuguese high-level explanation by TSE.

After a single vote

After the voter has cast their vote, they're asked to sign the notebook and the Mesário gives them a piece of paper, taken from the notebook itself, which is the proof that voter actually voted. The voter can now leave the session, and the next one comes in. This is the cycle that happens until 5PM.

Electoral session post-election-day ceremonies

At 17:00 (5PM), all voting sessions are due to stop soon. If there are any remaining voters in the queue, they must be attended (which means the session might stop after 5PM), but no new voters will be attended (so the actual place closes its doors, but the sessions inside are still open).

After all voters went through, the Mesários issue the command to make the voting machine stop. When that happens, it prints out the total amount of votes that were casted on that session, as in, “candidate X got N votes”, “candidate Y got M votes”, etc. This report is called “Boletim de Urna”, and I'll keep its translation as Vote Report.

The Vote Report is generated from the RDV file cited earlier, it is printed by the voting machine up to 11 times. The first copy is a check to see if the printer of the machine is operational, then 5 required copies, then 5 optional copies that may be requested by political party representatives, the media, or the Ministério Público. Other reports printed are the Justificative Report, and the Mesário Identification Report.

All 5 required copies of the Vote Report and the Justificative Report are signed by the President of the Voting Session, the Mesários, and other Inspectors. Mesários must sign the Mesário Identification Report.

The Vote Report MUST be publically available, either through physical means, where one of the copies is put on the voting session itself, or through digital means, on the TSE's website. The Vote Report contains a QR code which means you can cross-reference the tally made by the voting machine with what the TSE received.

Another copy is given to one of the Inspectors in the session, or the media, or a representative from Ministério Público. Remaining reports are sent back to TREs.

After all of those are finished, the voting machine displays that the session is over, and can now be turned off by the President of the voting session.

Vote Transfer

The Vote Report, while it is sent to TREs, is not the way votes are processed. The voting machine writes all of the information in its Vote Report on the locked-up USB flash drive. The seal and lock are undone, with the President for the session's supervision, and all flash drives for the Zone are sent physically back to TRE.

On the TRE, those flash drives are loaded for final counting, their data is sent back to TSE via a separate network (network is provided by Embratel), and from there, results are distributed via online/TV to everyone in the country. Here's the website for the 2020 Local Elections.

Before actually starting this I want to say that programming language discourse is 99% of the time based on subjective experience. The writing here is my subjective experience dealing with Python. Python may be good for you, or bad for you, or whatever. Replacements beware.

Time changes, people change

I've started learning Python after attempting to learn C but failing back in approximately 2013. But I would only consider writing Python for internet-facing production in 2017.

This rant has hindsight bias, for sure. But a good point on Python is that I didn't quite knew how to program things before learning it. Sure, I did write massive if chains with C but nothing more complex because as soon as I started reading the pointers chapter I would get confused.

Nowadays I'm not that confused with C pointers, or the difference between the stack and the heap (thanks Zig), and I'm a different person, with different goals and needs on what I want my software to be, or, what its principles are. But, even though I'm a different person, I'm still locked in to the younger me with Python, and now that I can put “5 years of experience writing Python” on my CV, I don't think this will go away any time soon.

A non-exhaustive list of things that keep me sad while writing Python

I could go on and enumerate things I'm not happy with and that I ultimately don't have power to fix, so here we go:

  • Typing is sad, it's both a mix of giving static type checking but without actually breaking the language. I mean, yes, typing is good, but in the end you can declare a function that takes an int and give it a string and you'll only know something is wrong at execution time if you don't have mypy, and so you go to install mypy, and then it doesn't find some library, or a library doesn't have typehints, or it decides to never want typehints, etc, etc, etc.
    • A subproblem of typing is using things like typing.Protocol. Which I tried to use, once. It type passed afterwards, but I feel dirty after writing something like that. I wish I could elaborate further on why, other than that general description of “something's wrong” while looking at it.
  • Packaging. God. Packaging. After going through requirements.txt, setup.py, poetry, dephell, pipenv, I just gave up and settled on setup.py since it's the only guaranteed thing that might work.
  • Not being happy with the “large” frameworks AKA Django, meaning I have to go to things like Flask, but then there's asyncio, so I'm going to Sanic, then sanic is it's own hellish thing, and I'm now comfortable on Quart. And by using a niche of a niche of a niche web framework, I'm having to create my own patterns on, for example, how to hold a database connection by just putting app.db = ... on @app.before_serving, you can be sure mypy does not catch anything about that at all. Love it.
  • A Bug in a completely separate part of my software was caused by a bump in dependencies but not bumping my python version on CI. I kept banging my head for hours until I gave up and updated from 3.7 to 3.9.
  • Endless language constructs seem to be the norm now. I don't like that we now have 4 different ways to merge dictionaries together, because the existing one didn't have “a good syntax”. That feels like a really low bar for “new” language constructs.
    • This is new on 3.9, so its likely the two last entries are more of rants than critiques, idk.

The future?

I don't know. None of those things look like will ever be fixed. I'm seeing if I can write webshit with Zig, but that's all experimental. Might take much, much longer to “ship” but I feel that by not having the ecosystem or an interpreter slowing me down I'd have a better development experience.

I can't deny that I'm now with years-long experience with Python and I have to put something in my curriculum, though.

The tech industry makes me sad.

Or, “Monolith First, Complicated Thing Later, Also Plan Before Switching, And How Those Things Have Been Known For Ages, HELP”

I wan't to start this with Gall's law, from General Systemantics:

A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.

When I see large scale distributed systems being written with Kubernetes or whatever, I have the gut feeling Gall's law wasn't followed, though my sample size is very small: 1.5 systems. 0.5 comes in because the other system I'm managing isn't Kubernetes (yet, I'm sure it'll become that after a while).

I consider that jumping from a monolith line of thought to a microservice line of thought, for any team or system, is a very large jump that must be scrutinized before actually doing it. Microservices are no silver bullet, mostly because developers are taught on writing monoliths for their whole life, and then you push up microservices, which come with a lot of specific tooling, and as years pass, even more tooling is created, to the point enumerating them all can be absolute sensory overload. Really.

My Bad Kubernetes Experience

In one of the systems I was developing on, I wasn't actually messing with Kubernetes directly, but, regardless, I had a very poor experience integrating with it. The monolith app I developed was inside a VPS, and had to push data to MongoDB and Redis nodes which were living inside the Kubernetes cluster. The person that managed it told us to use kubectl port-forward, and sure, we tested it out, and it worked, at least at the start, but we all know how it goes from here.

While the system was operating at spike load, the cluster restarted itself, and kubectl port-forward lost connectivity with it. In my mental model, it would reconnect to the cluster: it has authentication, it doesn't need to hold any state. Yet it didn't, and our app kept unable to connect to Redis (wish a very cryptic ECONNRESET in the app logs) for a long time while we attempted to debug it, a restart on the systemd unit that did the port-forward “fixed” it, after we talked with the person operating the cluster how redis was, and they told it was up.

After talking to a friend about that situation (a long time later), they told me port-forward is “a broken ssh tunnel that might not even work”, and “it's not meant to be used for anything other than testing”. So I could have used some other Kubernetes solution which wouldn't have made me write three paragraphs.

My Bad Microservices Experience

I think I can say this one is caused by bad management wanting to put “novel” things on teams that are very new, don't know anything about what a container actually is, or don't know what distributed system design actually entails to the failure modes of such system.

When you jump into microserivces, and design with them in mind, you have to know that you're designing a distributed system, and while the units composing that system are small and simple and understandable, the whole can be much, much more complicated (see: natural systems in the whole world, inculding human-made ones!). As well as you must know all the “fallacies of distributed systems”, which can become somewhat cliché, but are actually quite true, and at this rate, if I'm developing a distributed system, I should print out the list and put it next to me and every other developer, I've seen them happen, even though the fallacies have been enumerated since... 1997? It's weird how we're still having issues that come from those fallacies in 2020. We should do better.

I think this boils down to “people should learn a little bit before jumping in the microservice hype”, and in most cases, Gall's law still applies. I believe that things should be written first as a monolith, to have the developers learn the business logic in an environment that function calls aren't remote everywhere. And if scaling is required, then proceed to investigate a distributed architecture, if you can actually scale there, or do something else.

Offloading to the database

Another data point from overall system design is that you shouldn't offload everything to the database, as in, make every microservice need it for any operation, because as things like autoscaling jump in and now you have 10 times the amount of microservices, your database will suffer 10 times the load. That might catch fire.

Thouh if the database does catch on fire AND is a managed database solution, you can blame the cloud company providing it instead of yourself (hrrrrm Discord loved to do this on their postmortems, but nowadays they don't even give any postmortem. It's sad).

Disclaimer: I do not work for Discord, or with Discool. If i wanted to look cool I would call myself an “independent security researcher”, but in the end I just stare at Chromium's DevTools while Discord is open for an hour.

What are Intents?

Discord has unveiled Gateway Intents a while ago. A PR on the Discord API Docs repository was created to help library developers start adding support for Intents, and provides some answers to questions developers raised up.

The general idea of Intents is that instead of having all events being thrown away at your gateway connection, only the events you actually want are sent, this gives huge wins to general network usage. Most bots get a ton of typing events while being unable to do anything about that, for example. There was a stop-gap measure for this via the “guild_subscriptions” field, but the general idea is that Intents is the way to go, as it's more generic.

From all intents, two of them are considered “Privileged”: members and presence, providing the member list and user status (online/offline/what game they're playing) respectively.

I'm writing this on October 3rd, 2020 (with many drafts afterward). I did have a lot of time in my hands to study about Intents but I'm only doing it right now and sharing the results since I had to port my personal bot to it.

What's the actual deal/reasoning with Privileged Intents?

From the original blogpost:

We believe that whitelisting access to certain information at scale, as well as requiring verification to reach that scale, will be a big positive step towards combating bad actors and continuing to uphold the privacy and safety of Discord users.

To use Privileged Intents, your bot: – must not be in 100 guilds, or – if you are in more than 100 guilds, you must be a “verified developer” with a “verified bot”. If you attempt to bring your bot down to less than 100 guilds after that, you can't have the intents back

The verification process involves filling a form describing your bot's functionality, and give a government-issued identification document via Stripe Identity. Depending on your country, your passport may be the only available option. You can check that information on the Stripe Identity documentation.

There is a grace period before bots are forced to be verified: October 7th, 2020. Bots beyond 100 guilds past that date will not be able to join any more guilds.

From the blogpost, and the surrounding timing on it, I can safely say the creation of Privileged Intents is to provide an answer to what I dub the “discool discourse”. They haven't said that publicly, and I can only base this on feeling alone.

Discool was a service that stored a lot of Discord user metadata. You could put a User ID in, and get the list of guilds they were in (of course, not all guilds, but a lot of public guilds had their data scraped).

The way they did it was by creating burner user accounts able to join those guilds, scraping all the data already available to users (the Discord client needs the member list after all), and store them in a database. Give enough time and effort, and you have a pretty big privacy scandal regarding user data and Discord.

From my perspective, Privileged Intents were made to make developers that create a large-scale privacy scandal legally binded to Discord, and go to court, and etc, etc, etc.

Are the Privileged Intents a wrong way to protect data?

I think so, or at least, very flawed. As I said, Discool uses user accounts, they never used bots in the first place, any Intent business would never get to those bad actors. Of course, Discord has been implementing various kinds of user extra-verification with “machine learning”, but the point is that anyone with a browser can start up Selenium, point it to Discord, and start scraping. Hell, given enough time and effort someone could gcore the browser process to scrape data while still being classified as having “human sentient behavior”

Even in the case of an actual privacy scandal on Discord, the damage for it, even though you're limited to 100 guilds, even though it's technically less users it's the same level of data, you can still fuck up the lives of some people, but I think Discord only cares about having less of the numbers.

When giving data out to anyone, you should assume the worst, because you don't control their intent (pun unintended). Sure, requiring ID and so providing the existing legal system as a protection against misuse of user data works with more than 0% efficacy, but I don't think that is a good solution.

In regards to giving out my own data, I'm not sure I can trust Discord in handling of that. I lost my trust on that after they explained that their data collection (the famous /api/science route, that was formerly /api/tracking, but got blocked by extensions to the point they had to rename it), if turned off by the client, would still continue (you can prove this with devtools), and in their FAQ post, just said that it's dropped on the server.

The nitty gritty: when you turn the flag off, the events are sent, but we tell our servers to not store these events. They're dropped immediately — they're not stored or processed at all. The reason that we chose to do it this way is so that when you turn it off on your desktop app it also turns off automatically on your phone – and vice-versa. This allows us to keep things the same across all of our apps and clients, across upgrades.

Believe me when I say, since they already sync user configuration across clients, it's easy to make the app disable tracking upon seeing the already existing configuration option to disable tracking.

Learning to bypass limitations imposed by Privileged Intents

For context: in the library that I'm using for my bot, discord.py, there's a full framework, discord.ext.commands, in it you can declare a command like such:

from discord.ext import commands

@commands.command()
async def some_command(ctx, argument1, argument2):
    ...

One specific feature of the framework is the ability to convert from raw strings to useful objects, for example, if I wanted to receive a user, I could say

async def some_command(ctx, user: discord.User):
    ...

and discord.py will convert either a mention, or an ID, or the username back to the User object. This is possible because Discord gives all member data to the bot upon the bot's startup. It is fully dependent on the members privileged intent.

Without the members intent, the member lists are basically empty, and the feature doesn't work anymore.

Bringing the feature back

Discord's message events contain full user objects (which have ID, name, avatar, discriminator, etc) and member objects, (which contain roles, nickname, etc) since the client must be able to draw out the UI containing that user.

Even though you don't have full member lists on startup, you, using that fact, can start passively collecting data to re-create the feature. It works like this: on any message, store the user in persistent storage, when needed, look up from that storage, and maybe put it in cache for performance reasons.

Considering message events also contain the guild's ID, you can safely say that with any message that contains an author and a guild ID, you can say that author is a member of that guild, and still do the same thing Discool thing. It is technically less data than originally, since you're only collecting data about the people that are actually sending messages in the guild.

Keep in mind that messages being sent also include the system messages created upon a user's join (they contain the same user object). They're messages like any other, and will reach any bot in the guild. You can have the same effect with bots that create welcome messages to new users, as they usually put a mention to the user (“welcome @user to $guild!”), and since mentions internally are represented by putting the user ID on the message, bots can still scrape it up.

You can also use that same method to extract relationships with typing events as well, as they contain guild ID and user ID

Future work

New Discord features

UPDATE: Two months after this blogpost, in December 2020, Discord has released slash commands, which is the finalization of the work-in-process that Discord shared.

Recently I've seen work-in-progress inside the Discord client on a way for bots to declare their own commands (you can also see the Figma prototypes on the “The Future of Bots” blogpost), removing the need for bots to have every single message being sent to them (it currently works with bots checking in if the message starts with a given prefix, etc).

It is defnitely a better idea than the current state of things, but I'm not sure if the boundary of “hey, your user data is now in the hands of the bot developer” is very well defined. Many users would still use the bot, even once, and have their user info stored on a database. I'm not using it for malicious intent, but if we're talking about the technical purpose of Privileged Intents, we must assume every bot is malicious, and design around that.

discord.py using Request Guild Members for the User and Member converters

See this issue. It would sidestep the problem of having to extract info out of messages, which would help my usecase immensely.

I don't know if it'll be implemented, and I'm too anxious to open an issue on discord.py to use it, but it is a nice idea.

Closing thoughts

this music is fucking awesome

I tried, as an experiment/curious endeavour, to run osu!lazer (2020.717.0) on a NetBSD 9.0 QEMU/KVM guest, on my machine. Here's the results:

  • Follow the Linux Emulation NetBSD guides, installing the linux compatibility packages (suse_*) was good enough to continue.
  • FUSE is not included by default, but thankfully, the AppImage gives instructions to extracting itself into a folder.
  • An AppRun file exists in the folder. Trying to execute it would give “Failed to initialize CoreCLR, HRESULT: 0x80070008”. It is related to ulimit -n, the default is way lower than 2048
  • Raise the file descriptor hard limit using sysctl kern.maxfiles, then bring up the hard limit (ulimit -Hn), then bring up the soft limit (ulimit -Sn). If I recall correctly, I bumped it up to 8000.
  • Some other failure happened while running AppRun, upon further inspection via ktrace/kdump, a sched_setaffinity syscall failed because of permissions.
  • Running it as root fixed that.
  • A traceback appeared regarding ICU libraries not being found, even though I have them installed.
  • You can edit osu!.runtimeconfig.json to disable that requirement. More here
  • osuTK kept talking about unsupported platforms. Upon further investigation, it was because no valid GPUs were found. A very cryptic error, for sure.
  • In the end and some more discussion, we blamed the fact I was running NetBSD as a KVM guest as the culprit. virt-manager only provides QXL, VGA, and virtio, and virtio gpu drivers aren't on NetBSD 9.0.

Maybe I'll run NetBSD as a host on one of my machines and keep this experiment running. Until then, that's what I got.

To be continued..?

awawawaw

beep boop

welcome to writing moment

  • lunlunlunlunluna