LavaTech WriteFreely

Reader

Read the latest posts from LavaTech WriteFreely.

from is it gpt-3, or is it just fantasy?

Hello, if this is your first time reading this, I'm luna, one half of LavaTech and we provide continuous integration (CI) services for the Zig programming language.

We started doing this because Zig could not use their pre-existing CI service for FreeBSD builds, sr.ht, because the compiler became very RAM-intensive and that hits the limits on their CI. We stepped in and we now provide a selfhosted Sourcehut instance where FreeBSD builds can still happen while the self-hosted compiler is not finished yet (Genuinely, I have no idea what will happen once the memory usage is brought down, but for now, I'm happy to provide the service, and wish to continue).

On a sunny day with nothing much to do, I decided to bring better NetBSD support for Zig, and one of the areas I can learn on is the one where I'm operating: CI.

This blogpost outlines the things I had to do to bring NetBSD to Zig CI.

Some background, or, How does CI work

Sourcehut is a “software forge”, which is a collection of tools to manage codebases, either publicly or privately. sr.ht is the main instance of Sourcehut, operated by its original creator, Drew Devault.

builds.sr.ht, the CI service of Sourcehut, works by providing QEMU/KVM virtual machines that run specifically-crafted operating system images, and from them, your project can be built and the resulting artifacts can be tested inside the virtual machine, all via executing SSH into the VM. After all the commands are run without errors, the build has “passed”. An example of builds.sr.ht on the main instance can be seen here

The outline of the available VM images can be seen in the compatibility page of builds.sr.ht, and from there you can see Alpine Linux, Arch Linux, FreeBSD, among others (hell, even 9front!).

NetBSD on sr.ht

Back when I was developing support for it, the compatibility page did not mention NetBSD, and thanks to Michael Forney, it now is!

The history for it is as follows: – The initial versions of the NetBSD image were created by Drew Devault. – However, they weren't really finished. – I began working on finishing it. – Michael Forney also began working on it, in parallel. Their patch has been successfully upstreamed! – Even though we worked in parallel, our changes are mostly equivalent and so, there isn't a need to upstream my changes: – Replacing anita (The automated NetBSD installer) for directly downloading the binary sets and partitioning the virtual disk. This is done to remove the QEMU requirement on the build script, because I am building the image in a NetBSD VM itself, and I was not sure on how to bring nested virtualization to it. – Replacing pkgsrc building source packages with downloading binary packages via pkgin.

Here is a successful test build with the image!

Preparing the CI

Zig CI does not depend on the upstream platforms' LLVM builds. Instead, a self-built LLVM is created via the zig-bootstrap project. With it, you can go from a C compiler to a fully functioning Zig compiler for any architecture. It achieves this via four large compile steps (with the relevant names I will use for them throughout this blogpost): – Compiling LLVM for the host system (“llvm-host”). – Compiling Zig for the host system (“zig-host”). – Compiling LLVM for the target system (“llvm-crosscompiled”). – Compiling Zig for the target system (“zig-crosscompiled”).

With, for example, the zig-bootstrap result for aarch64-linux-musl, you can start an ARM64 Linux CI, where zig-crosscompiled compiles the new commits of Zig. This is possible because Zig is a C compiler as well. Here's the script for Zig Linux CI on Azure of such artifacts working in a CI script ($CACHE_BASENAME).

For NetBSD, the process shouldn't be different (FreeBSD follows it as well), so what we need to do is get zig-bootstrap running on it. This process took a couple of days, with help of andrewrk and washbear, and long arduous moments of the having to cyclically “Start the build, find something to do, mentally switch back to fixing the build”. It all paid off, and I generated fantastic quickfixes in the end! Here they all are!.

The submitted PR is more of a discussion place about what can be done regarding my workarounds, rather than putting all of them into zig-bootstrap (because I am pretty sure some of those fixes would break... every other target, lol).

Spinning up new infrastructure for CI

builds.sr.ht has the idea of build runners for jobs, so you can dynamically add more capacity for CI as needed. Each runner can have N workers, and those are the ones running QEMU VMs.

As we started with FreeBSD, we learned that a single worker requires 16GB RAM (the actual compile requirements are approximately 8GB, but there are a lot of build artifacts, the filesystem for the VM is composed of both the image and a growing temporary ramdisk). And as time passed, we learned that 4 build workers can handle the throughput needed by Zig when it comes to FreeBSD (as of July 2021, this is true).

To have NetBSD support added to the mix, we would have to add 4 build workers to the network, which means finding 64GB of available RAM somewhere in our infrastructure. LavaTech operates in a hybrid cloud model: There are VPS'es in various cloud providers, but most of our services run in our own colocated hardware, shared with general programming. We are able to do so because of Hurricane Electric's free colo deal. Also, Peer with us!

In general, our rack has two power supply units, one big momma computer (of which we kindly hostnamed it “laserjet”, it was close to being “HP-LaserJet-Enterprise-MFP”), and a bunch of blades.

There will not be a hard requirement for this list (or the infrastructure mentioned in this article) to be updated as time passes.

And this is the build runner allocation before NetBSD work: – runner1: Laserjet VM (2 workers) – runner2: Laserjet VM (2 workers)

After talking to our friends at generalprogramming, we were able to allocate two more runners: – runner3: GP Blade (2 workers) – runner4: GP Blade (2 workers)

However, they were unstable as time passed, and we decided to rent a server which would take care of both FreeBSD and NetBSD: – bigrunner: Hetzner Rented Server (8 workers)

runner1 and runner2 were decommissioned to decrease load on Laserjet. runner3 and runner4 were decommissioned because of their instability.

One infrastructure note to keep in mind is that we aren't using all of the available blades in the rack, because of power issues. The PSUs can handle 20A maximum current, and we have both of them so that we can stay redundant. Even though we have 40A theoretical max current to be used, we must only use 20A max, and we were already on that limit. Foreshadowing: A recent failure in the datacenter brought down one of the PSUs and our router was not plugged into both, causing an outage.

A collection of operation problems that happened in the build network while getting NetBSD CI up

Accidental Ramdisk in Production (July 12th 2021)

The issue was first identified by seeing jobs close successfully with “Connection to localhost closed by remote host”. This is bad (it should fail, really), but we knew that it was related to the OOM killer, as we had it before.

We noticed that the RAM usage on affected runners was 50%, even though no CI jobs were running on them. Before deciding to reboot, thinking it was some cursed I/O caching eating RAM (it wasn't, after checking free output), I checked df and found that the rootfs was a tmpfs, not something like ext4.

Turns out I accidentally installed those runners on the alpine livecd ramdisk. I didn't run setup-alpine. The motd as you enter a shell tells you to run it. I don't know how I missed it.

After backing up the important data (build logs), I was able to properly recreate them. Since I knew that more runners were bound to be created in the future, I wrote a pyinfra script to set everything up beforehand, and became a fangirl of it afterwards. Oh well, it happens.

Network Timeouts (July 13th 2021 and beyond)

Sometimes connectivity to the builds service simply times out with curl: (7) Failed to connect to builds.hut.lavatech.top port 443: Host is unreachable.

There are high packet losses, maybe because of one of our network upstreams, but I can't dive deeper into this issue. Hopefully it isn't as frequent.

Docker networking mishaps (July 13th 2021)

Jobs simply didn't start with the following error:

docker: Error response from daemon: Conflict. The container name "/builds_job_unknown_1626160482" is already in use by container "90c8a6c874840755e9b75db5b3d9b223fbc00ba0a540b60f43d78378efc4376a". You have to remove (or rename) that container to be able to reuse that name.

The name of the Docker container running QEMU is decided in the control script, here's the relevant snippet of code:

        --name "builds_job_${BUILD_JOB_ID:-unknown_$(date +"%s")}" \

The script that actually boots the VM containing a CI job is the control script, usually found in /var/lib/images/control (when using the Alpine Linux Sourcehut packages, at least). Sourcehut documentation says that the user running that script should be locked down to only being able to run it so that vulnerabilities on the worker process don't lead to a privilege escalation. That is done via the doas utility, an alternative to sudo.

A missing parameter on the doas.conf file makes the control script run without any environment variables, since BUILD_JOB_ID is missing, it uses date +"%s".

This issue didn't arise until NetBSD builds were merged into Zig. Before that happened, we only spawned a single job per PR: the FreeBSD one. However, after adding NetBSD, we started spawning two jobs per PR. Since they were close in time, the container names would be the same, one of them would work, and the other would crash with the aforementioned error.

VM Settling times out (July 28th 2021)

Some VMs were timing out on their settling, but only when the machine was under load (such as after a merge spree in the Zig project). When running a CI job, a new VM is created running the specific OS image, then a “settling” task is run for it. The task attempts to SSH into the virtual machine, run echo "hello world", and if that works, send the rest of the commands (cloning the repo, running the scripts declared in the YAML manifest, etc).

We were always missing timing on it, so by the time I connected to the runner, it was fine and stable. But by the date mentioned in the subtitle, we were able to catch it in real time, and saw that the load numbers are close to 16, even though the runner has 8 cores total.

Turns out the CI VMs have 2 cores allocated to them, that's hardcoded in the control script.

Current solution: Edit the control script to make the VMs be single-core.

Possible future solution: Buy more servers

What next?

  • It works, and zig provides master versions of the compiler for NetBSD!
  • It might take a while until I decide to add OpenBSD into the mix.
  • All of this should have a status page and log analysis to find CI failures that were caused by operation errors, instead of false-positives due to someone writing incomplete code in a work-in-progress PR.
 
Read more...

from LavaTech Blog

This is our second transparency report to date. We've provided comparisons in this document, but if you'd like to review the 2019 one yourself regardless, you can access it here.

We also released the 2020 financial report, you can read it here.

This document was prepared on 2021-03-14, with numbers based around this date. It was mostly written by Ave, but it was reviewed and approved by both LavaTech members.

In general...

We haven't received any warrants or anything like that to date.

No DMCAs in any service in 2020.

No C&Ds in any service in 2020.

elixi.re

elixi.re is alive, as it has been for the past few years.

We're at 66 domains (including admin-only ones), compared to 60 of 2019.

We've had a complete hardware change in 2020, moving from a Hetzner dedicated server to a colocated server. Compared to 2019, storage space is a bigger concern than before, and we're exploring options.

v3 backend is feature complete, frontend still needs work. ETA: Soon™.

We have 691 active and 803 inactive users, which is to say that we got 85 new active users and 510 new inactive users since 2019. We're quite picky, I suppose.

There's 747922 files (taking up 182.21GiB), with 3881 of them from last week (taking up 982.94MiB).

In 2019, we had 473480 files (112.79GiB), and at the week of writing the blog, we had 5333 uploads in the last week (1361.02MiB).

While our total file count has increased by around 50%, our weekly counts slowed down a bit, even with Julian uploading a ton of cat pictures.

There's 1927 shortens, with 4 of them from the last week. That's only 4 new shortens in a whole year. It's not the most popular feature, you see.

a3.pm XMPP services

User count is 1072, up from 1042 in 2019.

No one was banned in 2020.

We've kept our server up to date as before, albeit slower, and we're running the latest version right now. We're still releasing our latest config files open to encourage people to deploy their own XMPP servers.

90dns

90dns has almost doubled in daily unique IP counts.

Right now, there's ~39k unique IPs per day (up from ~20k), ~10k of these in the US instance (up from ~4k).

Gitdab

Gitdab had a major user registration spam issue over the year, and we had to constantly wipe unverified users. Sadly, we also had to switch from a built-in captcha to hcaptcha after a while to stop this spam, which ended up working fine.

  • 473 users (2019: 323)
  • 98 organizations (2019: 61)
  • 415 repositories (2019: 153)

Bitwarden

We spun up a bitwarden+bitbetter instance in 2020, and after a couple months of non-stop issues, we moved to bitwarden_rs.

We're quite happy with bitwarden_rs so far.

  • 57 users

Mirrors

We didn't add any new mirrors in 2020, but we've maintained our existing mirrors at:

LavaDNS

LavaDNS was killed... and... LavaDNS v2 was born: https://dns.lavate.ch/

LavaDNS v2 uses industry standard software, and provides DNS, DoH, DoT and dnscrypt. Oh, and it doesn't keep logs, and is available in two locations: US and Finland.

LavaSearx

Heck, I (ave) forgot LavaSearx was even a thing!

The service is still up, but it's at a degraded state. It's not synced to upstream, and is probably broken.

We should probably take it down.

(Tags to jump to other related posts: #transparencyreport #report)

 
Read more...

from LavaTech Blog

This is our second financial report to date. We've provided comparisons in this document, but if you'd like to review the 2019 one yourself regardless, you can access it here.

We also released the 2020 transparency report, you can read it here.

This document was mostly written by Ave, but it was reviewed and approved by both LavaTech members.

Income: Patreon

Patreon is our main source of donations for LavaTech.

Patreon graph

  • In 2020, we made $1028 (minus fees) through Patreon (2019: $563).
  • We had no refunds (2019: same).

Before fees, we made $1028.

Expenses: Fees

  • $53.00 went to Patreon processing fees (2019: $35.93).
  • $51.40 went to Patreon platform fees (2019: $28.15).
  • $36.00 went to Payoneer maintenance fees (2019: $11.62).
  • $21.62 went to Payoneer (outbound) transaction fees (2019: $9.32).
  • $12.00 went to Payoneer (inbound) transaction fees (2019: $6.00).

In total, $172.02 (16.73% of the income) went to taxes and fees in 2020 (2019: $139.78, 29.12%).

Here's a lazy pie chart for your enjoyment:

Pie chart showing distribution of fees in 2020

Expense: Backblaze B2

Backblaze B2 is our preferred data storage service of choice for backups.

  • We paid $98.25 to B2 in 2020 (2019: $35.07).
  • Our bill fluctuated over the months. We started the year with a $5.89 bill, went up to $13.83 on February, went down to $3.55 on April after we moved lots of old data to cold storage (relevant blog post). We finished the year with a $10.77 bill on December.

Expense: Cloudflare

We started using Cloudflare Load Balancing for Switchroot in July 2019 and stopped in July 2020.

  • $12 went to Cloudflare Load Balancing.
  • Domain fees from Cloudflare are not included here. You can find them in the domain expenses section of this post.

Expense: Servers

2020 was an interesting year as we've moved from a dedicated server in Hetzner to a colocated server in August.

The new server belongs to Ave, the network and rack belongs to Lasagna Ltd (a company owned by Ave, unrelated to LavaTech). As such, server, network and colocation costs are NOT reflected here.

  • We paid €470.70 ($532.58) to Hetzner. (2019: €534.94)
  • Most expensive month on Hetzner was July with €49.98 and cheapest months were September, October and November with €20.60. The infrastructure review we did in Q1 2020 also helped with reducing expenses.
  • We paid €143.88 (~$171.98) to Online.net (Scaleway Dedibox). All months were €11.99. (2019: €143.88)
  • We paid €67.90 (~$81.16) to Scaleway Elements (2019: €55.99).

Note: All of Scaleway was paid directly by Ave. All but 3 months (€35.97) of Online.net were paid by Ave.

In total, we paid €682.48 (~$785.72) for servers in 2020 (2019: €734.81).

Expense: Domains

We bought many cute domains this year.

While we do our best to reflect the correct numbers here, they're not perfect. Stuff like elixi.re domains we pay for renewal of (but don't own yet) aren't included, and we may have forgotten to include some registrars.

Do note that 2019 numbers do not exclude personal purchases while 2020 numbers do.

  • We paid $86.85 to Porkbun. (2019: $146.21)
  • We paid $27.96 to Dynadot. (2019: $27.46)
  • We paid $41.96 to Namecheap (2019: $22.53).
  • We paid $16.21 to Cloudflare (2019: $8.03).

In total, we paid $172.98 for domains in 2020 (2019: $228.19).

Expense: KernelCare

We have KernelCare on our main hypervisor, laserjet.

We have a 2 server license, one in use for LavaTech and other in use by our friends at General Programming (which is excluded from the number below). This arrangement allows us to get a slightly favorable per server cost.

During certain parts of the year we bought more licenses temporarily to test other servers and move between servers.

  • We paid $46.01 for KernelCare in 2020 (2019: $35.40)

Note: Only one month ($6.90) was paid by LavaTech funds. Rest was paid directly by Ave.

Personal Expenses through LavaTech funds

I (Ave) owe a personal apology here for using the wrong card for one payment. Considering I also paid for a large number of expenses with my own funds, I hope this can be excused.

  • $53.53 was paid to General Programming, LLC for helping install servers. (without fees: $53)

In total, we paid $53.53 for personal expenses through LavaTech funds in 2020 (2019: $26.29).

In conclusion

  • We made $1028. (2019: $563)
  • We got to keep $855.98 of it after fees. (2019: $340.22)
  • We spent $1114.96 (doesn't include personal expenses) (2019: $1140.22).
  • After fees, %76.77 of our expenses were paid by donations (2019: %29.83).
  • Exactly $258.98 was paid out of pocket (2019: $800).

Pie chart showing distribution of expenses in 2020

We'd like to thank all of you for supporting us, by using our services, by recommending our services, and by donating domains and funds to our services.

We're closer than ever to being self sustaining, and that makes us unbelievably happy. Thank you so much to everyone who used our services, who recommended our services and those who supported us in any way.

Shameless plug: If you'd like to help make that percentage be higher for 2021, here's our patreon. Anything helps.

(Tags to jump to other related posts: #financialreport #report)

 
Read more...

from is it gpt-3, or is it just fantasy?

A bit of history

In the current LavaTech infrastructure, we maintain a single application server that runs Proxmox VE, and from there we slice it up containers for most of our applications, with some (like the ejabberd instance powering a3.pm) having their own dedicated servers.

Proxmox VE provides both LXC containers and KVM/QEMU virtual machines so we can use specific operating systems. One of the VMs provides FreeBSD CI infrastructure for the Zig project. The architecture for it is composed of two VMs: one that runs the Sourcehut instance on Alpine Linux, and another running FreeBSD for experimentation─such as compiling LLVM if needed─and also to update the Sourcehut instance's FreeBSD CI image.

That was the first proper contact with FreeBSD I had, lol.

Why?

Since then, I have been experimenting about with NetBSD and FreeBSD, and one of the latest experiments was installing FreeBSD over a serial line.

There wasn't a production need to do it, as we don't have a project that needs thousands of FreeBSD VMs, but the serial line idea helps us because we wouldn't need to go the whole way through Proxmox VE's web admin panel to get a VNC session. In theory we should be able to just SSH to the host and manage that VM directly, without needing SSH inside the actual VM.

The experiment was succesful and here are my findings.

Resources

Proxmox provides a way to attach to a VM that has a serial line, all documented here, so the bit that I had to dig through was FreeBSD's own interaction with the serial line.

I found a random blog post that helped in explaining the general idea of FreeBSD-install-in-a-serial-line, but it isn't directly applicable because this process uses USB installation media, and that's something that I did not want to research, because our mental model to creating VMs describes creating them with ISO files/CD/DVD, and Proxmox doesn't quite provide an obvious button for that.

From that finding, we know that we would need to create our own ISO file containing the serial line settings, then I found the installation manuals for FreeBSD 7.4 that talked about using the serial line for installation, all here, strangely I couldn't locate that same documentation on 12.1-RELEASE.

Actual installation

After creating a modified ISO file with the process from the FreeBSD manuals (and from that blogpost, as I don't know if comconsole_speed="115200" is a default, I assumed not), it was straightforward to add it to the Proxmox VM image library, and boot up the VM. After doing qm terminal ... I got the bootloader messages, the bootloader screen, and was able to have it process my input to boot (it would automatically boot either way, but pressing a button and seeing things happen is a good feedback mechanism).

FreeBSD bootloader screen on serial line

While it was booting though, we have found an anomaly.

I typed cd9660:/dev/cd0, hoping it would work, because I didn't understand the issue that was happening (Ave cited that maybe something related to ISO repacking was doing something bad here, but I didn't investigate it further), sure enough, it worked, and I got to a selection of which terminal am I on so that the installer can draw its boxes!

I recommend you to use xterm when possible, because ansi and vt100 can be hard to use and lead you to type wrong things in the wrong places.

Drawbacks compared to doing VGA

Serial line can get clunky, especially if you try to just tmux(1) inside of it (I was a reboot away from being unable to boot due to a broken rc.conf, as vi drew things on the terminal in the wrong ways), but it is enough to curl my SSH keys to a user, and use it inside a proper terminal.

 
Read more...

from Ave's Blog

Introduction

bigbigvpn is an ephemeral VPN self-hosting managing system. You give it API keys from your favorite VPS providers, and it allows you to get VPNs on all countries they support.

You pick a country, hit connect, and in a minute or so it finishes setting everything up and automatically connects you to the VPN. You disconnect (or be offline for a configurable amount of time), and it automatically deletes the server.

You get the best of both worlds, with very low prices, dynamic IPs, large amount of locations AND the comfort of self-hosting.

It mostly started as a “would be nice” and then quickly turned into a quick 6 hour PoC. I stopped working on it for a while, but picked it back up to add more features as I ended up using it more and more, and I finally feel like I'm comfortable sharing what I accomplished so far alongside some thoughts about the whole concept.

I was supposed to post this 2 weeks ago (even said so on twitter), but life happened, and I decided to delay the PoC until it got to a state where it represented the whole potential, and that's the state now.

Demos

Please note that this is just a PoC at the moment, and while it works fairly well, it's not as polished as I'd like:

As a note, bigbigvpn can currently have providers configured with a weight value. While both DigitalOcean and Hetzner provide servers in Germany (and I have them both configured), in the video it automatically picked Hetzner there as I had a higher weight set on it.

Post release edit: My very cool friends Mary and Linuxgemini had some ideas and helped implement them, and now server spin up takes significantly less, down to ~30-40s on hetzner.

(Also available as some asciinema recordings: Connecting to a server on Hetzner, Connecting to a server on scaleway, Getting region list)

Conclusions from the “experiment”

(Please do note that I have a conflict of interest as the developer of the project.)

I do think that the idea of ephemeral VPN servers is fairly viable as long as you're willing to wait 60-90s for a server to spin up, are okay with the fact that there's little to no tooling or clients right now, and most importantly, need the benefits it provides.

Many VPS providers offer hourly pricing, and most of them (that I've used) just charge for one hour when you initially spin them up, though there's some exceptions (like Scaleway). This makes them quite viable for short term use.

There are some stuff that can be improved from a technical perspective that I intend to address (see the next section), but even with those covered, several shortcomings remain:

  • Your IP is still not shared with others, though you're not on the same one all the time. This is a step up from traditional self-hosting.
  • Some VPS providers (Hetzner etc) don't give you a random IP on every server creation, but seem to effectively reserve an IP to hand out back to your next server creation for a while after you delete it. This is good if you accidentally fat fingered the prod server away, and bad if you're looking for a VPN service with dynamic IPs.
  • Many commercial VPN services have tons of locations, and the 3 VPS providers bigbigvpn currently supports add up to “just” 11 countries. Even if more providers were added, it might not be possible to get to a similar number of countries.
  • Compared to both commercial VPN services and self-hosting, payments can get very segmented depending on how many providers you enable.
  • Compared to commercial VPN services, registering is a slower process, and in many cases can involve a KYC process (Hetzner required me to send over my passport several years ago, and Scaleway is slowly rolling out ID verification too). This may not be desirable to all, but it's a compromise I'm okay with, the saying does go “Be gay, be law abiding” after all.

Going forward

Overall, bigbigvpn was intended to be quick experiment that I'd stop thinking about after a few hours, but after I talked about it with friends, quite a few of them expressed that it may be actually useful, and as such I do intend to continue working on it. It's quite fun anyways :)

Right now, bigbigvpn code is not something I'm ready to publish. While most of the server related bits are fairly clean (and as such will be carried over), web bits were fairly rushed, and I still make breaking changes to the API on a regular basis.

I have significant changes in mind and will do a complete redesign of many parts of it, and intend to open source the proper implementation early on in the development cycle.

Some of the changes I want to make include:

  • Multiple device support per server/location, so you can connect from both your phone and PC at the same time, or even add your friend to your current VPN box.
  • Multiple user support, so you don't have to also pay a server to manage the server-side code if you don't want to. We'll probably host an instance on lavatech too.
  • Better accounting for pricing schemes (for example, Scaleway charges for first 3 hours on DEV1-S spinup, so it doesn't make sense to delete server until that time)
  • More clients, most notably mobile clients.
  • Support for more providers, or perhaps even just terraform.
  • Preparing images and just creating VPSes from those to have faster start times. I also intend to experiment with optimizing other parts of the process to minimize the spinup time.

I'll likely be publishing the repositories under https://gitlab.com/bigbiglarge and if there's enough interest I may post more updates here in my blog.

Thanks for reading this post! I've also posted some parts I cut out from this blog post (like how we got the idea, why I want this etc) over in my side blog in case you want to read more.

(tag: #bigbigvpn)

 
Read more...

from Ave's Blog

How does a Turkish ID work anyways

Official image of the TCKK

The Turkish ID Card, aka TCKK, is a smartcard, quite an interesting one actually. It has two separate chips for contact and contactless. Both run a locally developed smartcard OS (AKİS/“Akıllı Kart İşletim Sistemi”, lit “Smart Card Operating System”). It can theoretically use a locally developed IC (UKTÜM), but none of my IDs so far had it.

The contactless interface

On the contactless (NFC) interface, it's an ISO/IEC 14443A based ICAO 9303-compliant eMRTD (Electronic Machine Readable Travel Document). I've done quite a bit of work recently to add eMRTD support to Proxmark3 and it can read my ID perfectly, but that's a blog post for another day.

The contact interface

On the contact interface, however, it's a completely different beast: It's based on a number of Turkish Standards [1], and it's seemingly quite secure.

It has various applets like ID verification, e-signature (both by your identity, and your dedicated e-imza cert, though latter wasn't deployed yet I believe) and emergency health information. Sadly, however, it's not well documented publicly (other than some exceptions [4], and all these cool features are simply... unused.

Dumping the cert

I've dumped my first TCKK cert on my first ever TCKK back in 2018 by sniffing USB communications[7], and wrote a tool to automate it back in 2019 when I renewed my ID to get my image updated, and finally got to use it again when I got a new ID in 2020 after I lost my wallet in Leipzig after 36c3 [2].

Anyhow, today I open sourced that script and another one. I'll probably be publishing more over there in the future, especially as I understand ISO/IEC 7816-4 and ASN.1 better after implementing ICAO 9303, so I will simply go over using that.

This isn't intended to be a “TCKK verification for the masses” post, so I'll skip through the simple details.

Clone TCKKTools, install python3.6+, install dependencies, plug in your favorite contact smart card reader (I use an ACS ACR39U), put in your ID with the chip facing up. You'll likely also want to install openssl as we'll be using that for converting the certificate and verifying it.

Run python3 dumpcert.py, and it should dump your certificate as a file called cert.der:

Cert dump procedure and the dumped der file shown on a terminal

You can convert the cert from der to pem with openssl x509 -in cert.der -inform der -out cert.pem.

You can view certificate details with openssl x509 -in cert.pem -text.

Verifying the cert [3]

First off, ensure that you converted the certificate to pem format and that you have openssl installed.

Secondly, let's grab the required files. Download the following URLs (do be aware that the .crl url is fairly big, around 350MB):

http://depo.tckk.gov.tr/kok/kokshs.v1.crt
http://depo.tckk.gov.tr/kyshs/kyshs.v1.crt
http://depo.tckk.gov.tr/kyshs/kyshs.v1.crl

So... there's an odd thing where kokshs is a der and kyshs is a pem file (where kyshs lacks a newline on the file ending), so the procedure is a little odd. In any case...

# Convert CRL to a PEM
openssl crl -inform DER -in kyshs.v1.crl -outform PEM -out crl.pem
# Add a newline to kyshs.v1.crt
echo "" >> kyshs.v1.crt
# Convert the kokshs.v1.crt file to a PEM
openssl x509 -in kokshs.v1.crt -inform der -out kok.pem
# Join intermediary cert with root cert to create a cert chain
cat kyshs.v1.crt kok.pem > chain.pem
# Join chain and CRL into a single CRL chain
cat chain.pem crl.pem > crl_chain.pem

Additionally, you may have issues verifying the certificate as the CRL at the time of writing has expired (roughly 2 weeks ago), so we'll be skipping CRL expiry checks. If this is no longer the case in the future (see [5] for more info on how you can check), drop the -no_check_time. See [6] for more info on what happens if you run without that.

To verify the certificate, run this command:

openssl verify -no_check_time -crl_check -CAfile crl_chain.pem cert.pem

It should take a while, but it will go through the whole CRL and verify your TCKK cert's validity.

If you see a message like this, then your TCKK certificate is valid:

cert.pem: OK

However, if you see one like this, then it isn't:

C = TR, serialNumber = 1234568902, CN = ACAR HASAN
error 23 at 0 depth lookup: certificate revoked
error cert2017.pem: verification failed

Conclusion

I've been curious if my old ID certificates that I was keeping around were in the long, long CRL that govt publishes, but only got around to checking today. It was nice to see that they were indeed in there.

I've also been meaning to publish some of the TCKK research I made, and publishing this and the two scripts over at TCKKTools feels good. I look forward to publishing more stuff.

Disclosure

This is just one of the many ways to verify the identity of someone using the TCKK. This may not be a legally acceptable way of verifying someone's ID for actual commercial purposes (I simply haven't checked them).

Notes

1: TS 13582, TS 13583, TS 13584, TS 13585, TS 13678, TS 13679, TS 13680, TS 13681.

2: Funny story actually. I got through the whole event without losing anything, then dropped my wallet in Leipzig Hbf at an Aldi. Almost missed my flight searching it. Called my banks on Sbahn to cancel my cards. When I got to the airport there was a “Final Call” for me, Turkish Airlines staff warned me that I was late but that they'd let me through, and airport staff practically pushed me to the front of the passport line. Border control dude still took his sweet time counting the amount of days I spent in Germany before finally letting me through. I was the last to board. I ended up getting my NVI date while taxiing to gate on Istanbul Airport. But in the end everything ended up working out and I ended up getting everything reissued, which is okay I guess.

3: Huge shoutouts to this article on raymii.org as I based the CRL verification on that.

4: There's apparently a person called Furkan Duman who's working on a company developing ID verification technologies who's posted some tidbits in Turkish on his blog, I didn't get a chance to read stuff very much so far, but they look quite interesting: https://furkanduman.com/blog/category/tckk

5: Run openssl crl -in crl.pem -text -noout | grep "Next Update". You can safely Ctrl-c after the first output, otherwise it'll go through the whole file for no good reason. If the shown date is past current date, then the CRL has expired.

6: Running without -no_check_time leads to a rather confusing output from openssl. You still get the same output when feeding it invalid certificates, but you also get error 12 at 0 depth lookup: CRL has expired. However, on valid certificates, while you don't get error 23 at 0 depth lookup: certificate revoked like you do on invalids, you still get the CRL has expired line, and that leads to a verification failure, which ends up being a little confusing.

7: Huge shoutouts to linuxgemini for informing me this was possible (and overall sparking my interest in smartcards and RFID tech) and showing me how to do it on a cold election day in Ankara when I flew back to vote.

 
Read more...

from is it gpt-3, or is it just fantasy?

I'm glad you asked (you did, I promise) because I can infodump for an hour.

hatsune miku votes too!

Huh???

Every 2 years, Brazil runs elections. One for the local elections, where someone votes for the city Mayor and the city's representative, and other where someone votes for the President, State Senator, State governor, and State deputies.

Local elections happened in November 15th 2020, so what better day to start writing a blogpost describing how it works than today.

This is going to be a technical explanation of how things work. So I'm skipping many pre-election and post-election processes. I'm going from candidate loading until vote counting.

The courts

In Brazil, we have the TSE, Tribunal Superior Eleitoral, it translates directly to the Supreme Electoral Court. It is the practical creator and maintainer of the voting systems used around the country. After TSE comes the TREs, Tribunal Regional Eleitoral, translates to Regional Electoral Court, and are the courts that actually collect the data from the voting machines back to TSE (more about that in the future of this post).

Pre-election things

Every Brazillian citizen between 18-70 years of age must vote. Citizens above 16 can vote. Not voting when you must vote means that you must pay a fine to TSE (if the fine is not paid, there are other things as well, such as being unable to emit a Passport, being unable to be admitted to a public university, etc). To be able to vote, a citizen must create their Voter Card (my translation to Título Eleitoral), and as of recent years, the Voter Card creation process involves taking a picture of the voter and a record of their fingerprints (more on that later).

In the creation process, the citizen is asked which place would they want to have their voting take place. I don't know the criteria, but places can become unavailable as they might be full of voters. After your selections, you are assigned an Electoral Zone and an Electoral Session. The hierarchy goes as follows: State –> Location (city, suburb) –> Zone (number) –> Session (number). Schools are the most common place to be Electoral Zones.

After candidates for the respective year's positions are selected, they are loaded in blank voting machines. The loading process is done via a specific kind of USB flash drive with a specific plastic shell to guide it into the back of the machine (preventing people from trying to jam it in, then switch sides, then attempt again, the usual USB problem). After that is done, another USB flash drive (of same structure) is loaded in, and the spot it's on is locked, as that flash drive will contain votes.

Electoral Session pre-election-day ceremonies

The election days are selected by TSE, composed of a 1st and 2nd round. 2nd rounds happen when a candidate for a major position (such as mayor, president, or governor) does not reach a majority (51%+) in a location with more than 200 thousand voters. 2nd rounds just have the 2 leading candidates from the 1st round.

On election day, the mesários (which are voluntary workers for TSE/TRE to manage the election) assigned for a specific Electoral Session, with relevant managers, do the starting ceremonies for the voting machine the respective TRE gave them. That ceremony happens before the actual hours of the election day (election day happens from 07:00 (7AM) until 17:00 (5PM)), so between 6-7AM.

The setup ceremonies involve testing the machine's hardware and making it print a report that it does not contain any votes inside. That report is called the Zerésima. You can see examples of them on the TSE website.

After those ceremonies, the session is ready to accept voters.

Incoming voter process

When a new voter comes in the voting session, they must have some form of identification of thsemselves and their “voter selves”. For the former, that can be done either via TSE's digital voter card app (recommended during the pandemic), the E-Título (only works for voters that registered a picture of themselves) or any government-issued ID document (passport, national ID card, driver's license). For the latter, either E-Título or their voter card works.

Upon showing their identification, one of the Mesários checks the identification provided on the session's book. That is a physical paper notebook possibly contains a picture of a voter, their name, and their voter card number. When a match is found, the voter number is told to the Mesário operating the Terminal. Depending if the voter had their fingerprint registered or not, the Terminal requires the voter to put that in its reader. (This bit may be disabled entirely, considering COVID. As an example, I was not asked to put my fingerprint in for 2020's elections, even though I did in 2018's)

The Terminal controls the voting machine in terms of “starting a voting session”, “ending the session”, “stopping a vote if the voter didn't finish their vote”, “requesting accessibility options”, etc.

After voter identification, the voting machine is now ready to receive the vote.

Voting machine architecture (physical)

The voting machines are all manufactured by TSE and TSE contractors and sent out to TREs around the country in the months leading to the election.

Here's a picture of it:

brazillian voting machine

You can use a little simulation TSE made using highly advanced JavaScript (note: fully portuguese)

Voting machines do not have any networking equipment (Wi-Fi, Ethernet, Bluetooth, etc), its only form of communication is with the Mesário Terminal, and a power cable. Voting machines have batteries as well, to handle remote locations.

Voting machine software architecture (high-level)

The voting machine runs Linux with a highly customized userland made by TSE.

Inside the voting machine there is the Registro Digital do Voto (RDV, Digital Vote Registry), that file can be thought of as a spreadsheet where the columns are the political positions, each row is a voter, and each cell is a vote for that political position. The vote, when casted by the voter, is randomized amongst the other votes, emulating a physical ballot box being shuffled around. Portuguese high-level explanation by TSE.

After a single vote

After the voter has cast their vote, they're asked to sign the notebook and the Mesário gives them a piece of paper, taken from the notebook itself, which is the proof that voter actually voted. The voter can now leave the session, and the next one comes in. This is the cycle that happens until 5PM.

Electoral session post-election-day ceremonies

At 17:00 (5PM), all voting sessions are due to stop soon. If there are any remaining voters in the queue, they must be attended (which means the session might stop after 5PM), but no new voters will be attended (so the actual place closes its doors, but the sessions inside are still open).

After all voters went through, the Mesários issue the command to make the voting machine stop. When that happens, it prints out the total amount of votes that were casted on that session, as in, “candidate X got N votes”, “candidate Y got M votes”, etc. This report is called “Boletim de Urna”, and I'll keep its translation as Vote Report.

The Vote Report is generated from the RDV file cited earlier, it is printed by the voting machine up to 11 times. The first copy is a check to see if the printer of the machine is operational, then 5 required copies, then 5 optional copies that may be requested by political party representatives, the media, or the Ministério Público. Other reports printed are the Justificative Report, and the Mesário Identification Report.

All 5 required copies of the Vote Report and the Justificative Report are signed by the President of the Voting Session, the Mesários, and other Inspectors. Mesários must sign the Mesário Identification Report.

The Vote Report MUST be publically available, either through physical means, where one of the copies is put on the voting session itself, or through digital means, on the TSE's website. The Vote Report contains a QR code which means you can cross-reference the tally made by the voting machine with what the TSE received.

Another copy is given to one of the Inspectors in the session, or the media, or a representative from Ministério Público. Remaining reports are sent back to TREs.

After all of those are finished, the voting machine displays that the session is over, and can now be turned off by the President of the voting session.

Vote Transfer

The Vote Report, while it is sent to TREs, is not the way votes are processed. The voting machine writes all of the information in its Vote Report on the locked-up USB flash drive. The seal and lock are undone, with the President for the session's supervision, and all flash drives for the Zone are sent physically back to TRE.

On the TRE, those flash drives are loaded for final counting, their data is sent back to TSE via a separate network (network is provided by Embratel), and from there, results are distributed via online/TV to everyone in the country. Here's the website for the 2020 Local Elections.

 
Read more...

from is it gpt-3, or is it just fantasy?

Before actually starting this I want to say that programming language discourse is 99% of the time based on subjective experience. The writing here is my subjective experience dealing with Python. Python may be good for you, or bad for you, or whatever. Replacements beware.

Time changes, people change

I've started learning Python after attempting to learn C but failing back in approximately 2013. But I would only consider writing Python for internet-facing production in 2017.

This rant has hindsight bias, for sure. But a good point on Python is that I didn't quite knew how to program things before learning it. Sure, I did write massive if chains with C but nothing more complex because as soon as I started reading the pointers chapter I would get confused.

Nowadays I'm not that confused with C pointers, or the difference between the stack and the heap (thanks Zig), and I'm a different person, with different goals and needs on what I want my software to be, or, what its principles are. But, even though I'm a different person, I'm still locked in to the younger me with Python, and now that I can put “5 years of experience writing Python” on my CV, I don't think this will go away any time soon.

A non-exhaustive list of things that keep me sad while writing Python

I could go on and enumerate things I'm not happy with and that I ultimately don't have power to fix, so here we go:

  • Typing is sad, it's both a mix of giving static type checking but without actually breaking the language. I mean, yes, typing is good, but in the end you can declare a function that takes an int and give it a string and you'll only know something is wrong at execution time if you don't have mypy, and so you go to install mypy, and then it doesn't find some library, or a library doesn't have typehints, or it decides to never want typehints, etc, etc, etc.
    • A subproblem of typing is using things like typing.Protocol. Which I tried to use, once. It type passed afterwards, but I feel dirty after writing something like that. I wish I could elaborate further on why, other than that general description of “something's wrong” while looking at it.
  • Packaging. God. Packaging. After going through requirements.txt, setup.py, poetry, dephell, pipenv, I just gave up and settled on setup.py since it's the only guaranteed thing that might work.
  • Not being happy with the “large” frameworks AKA Django, meaning I have to go to things like Flask, but then there's asyncio, so I'm going to Sanic, then sanic is it's own hellish thing, and I'm now comfortable on Quart. And by using a niche of a niche of a niche web framework, I'm having to create my own patterns on, for example, how to hold a database connection by just putting app.db = ... on @app.before_serving, you can be sure mypy does not catch anything about that at all. Love it.
  • A Bug in a completely separate part of my software was caused by a bump in dependencies but not bumping my python version on CI. I kept banging my head for hours until I gave up and updated from 3.7 to 3.9.
  • Endless language constructs seem to be the norm now. I don't like that we now have 4 different ways to merge dictionaries together, because the existing one didn't have “a good syntax”. That feels like a really low bar for “new” language constructs.
    • This is new on 3.9, so its likely the two last entries are more of rants than critiques, idk.

The future?

I don't know. None of those things look like will ever be fixed. I'm seeing if I can write webshit with Zig, but that's all experimental. Might take much, much longer to “ship” but I feel that by not having the ecosystem or an interpreter slowing me down I'd have a better development experience.

I can't deny that I'm now with years-long experience with Python and I have to put something in my curriculum, though.

The tech industry makes me sad.

 
Read more...

from is it gpt-3, or is it just fantasy?

Or, “Monolith First, Complicated Thing Later, Also Plan Before Switching, And How Those Things Have Been Known For Ages, HELP”

I wan't to start this with Gall's law, from General Systemantics:

A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.

When I see large scale distributed systems being written with Kubernetes or whatever, I have the gut feeling Gall's law wasn't followed, though my sample size is very small: 1.5 systems. 0.5 comes in because the other system I'm managing isn't Kubernetes (yet, I'm sure it'll become that after a while).

I consider that jumping from a monolith line of thought to a microservice line of thought, for any team or system, is a very large jump that must be scrutinized before actually doing it. Microservices are no silver bullet, mostly because developers are taught on writing monoliths for their whole life, and then you push up microservices, which come with a lot of specific tooling, and as years pass, even more tooling is created, to the point enumerating them all can be absolute sensory overload. Really.

My Bad Kubernetes Experience

In one of the systems I was developing on, I wasn't actually messing with Kubernetes directly, but, regardless, I had a very poor experience integrating with it. The monolith app I developed was inside a VPS, and had to push data to MongoDB and Redis nodes which were living inside the Kubernetes cluster. The person that managed it told us to use kubectl port-forward, and sure, we tested it out, and it worked, at least at the start, but we all know how it goes from here.

While the system was operating at spike load, the cluster restarted itself, and kubectl port-forward lost connectivity with it. In my mental model, it would reconnect to the cluster: it has authentication, it doesn't need to hold any state. Yet it didn't, and our app kept unable to connect to Redis (wish a very cryptic ECONNRESET in the app logs) for a long time while we attempted to debug it, a restart on the systemd unit that did the port-forward “fixed” it, after we talked with the person operating the cluster how redis was, and they told it was up.

After talking to a friend about that situation (a long time later), they told me port-forward is “a broken ssh tunnel that might not even work”, and “it's not meant to be used for anything other than testing”. So I could have used some other Kubernetes solution which wouldn't have made me write three paragraphs.

My Bad Microservices Experience

I think I can say this one is caused by bad management wanting to put “novel” things on teams that are very new, don't know anything about what a container actually is, or don't know what distributed system design actually entails to the failure modes of such system.

When you jump into microserivces, and design with them in mind, you have to know that you're designing a distributed system, and while the units composing that system are small and simple and understandable, the whole can be much, much more complicated (see: natural systems in the whole world, inculding human-made ones!). As well as you must know all the “fallacies of distributed systems”, which can become somewhat cliché, but are actually quite true, and at this rate, if I'm developing a distributed system, I should print out the list and put it next to me and every other developer, I've seen them happen, even though the fallacies have been enumerated since... 1997? It's weird how we're still having issues that come from those fallacies in 2020. We should do better.

I think this boils down to “people should learn a little bit before jumping in the microservice hype”, and in most cases, Gall's law still applies. I believe that things should be written first as a monolith, to have the developers learn the business logic in an environment that function calls aren't remote everywhere. And if scaling is required, then proceed to investigate a distributed architecture, if you can actually scale there, or do something else.

Offloading to the database

Another data point from overall system design is that you shouldn't offload everything to the database, as in, make every microservice need it for any operation, because as things like autoscaling jump in and now you have 10 times the amount of microservices, your database will suffer 10 times the load. That might catch fire.

Thouh if the database does catch on fire AND is a managed database solution, you can blame the cloud company providing it instead of yourself (hrrrrm Discord loved to do this on their postmortems, but nowadays they don't even give any postmortem. It's sad).

 
Read more...

from LavaTech Blog

Hello everyone welcome to my quick guide on how to change password in Conversations:

0) Open Conversations, go to the main view, hit the 3 dots on top right

1) Go to Manage accounts

2) Pick your account

3) Hit the 3 dots on top right, smash that “Change password” button

4) Change your password by putting your current password to the first box and the new one to the second, and hitting the “Change Password” button

Thank you for reading my guide.

(Tags to jump to other related posts: #guide, #a3pm)

 
Read more...

from Ave's Blog

aka “How to use a ZTE MF110/MF627/MF636/MF190 on Linux in 2020”

I was looking at random cheap networking stuff on the Turkish ebay clone the other day when I stumbled upon the good old 3G modems.

I always wanted one, and at some point one entered the house, but I never got a chance to use it, and I no longer am on good terms with the person who owns it now. I also ended up being given one by my grandparents several years ago, but I managed to break the sim slot before I could test it by storing a sim converter in it without the actual sim, which got stuck on the pins, and the attempts to pull it out led to the pins being destroyed. Ouch.

So, after all this time, I wanted to finally give it a shot, and they were being sold for as cheap as $3.5 + shipment, so I ordered the first one I saw, which was an “Avea Jet” model. It arrived this morning in a bootleg nutella box, and I had to pay like $2 to shipment. Yup. Can't make that shit up.

But yeah, the modem was inside that, wrapped in bubble wrap. I quickly opened it up, grabbed a SIM that has an actual data plan, found an appropriate SIM converter, stuffed it in:

So... I plugged it into my computer... and it didn't show up as a modem on neither modemmanager when I ran mmcli --list-modems, nor on dmesg:

Uh oh.

Back when my grandparents gave me their old one years ago, I didn't need to install drivers or do any special config for it to be detected, but I was running Ubuntu 16.04 back then.

This suddenly got me worried. What if the relevant kernel module was dropped since then? What if arch doesn't build it in? What if this device never got supported? What if the modem itself was broken? What if the modem is carrier locked (I was using a Turkcell SIM)?

I plugged it into the only Windows computer to try and figure out stuff, but that left me with more concerns, especially about the modem itself being broken, as the CD drive just decided to not load after some time, and as I couldn't see the modem in device manager.

So I started searching, and stumbled upon this arch wiki page, and it's pretty much the basis of this blog post.

I already had many of the software needed such as modemmanager, usbutils and mobile-broadband-provider-info, but turns out I also needed usb_modeswitch and also optionally nm-connection-editor for easy configuration of the settings (nmtui doesn't support it), and modem-manager-gui if you want a GUI for SMS.

The whole thing boiled down to:

# pacman -S mobile-broadband-provider-info modemmanager usbutils usb_modeswitch --needed
# pacman -S nm-connection-editor modem-manager-gui --needed  # these are optional
# systemctl enable --now ModemManager

But even after that, I was a little scared. usb_modeswitch --help talked about modes for huawei, cisco and many other manufacturers, but ZTE was missing from that list:

I sighed. I took a deep breath, and I re-plugged the modem and... what do you know, it showed up as a modem on dmesg and mmcli:

First off I opened modem-manager-gui, and sent a text to my girlfriend, who was able to confirm that she got it:

Being relieved that there's no apparent SIM lock or any other modem issues, I booted up nm-connection-editor, hit +, picked Mobile Broadband, and followed the wizard:

After saving, I finally booted up nmtui, disconnected from wifi and connected to the mobile broadband network:

And voila. I'm now writing this blog post while on a mobile network.

It's not fast by any means, but at least I can fall back to this in an emergency:

We're also planning having a similar setup on a colo with an IoT SIM so that we can still access the network if anything goes wrong.

That's all! Hope you enjoyed this blog post.

Optional reading after this post: A retrospective of 3G routers in Turkey

 
Read more...

from LavaTech Hall of Fame

Violet M. discovered a vulnerability on 2020-05-03 with our reverse proxy handling code and informed us through the mail address we specified on our security.txt.

We were using the X-Forwarded-For header for IP ratelimiting without realizing that IPs passed onto this aren't overridden by Cloudflare, but are appended. Relevant document can be found here.

This could potentially allow an attacker to bypass IP ratelimiting. We've deployed a fix shortly after she reported the issue to us:

We also deployed a workaround for our non-CF domains, which we will improve on v3. We did this by overriding the CF-Connecting-IP header with user's IP on nginx running on the routing server. We had been overriding X-Forwarded-For before this, so non-CF domains have been safe from this vulnerability the whole time.

Violet also found a potential vulnerability that an attacker with our IP addresses could abuse which was dependent on how our setup was configured behind the scenes, however this wasn't an issue that affected our specific setup as we have IP whitelists in place, and don't have elixi.re as the default route on the routing server. Specifically, we limit to CF IPs on our routing server, and to routing server on the server that runs our elixire instance.

While the main instance at elixi.re is safe from all these security issues now, we recommend anyone running elixire on their own server to upgrade to the latest master, and to deploy IP whitelists if they don't already have them in place (here's Cloudflare's IP list), and don't put elixire as your default route (as an attacker may put your IP to their Cloudflare account to bypass the whitelists). If you have a short list of domains, consider enabling Authenticated Origin Pulls.

We'd like to publicly thank Violet for helping us make elixire safer.

You may find/contact Violet at:

 
Read more...

from LavaTech Blog

Hello everyone, a3.pm XMPP Services had several (unannounced) downtimes today as I did some changes to a3.pm.

  • I've updated the server from ejabberd 20.03 to 20.04, released 4 days ago.
  • I've enabled STUN/TURN to enable Video/Audio calls. This is somewhat big. Here's the relevant issue, and a number of links explaining this. You may use this feature with Conversations. I haven't tested it yet, however if you do manage to try it out, please let me know if it works or not on email or XMPP [email protected]
  • I synced the configuration on the server to the one on the repository. The actual settings were the same, but locations of them weren't, and they missed notes etc from each other. I've synced these, mostly from the repo to the server. You can find the relevant commit here.
  • I've open sourced our MAM wiping systemd service/timer.
  • We've updated the SSL certificate. This is a downtime we have to face roughly every 80 days.

Thanks as always for using our services, Ave

(Tags to jump to other related posts: #a3pm)

 
Read more...

from LavaTech Blog

Hey all,

As title says, we used to be a Manjaro ARM mirror for several months, but exactly 4 weeks ago, it became a part of Manjaro's regular mirrors. This was rather big and honestly amazing news for Manjaro ARM.

It did however mean that our mirror would be deprecated.

We thought of becoming a regular mirror, but couldn't find easy resources on how to become an official one quickly, and I (ave) was quite busy at the time, so it got kind of ignored.

Well, that changes today! We're now an official Manjaro mirror. Well, it was ready yesterday, but it got approved today, and we only switched it to the manjaro.org upstream today (that's the equivalent of a T2 Arch repo going to T1).

You shouldn't need to do much and it should get activated automatically based on ping, but if you want to add it manually, our mirror is at https://manjaro.mirrors.lavatech.top/

Signed, LavaTech Team

(Tags to jump to other related posts: #manjaro #mirror)

 
Read more...

from Ave's Blog

Superonline, aka SOL, aka Turkcell Superonline, aka AS34984 is one of the largest ISPs in Turkey.

One of the ads from the ISP, modified to say oof

I've been using their 100/5Mbps unlimited fiber service (their highest-end plan, other than the 1000Mbps one that has its own listing and costs 1000TRY/mo) for over a year now.

Let me tell you: I suffered a lot. Anything from random internet cuts to constant network-wide slowdowns whenever we watched anything on Netflix. I was constantly spammed with calls trying to sell me Turkcell TV+ (even when I told them that I don't watch TV countless times), and roughly 5 months before my contract expired, trying to sell me expensive and lengthy contract renewals*.

And even when it worked, it wasn't as fast as promised, at least over WiFi (5GHz):

Speedtest.net showing 45/20

Meet the routers

Huawei HG253

When I first got my Home Internet, I was given a Huawei HG253, a rather bad router: No 5GHz WiFi, horrible DHCP (can't even set static assignments), etc.

This is a rather hated router according to bad internet forums apparently (yes, I called donanimhaber bad, bite me).

Back then I set up a pihole instance at home just to deal with the DHCP issues (and ofc, also to block some ads).

All in all, this is how it looked like (before I did cable management haha I never did):

Huawei HG253

Thankfully though, the HG253 had a security vulnerability that ended up in my favor: It sent the PPPoE password to the client on the UI, and just marked the field as a password. You can literally just check the page source and get the PPPoE password. Back then I realized this and noted down the credentials (more on this later).

The HG253 had at least one public firmware (link of my HG253 files archive, including a user guide and firmware), and had SSH enabled.

I extracted this firmware and pretty much just explored it Back Then™, but found nothing too interesting. I think I found some hashed credentials but never bothered with hashcat-ing them. SSH was also out of question, it was ancient and even when I forced ciphers it liked to error out, I couldn't get very far with it.

I don't remember exactly what happened to this router, but IIRC it just died one day, and upon calling the support line, they replaced it with a...

Huawei HG255S

The HG255S, my current ISP Router, is a fairly decent router compared to HG253 and overall to other ISP routers I've used so far: It has 5 GHz WiFi (but it sucks, you saw the speedtests earlier), decent DHCP (after the HG253 it felt nice to have), 3G modem support, built-in SIP and DECT, USB port with Samba and FTP support, etc.

Huawei HG255S

However, as you may expect, most of these features are either locked down or behind a paywall. I'd honestly love to be able to modify the SIP settings so that I can have a DECT network at home that connects to my SIP network, but SOL only allows buying phone service from them. The SIP settings menu is removed from UI. More on all this later, this is what finally brought me to the point of replacing the router.

I still kept my Pihole install with this setup in order to not lose my DHCP and DNS data if my ISP ever swapped my routers again, and at that point, I was already doing a bunch of other odd stuff on that Pi anyways (like running openhab2).

“So just replace the router”

Well... Superonline doesn't allow you to replace their router if you're a fiber customer. The PPPoE credentials are not given to you even if you ask for them unless you're an enterprise customer (Relevant page for enterprise customers).

They hate the idea of you replacing the router. Whenever I call the support line with a technical problem they ask if my router is the one they gave or not.

There's literally no technical reason for this I can see, it's all red tape: The fiber cable doesn't even plug into the router, they give you a free GPON:

Huawei HG8010 GPON

The fiber cable goes into that and terminates as a female RJ45 port, which then gets plugged into the WAN port on their router. After that, it's just PPPoE.

I've previously looked into getting an inexpensive router that can run DD-WRT or OpenWRT to plug into the ISP router (and to limit the use of the ISP router to just serving the DD-WRT/OpenWRT router instead), but the things I found were either incredibly high end or simply unavailable. I ordered a router that can run OpenWRT couple months ago, and the order got canceled saying that they don't actually have any left. I gave up.

The straw that broke the camel's back

Couple weeks back, I was looking into messing with the HG255S again, mostly to figure out how I can get my own SIP stuff running on it so that I wouldn't have to worry about the horrible SIP implementation on my Cisco phone, and so that I could free an Ethernet port.

While doing my usual scouring to find any new information, I stumbled upon this specific post on a bad Turkish forum mentioning them running OpenWRT on the Xiaomi Mini router, and asking if moving to that would get them better performance. I quickly checked N11 (Turkish amazon, basically) and saw that there's some other Xiaomi Mi Routers, specifically the Mi Router 4 and 4A (Gigabit Edition). I checked their OpenWRT compatibility, and after seeing that they're supported, I ordered a 4A for merely 230TRY.

I considered getting something better that costs more, but due to COVID-19, I am trying to lower my expenses.

I also went ahead and dropped ~120TRY for a bunch of different programmers to have around, lol.

More on the Mi Router 4A

It's a Xiaomi Mi Router (to be called MiR) 3Gv2, in which 3Gv2 is just 3G, but worse. If you can get one of those, go ahead. Sadly though, they're not available in Turkey. It has 3 gigabit Ethernet ports, one for WAN. It has 2.4GHz and 5GHz WiFi.

It has support for OpenWRT snapshots, though it was broken as part of the move to Kernel 5.4 for over a week now. I talk more about this later.

It runs their own OpenWRT fork called MiWiFi:

MiWiFi

MiWiFi is fairly decent and honestly, is pretty usable by default. However, as you might expect, it's not very extensible. I wanted to use Wireguard with this router, and MiWiFi simply didn't offer that (though it did have built-in PPTP and L2TP). There are also some privacy concerns I have with Xiaomi due to the amount of telemetry my Xiaomi Mi phone sends.

It has two ways of getting proper OpenWRT on it:

The physical way

You can go the physical way by opening up the device, dumping the SPI, changing the uboot parameters, then flashing it back.

This is safer as you have a point to recover to if you somehow manage to softbrick, but in the end, there are people who posted their own images on the Internet (which will change your MAC address btw, you'll need to edit your MAC back if you flash those images).

While noting down that I was unable to successfully dump the SPI as I couldn't get the programmer to see it, I was unable to find enough information on several parts of this process before I could even attempt it, so here are some tips:

Software (OpenWRTInvasion)

The other approach is to take the lazy approach and use the software exploit, OpenWRTInvasion. This is what I ended up doing in the end.

FWIW, to get stok (session token), open the panel (http://192.168.31.1) and log in. It will be on the URL:

stok

OpenWRT on MiR 4A time

Shortly after it arrived, I ended up installing a build from the OpenWRT forum, as the latest builds reportedly soft-bricked the device. I spent the day setting it up and learning how to use Luci (the Web UI) and OpenWRT.

Sadly though, I realized shortly after that I wouldn't be able to run Wireguard on it for some time as:

  • MiR 4A doesn't have stable releases, just snapshots.
  • The build I installed was an unofficial build (I later tried another build and it was one too).
  • Snapshots do not have packages for older versions (except kmods, but obviously only for official builds— I tried force installing one with a matching kernel version, but it obviously didn't work as it couldn't match the symbols).
  • The OpenWRT image builder uses the latest packages from the repo.
  • Official snapshots do not get archived, which means that I couldn't switch to an official version.

So a couple days later, I decided to make my own build. Being scared of bricking my router (even if I could recover from it, I didn't want the hassle), I ended up hours trying to find which commit was the last safe one and then realized that the version I'm running included a git hash in the version code. Oops. I ended up going with that one.

So I set up an OpenWRT build environment and built it for the first time, and while praying to tech gods to not lead to a brick, I flashed it.

And it worked... though it was missing Luci and a bunch of other packages as I compiled them as modules, not as mandatory. Apparently, module means that you just get the ipks, while mandatory means that you get the ipks AND it gets built into the image.

I SSH'd in and installed the Luci modules I compiled (it was painful, it's like 10 packages), then did another build with everything set as mandatory.

And sure enough, it worked! I quickly posted my build and talked about my success in the OpenWRT forum.

All basic functionality worked as expected AND it had the wireguard kmod, so I could call it a success, right? Well, no.

I just couldn't get wireguard to work, it did show as connected on the router, but when I checked on the peers, it didn't show up. I never used OpenWRT before so I had no idea if I was doing something wrong or not, so I simply noted that down on the forum post and moved on.

The next day though, someone who's an OpenWRT dev posted about a patch they proposed to fix the issue on master. I quickly applied the patch, improved the set of packages I include, compiled, flashed, confirmed that it worked and posted a build to the forum.

I had to reset the settings to get it to work due to these DTS changes, and after a reconfiguration, I was happy to see that wireguard actually worked... mostly.

While it did work for IPv4, IPv6 just kept not working. This happened when I tried 6in4 too, which is rather annoying as I've been wanting IPv6 at home or some time. I think IPv6 is just broken somehow. I'll dig into it more later.

Edit, a couple days later: IPv6 on router was okay, however there were two issues:

  • The server I was Wireguarding to ended up having constant issues due to upstream, leading to IPv6 downtime for some time (without me realizing it, oops).
  • I had no idea how to properly distribute an IPv6 block to LAN with Wireguard, and I still don't. Yell at me with instructions here.

Anyhow, I got it working. See the conclusion for more details.

This is mostly where the state of affairs is right now. A modified version of the proposed patch was merged into master, and I also posted a build including that, but there's not much noteworthy there, nothing in the build was changed.

Extracting SOL's PPPoE creds

And as promised, what you came for: PPPoE magic.

Well, first of all, I tried using the PPPoE credentials I extracted from the HG253, but they didn't work. It'd probably work if I still had the HG253, but it probably changed when my router was being changed to an HG255S. That's all there is to the “I'll get to this later”. Yep.

There are guides out there that talk about how you can extract the credentials, but these are all aimed at people who don't use Linux, basically writing guides that are helpful to people who aren't familiar with Linux, but wasting the time of those who are familiar. Some are better tho, but IMO could be improved.

Here's my take at it:

  • Log into your router, find the PPPoE username. It should look like this: [email protected]. Note it down.
  • Install rp-pppoe

On Debian-based distros: # apt install pppoe

On Arch-based distros: # pacman -S rp-pppoe

  • Edit /etc/ppp/pppoe-server-options

Change the contents to:

# PPP options for the PPPoE server
# LIC: GPL
require-pap
login
lcp-echo-interval 10
lcp-echo-failure 2
show-password
debug
logfile /var/log/pppoe-server-log
  • Edit /etc/ppp/pap-secrets

Change the contents to (replace REPLACETHISWITHYOURUSERNAME with your username):

# Secrets for authentication using PAP
# client        server  secret                  IP addresses
"REPLACETHISWITHYOURUSERNAME" * ""
  • Create the log file for rp-pppoe: # touch /var/log/pppoe-server-log; chmod 0774 /var/log/pppoe-server-log
  • Find your ethernet interface with ip a. Mine looks like enp3s0, it's what I'll use in the future commands, replace that with your own.
  • Shut down your router, plug in a cable to the WAN port, plug the other end to your computer.
  • Run # pppoe-server -F -I enp3s0 -O /etc/ppp/pppoe-server-options on a terminal, replace enp3s0 with your own interface.
  • Run # tail -f /var/log/pppoe-server-log on another terminal
  • Turn on your router, wait for a little until you see lines like this:
rcvd [PAP AuthReq id=0x7 user="[email protected]" password="no"]
sent [PAP AuthNak id=0x7 "Session started successfully"]
PAP peer authentication failed for [email protected]
sent [LCP TermReq id=0x2 "Authentication failed"]

and

  script /usr/bin/pppoe -n -I enp3s0 -e 7:no:no:no:no:no:no -S '', pid 4767
Script /usr/bin/pppoe -n -I enp3s0 -e 7:no:no:no:no:no:no -S '' finished (pid 4767), status = 0x1

Take the password from the first block, and the MAC address from the second one (ignore the 7: or whatever number from the start).

Now you have everything you need to replace your SOL router.

Finally: Replacing the ISP router with a MiR 4A

This is the simple part.

Plug the cable from GPON to your router.

Log onto Luci, edit WAN (and disable WAN6), change type to PPPoE, put in the username and password we got earlier into the PAP/CHAP username and password fields like this:

PPPoE settings

Then save and apply.

ssh into your router, edit /etc/config/network, find config interface 'wan'. Add a line to it (with proper indents) with something like option macaddr 'no:no:no:no:no:no'— replace no:no:no:no:no:no with the MAC address we found earlier.

Then finally run service network restart, and you'll be free from the curse that is Superonline's ISP routers.

In conclusion

My wifi speeds are MUCH better now :)

97/20 speedtest

And I can connect to our internal network without needing to VPN on the device itself :D

Me accessing a server on edgebleed

Soon I'll even be able to have IPv6 at home :P

I even have IPv6 at home, thanks to linuxgemini :3

IPv6 test showing IPv4 from SOL and IPv6 from Lasagna Ltd

Also: Capitalism is a failure, and free market ideologies are a joke. You don't get companies competing for cheaper prices, better service and less restrictions, you get companies all limiting their customers and all of them fucking them over in different ways. I am forced to use SOL because VodafoneNet and TT both have a contract minimum of 2 years, TT is unreliable AF, and TurkNet Fiber is unavailable in 99% of Turkey, including where I live, and everyone else are just resellers.

Bonus vent

*: I constantly turned down their offers as they were all worse than what I was already getting, or were slower than 100Mbps. I was also lied to, saying that fees would go up after the new year due to BTK, which was simply wrong, they still sell the same plan for the same price I started out my contract with. I ended up calling them 2 weeks before my contract expiry date, telling them exactly what I want (100Mbps, with contracts no longer than a year), they came up with a 15 month 100Mbps plan for 135TRY for the first 6 months, then 160TRY for the next 9. I kinda hesitated for the 15 months thing, but I said meh and agreed to it.

 
Read more...

from Ave's Blog

I'm not the fastest typer, and I don't really use 10 fingers- I tend to use 7-8, but in general, I try to minimize the amount of keypresses that I make. This means that I use shortcuts and dedicated keys as much as I can. One example of this (that involves the delete key) is how I press delete instead of right arrow and backspace.

And ever since I got my Pinebook Pro, I felt the distinct lack of a delete key.

What's worse was the fact that in place of a delete key was a power key, one that, once tapped, depending on the DE either showed a power menu, or shut off the PBP:

Power button on PBP's keyboard, in place of delete key, at top right corner of keyboard, as an actual keyboard key (official image from Pine64 Store, modified with circle around power button)

One of the first things after I installed Manjaro ARM was disabling the power button's system shutdown effects in /etc/systemd/logind.conf, by setting HandlePowerKey=ignore (and restarting systemd-logind, which fyi kills X session)

Later on, to actually get it to work as a delete key I used something I did long ago, and just got the keycode from xev and set it with xmodmap to delete.

This wasn't perfect by any means, it had some delay, and some software like gimp just ignored it (which made image editing a lot more painful).

Then the project working on improving the keyboard and touchpad ended up releasing an update, one that allowed people to make their own keymappings.

I saw this while at work, put a note for myself:

The original note from February 9

I've been meaning to put aside some time to try and implement this behavior in the firmware itself, but I just couldn't find the time or the energy.

Until today.

The setup

I don't have much of a story to tell tbh. I cloned the repo, downloaded the requirements, compiled the tools.

I compiled and installed firmware/src/keymaps/default_iso.h (by following instructions on firmware/src/README.md) just to see if it works or not, it did, so I continued on.

After setting up this new firmware, I did notice that some functionality worked differently though, such as:

  • numlock didn't turn ha3f 6f the 2eyb6ard 5nt6 a n40-ad (numlock didn't turn half of the keyboard into a numpad), but simply allowed the numpad area on the keyboard to be used with fn keys, which is a much better way of doing things.
  • Fn+F3 no longer pressed p. p.
  • Keyboard/Touchpad name changed from the actual part name to “Pine64 Pinebook Pro”, breaking my xinput set-prop settings. Simply renaming the device on the commands fixed this.
  • Sleep button combination (Fn+Esc) did not work (I don't use this combination, but the fact that it had the image on the keyboard and worked prior to the flashing bothered me).

The tinkering

I copied the file default_iso.h to ave_iso.h, trying to figure out how it's structured. I tried to find the power button, and I couldn't find it.

There was this vague keyboard shaped array with key mappings, and I did get how one half of them worked, but I couldn't understand how the other half did:

The keyboard shaped array

Well, I dug in the codebase for a couple hours, trying to figure out everything, and it finally made sense.

FR, FS and FK are just arrays that are mapped to fns_regular, fns_special and fns_keypad arrays in the same file respectively. This is all explained on the firmware/src/README.md.

The number (such as 6 on FR(6)) given as argument is the index from said array.

An example entry of REG_FN(KC_Z, KC_NUBS) means that default action is KC_Z, while action when Fn is held down is KC_NUBS.

KC means keycode, and they're mapped in firmware/src/include/keycodes.h. Do note that not all descriptions are correct in practice though, one example is that KC_SYSTEM_POWER says 0xA5, but 0xA5 is actually used for brightness up (I explain why this is the case later).

The R() function used on rest of the keyboard are “Regular” keys, ones that have no actions with Fn. They're directly passed on to their KC_ versions.

If you hate yourself, you can also supply regular integers in place of any of the aforementioned functions and anywhere where you see a KC_, and this did help when I was trying to understand how things work.

FK is only able to be used with Fn keys when numlock is open. I'm not exactly sure what the difference of FR and FS are outside of semantics. (Looking at my own PR, I regret using FR instead of FS as I'm not fitting the semantics properly. Functionality seems the same though.)

I ended up implementing the sleep button combination, and I learned a lot about keyboards while trying to figure out how I could even emulate the power button. I have some links that I used during my adventure at the bottom of this article. I sent a PR with that patch and it got merged.

The realization

After asking around on the Pine64 #pinebook channel, I was told by a helpful person that the power button is wired to the SoC directly, and that SoC sends the power key input itself (or rather, that this input was handled by the device tree in the linux kernel and turned into an actual emulated keypress).

Most importantly however, they said that it could be remapped with udev. Now, I had only used udev rules to date,and it got me rather confused as I had no idea how one would remap anything with that. That got me to research how to do that, and I learned about a tool that I never used before: evtest.

And sure enough, I found it:

gpio-key-power on evtest's device list

Upon picking gpio-key-power and hitting the key, I immediately saw the keypress (this image was taken after the change, so it says KEY_DELETE, before the change it used to say KEY_POWER):

Power key press event on evtest

Upon more research, I learned how to write hwdb entries in udev, not rules. Similarly, I found an already existing hwdb file in /etc/udev/hwdb.d/10-usb-kbd.hwdb, which explained why the KC_SYSTEM_POWER key was mapped to brightness up: Because the hwdb was set up this way. For reference, here's what it looks like:

evdev:input:b0003v258Ap001E*
  KEYBOARD_KEY_700a5=brightnessdown
  KEYBOARD_KEY_700a6=brightnessup
  KEYBOARD_KEY_70066=sleep

This also explained to me why KC_POWER caused a sleep action and not a power key action when done through the builtin keyboard (but not through the dedicated power button).

The ending

I quickly wrote a hwdb file myself on /etc/udev/hwdb.d/20-power-button.hwdb:

evdev:name:gpio-key-power*
  KEYBOARD_KEY_0=delete

And upon recreating the hwdb file with # systemd-hwdb update and triggering the hwdb with # udevadm trigger /dev/input/event2, the power button started working as a proper delete key.

evtest saw it as KEY_DELETE, the delay when tapping it rapidly vanished, and stuff like gimp started to acknowledge it. Now I just need to avoid holding it down.

Handy resources

 
Read more...