Quotes and sources

God I love Yahtzee’s game reviews. Especially for choice games like Ride to Hell Retribution 1%, Alien: Colonial Marines and Amy.

  • I got shit to doooo! Ride to Hell Retribution
  • The entire QA team simultaneously resigned to start a shotgun-tasting business. Ride to Hell Retribution
  • Because: fuck you! Ride to Hell Retribution
  • If a game is bad it’s usually because not enough people cared; not because development was forming a murder-suicide. Best and worst of 2012
  • In one of its many dalliances with total fucking pointlessness. Soul Calibur IV
  • A few seconds earlier would have been expedient deary! Siren Blood Curse
  • It’s all right, you can swear on the internet. Your mum probably isn’t going to read it. I know because she’s too busy being fucked by me. Mailbag Showdown
  • The play time is shorter than a documentary about French war heroes. Tomb Raider: Underworld
  • A lesson who could well have been learnt by those people who lived next door to Auschwitz and thought all that smoke came from an unusally screamy pie-factory. Lego Indiana Jones
  • Random documents and audio logs! Song
  • Knifing people with sideburns. That’s ‘Knifing people who have got sideburns’ rather, the alternative would be absurd. E3 2012
  • More shaky-cam footage than a Paul Greensgrass film being projected onto a fat jogger’s tits. FFXIII
  • Badger watcher with anger-management issues. Sniper Elite V2
  • I remember it being in God of War III, Shattered Dimensions, The Force Unleashed Two, Wet, Wolverine… That’s the game Wet and the game Wolverine, not a game about a wet wolverine, no such thing exists. Dead Space 2

Other carefully crafted burns from Youtube-celebs:

  • Vinny didn’t entirely enjoy No Man’s Sky. It’s creator got that slightly wrong but Vinny was kind enough to clarify, which made it into his 2019 retrospective.
  • It’s good to know that Civvie11 wants to be one of my people. Otherwise his Quake 4 review was kind of… Quake 4.
  • Rifftrax comment on seeing Lycan colony: I saw Manos the Hands of Fate and thought, how could it get any worse? Then I saw Birdemic and thought, how could it get any worse? Then I saw Feeders and thought, how could it get any worse? Then I saw Suburban Sasquatch and thought, how could it get any worse? Then I saw Lycan Colony and thought, oh. That’s how.
  • Arin has plans for Sonic ’06: Well then… I’ll better pack my suitcase… ’cause we’re going straight to hell!
  • Somewhere , if you reach deep into Mike Matei, you will find a smidgeon of humanity! But for now we can only see the Beast! Sonic ’06 part 25
  • What can we say that’s PC here? … Because I don’t think ‘Retard-child’ will go over well. Sonic ’06 part 26
  • These two girls they make quite a pair. They both come from your worst nightmare. They will haunt your soul forever, and now every time you see pink you’re gonna think: we’re doomed. They are agents of Satan… MST3K 0421 – Monster A Go-Go
  • Uh-oh, I think we’ have’re looking at a Pinky Promise Protocol here fellas… I uncovered a billion dollar fraud
  • Trees! The Horrible World of Kinect Games – Caddicarus

Notes:

“Beautiful ladies who want to meet me don’t need an appointment.”
Scrooge McDuck – E049 – Raiders of the Lost Harp – 12:45

Like Wet, Wolverine… That’s the game Wet and the game Wolverine, not a game about a Wet Wolverine, no such thing exists.

For Brain Rose. Coffezilla

“Let’s say good-bye to the bullshit van! ‘Good-bye bullshit van!’ [ He waves theatrically to a spot in the middle distance ] Now, it’s gone! Now I won’t bullshit you and you don’t get to bullshit me.”
Sean Anderson – Differently Morphous – Chapter seventeen – 01:28

Bahnhof service

Bahnhof needs to do some service so we’re running internet via Telia 4G for a few hours. I took the opportunity to get Galera on galera02 going again after I used that node for VPN experiments. It took some time since I confused backend0X with galera0X. I couldn’t quite figure out why mariadbgalera was down on all nodes… My monitoring system explained the discrepancy…

Turns out zabbix has a bunch of stuck queries in Galera so I guess the max_user_connection warning makes sense. I used for PD in $(mysqladmin processlist | awk ‘{print $2}’); do mysqladmin kill $PD; done to kill ongoing queries quickly and now it works.

VPN

I got a Ubuntu VPN up and running with my Android phone before but I wanted it to work with my Rocky 9.2 install and now it works. Server conf:

conn roadwarriors
    ikev2=insist
    fragmentation=yes
    left=%any
    leftsubnet=192.168.0.0/21
    leftcert="IPsec client cjp"
    leftid=%fromcert
    right=192.168.2.72
    # trust our own Certificate Agency
    rightca=%same
    # pick an IP address pool to assign to remote users
    rightaddresspool=192.168.4.1-192.168.4.20
    # if you want remote clients to use some local DNS zones and servers
    modecfgdns="192.168.0.220, 192.168.0.1"
    modecfgdomains="incandescent.tech"
    rightcert="IPsec server cert"
    authby=rsasig
    auto=add
    # kill vanished roadwarriors
    dpddelay=1m
    dpdtimeout=5m
    dpdaction=clear

I’ve imported the CA, the server cert and the client cert:

ipsec import dualca.p12
ipsec import ipsecserver.p12
ipsec import cjpipsec.p12

[root@runner02 ~]# ipsec trafficstatus
006 #3: “roadwarriors”[2] 83.191.105.10, type=ESP, add_time=1698601747, inBytes=0, outBytes=0, maxBytes=2^63B, id=’CN=cjp’, lease=192.168.4.1/32

I can’t get forwarding to work on the server end though like I could with Ubuntu. I tried debugging it with my Ubuntu-machine on Vultr but the two just would not communicate and my Ubuntu host gave no response to ipsec listcerts. Anyway, this seems to work with RHEL now:

conn roadwarriors
    ikev2=insist
    # support (roaming) MOBIKE clients (RFC 4555)
    #mobike=yes
    fragmentation=yes
    left=%any
    # if access to the LAN is given, enable this, otherwise use 0.0.0.0/0
    leftsubnet=192.168.0.0/21
    leftcert="IPsec client cjp"
    #leftcert="IPsec server cert"
    leftid=%fromcert
    #leftxauthserver=yes
    #leftmodecfgserver=yes
    right=192.168.2.72
    # trust our own Certificate Agency
    rightca=%same
    # pick an IP address pool to assign to remote users
    rightaddresspool=192.168.9.1-192.168.9.20
    # if you want remote clients to use some local DNS zones and servers
    modecfgdns="192.168.0.220, 192.168.0.1"
    modecfgdomains="incandescent.tech"
    #rightxauthclient=yes
    #rightmodecfgclient=yes
    rightcert="IPsec server cert"
    rightsubnet=192.168.0.0/21
    authby=rsasig
    # optionally, run the client X.509 ID through pam to allow or deny client
    # pam-authorize=yes
    # load connection, do not initiate
    auto=add
    # kill vanished roadwarriors
    dpddelay=1m
    dpdtimeout=5m
    dpdaction=clear

So we have to set rightsubnet for some reason. I intend to look into that tomorrow because I don’t understand that configuration name. BTW, my Android won’t accept redirected routes if they contain a slash. Uhm… You own example 10.0.0/8 contains a slash and without it it isn’t a proper subnet so… How that supposed to work? Thankfully rightsubnet does the work for me it seems. Oh, my firewall rules I’ve added:

*filter
:INPUT ACCEPT [0:0]
:FORWARD DROP [0:0]
:OUTPUT ACCEPT [0:0]
-I FORWARD --match policy --pol ipsec --dir in --proto esp -s 192.168.9.0/26 -j ACCEPT
-I FORWARD --match policy --pol ipsec --dir out --proto esp -d 192.168.9.0/26 -j ACCEPT
COMMIT

*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
-I POSTROUTING -s 192.168.9.0/26 -o ens18 -m policy --pol ipsec --dir out -j ACCEPT
-I POSTROUTING -s 192.168.9.0/26 -o ens18 -j MASQUERADE
COMMIT

*mangle
-I FORWARD --match policy --pol ipsec --dir in -s 192.168.9.0/26 -o ens18 -p tcp -m tcp --tcp-flags SYN,RST SYN -m tcpmss --mss 1361:1536 -j TCPMSS --set-mss 1360
COMMIT

Note that we do NOT want -A because that adds the rules to the end of the existing sets. I use iptables-restore -n FILENAME to read these rules in. I better check tomorrow that I can read these extra rules in on boot. Then I can keep Ubuntu for Galera only.

Online Safety Bill

So Britain went ahead with their Online Safety Bill… I’m not entirely sure they understand how the internet works. It’s not national, it’s one giant set covering the world. You’ll easily wind up with Facebook being required by Indian law to take down a post while British law states that it can’t be taken down. So Meta would have to choose which law to break.

The most contentious part of the bill is that requirement to install government-sponsored scanning software on devices to get around end-to-end-encryption. If we assume for the time being that every phone in the UK has such scanning software, communication with anyone in the UK will have three parties, the interlocutors and the British government.

I’m not sure how to make all UK phones carry scanning software either. If a frenchman enters the UK, will scanning software be installed on his phone then? Normally people have control over what is installed on their phones so that seems unrealistic. This extends to British people as well. Is Google required to force government scanning software onto anyone running Android and connecting to a British telecommunications tower? Similarly Apple with iPhones? That seems like it leaves them open to law suits in other jurisdictions as they have installed software on someone else’s phone.

By and large I don’t think tech giants will impair their product to accomodate one country of 75 million people. It will be interesting to see what Britain does in response. Blocking Google, Facebook and so on seems like a reasonable response but that will cause their islands to catch fire very soon(metaphorically speaking).

Then there’s the PR issue. The bill might have done better before Snowden but I don’t think people are very inclined to trust governments generally. Giving people end-to-end-encryption means not even the service provider can access the contents, let alone the governments. People are not keen on what amounts to government-sponsored spyware forcibly installed to get around end-to-end-encryption.

It should be noted of course that end-to-end-encryption doesn’t require Signal or WhatsApp. It can be arranged easily with gpg. Using a Yubikey is very secure but gpg with something like Qubes OS is pretty darn good as well.

The EU’s DSA is less intrusive thus far. It represents the entire block of 600-something million people and based on the GDPR there is likely to be wide adoption there. I’m no fan of the “right to be forgotten”(Google shouldn’t be required to eliminate search results if the original content is still there, it’s a half-measure to force them to hide it in search results) but tech giants have chosen to comply.

Galera again

I’ve tried out Galera on my workstation with a few VMs on VirtualBox and it has worked okey now that I overwrite the wsrep_sst_rsync to keep lsof from running and by extension using 100% of CPU capacity and not completing. I tried the latest version of MariaDB 11.0.3 to see if that removed the need to skip lsof but no, the behavior is still there. I tried Bitnami’s compilation of MariaDB but the docker but there were more than the acceptable level of error/warning messages when I started it up.

Below is an example of me turning off galera03, galera02 and galera01 in sequence. galera03 does an IST synchronization whereas galera02 and galera01 removed the local directory for MariaDB to force SST with rsync. In reality my MariaDB storage is more like 9GB and not 600MB like this toy setup but I think it will work.

The bottom three graph indicate that status of MySQL on the three VMs and you can see from the top graph that writes are made uninterrupted. The second graph indicates that number of bytes sent from each node. You only see a difference when galera03 is down and galera02 takes over thanks to keepalived.

I’m going try to let Zabbix use ProxySQL next because as it stands I rely on Keepalived to move an IP around as mariadbgalera does up and down. Then my plan is to set up new backend nodes – now with MariaDB Galera – and no virtual IP. The new Galera cluster will replicate from the current MariaDB master and then I will do a switchover. Maybe I should keep keepalived and the VIP so Grafana can access it easily. (thinking)

Well anyway, I’ll include the log of galera02 doing SST for general edification.

Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 0 [Note] WSREP: Joiner monitor thread started to monitor
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] rsync SST started on joiner (20231021 12:23:26.298)
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 113995, STRv: 3
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: IST receiver addr using tcp://192.168.2.162:4568
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: IST receiver bind using tcp://0.0.0.0:4568
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: Prepared IST receiver for 0-113995, listening at: tcp://0.0.0.0:4568
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 0 [Note] WSREP: Member 2.0 (11a0c1b70499) requested state transfer from 'any'. Selected 0.0 (e541a05fe7a3)(SYNCED) as donor.
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 113996)
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: Requesting state transfer: success, donor: 0
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: Resetting GCache seqno map due to different histories.
Oct 21 14:23:26 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:26 1 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113995
Oct 21 14:23:28 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:28 0 [Note] WSREP: (a1d6cfa1-90b2, 'tcp://0.0.0.0:4567') turning message relay requesting off
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:47 0 [Note] WSREP: 0.0 (e541a05fe7a3): State transfer to 2.0 (11a0c1b70499) complete.
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:47 0 [Note] WSREP: Member 0.0 (e541a05fe7a3) synced with group.
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] Extracting binlog files: (20231021 12:23:47.818)
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: galera03-bin.000004
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] Galera co-ords from recovery: 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113997 0 (20231021 12:23:47.848)
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] rsync SST completed on joiner (20231021 12:23:47.850)
Oct 21 14:23:47 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] Joiner cleanup: rsync PID=255, stunnel PID=0 (20231021 12:23:47.852)
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: WSREP_SST: [INFO] Joiner cleanup done. (20231021 12:23:48.360)
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 3 [Note] WSREP: SST received
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 3 [Note] WSREP: Server status change joiner -> initializing
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: Number of transaction pools: 1
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: Initializing buffer pool, total size = 1.000GiB, chunk size = 16.000MiB
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: Completed initialization of buffer pool
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
Oct 21 14:23:48 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:48 0 [Note] InnoDB: End of log at LSN=306743910
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: Opened 3 undo tablespaces
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait …
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB.
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: log sequence number 306743910; transaction id 305297
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] Plugin 'FEEDBACK' is disabled.
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] Recovering after a crash using galera02-bin
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] Starting table crash recovery…
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] Crash table recovery finished.
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] InnoDB: Buffer pool(s) load completed at 231021 12:23:49
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] Server socket created on IP: '0.0.0.0'.
Oct 21 14:23:49 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:49 0 [Note] WSREP: wsrep_init_schema_and_SR (nil)
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Server initialized
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Server status change initializing -> initialized
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: Recovered position from storage: 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113997
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: Server status change initialized -> joined
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: Recovered view from SST:
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: id: 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113995
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: status: primary
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: protocol_version: 4
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: final: no
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: own_index: 2
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: members(3):
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 0: 27fcd22a-7009-11ee-9a2b-5647f27ea5ae, e541a05fe7a3
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 1: 860d53c3-6f83-11ee-88b1-1309baa24770, 780dc59b17db
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2: a1d6cfa1-700c-11ee-90b2-176fa6c70192, 11a0c1b70499
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 6 [Note] WSREP: Recovered cluster id 16442fed-6f68-11ee-ae8d-9b68eafda4eb
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: SST received: 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113997
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 3 [Note] WSREP: SST succeeded for position 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113997
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Joiner monitor thread ended with total time 24 sec
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 1 [Note] WSREP: Installed new state from SST: 16442fed-6f68-11ee-ae8d-9b68eafda4eb:113997
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] mariadbd: ready for connections.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: Version: '11.0.3-MariaDB-1:11.0.3+maria~ubu2204-log' socket: '/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary distribution
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 1 [Note] WSREP: Cert. index preload up to 113997
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: ####### IST applying starts with 113998
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: ####### IST current seqno initialized to 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Receiving IST… 0.0% ( 0/1010 events) complete.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: IST preload starting at 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Service thread queue flushed.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:112985, protocol version: 5
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: REPL Protocols: 10 (5)
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: ####### Adjusting cert position: 113015 -> 113016
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Service thread queue flushed.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Lowest cert index boundary for CC from preload: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Min available from gcache for CC from preload: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: REPL Protocols: 10 (5)
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: ####### Adjusting cert position: 113994 -> 113995
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Service thread queue flushed.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Lowest cert index boundary for CC from preload: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Min available from gcache for CC from preload: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Receiving IST…100.0% (1010/1010 events) complete.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 1 [Note] WSREP: Cert. index preloaded up to 113995
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 1 [Note] WSREP: Lowest cert index boundary for CC from sst: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 1 [Note] WSREP: Min available from gcache for CC from sst: 112986
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: 2.0 (11a0c1b70499): State transfer from 0.0 (e541a05fe7a3) complete.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Shifting JOINER -> JOINED (TO: 114023)
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Processing event queue:… 0.0% ( 0/25 events) complete.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Member 2.0 (11a0c1b70499) synced with group.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Processing event queue:…100.0% (26/26 events) complete.
Oct 21 14:23:50 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:50 0 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 114023)
Oct 21 14:23:52 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:52 1 [Note] WSREP: Server 11a0c1b70499 synced with group
Oct 21 14:23:52 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:52 1 [Note] WSREP: Server status change joined -> synced
Oct 21 14:23:52 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:52 1 [Note] WSREP: Synchronized with group, ready for connections
Oct 21 14:23:52 galera02.incandescent.tech docker[1565]: 2023-10-21 12:23:52 1 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.

PiHole downtime

Cool visualization of how PiHole 02 went down for some reason and PiHole 03 took over. PiHole 02:

PiHole 03:

As you can see these are the nodes runner02 and runner03 respectively. I’m adding this to my docker commands because there is some issue with shm, even though it might not be what happened PiHole in this case:

docker run –shm-size=256m –name pihole –network host

The error message in question:

RAM shortage (/dev/shm) ahead: 99% used/dev/shm: 67.1MB used, 67.1MB total, FTL uses 67.1MB

UPS failure

Need to back up PVE nodes. /etc/pve is especially important.

Might not get a new UPS. I only had one back when I had one Solaris 11 server that didn’t like unclean shutdowns.

pve1 lost its 1000GB SSD somehow. Half of the RAID1 OS disk is worn out so it’s time to replace both it and the other OS disk which is nearing 100% wearout.

Rollout

I’m going to start rolling out new nodes to make sure my Ansible cookbooks are correct and to use the right template. I thought I knew how to expand a partitioned drive without making the corresponding VM unbootable but apparently not… I connected the old workstation VM-drive to a new VM and ran testdisk. What a wonderful utility. Now I’ve made sure both Ubuntu and Rocky Linux use a partitioned drive for /boot and a separate disk for LVM and no partitioning there! So it should be expandable.

I’m giving up on Galera as it didn’t handle reboots appropriately.

It worked before and fortunately it works fine for people at work but I’m switching to pure primary/replica with my old scripts for switchover and failover. I have some software lying around that I wrote to keep track of MariaDB’s status.

MinIO node replacement

Works just fine once I set the right owner for the MinIO storage area.

NUC: /ho/cj/St$ mc admin info svea
●  backend01.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend02.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend03.incandescent.tech:9000
   Uptime: 1 minute
   Version: 2023-08-09T23:30:22Z
   Network: 4/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend04.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

3.6 GiB Used, 10 Buckets, 73,525 Objects

Now Scylla needs to be primed so backend03 joins the fray. When backend02 is replaced I hope to get more insight into why Zabbix throws max_user_connections error after an SST…

MariaDB Galera

MariaDB Galera was ok to start but now with me reinstalling backend03 it’s not going great.

Sep 28 20:30:49 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] Stale sst_in_progress file: /var/lib/mysql/sst_in_progress (20230928 18:30:49.649)
Sep 28 20:30:49 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:49.720)
Sep 28 20:30:50 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:50.746)
Sep 28 20:30:50 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:50 0 [Warning] WSREP: last inactive check more than PT6S ago (PT6.00067S), skipping check
Sep 28 20:30:51 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:51.786)
Sep 28 20:30:52 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:52.819)
Sep 28 20:30:53 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:53.840)
Sep 28 20:30:54 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:54.864)
Sep 28 20:30:55 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:55.893)
Sep 28 20:30:56 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:56.938)
Sep 28 20:30:57 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:57.978)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:59.003)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: WSREP_SST: [ERROR] previous SST script still running. (20230928 18:30:59.009)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_mariabackup --role 'joiner' --address '192.168.2.83' --datadir '/var/lib/mysql/' --parent 1 --progress 0 --binlog 'backend03-bin' --binlog-index 'backend03-bin.index'
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         Read: '(null)'
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address '192.168.2.83' --datadir '/var/lib/mysql/' --parent 1 --progress 0 --binlog 'backend03-bin' --binlog-index 'backend03-bin.index': 114 (Operation already in progress)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [ERROR] WSREP: Failed to prepare for 'mariabackup' SST. Unrecoverable.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [ERROR] WSREP: SST request callback failed. This is unrecoverable, restart required.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: ReplicatorSMM::abort()
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing send monitor...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closed send monitor.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: terminating thread
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: joining thread
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: closing backend
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: view(view_id(NON_PRIM,52cd15fb-9b29,87) memb {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         be038c1b-8061,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } joined {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } left {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } partitioned {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         52cd15fb-9b29,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         82c6ead4-bf0f,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: })
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: PC protocol downgrade 1 -> 0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: view((empty))
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: closed
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Flow-control interval: [16, 16]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Received NON-PRIMARY.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 8042075)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: New SELF-LEAVE.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Flow-control interval: [0, 0]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 8042075)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: RECV thread exiting 0: Success
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: recv_thread() joined.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing replication queue.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing slave action queue.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: mariadbd: Terminated.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 230928 18:30:59 [ERROR] mysqld got signal 11 ;
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: This could be because you hit a bug. It is also possible that this binary
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: We will try our best to scrape up some info that will hopefully help
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: diagnose the problem, but since we have already crashed,
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: something is definitely wrong and this may fail.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204-log source revision: 9bc25d98209df6810f7a7d5e7dd3ae677a313ab5
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: key_buffer_size=0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: read_buffer_size=131072
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: max_used_connections=0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: max_threads=1002
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: thread_count=3
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: It is possible that mysqld could use up to
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2206964 K  bytes of memory
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Hope that's ok; if not, decrease some variables in the equation.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Thread pointer: 0x7f6acc000c68
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Attempting backtrace. You can use the following information to find out
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: where mysqld died. If you see no messages after this, something went
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: terribly wrong...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: stack_bottom = 0x7f6ae8fb0c68 thread_stack 0x49000
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Printing to addr2line failed
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(my_print_stacktrace+0x32)[0x5575c97de7c2]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(handle_fatal_signal+0x488)[0x5575c92b7cf8]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f6af380b520]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x178)[0x7f6af37f1898]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x156812)[0x7f6aeb1fb812]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x6f151)[0x7f6aeb114151]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x6bdb4)[0x7f6aeb110db4]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x8a5b1)[0x7f6aeb12f5b1]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x5f690)[0x7f6aeb104690]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x47611)[0x7f6aeb0ec611]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x12)[0x5575c989d3a2]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(+0xd5f191)[0x5575c9571191]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(_Z15start_wsrep_THDPv+0x26b)[0x5575c955f15b]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(+0xcd1906)[0x5575c94e3906]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f6af385db43]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7f6af38eebb4]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Trying to get some variables.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Some pointers may be invalid and cause the dump to abort.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Query (0x0): (null)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Connection ID (thread ID): 2
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Status: NOT_KILLED
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: information that should help you find out what is causing the crash.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: We think the query pointer is invalid, but we will try to print it anyway.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Query:
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Writing a core file...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Working directory at /var/lib/mysql
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Resource Limits:
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Limit                     Soft Limit           Hard Limit           Units
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max cpu time              unlimited            unlimited            seconds
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max file size             unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max data size             unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max stack size            8388608              unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max core file size        0                    0                    bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max resident set          unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max processes             unlimited            unlimited            processes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max open files            1073741816           1073741816           files
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max locked memory         8388608              8388608              bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max address space         unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max file locks            unlimited            unlimited            locks
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max pending signals       14479                14479                signals
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max msgqueue size         819200               819200               bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max nice priority         0                    0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max realtime priority     0                    0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max realtime timeout      unlimited            unlimited            us
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Kernel version: Linux version 5.14.0-284.11.1.el9_2.x86_64 (mockbuild@x64-builder01.almalinux.org) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023
Sep 28 20:30:59 backend03.incandescent.tech systemd[1]: mariadbgalera.service: Main process exited, code=exited, status=139/n/a

This is with the following config:

[root@backend03 ~]# cat /etc/containers/mariadbgalera/config/*
[mariadb]
log-bin                    = ON
server-id                  = 3
log-basename               = backend03

wsrep_cluster_address      = gcomm://backend01.incandescent.tech,backend02.incandescent.tech,backend03.incandescent.tech

wsrep_cluster_name         = svealiden
binlog-format              = ROW
default_storage_engine     = InnoDB
innodb_autoinc_lock_mode   = 2
wsrep_on                   = ON
wsrep_log_conflicts        = ON
wsrep_node_address         = 192.168.2.83
wsrep_sst_receive_address  = 192.168.2.83
wsrep_provider             = /usr/lib/libgalera_smm.so
wsrep_provider_options     = ist.recv_addr=192.168.2.83;ist.recv_bind=0.0.0.0;evs.inactive_check_period=PT2S;evs.view_forget_timeout=P15M
wsrep_sst_method           = mariabackup
[mysqld]
skip-external-locking
bind-address                    = 0.0.0.0
expire_logs_days                = 4
gtid-domain-id                  = 10
character-set-server            = utf8mb4
collation-server                = utf8mb4_general_ci
innodb_buffer_pool_size         = 1G
innodb_compression_algorithm    = zlib
innodb_compression_default      = ON
performance_schema              = 1
max_connect_errors              = 1000
max_connections                 = 1000
max_user_connections            = 50

I’m also using a custom script for SST without timeout raised to one hour:

    impts=$(parse_cnf sst inno-move-opts "")
    stimeout=$(parse_cnf sst sst-initial-timeout 3600)
    ssyslog=$(parse_cnf sst sst-syslog 0)

This is done with ansible.

 - name: Copy wsrep script file
   ansible.builtin.copy:
     src: wsrep_sst_mariabackup
     dest: /etc/incandescent/containers/mariadbgalera/config/wsrep_sst_mariabackup
     mode: '755'

The service is started like this:

cjp@workstation:~/incandescent.tech/roles$ cat mariadbcluster/templates/mariadbgalera.service.j2
[Unit]
Description=MariaDB Galera

[Service]
TimeoutStartSec=3600
RestartSec=20
Restart=always
ExecStartPre=-/usr/bin/docker stop mariadbgalera
ExecStartPre=-/usr/bin/docker rm mariadbgalera
ExecStart=/usr/bin/docker run --name mariadbgalera -p {{publicport}}:{{privateport}} -p {{publicportgalera}}:{{privateportgalera}} -p {{publicportist}}:{{privateportist}} -p {{publicportsst}}:{{privateportsst}} -v /etc/incandescent/containers/mariadbgalera/config/wsrep_sst_mariabackup:/usr/bin/wsrep_sst_mariabackup -v /etc/incandescent/containers/mariadbgalera/config:/etc/mysql/conf.d -v /srv/storage/mariadb/data:/var/lib/mysql  -v /srv/storage/mariadbbackups:/backup --env-file /etc/incandescent/containers/mariadbgalera/environment/mariadbgalera.env --cpu-quota=100000 --memory={{memlimit}}m "{{registryhost}}:{{registryport}}/{{image_basename}}:{{image_tagname}}" {% if bootstrap != 0 %}-- --wsrep-new-cluster{% endif %}

[Install]
WantedBy=multi-user.target

Weird. Now I’m seeing lsof taking all CPU:

top - 20:43:54 up  8:04,  1 user,  load average: 1.09, 1.31, 1.25
Tasks: 207 total,   2 running, 205 sleeping,   0 stopped,   0 zombie
%Cpu(s): 10.4 us, 13.3 sy,  0.0 ni, 71.2 id,  0.0 wa,  0.6 hi,  0.6 si,  3.9 st
MiB Mem :   3661.7 total,   1561.0 free,   1003.4 used,   1362.7 buff/cache
MiB Swap:   3584.0 total,   3583.2 free,      0.8 used.   2658.3 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 188514 systemd+  20   0    5424   1616   1372 R  92.7   0.0  10:49.62 lsof -Pnl -i :4444

Der Angriff Steiner war ein Befehl! OK, I’ll try to calm down…

So I went back to rsyncing and yes lsof stalls somehow. So I changed the wsrep_sst_rsync script and made sure it didn’t run lsof and now we get past that part.

top - 21:36:40 up  8:57,  1 user,  load average: 1.04, 0.93, 1.16
Tasks: 206 total,   1 running, 205 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.9 us,  1.7 sy,  0.0 ni, 84.1 id,  0.0 wa,  0.9 hi,  6.0 si,  6.4 st
MiB Mem :   3661.7 total,    127.2 free,   1017.9 used,   2787.6 buff/cache
MiB Swap:   3584.0 total,   3583.0 free,      1.0 used.   2643.8 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 194726 systemd+  20   0   10912   3344   2048 S   3.3   0.1   0:35.21 rsync --daemon --no-detach --port 4444 --config /var/lib/mysql/rsync_sst.conf
  26799 root      20   0 1619348  83060  51416 S   1.3   2.2   9:16.91 /usr/sbin/promtail-linux-amd64 -config.file /etc/promtail/promtail.yaml
 194291 root      20   0  719844  17620   7096 S   1.3   0.5   0:03.42 /usr/bin/containerd-shim-runc-v2 -namespace moby -id e4ccec842b4c4f20225b2daf876f55613032eb49a72e5+
     28 root      20   0       0      0      0 S   0.7   0.0   0:55.24 [ksoftirqd/2]

It’s not done yet so it still might crash like before. Weird how strace shows a lot of SQL statement. With rsync binary files are transferred. Well, tar balls is pretty great.

Uhm, I’m starting to think rsync got stuck. This is the size of tables on backend02:

4.0K    /srv/storage/mariadb/data/zabbix/history_log.frm
28K     /srv/storage/mariadb/data/zabbix/history_log.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_str.frm
2.1M    /srv/storage/mariadb/data/zabbix/history_str.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_text.frm
1.3G    /srv/storage/mariadb/data/zabbix/history_text.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_uint.frm
328M    /srv/storage/mariadb/data/zabbix/history_uint.ibd

If we go to backend03:

4.0K    /srv/storage/mariadb/data/zabbix/history_log.frm
64K     /srv/storage/mariadb/data/zabbix/history_log.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_str.frm
15M     /srv/storage/mariadb/data/zabbix/history_str.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_text.frm
8.0G    /srv/storage/mariadb/data/zabbix/history_text.ibd

Uhm, why is history_text 8G on backend03? No compression? It’s the same binary running on both systems. history_str is also larger by a wide margin. rsync seems to have completed but then we get the crash:

Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:40 0 [Note] WSREP: 0.0 (5e5dd1122766): State transfer to 2.0 (e4ccec842b4c) complete.
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Extracting binlog files: (20230928 19:58:40.822)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: backend01-bin.000018
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Galera co-ords from recovery: 52d278a6-50ad-11ee-8431-2ab12cc70be8:8066709 0 (20230928 19:58:40.857)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] rsync SST completed on joiner (20230928 19:58:40.866)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Joiner cleanup: rsync PID=263, stunnel PID=0 (20230928 19:58:40.874)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Joiner cleanup done. (20230928 19:58:41.415)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: SST received
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: Server status change joiner -> initializing
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] mariadbd: Aria engine: starting recovery
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: tables to flush: 1 0 (0.0 seconds);
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] mariadbd: Aria engine: recovery done
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Number of transaction pools: 1
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Using SSE4.2 crc32 instructions
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Initializing buffer pool, total size = 1.000GiB, chunk size = 16.000MiB
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Completed initialization of buffer pool
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.6.15. You must start up and shut down MariaDB 10.7 or earlier.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Starting shutdown...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] Plugin 'FEEDBACK' is disabled.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Unknown/unsupported storage engine: InnoDB
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Aborting
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [ERROR] WSREP: sst_received failed: State wait was interrupted
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 1 [ERROR] WSREP: Application received wrong state:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]:         Received: 00000000-0000-0000-0000-000000000000
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]:         Required: 52d278a6-50ad-11ee-8431-2ab12cc70be8
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 1 [ERROR] WSREP: Application state transfer failed. This is unrecoverable condition, restart required.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 230928 19:58:41 [ERROR] mysqld got signal 11 ;
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: This could be because you hit a bug. It is also possible that this binary
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: We will try our best to scrape up some info that will hopefully help
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: diagnose the problem, but since we have already crashed,
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: something is definitely wrong and this may fail.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204-log source revision: 9bc25d98209df6810f7a7d5e7dd3ae677a313ab5
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: key_buffer_size=134217728
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: read_buffer_size=131072
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: max_used_connections=0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: max_threads=1002
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: thread_count=2
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: It is possible that mysqld could use up to
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2338036 K  bytes of memory
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Hope that's ok; if not, decrease some variables in the equation.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Thread pointer: 0x7fe694000c68
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Attempting backtrace. You can use the following information to find out
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: where mysqld died. If you see no messages after this, something went
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: terribly wrong...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: stack_bottom = 0x7fe6ac602c68 thread_stack 0x49000
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Printing to addr2line failed
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(my_print_stacktrace+0x32)[0x5610db8167c2]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(handle_fatal_signal+0x488)[0x5610db2efcf8]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fe6b665c520]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x178)[0x7fe6b6642898]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x156812)[0x7fe6ae04c812]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x6f151)[0x7fe6adf65151]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x6cde1)[0x7fe6adf62de1]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x8a5b1)[0x7fe6adf805b1]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x5f690)[0x7fe6adf55690]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x47611)[0x7fe6adf3d611]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x12)[0x5610db8d53a2]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(+0xd5f191)[0x5610db5a9191]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(_Z15start_wsrep_THDPv+0x26b)[0x5610db59715b]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(+0xcd1906)[0x5610db51b906]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7fe6b66aeb43]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7fe6b673fbb4]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Trying to get some variables.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Some pointers may be invalid and cause the dump to abort.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Query (0x0): (null)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Connection ID (thread ID): 1
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Status: NOT_KILLED
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: information that should help you find out what is causing the crash.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: We think the query pointer is invalid, but we will try to print it anyway.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Query:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Writing a core file...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Working directory at /var/lib/mysql
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Resource Limits:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Limit                     Soft Limit           Hard Limit           Units
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max cpu time              unlimited            unlimited            seconds
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max file size             unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max data size             unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max stack size            8388608              unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max core file size        0                    0                    bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max resident set          unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max processes             unlimited            unlimited            processes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max open files            1073741816           1073741816           files
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max locked memory         8388608              8388608              bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max address space         unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max file locks            unlimited            unlimited            locks
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max pending signals       14479                14479                signals
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max msgqueue size         819200               819200               bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max nice priority         0                    0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max realtime priority     0                    0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max realtime timeout      unlimited            unlimited            us
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Kernel version: Linux version 5.14.0-284.11.1.el9_2.x86_64 (mockbuild@x64-builder01.almalinux.org) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023
Sep 28 21:58:42 backend03.incandescent.tech systemd[1]: mariadbgalera.service: Main process exited, code=exited, status=139/n/a

I’ll try tomorrow again using Ubuntu 22.04 which is what my backend cluster currently runs and we’ll see what happens. But I hope scylla behaves better. I’m a-ok with replace-node and all that but these crashes? RDBMS clustered is a pain.

Addendum 1:

Scylla is nice.

root@backend02:~# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  4.64 MB    1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.67 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  4.64 MB    1            ?       5a902584-96a2-4a0e-9e79-77ddcaa1f62f  one

From backend03:

[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  ?          1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  ?          1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  ?          1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.71 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  4.68 MB    1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.71 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

Addendum 2:

I’m running docker with version 10.11.5 now and it works if I use rsync but this causes a crash just like I saw on 11.1.2 so this line has to do in the config:

wsrep_sst_method           = mariabackup

In the meantime I tried running Zabbix with PostgreSQL and cockroachdb. But Zabbix need some of the features that cockroachdb hasn’t implemented. So no dice there. If Galera is too unpredictable I can always use a single master and two slaves and use my scripts for switching between them. Standard replication.

Addendum 3:

I think I might have come across a good combo. It passes my exceptional “can be fixed while I’m drunk” test! Sure, it still requires me to patch the wsrep_sst_rsync script but I did that part while I was sober. Getting the galera cluster from 1 to 3 nodes passed the test. Now for some music!