UPS failure

Need to back up PVE nodes. /etc/pve is especially important.

Might not get a new UPS. I only had one back when I had one Solaris 11 server that didn’t like unclean shutdowns.

pve1 lost its 1000GB SSD somehow. Half of the RAID1 OS disk is worn out so it’s time to replace both it and the other OS disk which is nearing 100% wearout.

Rollout

I’m going to start rolling out new nodes to make sure my Ansible cookbooks are correct and to use the right template. I thought I knew how to expand a partitioned drive without making the corresponding VM unbootable but apparently not… I connected the old workstation VM-drive to a new VM and ran testdisk. What a wonderful utility. Now I’ve made sure both Ubuntu and Rocky Linux use a partitioned drive for /boot and a separate disk for LVM and no partitioning there! So it should be expandable.

I’m giving up on Galera as it didn’t handle reboots appropriately.

It worked before and fortunately it works fine for people at work but I’m switching to pure primary/replica with my old scripts for switchover and failover. I have some software lying around that I wrote to keep track of MariaDB’s status.

MinIO node replacement

Works just fine once I set the right owner for the MinIO storage area.

NUC: /ho/cj/St$ mc admin info svea
●  backend01.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend02.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend03.incandescent.tech:9000
   Uptime: 1 minute
   Version: 2023-08-09T23:30:22Z
   Network: 4/4 OK
   Drives: 1/1 OK
   Pool: 1st

●  backend04.incandescent.tech:9000
   Uptime: 2 weeks
   Version: 2023-08-09T23:30:22Z
   Network: 3/4 OK
   Drives: 1/1 OK
   Pool: 1st

3.6 GiB Used, 10 Buckets, 73,525 Objects

Now Scylla needs to be primed so backend03 joins the fray. When backend02 is replaced I hope to get more insight into why Zabbix throws max_user_connections error after an SST…

MariaDB Galera

MariaDB Galera was ok to start but now with me reinstalling backend03 it’s not going great.

Sep 28 20:30:49 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] Stale sst_in_progress file: /var/lib/mysql/sst_in_progress (20230928 18:30:49.649)
Sep 28 20:30:49 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:49.720)
Sep 28 20:30:50 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:50.746)
Sep 28 20:30:50 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:50 0 [Warning] WSREP: last inactive check more than PT6S ago (PT6.00067S), skipping check
Sep 28 20:30:51 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:51.786)
Sep 28 20:30:52 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:52.819)
Sep 28 20:30:53 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:53.840)
Sep 28 20:30:54 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:54.864)
Sep 28 20:30:55 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:55.893)
Sep 28 20:30:56 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:56.938)
Sep 28 20:30:57 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:57.978)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: WSREP_SST: [INFO] previous SST is not completed, waiting for it to exit (20230928 18:30:59.003)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: WSREP_SST: [ERROR] previous SST script still running. (20230928 18:30:59.009)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_mariabackup --role 'joiner' --address '192.168.2.83' --datadir '/var/lib/mysql/' --parent 1 --progress 0 --binlog 'backend03-bin' --binlog-index 'backend03-bin.index'
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         Read: '(null)'
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'joiner' --address '192.168.2.83' --datadir '/var/lib/mysql/' --parent 1 --progress 0 --binlog 'backend03-bin' --binlog-index 'backend03-bin.index': 114 (Operation already in progress)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [ERROR] WSREP: Failed to prepare for 'mariabackup' SST. Unrecoverable.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [ERROR] WSREP: SST request callback failed. This is unrecoverable, restart required.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: ReplicatorSMM::abort()
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing send monitor...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closed send monitor.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: terminating thread
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: joining thread
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: closing backend
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: view(view_id(NON_PRIM,52cd15fb-9b29,87) memb {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         be038c1b-8061,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } joined {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } left {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: } partitioned {
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         52cd15fb-9b29,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]:         82c6ead4-bf0f,0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: })
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: PC protocol downgrade 1 -> 0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: view((empty))
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: gcomm: closed
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Flow-control interval: [16, 16]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Received NON-PRIMARY.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Shifting PRIMARY -> OPEN (TO: 8042075)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: New SELF-LEAVE.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Flow-control interval: [0, 0]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 8042075)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 0 [Note] WSREP: RECV thread exiting 0: Success
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: recv_thread() joined.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing replication queue.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: Closing slave action queue.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 2023-09-28 18:30:59 2 [Note] WSREP: mariadbd: Terminated.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: 230928 18:30:59 [ERROR] mysqld got signal 11 ;
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: This could be because you hit a bug. It is also possible that this binary
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: We will try our best to scrape up some info that will hopefully help
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: diagnose the problem, but since we have already crashed,
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: something is definitely wrong and this may fail.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204-log source revision: 9bc25d98209df6810f7a7d5e7dd3ae677a313ab5
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: key_buffer_size=0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: read_buffer_size=131072
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: max_used_connections=0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: max_threads=1002
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: thread_count=3
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: It is possible that mysqld could use up to
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2206964 K  bytes of memory
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Hope that's ok; if not, decrease some variables in the equation.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Thread pointer: 0x7f6acc000c68
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Attempting backtrace. You can use the following information to find out
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: where mysqld died. If you see no messages after this, something went
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: terribly wrong...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: stack_bottom = 0x7f6ae8fb0c68 thread_stack 0x49000
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Printing to addr2line failed
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(my_print_stacktrace+0x32)[0x5575c97de7c2]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(handle_fatal_signal+0x488)[0x5575c92b7cf8]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f6af380b520]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x178)[0x7f6af37f1898]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x156812)[0x7f6aeb1fb812]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x6f151)[0x7f6aeb114151]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x6bdb4)[0x7f6aeb110db4]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x8a5b1)[0x7f6aeb12f5b1]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x5f690)[0x7f6aeb104690]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /usr/lib/libgalera_smm.so(+0x47611)[0x7f6aeb0ec611]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x12)[0x5575c989d3a2]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(+0xd5f191)[0x5575c9571191]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(_Z15start_wsrep_THDPv+0x26b)[0x5575c955f15b]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: mariadbd(+0xcd1906)[0x5575c94e3906]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7f6af385db43]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7f6af38eebb4]
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Trying to get some variables.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Some pointers may be invalid and cause the dump to abort.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Query (0x0): (null)
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Connection ID (thread ID): 2
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Status: NOT_KILLED
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: information that should help you find out what is causing the crash.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: We think the query pointer is invalid, but we will try to print it anyway.
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Query:
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Writing a core file...
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Working directory at /var/lib/mysql
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Resource Limits:
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Limit                     Soft Limit           Hard Limit           Units
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max cpu time              unlimited            unlimited            seconds
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max file size             unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max data size             unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max stack size            8388608              unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max core file size        0                    0                    bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max resident set          unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max processes             unlimited            unlimited            processes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max open files            1073741816           1073741816           files
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max locked memory         8388608              8388608              bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max address space         unlimited            unlimited            bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max file locks            unlimited            unlimited            locks
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max pending signals       14479                14479                signals
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max msgqueue size         819200               819200               bytes
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max nice priority         0                    0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max realtime priority     0                    0
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Max realtime timeout      unlimited            unlimited            us
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Sep 28 20:30:59 backend03.incandescent.tech docker[186863]: Kernel version: Linux version 5.14.0-284.11.1.el9_2.x86_64 (mockbuild@x64-builder01.almalinux.org) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023
Sep 28 20:30:59 backend03.incandescent.tech systemd[1]: mariadbgalera.service: Main process exited, code=exited, status=139/n/a

This is with the following config:

[root@backend03 ~]# cat /etc/containers/mariadbgalera/config/*
[mariadb]
log-bin                    = ON
server-id                  = 3
log-basename               = backend03

wsrep_cluster_address      = gcomm://backend01.incandescent.tech,backend02.incandescent.tech,backend03.incandescent.tech

wsrep_cluster_name         = svealiden
binlog-format              = ROW
default_storage_engine     = InnoDB
innodb_autoinc_lock_mode   = 2
wsrep_on                   = ON
wsrep_log_conflicts        = ON
wsrep_node_address         = 192.168.2.83
wsrep_sst_receive_address  = 192.168.2.83
wsrep_provider             = /usr/lib/libgalera_smm.so
wsrep_provider_options     = ist.recv_addr=192.168.2.83;ist.recv_bind=0.0.0.0;evs.inactive_check_period=PT2S;evs.view_forget_timeout=P15M
wsrep_sst_method           = mariabackup
[mysqld]
skip-external-locking
bind-address                    = 0.0.0.0
expire_logs_days                = 4
gtid-domain-id                  = 10
character-set-server            = utf8mb4
collation-server                = utf8mb4_general_ci
innodb_buffer_pool_size         = 1G
innodb_compression_algorithm    = zlib
innodb_compression_default      = ON
performance_schema              = 1
max_connect_errors              = 1000
max_connections                 = 1000
max_user_connections            = 50

I’m also using a custom script for SST without timeout raised to one hour:

    impts=$(parse_cnf sst inno-move-opts "")
    stimeout=$(parse_cnf sst sst-initial-timeout 3600)
    ssyslog=$(parse_cnf sst sst-syslog 0)

This is done with ansible.

 - name: Copy wsrep script file
   ansible.builtin.copy:
     src: wsrep_sst_mariabackup
     dest: /etc/incandescent/containers/mariadbgalera/config/wsrep_sst_mariabackup
     mode: '755'

The service is started like this:

cjp@workstation:~/incandescent.tech/roles$ cat mariadbcluster/templates/mariadbgalera.service.j2
[Unit]
Description=MariaDB Galera

[Service]
TimeoutStartSec=3600
RestartSec=20
Restart=always
ExecStartPre=-/usr/bin/docker stop mariadbgalera
ExecStartPre=-/usr/bin/docker rm mariadbgalera
ExecStart=/usr/bin/docker run --name mariadbgalera -p {{publicport}}:{{privateport}} -p {{publicportgalera}}:{{privateportgalera}} -p {{publicportist}}:{{privateportist}} -p {{publicportsst}}:{{privateportsst}} -v /etc/incandescent/containers/mariadbgalera/config/wsrep_sst_mariabackup:/usr/bin/wsrep_sst_mariabackup -v /etc/incandescent/containers/mariadbgalera/config:/etc/mysql/conf.d -v /srv/storage/mariadb/data:/var/lib/mysql  -v /srv/storage/mariadbbackups:/backup --env-file /etc/incandescent/containers/mariadbgalera/environment/mariadbgalera.env --cpu-quota=100000 --memory={{memlimit}}m "{{registryhost}}:{{registryport}}/{{image_basename}}:{{image_tagname}}" {% if bootstrap != 0 %}-- --wsrep-new-cluster{% endif %}

[Install]
WantedBy=multi-user.target

Weird. Now I’m seeing lsof taking all CPU:

top - 20:43:54 up  8:04,  1 user,  load average: 1.09, 1.31, 1.25
Tasks: 207 total,   2 running, 205 sleeping,   0 stopped,   0 zombie
%Cpu(s): 10.4 us, 13.3 sy,  0.0 ni, 71.2 id,  0.0 wa,  0.6 hi,  0.6 si,  3.9 st
MiB Mem :   3661.7 total,   1561.0 free,   1003.4 used,   1362.7 buff/cache
MiB Swap:   3584.0 total,   3583.2 free,      0.8 used.   2658.3 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 188514 systemd+  20   0    5424   1616   1372 R  92.7   0.0  10:49.62 lsof -Pnl -i :4444

Der Angriff Steiner war ein Befehl! OK, I’ll try to calm down…

So I went back to rsyncing and yes lsof stalls somehow. So I changed the wsrep_sst_rsync script and made sure it didn’t run lsof and now we get past that part.

top - 21:36:40 up  8:57,  1 user,  load average: 1.04, 0.93, 1.16
Tasks: 206 total,   1 running, 205 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.9 us,  1.7 sy,  0.0 ni, 84.1 id,  0.0 wa,  0.9 hi,  6.0 si,  6.4 st
MiB Mem :   3661.7 total,    127.2 free,   1017.9 used,   2787.6 buff/cache
MiB Swap:   3584.0 total,   3583.0 free,      1.0 used.   2643.8 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 194726 systemd+  20   0   10912   3344   2048 S   3.3   0.1   0:35.21 rsync --daemon --no-detach --port 4444 --config /var/lib/mysql/rsync_sst.conf
  26799 root      20   0 1619348  83060  51416 S   1.3   2.2   9:16.91 /usr/sbin/promtail-linux-amd64 -config.file /etc/promtail/promtail.yaml
 194291 root      20   0  719844  17620   7096 S   1.3   0.5   0:03.42 /usr/bin/containerd-shim-runc-v2 -namespace moby -id e4ccec842b4c4f20225b2daf876f55613032eb49a72e5+
     28 root      20   0       0      0      0 S   0.7   0.0   0:55.24 [ksoftirqd/2]

It’s not done yet so it still might crash like before. Weird how strace shows a lot of SQL statement. With rsync binary files are transferred. Well, tar balls is pretty great.

Uhm, I’m starting to think rsync got stuck. This is the size of tables on backend02:

4.0K    /srv/storage/mariadb/data/zabbix/history_log.frm
28K     /srv/storage/mariadb/data/zabbix/history_log.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_str.frm
2.1M    /srv/storage/mariadb/data/zabbix/history_str.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_text.frm
1.3G    /srv/storage/mariadb/data/zabbix/history_text.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_uint.frm
328M    /srv/storage/mariadb/data/zabbix/history_uint.ibd

If we go to backend03:

4.0K    /srv/storage/mariadb/data/zabbix/history_log.frm
64K     /srv/storage/mariadb/data/zabbix/history_log.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_str.frm
15M     /srv/storage/mariadb/data/zabbix/history_str.ibd
4.0K    /srv/storage/mariadb/data/zabbix/history_text.frm
8.0G    /srv/storage/mariadb/data/zabbix/history_text.ibd

Uhm, why is history_text 8G on backend03? No compression? It’s the same binary running on both systems. history_str is also larger by a wide margin. rsync seems to have completed but then we get the crash:

Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:40 0 [Note] WSREP: 0.0 (5e5dd1122766): State transfer to 2.0 (e4ccec842b4c) complete.
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Extracting binlog files: (20230928 19:58:40.822)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: backend01-bin.000018
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Galera co-ords from recovery: 52d278a6-50ad-11ee-8431-2ab12cc70be8:8066709 0 (20230928 19:58:40.857)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] rsync SST completed on joiner (20230928 19:58:40.866)
Sep 28 21:58:40 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Joiner cleanup: rsync PID=263, stunnel PID=0 (20230928 19:58:40.874)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: WSREP_SST: [INFO] Joiner cleanup done. (20230928 19:58:41.415)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: SST received
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: Server status change joiner -> initializing
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] mariadbd: Aria engine: starting recovery
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: tables to flush: 1 0 (0.0 seconds);
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] mariadbd: Aria engine: recovery done
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Number of transaction pools: 1
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Using SSE4.2 crc32 instructions
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Initializing buffer pool, total size = 1.000GiB, chunk size = 16.000MiB
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Completed initialization of buffer pool
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] InnoDB: Upgrade after a crash is not supported. The redo log was created with MariaDB 10.6.15. You must start up and shut down MariaDB 10.7 or earlier.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] InnoDB: Starting shutdown...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [Note] Plugin 'FEEDBACK' is disabled.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Unknown/unsupported storage engine: InnoDB
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 0 [ERROR] Aborting
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 3 [ERROR] WSREP: sst_received failed: State wait was interrupted
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 1 [ERROR] WSREP: Application received wrong state:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]:         Received: 00000000-0000-0000-0000-000000000000
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]:         Required: 52d278a6-50ad-11ee-8431-2ab12cc70be8
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 2023-09-28 19:58:41 1 [ERROR] WSREP: Application state transfer failed. This is unrecoverable condition, restart required.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: 230928 19:58:41 [ERROR] mysqld got signal 11 ;
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: This could be because you hit a bug. It is also possible that this binary
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: or one of the libraries it was linked against is corrupt, improperly built,
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: or misconfigured. This error can also be caused by malfunctioning hardware.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: We will try our best to scrape up some info that will hopefully help
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: diagnose the problem, but since we have already crashed,
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: something is definitely wrong and this may fail.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Server version: 11.1.2-MariaDB-1:11.1.2+maria~ubu2204-log source revision: 9bc25d98209df6810f7a7d5e7dd3ae677a313ab5
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: key_buffer_size=134217728
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: read_buffer_size=131072
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: max_used_connections=0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: max_threads=1002
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: thread_count=2
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: It is possible that mysqld could use up to
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2338036 K  bytes of memory
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Hope that's ok; if not, decrease some variables in the equation.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Thread pointer: 0x7fe694000c68
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Attempting backtrace. You can use the following information to find out
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: where mysqld died. If you see no messages after this, something went
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: terribly wrong...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: stack_bottom = 0x7fe6ac602c68 thread_stack 0x49000
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Printing to addr2line failed
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(my_print_stacktrace+0x32)[0x5610db8167c2]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(handle_fatal_signal+0x488)[0x5610db2efcf8]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fe6b665c520]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x178)[0x7fe6b6642898]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x156812)[0x7fe6ae04c812]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x6f151)[0x7fe6adf65151]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x6cde1)[0x7fe6adf62de1]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x8a5b1)[0x7fe6adf805b1]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x5f690)[0x7fe6adf55690]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /usr/lib/libgalera_smm.so(+0x47611)[0x7fe6adf3d611]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(_ZN5wsrep18wsrep_provider_v2611run_applierEPNS_21high_priority_serviceE+0x12)[0x5610db8d53a2]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(+0xd5f191)[0x5610db5a9191]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(_Z15start_wsrep_THDPv+0x26b)[0x5610db59715b]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: mariadbd(+0xcd1906)[0x5610db51b906]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43)[0x7fe6b66aeb43]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7fe6b673fbb4]
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Trying to get some variables.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Some pointers may be invalid and cause the dump to abort.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Query (0x0): (null)
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Connection ID (thread ID): 1
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Status: NOT_KILLED
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=on
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: information that should help you find out what is causing the crash.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: We think the query pointer is invalid, but we will try to print it anyway.
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Query:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Writing a core file...
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Working directory at /var/lib/mysql
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Resource Limits:
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Limit                     Soft Limit           Hard Limit           Units
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max cpu time              unlimited            unlimited            seconds
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max file size             unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max data size             unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max stack size            8388608              unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max core file size        0                    0                    bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max resident set          unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max processes             unlimited            unlimited            processes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max open files            1073741816           1073741816           files
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max locked memory         8388608              8388608              bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max address space         unlimited            unlimited            bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max file locks            unlimited            unlimited            locks
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max pending signals       14479                14479                signals
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max msgqueue size         819200               819200               bytes
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max nice priority         0                    0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max realtime priority     0                    0
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Max realtime timeout      unlimited            unlimited            us
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Sep 28 21:58:41 backend03.incandescent.tech docker[194177]: Kernel version: Linux version 5.14.0-284.11.1.el9_2.x86_64 (mockbuild@x64-builder01.almalinux.org) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Tue May 9 05:49:00 EDT 2023
Sep 28 21:58:42 backend03.incandescent.tech systemd[1]: mariadbgalera.service: Main process exited, code=exited, status=139/n/a

I’ll try tomorrow again using Ubuntu 22.04 which is what my backend cluster currently runs and we’ll see what happens. But I hope scylla behaves better. I’m a-ok with replace-node and all that but these crashes? RDBMS clustered is a pain.

Addendum 1:

Scylla is nice.

root@backend02:~# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  4.64 MB    1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.67 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  4.64 MB    1            ?       5a902584-96a2-4a0e-9e79-77ddcaa1f62f  one

From backend03:

[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  ?          1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  ?          1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  ?          1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.71 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
[root@backend03 ~]# scylla nodetool status
Datacenter: svealiden
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns    Host ID                               Rack
UN  192.168.2.81  4.68 MB    1            ?       6be20c62-a7f6-41b9-8924-1de608b5fb49  one
UN  192.168.2.82  4.71 MB    1            ?       c4e0631b-a282-4bd8-866b-aa2f15877f1c  one
UN  192.168.2.83  6.03 MB    1            ?       365a7f90-1626-46ff-86e2-0d2ebdf3d762  one

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

Addendum 2:

I’m running docker with version 10.11.5 now and it works if I use rsync but this causes a crash just like I saw on 11.1.2 so this line has to do in the config:

wsrep_sst_method           = mariabackup

In the meantime I tried running Zabbix with PostgreSQL and cockroachdb. But Zabbix need some of the features that cockroachdb hasn’t implemented. So no dice there. If Galera is too unpredictable I can always use a single master and two slaves and use my scripts for switching between them. Standard replication.

Addendum 3:

I think I might have come across a good combo. It passes my exceptional “can be fixed while I’m drunk” test! Sure, it still requires me to patch the wsrep_sst_rsync script but I did that part while I was sober. Getting the galera cluster from 1 to 3 nodes passed the test. Now for some music!

PiHole weirdness

I kept seeing connections dropping out in Grafana and sometimes I even saw it in the browser with “No data” in all panels.

Things seemed to be in order with pdnsauth and pdnsrecursor so we had pihole as a suspect. Indeed keepalived didn’t think it was stable:

Sep 27 17:40:41 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:41:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:41:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:42:17 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:42:30 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:43:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:43:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:45:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:45:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:47:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:47:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:50:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:50:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:51:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:51:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:53:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:53:15 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:54:47 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:55:11 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:56:47 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:57:11 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 17:58:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 17:58:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:00:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:00:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:01:47 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:02:11 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:03:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:03:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:05:17 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:05:41 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:07:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:07:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:10:17 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:10:41 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:11:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:11:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:12:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:12:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:13:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:13:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:14:47 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:15:11 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:16:32 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:16:56 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:18:17 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:18:41 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:20:17 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:20:41 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:21:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:21:26 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:22:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:22:30 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0
Sep 27 18:23:02 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 9
Sep 27 18:23:11 runner02.incandescent.tech Keepalived_vrrp[418200]: Script `chk_dns` now returning 0

Now pihole is out, we’re going straight to pdnsrecursor which checks certain names with pdnsauth, otherwise it’s the usual DNS resolution. Seems more stable:

HAproxy and Prometheus

Why didn’t my own handcrafted work when HAproxy was the middleman? It worked with curl! Maybe compression? No, turns out that was OK. Some other Accept-header? I ended up running tcpdump:

I ended up suspecting the port 80 being sent by Prometheus. I tried it with curl but even with this below curl didn’t send the portnumber(and so didn’t fail):

curl -sH 'Accept-encoding: gzip' -H "Accept: application/openmetrics-text;version=1.0.0,application/openmetrics-text;version=0.0.1;q=0.75,text/plain;version=0.0.4;q=0.5,/;q=0.1" http://networkmon.incandescent.tech:80/ | gunzip -

I changed it in HAproxy instead and it worked:

Note how the portnumber isn’t shown in the output above but tcpdump showed us that the port number is sent along. Anyway, I needed HAproxy’s Consul service discovery to translate the networkmon pointer into a fixed address. This is what I use now:

    acl ACL_networkmon hdr_sub(host) -i networkmon.incandescent.tech
    use_backend networkmon-backend if ACL_networkmon

backend networkmon-backend
    balance roundrobin
    option httpchk HEAD /
    server-template networkmon 1 _networkmon._tcp.service.consul resolvers consul resolve-opts allow-dup-ip resolve-prefer ipv4 check

I’ll post the entire Ansible cookbook soon.

Bad error messages 1

Here is output from my fully functional pdns-recursor instance:

Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 PowerDNS Recursor 4.9.1 (C) 2001-2022 PowerDNS.COM BV
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 Using 64-bits mode. Built using gcc 10.2.1 20210110 on Aug 25 2023 09:18:15 by root@0b77bb2e4da4.
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="Enabling IPv4 transport for outgoing queries" subsystem="config" level="0" prio="Notice" tid="0" ts="1694540905.330"
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="NOT using IPv6 for outgoing queries - add an IPv6 address (like '::') to query-local-address to enable" subsystem="config" level="0" prio="Warning" tid="0" ts="1694540905.330"
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="Setting access control" subsystem="config" level="0" prio="Info" tid="0" ts="1694540905.331" acl="allow-from" addresses="127.0.0.0/8 10.0.0.0/8 100.64.0.0/10 169.254.0.0/16 192.168.0.0/16 172.0.0.0/8 ::1/128 fc00::/7 fe80::/10"
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="Will not send queries to" subsystem="config" level="0" prio="Notice" tid="0" ts="1694540905.339" addresses="127.0.0.0/8 10.0.0.0/8 100.64.0.0/10 169.254.0.0/16 192.168.0.0/16 172.16.0.0/12 ::1/128 fc00::/7 fe80::/10 0.0.0.0/8 192.0.0.0/24 192.0.2.0/24 198.51.100.0/24 203.0.113.0/24 240.0.0.0/4 ::/96 ::ffff:0:0/96 100::/64 2001:db8::/32 0.0.0.0 ::"
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="Redirecting queries" subsystem="config" level="0" prio="Info" tid="0" ts="1694540905.339" addresses="192.168.2.72:8053" recursion="0" zone="svealiden.se"
Sep 12 19:48:25 runner02.incandescent.tech docker[429556]: Sep 12 17:48:25 msg="Redirecting queries" subsystem="config" level="0" prio="Info" tid="0" ts="1694540905.339" addresses="192.168.2.72:8053" recursion="0" zone="incandescent.tech"

The highlighted section is my problem because I use 192.168.0.0/16 for my local network. Well, 192.168.0.0/21 actually but that is a strict subset of 192.168.0.0/16 – which I also believe makes 192.168.0.0/21 a partition from a set-theoretical perspective. But guess what? I run my pdns authoritative name servers on this network which pdns-recursor claims it will not send queries to – and it works fine!

I don’t know what they are referring to but can’t figure it out. I even looked in the source code to no avail. Anyway, it slowed down debugging by a few hours.

Docker and dns

Got pihole<->pdns recursor<->pdns authoritative to work on docker. Had to make them use host network:

/etc/systemd/system/pdnsauth.service:

[Unit]
Description=PowerDNS authoritative DNS server

[Service]
TimeoutStartSec=45
Restart=always
ExecStartPre=-/usr/bin/docker stop pdnsauth
ExecStartPre=-/usr/bin/docker rm pdnsauth
ExecStart=/usr/bin/docker run --name pdnsauth --network host -v /etc/containers/pdns-authoritative/config/pdns.conf:/etc/powerdns/pdns.conf -v /etc/containers/pdns-authoritative/config/named.conf:/etc/named/named.conf -v /etc/containers/pdns-authoritative/zones:/etc/zones --cpu-quota=50000 --memory=256m "dockerregistry.incandescent.tech:1080/pdns-auth-48:4.8.1"

[Install]
WantedBy=multi-user.target

/etc/systemd/system/pdnsrecursor.service:

[Unit]
Description=PowerDNS recursive DNS server

[Service]
TimeoutStartSec=45
Restart=always
ExecStartPre=-/usr/bin/docker stop pdnsrecursor
ExecStartPre=-/usr/bin/docker rm pdnsrecursor
ExecStart=/usr/bin/docker run --network host --name pdnsrecursor -v /etc/containers/pdns-recursor/config/recursor.conf:/etc/powerdns/recursor.conf -v /etc/containers/pdns-recursor/config/dnshosts:/etc/hosts --cpu-quota=30000 --memory=256m "dockerregistry.incandescent.tech:1080/pdns-recursor-49:4.9.1"

[Install]
WantedBy=multi-user.target

/etc/systemd/system/pihole.service:

[Unit]
Description=PiHole

[Service]
TimeoutStartSec=60
RestartSec=5s
Restart=always
ExecStartPre=-/usr/bin/docker stop pihole
ExecStartPre=-/usr/bin/docker rm pihole
ExecStart=/usr/bin/docker run --name pihole --network host -v "/srv/storage/pihole/etc-pihole:/etc/pihole" -v "/srv/storage/pihole/etc-dnsmasq.d:/etc/dnsmasq.d" --restart=unless-stopped --hostname pihole --env-file /etc/containers/pihole/environment/pihole.env --cpu-quota=50000 --memory=2048m "dockerregistry.incandescent.tech:1080/pihole:2023.05.2"

[Install]
WantedBy=multi-user.target
/etc/containers/pdns-authoritative/config/pdns.conf:
local-address=0.0.0.0,::
local-port=8053
launch=bind
bind-config=/etc/named/named.conf
webserver-address=0.0.0.0
allow-axfr-ips=192.168.0.0/21,172.0.0.0/8,10.0.0.0/8
api=yes
api-key=SECRETAPI
default-ttl=3600
webserver=yes
webserver-password=SECRETWEB
webserver-allow-from=192.168.0.0/21,172.0.0.0/8,10.0.0.0/8
loglevel=6
include-dir=/etc/powerdns/pdns.d

/etc/containers/pdns-recursor/config/recursor.conf:
allow-from=127.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 169.254.0.0/16, 192.168.0.0/16, 172.16.0.0/12, ::1/128, fc00::/7, fe80::/10
forward-zones=svealiden.se=192.168.2.73:8053
local-port=7053
local-address=0.0.0.0
webserver=yes
webserver-address=0.0.0.0
webserver-allow-from=192.168.0.0/16,172.0.0.0/8,10.0.0.0/8
webserver-password=SECRETWEB
webserver-port=8082
dnssec=off
export-etc-hosts=yes
log-common-errors=yes
loglevel=7
dont-throttle-netmasks=192.168.0.0/21,172.0.0.0/8,10.0.0.0/8

/etc/containers/pihole/environment/pihole.env:

PROXY_LOCATION=192.168.2.73
FTLCONF_REPLY_ADDR4=192.168.2.73
PIHOLE_DNS_=192.168.2.73#7053
TZ=Europe/Stockholm
WEBPASSWORD=SECRETWEBPIHOLE
QUERY_LOGGING=True
INTERFACE=ens18

Tests are run from runner03(192.168.2.73)

root@runner03:~# dig mx svealiden.se @192.168.2.73 -p 8053

; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> mx svealiden.se @192.168.2.73 -p 8053
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31309
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;svealiden.se.                  IN      MX

;; ANSWER SECTION:
svealiden.se.           3600    IN      MX      10 mail.svealiden.se.
svealiden.se.           3600    IN      MX      20 mail2.svealiden.se.

;; ADDITIONAL SECTION:
mail.svealiden.se.      3600    IN      A       192.0.2.3

;; Query time: 0 msec
;; SERVER: 192.168.2.73#8053(192.168.2.73) (UDP)
;; WHEN: Thu Aug 31 17:57:03 UTC 2023
;; MSG SIZE  rcvd: 100

root@runner03:~# dig mx svealiden.se @192.168.2.73 -p 7053

; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> mx svealiden.se @192.168.2.73 -p 7053
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48796
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;svealiden.se.                  IN      MX

;; ANSWER SECTION:
svealiden.se.           3225    IN      MX      20 mail2.svealiden.se.
svealiden.se.           3225    IN      MX      10 mail.svealiden.se.

;; Query time: 0 msec
;; SERVER: 192.168.2.73#7053(192.168.2.73) (UDP)
;; WHEN: Thu Aug 31 17:57:11 UTC 2023
;; MSG SIZE  rcvd: 84

root@runner03:~# dig mx svealiden.se @192.168.2.73

; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> mx svealiden.se @192.168.2.73
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26372
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;svealiden.se.                  IN      MX

;; ANSWER SECTION:
svealiden.se.           3221    IN      MX      20 mail2.svealiden.se.
svealiden.se.           3221    IN      MX      10 mail.svealiden.se.

;; Query time: 4 msec
;; SERVER: 192.168.2.73#53(192.168.2.73) (UDP)
;; WHEN: Thu Aug 31 17:57:15 UTC 2023
;; MSG SIZE  rcvd: 84

Thin volumes in Deb-based systems

I’ve tested this with Ubuntu 22.04.2(I even did a separate installation of it in case by cloned version was bad somehow) and Debian 12, both fails to mount /dev/gluster/smb01 as a THIN volume. It’s a-oh-key with it being a volume but not a thin volume. It mounts just fine after boot so I’ve done this:

ramfs                     ramfs           0     0     0    - /run/credentials/systemd-tmpfiles-setup.service
/dev/mapper/gluster-smb01 ext4         4.9G   24K  4.6G   1% /mnt
tmpfs                     tmpfs        392M     0  392M   0% /run/user/0
root@deb12:~# systemctl status domount.service
● domount.service
     Loaded: loaded (/etc/systemd/system/domount.service; enabled; preset: enabled)
     Active: active (exited) since Thu 2023-08-03 13:44:16 EDT; 23s ago
    Process: 453 ExecStart=/etc/mountdrives.sh (code=exited, status=0/SUCCESS)
   Main PID: 453 (code=exited, status=0/SUCCESS)
        CPU: 24ms

Aug 03 13:44:16 deb12 systemd[1]: Starting domount.service...
Aug 03 13:44:16 deb12 systemd[1]: Finished domount.service.
root@deb12:~# cat /etc/systemd/system/domount.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/etc/mountdrives.sh

[Install]
WantedBy=multi-user.target
root@deb12:~# cat /etc/mountdrives.sh
#!/bin/bash

mount -a

If you don’t follow along, it’s basically rc.locale but as a systemd script that simply runs “mount -a” after boot and everything is fine. I guess I should include fstab. Nah, I’ll just include the LVM stuff:

/dev/mapper/gluster-test        /srv/storage    ext4    defaults    0       0
/dev/mapper/gluster-smb01       /mnt            ext4    defaults,nofail 0   0
And LVM stuff:
root@deb12:~# lvs
  LV    VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  smb01 gluster twi-aotz-- 5.00g             0.00   10.64
  test  gluster -wi-ao---- 5.00g
root@deb12:~# vgs
  VG      #PV #LV #SN Attr   VSize   VFree
  gluster   1   2   0 wz--n- <50.00g 39.98g
root@deb12:~# pvs
  PV         VG      Fmt  Attr PSize   PFree
  /dev/sdb   gluster lvm2 a--  <50.00g 39.98g

I’ve tried modifying initramfs, udev and so on but no luck. I’m busy writing Ansible stuff so I’m not going to hunt down this further but it seems like a pretty big oversight not to be able to mount thin volumes on boot.

Linux Workstation with Yubikey Protection

I’m impressed with Yubikeys and have tinkered a bit with setting up a Linux workstation to use them. The pièce de résistance is requiring two of three Yubikeys to decrypt a system disk. Adding the Yubikey as a second factor to authentication and lock user sessions when a Yubikey is removed is also useful.

Majority unlock for LUKS-encrypted disks

LUKS allows us to have up to 8 passwords for a single disk. Let’s think of how we might use this to allow any two users from a set of three to start the computer by unlocking the encrypted drive. We have three people: A, B and C. They have a password each:

  • A => passwordA
  • B => passwordB
  • C => passwordC

To require two users to unlock the disk we must make every password a combination of two passwords:

  • Slot 1 => passwordApasswordB
  • Slot 2 => passwordApasswordC
  • Slot 3 => passwordBpasswordC

The users would here have to remember their internal ordering since we don’t have a slot with passwordCpasswordB for instance. LUKS will try a provided password against all slots so if two users enter the password(in the right order) the disk will be unlocked. Now we add Yubikeys to the mix which helps us a bit since that allows the scripts to order things based on the serials of each user’s Yubikey. I’ve swiped a lot of code from this suite of tools: https://github.com/cornelinux/yubikey-luks

Indeed I just installed it, replaced /usr/share/yubikey-luks/ykluks-keyscript and made my own enroll-script so this is little more than a hack layered on top of yubikey-luks.

To enroll three Yubikeys we first have to enable challenge-response mode which is basically just “encrypt incoming data with a secret key and return the output”.

ykpersonalize -2 -ochal-resp -ochal-hmac -ohmac-lt64 -oserial-api-visible

To then enroll:

#!/bin/bash

if [ -z $1 ]; then
  echo "You must provide a disk name as an argument. Example: /dev/sda3." 1>&2
  exit 1
fi

DISK=$1

if [ "$(id -u)" -ne 0 ]; then
  echo "You must be root." 1>&2
  exit 1
fi

declare -a STEPS=(0 1 2)

declare -a KEYSERIAL
declare -a KEYPASS

for INDX in ${STEPS[@]};
do
  echo "Index: $INDX"
  COUNTER=10
  while [ "$COUNTER" -gt 1 ];
  do
    sleep 0.5
    COUNTER=$((COUNTER-1))
    if ykinfo -n$INDX -q -2;
    then
      echo "Success!"
      break
    else
      echo "No YubiKey found."
    fi
  done

  SERIAL=$(ykinfo -s -n$INDX | cut -d ' ' -f2)
  echo "Enter challenge(password) for key $SERIAL: "
  read -s PW
  KEYSERIAL[$INDX]="$SERIAL"
  KEYPASS[$SERIAL]=$(printf %s "$PW" | ykchalresp -n$INDX -2 -i- 2>/dev/null || true)
done

SORTEDSERIALS=$(for K in ${!KEYPASS[@]};
do
  echo $K;
done | sort)

declare -a PASSWORDS
INDX=0
for K in $SORTEDSERIALS;
do
  echo "$K => ${KEYPASS[$K]}";
  PASSWORDS[$INDX]=${KEYPASS[$K]};
  INDX=$((INDX+1))
done

declare -a COMBINATIONS
COMBINATIONS[0]="${PASSWORDS[0]}${PASSWORDS[1]}"
COMBINATIONS[1]="${PASSWORDS[0]}${PASSWORDS[2]}"
COMBINATIONS[2]="${PASSWORDS[1]}${PASSWORDS[2]}"

OLD=$(/lib/cryptsetup/askpass "Please provide an existing passphrase. This is NOT the passphrase you just entered, this is the passphrase that you currently use to unlock your LUKS encrypted drive:")

SLOTS=$(seq 1 3)
for SLOT in $SLOTS;
do
  printf '%s\n' "$OLD" "${COMBINATIONS[$SLOT-1]}" "${COMBINATIONS[$SLOT-1]}" | cryptsetup --key-slot="$SLOT" luksAddKey "$DISK" 2>&1;
  SLOT=$((SLOT+1))
done

This is for a total of 3 users with their own keys. It could be made to handle 4 but 5 keys leaves too many permutations to fit into 8 slots. Execution looks like this:

root@yktest2:~# ./ykluks-enroll.sh /dev/sda3
Index: 0
1
Success!
Enter challenge(password) for key 24130422:  [password entered]
Index: 1
1
Success!
Enter challenge(password) for key 19652688:  [password entered]
Index: 2
1
Success!
Enter challenge(password) for key 23882290:  [password entered]

Now we need to introduce a key script /usr/share/yubikey-luks/ykluks-keyscript:

#!/bin/bash
#
#

message()
{
    if [ -x /bin/plymouth ] && plymouth --ping; then
        plymouth message --text="$*"
    else
        echo "$@" >&2
    fi
    return 0
}

# source for log_*_msg() functions, see LP: #272301
if [ -e /scripts/functions ] ; then
	. /scripts/functions
else
	. /usr/share/initramfs-tools/scripts/functions
fi

if [ -z "$cryptkeyscript" ]; then
	cryptkey="Unlocking the disk $cryptsource ($crypttarget)\\nEnter passphrase: "
	if [ -x /bin/plymouth ] && plymouth --ping; then
    	cryptkeyscript="plymouth ask-for-password --prompt" cryptkey=$(printf '%s' "$cryptkey")
    else
        cryptkeyscript="/lib/cryptsetup/askpass"
    fi
fi

check_yubikey_present="$(ykinfo -n0 -q -2)"
check_yubikey2_present="$(ykinfo -n1 -q -2)"

if [ "$check_yubikey_present" = "1" ]; then
  N0=$(ykinfo -n0 -s | cut -d ' ' -f 2)

  if [ "$check_yubikey2_present" = "1" ]; then
    N1=$(ykinfo -n1 -s | cut -d ' ' -f 2)
    if [ "$N0" -lt "$N1" ];
    then
	declare -a ORDER=(0 1)
    else
	declare -a ORDER=(1 0)
    fi
  else
    PW="$($cryptkeyscript "Please enter disk password: ")"
    printf '%s' "$PW"
    fi
else
  PW="$($cryptkeyscript "Please enter disk password: ")"
  printf '%s' "$PW"
fi

FINALPW=""
for INDX in "${ORDER[@]}";
do
	SERIAL=$(ykinfo -n$INDX -s)
	PW="$($cryptkeyscript "Please enter challenge for YubiKey $SERIAL: ")"
	R="$(printf %s "$PW" | ykchalresp -n$INDX -2 -i- 2>/dev/null || true)"
	message "Retrieved the response from Yubikey $SERIAL"
        FINALPW="$FINALPW$R"
done

printf '%s' "$FINALPW"

exit 0

Oh, right! I had to add bash to initramfs since my use of arrays isn’t compatible with dash which is what Ubuntu typically includes. So add this to /usr/share/initramfs-tools/hooks/yubikey-luks:

cp /usr/bin/bash "${DESTDIR}/bin/bash"

Then run update-initramfs -u

Yubikey U2F on authentication

Install libpam-u2f and pamu2fcfg packages:

apt install libpam-u2f pamu2fcfg

Then add this line to the bottom of /etc/pam.d/common-auth:

auth 	required pam_u2f.so authfile=/etc/u2f_mappings cue

Include nouserok to allow users without a Yubikey to log in, i.e. only apply the requirement for those users included in /etc/u2f_mappings: https://developers.yubico.com/pam-u2f/

Which bring us to adding U2F signatures to authfile. The user can run this:

cjp@yktest2:~$ pamu2fcfg 
Enter PIN for /dev/hidraw2: 
cjp:77hsMUYzPD0poXbu51/TWGW6roJ31F35G01JoiEskczwxqOvzb5zTgLsnWWo2nO0MmZ6L7erxJz2DufhQDuCs9GEQ==,Wrg4zmgQedALIQCBYTAxoIq/bd/Se2tqtOvVn6JdQmezN05Gt3qLmFGvMA7iXV6u2OHN/mQosg/46/LyIoY9gnow==,es256,+presence

And then the admin can add the generated line to /etc/u2f_mappings.

Lock on Yubikey removal

SUBSYSTEM=="usb", ACTION=="remove", RUN+="/usr/bin/loginctl lock-sessions"

I can’t get it to trigger based on ATTRS which seems reasonable since the device is disconnected when the rule is run. Also loginctl lock-sessions only work for some display managers but works for gdm3+gnome.