Linux Notes: Real world problems and solutions

  1. The information presented here is intended for educational use by qualified computer technologists.
  2. The information presented here is provided free of charge, as-is, with no warranty of any kind.
  3. Edit: 2023-03-04

Real-world Linux Problems

1) We cannot install or update via YUM

We have two CentOS-7 platforms: one for development and one for production (comment: two platforms may not be enough when using Linux; see the IBM-Managed warning after this section). The recommended approach is to first install (or update) software on the development box. If testing for the next few days (to weeks) proves that everything is working properly then we would repeat the procedure on the production box. This also keeps both platforms more-or-less in sync.

I wanted to install the tree utility so I logged onto our DVLP platform where I entered this command:

sudo yum install tree

... which worked properly.

Then I repeated this command on our PROD platform which failed with numerous errors associated with file /usr/libexec/urlgrabber-ext-down which is a python script. What was worse was this: you could not execute firewall-cmd or most yum commands including yum check-update. Investigating further, I noticed that someone had installed python3 then updated the symbolic link so that typing a python command pulls up python3 rather than python2 (most Linux administrator utilities in 2018 require Python 2.7)

There are only two ways out of this problem (remember that this is an active business system).

  1. modify the first line (the shebang or sharp-bang line) of broken system scripts from this #!/usr/bin/python to this #!/usr/bin/python2
    -or-
  2. modify the symbolic link for "python" and point it back at the symbolic link "python2" which already points to "python2.7". This will restore the system to its previous functionality but could break something if newer customer scripts required python3 to be the default. So before modifying the symbolic link, modify the shebang line of customer scripts like this:
    #!/usr/bin/python to this #!/usr/bin/python3

Tip: if this is an emergency, just make minimal changes. For now, just modify the scripts for whatever is broken (eg. yum or firewall-cmd). But you will eventually need to put everything back to a pristine state. If your stuff needs python3 then you need to rely upon shebang. I have no idea why the Linux maintainers didn't do this for their scripts requiring Python2. They broke their own golden rule

# 1) partial example of a system with two versions of python
# 2) notice that "python" is pointing to "python2"
# 3) notice that "python2" is pointing to "python2.7"
# 4) utilities requiring python2 (like yum and firewall-cmd)
#    should say so on the shebang line of those scripts
#
$ cd /usr/bin
$ ls pytho* -la
lrwxrwxrwx. 1 root root     7 Jan 12 15:25 python -> python2
lrwxrwxrwx. 1 root root     9 Dec 20  2016 python2 -> python2.7
-rwxr-xr-x. 1 root root  7136 Nov  5  2016 python2.7
lrwxrwxrwx. 1 root root     9 Apr 12  2017 python3 -> python3.4
-rwxr-xr-x. 2 root root 11312 Jan 17  2017 python3.4
...snip...

2) Never use a graphical console to update Linux

I have experienced several instances where updating software though the graphical interface fails for some reason then breaks the graphical interface (or the whole system). It should not surprise anyone that updating the gnome-session, or any of its dependencies, might disturb the very session that is running yum or rpm

So if you are on the graphical system console (which is almost always a VGA monitor) and want to move to non-graphical session console before running yum or rpm, try one of the following keystrokes:

key press description notes
CTRL ALT F1 switch to terminal 1 (graphical interface) only graphical when runlevel >= 5
CTRL ALT F2 switch to terminal 2 (/dev/tty2) text only
CTRL ALT F3 switch to terminal 3 (/dev/tty3) text only
CTRL ALT F4 switch to terminal 4 (/dev/tty4) text only
CTRL ALT F5 switch to terminal 5 (/dev/tty5) text only
CTRL ALT F6 switch to terminal 6 (/dev/tty6) text only

The only other way to safely disable graphics is to lower the runlevel of your system to 3. (but only do this if you are certain that you won't kill some process currently needed by your customers). Alternatively, use ssh to log into your system via the network then execute yum on that session.

update: Even though CentOS-7 does not use "/etc/inittab", and the text notes contained within say to do everything with systemctl, the following commands worked for me from the console as well as a network connection:

$ runlevel	# display current run level
runlevel N 5
$ init 3	# console #1 switches over to text mode
$ runlevel
runlevel 5 3
$ init 5	# console #1 switches back to graphics mode
$ runlevel
runlevel 3 5
Caveat: never init to a number below 3 over the network because that will kill the network so you WILL NOT be able to restore runlevel remotely

3) Using Windows to access a Linux remote

The self-help blogs really fall down on this one because the only secure way to do this is to tunnel x-sessions over SSH. But whenever anyone on a self-help blog asks how to do this only using SSH, some idiot will chime in with a procedure on how to do it using VNC, RealVNC, TigerVNC or Vino which are all insecure.

To make matters worse, setting up a remote graphical session is almost impossible (at least under certain circumstances like Windows -> Red-Hat/CentOS) because GNOME3 contains 3-d extensions not found in Windows clients. The best way out of this is to setup Red-Hat/CentOS on a machine at the client end then use it to connect to the desired Linux platform.
 
comment: some conspiracy-minded people think this change was deliberately done to stop support professionals from from using Windows as their default platform to support all others. The might be correct.

Xming

  • xming is a simple tool which is used in conjunction with a terminal emulator like: Tera Term or PuTTY

CygWin and CygWin/X

  • documentation: https://x.cygwin.com/docs/
    • https://x.cygwin.com/docs/ug/setup.html (jump to chapter 2.15)
    • x11 documentation tends to only speak of servers at each end (not true client server as one would normally think).
    • On the client side you must do this:
      • start an x11-server
      • connect through to the far-end
      • start a client over there which will open one, or more, x11 sessions back to your local server
  • Caveat: Caveat: do not install cygwin/x without first reading the build instructions (a full build will not produce what you want and will waste your time and bandwidth)
  • connecting:
    action: start your local x-server (on the start menu)
          : task will appear on your horizontal task bar
    
    action: start xterm
          : associated with the x-server icon on the task bar
    
    type  : ssh -X name@fully-qualified-domain-name 
    	  (replacing "-X" with "-Y" is even better)
    action: wait for the password prompt then enter it
    
    type  : xterm &
    action: a new window should open
    
    type  : /gnome-weather &
    action: a graphical weather app will pop up	
    
    type  : /gnome-session --disable-acceleration-check &
    action: a new window manager should open (but does not)
          : 1) green switch was added to gnome startin with version 3.16
          : 2) without it you may see something like this: 
    	   "Oh no! Something has gone wrong.
    	    A problem has occurred and the system can't recover.
    	    Please log out and try again."
          : 3) gnome3 requires advanced graphics so may never work this way;
    	   try a desktop other than gnome 

4) Recovering a failed YUM update (2018-01-xx)

  • I recently had a yum update fail on the graphical interface (see note #2 above regarding why you should never do this)
  • YUM was in the middle of updating ~ 1200 items when the GUI collapsed. I waited 8-hours then rebooted. The system came up then attempted to start a graphics console which failed (could see an arrow cursor and nothing else)
  • I typed CTRL ALT F2 and was able to login as root (from here I reran YUM)

5) a recent YUM update broke our development box (2018-01-xx)

  • we use sqlcmd from msodbcsql (ms odbc sql) to access an old database (SQL Server 2005) running on an old OS (Windows Server 2003). The platform is located in another city and province; they never installed mandatory patches; I have no faith in them upgrading any time soon; we have been able to connect for 14-months.
  • a recent yum update on our development platform broke msodbcsql (the production platform continues to work)
  • checking logs on our production platform shows that we can only connect to the Windows box only using the sslv2 protocol ???
  • testing on PROD
    # notes:
    # 1) well-known port 1433 is reserved for Microsoft SQL Server
    # 2) SQL Server 2005 appears to support ssl2 but nothing higher
    # 3) contrary to popular belief, as soon as you specify a
    # username and password in your ODBC connect string (via
    # sqlcmd) then the initial handshake will be encrypted # # this passes # openssl s_client -debug -state -connect ip-address:1433 -ssl2 # # this fails # openssl s_client -debug -state -connect ip-address:1433 -ssl3
  • Platform Differences
    Platform CentOS Notes  OpenSSL version  OpenSSL Notes
    Production CentOS 7.3 (built 14-months ago) OpenSSL-1.0.1e ssl2 is supported
    Development CentOS 7.4 (yum updated 2018-01-29) OpenSSL-1.0.2k ssl2 has been disabled prior to build
  • The simplest way out (at this time) is to build a fully-functional new version of OpenSSL-1.0.2k (in a local folder) then then copy the binary to "/usr/bin" after renaming the old version (just being paranoid here). A good friend provided me with this link on the Ubuntu site (a Debian flavor) which seems to work properly on CentOS-7
  • https://wiki.openssl.org/index.php/Compilation_and_Installation
  • https://askubuntu.com/questions/893155/simple-way-of-enabling-sslv2-and-sslv3-in-openssl
  • Building a new version of OpenSSL (only in your own folder for now)
    wget https://openssl.org/source/openssl-1.0.2k.tar.gz
    tar -xvf openssl-1.0.2k.tar.gz
    cd openssl-1.0.2k/
    # --prefix will make sure that make install copies the files locally instead of system-wide
    # --openssldir will make sure that the binary will look in the regular system location for openssl.cnf
    # no-shared builds a mostly static binary
    ./config --prefix=`pwd`/local --openssldir=/usr/lib/ssl enable-ssl2 enable-ssl3
    no-shared
    make depend
    make
    #
    # these next two steps are not required if openssl-1.0.2k already exists on your system.
    #
    make -i install
    sudo cp local/bin/openssl /usr/local/bin/
    #
    # test the newly created binary like so: ./apps/openssl s_client -debug -state -connect ip-address:1433 -ssl2
    # ...remembering that many other destinations will no longer accept ssl2
    # then rename the old binary (paranoid): mv /usr/bin/openssl /usr/bin/openssl-old
    # then copy the new binary: cp ./apps/openssl /usr/bin/openssl
    # 
  • Optional steps:
    1. for proper sslv23 handshaking (especially true when you only have an older ODBC connect string with no way to specify ssl parameters) you need to also include the switch no-tls-1-2-client
    2. building with the no-shared switch is necessary for testing your binary in a non-standard location but will result in the program being ~ 6 times larger (around 3.4 MB). Changing to switch shared will result in a binary size of ~ 600 KB)
Caveat: the procedure just given will only fix the OpenSSL CLI. Note that msodbcsql will still be broken because that software calls routines in the shared libraries. To fix msodbcsql you (supposedly) need to do one of the following:
  1. fully install an older version of OpenSSL (libraries and all) in a secondary location then ensure all scripts invoking sqlcmd look there
    • I built an older version of OpenSSL from source code then installed it in /opt/oldopenssl
    • all scripts starting sqlcmd first define LD_LIBRARY_PATH to point to /opt/oldopenssl/lib
    • although strace proves that msodbcsql is first looking in a secondary location, msodbcsql still does not work
  2. completely replace the new version of OpenSSL (libraries and all) with an older version
    • playing with yum downgrade openssl* has not yet worked but I think I may be close
  3. reinstall the previous OS (CentOS-7.2 in this case)

6) a recent update broke our production box (2018-06-xx)

One of our developers was experiencing problems developing a new LDAP-based application. So he invoked YUM to update LDAP on our production box. The big problem here is that the update was done in a careless way (without reading all the release notes). So the LDAP update also updated OpenSSL for the whole system so now we can no longer connect to that older Microsoft platform in Montreal. (see: this previous note)

It now appears that we will need to install a third (older) CentOS platform whose only purpose would be to reach through to the older Microsoft platform. This platform would need to be modified so that it could never been updated.

7) Something is overwriting file "/etc/resolv.conf"

This problem is so weird that I'll stick to bullet points

  • Last month I set up a new CentOS-7 platform for use in a project we will turn up in Feb of 2019
  • I logged onto the console then used a GUI session to setup several network connections which included three corporate DNS references.
  • From this point on, logging into that platform was slow (10 second delays). This included using the "su" command once you were logged in.
  • One of my peers in a city 100km away discovered a typo in file "/etc/resolv.conf" which he fixed using a non-GUI login. The delays disappeared.
  • However, just logging into the console with my GUI session, or rebooting the box, caused the manual repair to be overwritten (10 second delays were back)
  • One needs to remember that Linux began its life as a personal computing platform and many programmers who work in this ecosystem still see it that way. This means there are all kinds of special hooks put in to support the GUI user.
  • If you drop this text "centos networkmanager overwrites resolv.conf" into a Google search then you will get a bunch of hits like this:
    https://ma.ttias.be/centos-7-networkmanager-keeps-overwriting-etcresolv-conf/
    where we learn that this has been going on since CentOS-6.
  • Apparently you are not supposed to enter DNS addresses into the GUI dialog for each NIC. If you do, the Network Manager will continue to copy this information from NIC config files then overwrite "/etc/resolv.conf"
  • There are two ways you can fix this problem:
    1. use an editor to modify the appropriate network settings file(s) which is not recommended in case you make a typo
      • just "/etc/sysconfig/network-scripts/ifcfg-eno1" in my case
    2. use the GUI to remove all DNS references from all your active network configs then use an editor to edit "/etc/resolv.conf".
      • My file now looks like this:
        # NOT Generated by NetworkManager
        # /etc/resolv.conf
        options timeout:1
        options attempts:1
        options rotate
        options no-check-names
        search on.bell.ca
        nameserver 142.182.48.71
        nameserver 142.182.48.105
        #nameserver 142.113.87.152
      • you might consider making a copy of this file like so:
        cd /etc
        cp resolv.conf resolv.conf-copy

8) Our console device is totally dead (2019-08-xx)

We are running CentOS-7.2 on two HP-ML370-g5 servers (one PROD, one DVLP) and both have been running for 30 and 24 months respectively without a reboot. These are older hardware platforms so I have been preparing to cut the whole thing over to to two newer servers (HP-DL385p-gen8) next month. I just noticed I can't access the console on PROD.

command result  
CTRL ALT F1 screen turns solid blue should be GUI mode
CTRL ALT F2 screen turns solid green should be text mode
CTRL ALT F3 screen turns sold green should be text mode

I need to point out that we can do anything else we want via a remote ssh terminal session over the network. In fact, the customers are unaware of fact that anything is wrong.

I've tried everything (short of rebooting) including replacing the monitor and restarting various services (eg. "systemctl restart gdm.service") but it seems that the VGA port is locked up somehow.

SUGGESTION: every system admin must ensure that every system has at least one external network port configured -AND- that the firewall has been configured to permit ssh2 connections so you be able to manage your platform if your VGA console is FUBARed.

9) The old system won't reboot but some files are needed (2019-09-xx)

This is a continuation of item-8 after a 1 month delay. Okay so the good news is this: we have acquired a replacement server and copied all necessary files to it. Since everything appeared to be running properly on the new server, I finished the day by rebooting the old server to see if my VGA port was still broken. The VGA port was not defective but now the system only boots part way then drops into emergency text mode offering a few useless before presenting a root password prompt. Sometime later I got a call saying "we missed some files on the old server". Oops! So I tried rebooting again:

  • booting begins normally
  • I can see a nice solid gray GUI screen with a spinning white cursor so this was not a hardware problem
  • then the console crapped back to "text-only mode" with a prompt to choose between logging in as root or just continuing the boot process
    • I ran a few logs which did not help so I typed "exit" to allow the boot to continue
  • the console flipped to colored confetti in GUI mode with a red-orange spinning cursor; this would be okay if I could login over the network (I thought)
  • then the console crapped back to "text-only mode" with a prompt indicating to choose between logging in as root or just continuing the process
    • this system is not yet up; we cannot connect via the network
    • I can see the file system including the files which were missed
  • since this was an emergency, and my files were visible, I decided to try copying to a USB stick (a.k.a. thumb drive) but had never tried this before from the command line (it happens automatically in GUI mode).

How To Mount a USB stick (thumb drive) without GUI support
(also works with a 1-TB drive on a USB cable)

  • before you install a thumb drive, first inspect the contents of /dev like so:
    ---------- method 1 ----------
    cd /dev ls -ls sd*
    ---------- method 2 ----------
    lsblk # list block structured devices
  • you will see an "sd" device for every hard drive:
    sda name of disk #1
    sda1
    sda2
    sda3
    name of partition #1 (if one exists)
    name of partition #2 (if one exists)
    name of partition #3 (if one exists)
    sdb name of disk #2
    sdb1
    sdb2
    name of partition #1 (if one exists)
    name of partition #2 (if one exists)
  • So on a system with only one hard drive, it is likely that inserting a USB stick will cause Linux to discover the device as "sdb" and its partition (if there is one) as "sdb1"
  • The following steps assume you are inserting a USB stick that was formatted as FAT32 via Windows then was discovered by Linux as sdb/sdb1
    mkdir -p /media/usb			# my mount point (use any name you wish)
    mount -t vfat /dev/sdb1 /media/usb
    ls -la /media/usb
    -----------------------------------
    cd /home/neil
    cp -p * /media/usb			# copy with preserve attributes 
    -----------------------------------
    umount /media/usb
  • for many operations (like: "cp -t") it may make more sense to first reformat the USB stick (or USB hard-drive) with a Linux file system like ext4
    ---------- step 1a ----------
    <<< format whole device (deletes any partitions) >>> mkfs.ext4 /dev/sdb # formats whole device mount /dev/sdb /media/usb ---------- step 1b ---------- <<< format partition #1 >>> mkfs.ext4 /dev/sdb1 # only format partition #1 mount /dev/sdb /media/usb ---------- step 2 ----------- <<< common >>> set -ve # set verify, stop on error ls -la /media/usb rsync -aX /etc /media/usb/etc # a = all; X = extended attributes

10) Now it will boot after this fix (2019)

This is a continuation of items-8-9. I messed around with the old server (~ 30 minutes each day) following the on-screen suggestions after the GUI drops back to text mode during boot. Here is one of the messages presented to me:

Error initializing authority: Could not connect: No such file or directory (g-io-error-quark, 1)

I began Googling various pieces of the above phrase including "(g-io-error-quark, 1)" which took me to this link at askubuntu.com (even though this is a CentOS problem). That article implicates erroneous entries in "/etc/fstab". Apparently any mount failure during boot is considered fatal even though the basic root directory ( "/" ) is in good shape. So I used a text editor to disable my last line of "/etc/fstab" then rebooted. The system came right up.

p.s. that one line I disabled in fstab was pointing at a disk which had be unmounted and deslotted shortly after the first boot 30 months ago.

Caveat: I have seen one situation where log files written to files under "/var/log" had filled the associated partition (some files like wtmp and others under "/var/log/gdm" can grow forever if your system hasn't been rebooted for a while). Type "df -h" to inspect disk free space. If near full, and if an emergency, you might consider deleting some of the larger log files before you reboot. Use this command to display files larger than 2MB

find /var/log -size +2M -exec ls -la {} \; 

11) Doing better backups for faster system recovery (2019-09-xx)

  • If your business requires 100% uptime then you will probably need to go to some sort of cloud-based solution
    • you could completely outsource to companies like Amazon Web Services ( https://aws.amazon.com ) which will make you a short term hero but eventually put yourself out of work
    • or you could build your own cloud solution using products like OpenStack
  • If you are not ready to jump to a cloud then you should harden your existing stuff
  • I thought I was doing due diligence when it comes to doing backups (and restores) but my real-world items 8-10 (above) proves that I was not
  • For most businesses in 2019, I should not need to mention that hardware is now relatively inexpensive -AND- operating systems almost free (at least this is the case for the open-source variety) so you could install more stand-by platforms if you can't afford to lose your primary system for too long (say 15 minutes).
  • Prior to 2019-09-xx we only only ran two systems, PROD (which is our production platform) and DVLP (which is our development and qualification platform). We were doing daily backups to magnetic media on a 14-day rotation but here you can see the big problem: loosing PROD means one of the following:
    • recover the platform from magnetic media (will take a very long time)
    • build a new platform from optical media then apply changes from magnetic media
    • apply changes from magnetic media to DVLP then divert your customers there (but then you would still need to develop a plan to go back)
    • or a fourth option described next
  • Since then, I have installed four more systems (two local; two very remote) then use rsync to copy (twice a day) changes to backup locations from which we can do rapid restores by just copying

New Block Diagram

 PROD  (Linux)    DVLP  (Linux)    OpenVMS 
+-------------+  +-------------+  +-------+
| primary     |  | primary     |  | PROD  |
+-------------+  +-------------+  +-------+

+-------------+  +-------------+  +-------+
| local copy  |  | local copy  |  | DVLP  |
+-------------+  +-------------+  +-------+

+-------------+  +-------------+
| remote copy |  | remote copy |
+-------------+  +-------------+ 
  • All Linux systems are currently running CentOS-7.9 with Apache and MariaDB
    • primary Linux systems employ rsync to copy to local copy (same data facility) several times a day
    • primary Linux systems employ rsync to copy to remote copy (a different city more than 100-km away) several times a day
    • having a local stand by can provide peace of mind when you wonder if the next YUM update might break something
    • unlike Amazon or Alibaba, these systems do very little between 21:00 and 8:00
    • this scheme is also useful when migrating to newer server hardware
  • All OpenVMS platforms are running ver 8.4 (itanium2)
    • these machines previously did daily backups to tape which were delivered off site (M-F, excluding holidays)
    • Now, these machines copy their backups into a folder on "DVLP Linux primary" which are then rsync'd to local standby and remote standby every day

12) Python3 caching is currently broken on most Linux distros running SELinux (2019-10-xx)

caveat: this problem covers web applications using Python3 directly (i.e. when not using Django or WSGI)

  • first off, click here to learn about Python3 caching
  • now imagine an Apache process running Python3 script /var/www/cgi-bin/file (without a file extension)
  • Until the problem is fixed you only have a few options:
    1. place SELinux in permissive mode.
      1. this isn't as bad as it sounds provided this is done temporarily; test your web-services via Apache then ensure that Apache has compiled-cached all the Python3 scripts; then shift SELinux back into enforcing mode
      2. remember to do this every time you update any Python3 scripts -or- or do major Python3 upgrades (like from 3.6 to 3.8)
    2. inspect the suggestions SELinux has written to /var/log/message for suggestions then craft your own temporary fix (this always happens; even when SELinux is in permissive mode)
    3. live with the problem (Python3 will run like Python2) but remember that every transaction may consume an additional 10-15 mS since you will be always compiling but never caching

update: as of 2020-04-30 I have not seen any movement on this problem for CentOS-7.7 but I have heard rumors of a beta RPM for CentOS-8.
update: on 2020-05-13 this patch was available from the CentOS repositories: libselinux-python3.x86_64 (ver2.5-15.el7)
comment: perhaps it is not unreasonable to wait 8 months for a patch on unsupported software (CentOS). Although "Cent" supposedly means "community enterprise", companies requiring faster service would be advised to move to RHEL along with a support agreement.

13) FUBAR with USB Audio (2019-11-xx)

test #1

  • build a CentOS-7.7 system using recipe "Server with GUI"
  • log into the GUI console with any priv account other than root
    • insert any USB audio device (I tried these two):
      • Logitech S150 Digital USB Stereo Speakers
      • Ugreen USB to Audio
    • audio testing from any software app (including Gnome audio settings) produces no audio
  • now log out then back in as root
    •  audio testing now works properly

test #2

  • build a CentOS-7.7 system using recipe "Gnome Desktop"
  • log into the GUI console with any priv account other than root
    • insert any USB audio device
    • audio testing works

comments:

  • USB storage devices (thumb-drives, hard drives, and DVD/CD drives) always work properly (they are owned by the GUI session logged into the console) so I wonder why this doesn't always happen with USB Audio devices.
  • After you get control of your audio device from Gnome Audio Settings, audio streaming from internet radio stations only work properly from Google Chrome (78) but not Firefox (68)
  • installing Google Chrome:
  • Firefox Linux updates:
    • Firefox version 78.11.0 now works (tested 2021-06-30 with CentOS-7.9) but fails on Rocky Linux 8.7 (tested 2022-12-22)

14) One LVM volume is too big, the other too small

  • Our production CentOS systems have been up and running for 560 days with not too many difficulties
  • Our primary PROD and DVLP platforms are implemented on HP DL385p_gen8 servers with 8-drives configured as a single RAID-60 volume of 1-TB of storage space
  • I was a Linux newbie when I installed CentOS-7 on these machines so went with the suggested partitioning and LVM (a big mistake). This means that the LVM representing slash (a.k.a. root) is sitting at 50-GB whilst the LVM representing slash-home is sitting at 950-GB (this would be okay if we were in a university with a lot of interactive users; but we only have 3 interactive accounts and 6 SAMBA shares)
  • The problem here is that our MariaDB database (an alternate fork of MySQL) has grown to the point that we've only got 30% free space on the root LVM
    caveat: I was a little wiser when I set up the 4 backup systems (2 local, 2 remote). On these machines I instructed the installer to do a 50-50 split between root and slash-home.
  • According to this document it should be easy to free up space on one LVM then apply it to the other LVM while the system is running
    https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_logical_volumes/assembly_modifying-logical-volume-size-configuring-and-managing-logical-volumes
    but we have a problem. Apparently you cannot use lvreduce on an xfs formatted volume. Note that xfs is what I see when I type either one of these commands:
    df -Th              # disk free (human units; fs Type)
    mount | grep mapper # display mount points (1)
    mount | grep centos # display mount points (2)
  • Also ignore all internet advice claiming you can reduce the size of an xfs volume because it is not possible. "xfs" is really fast because it was only made to grow.
  • At this point we've got three or four options:
    1. shutdown the application then take the database offline; move the database from the root LVM to the slash-home LVM; restart the application
      note: I've got a few example procedures which involve about 12 steps when SELinux is present
    2. Backup the LVM volume associated with "/dev/mapper/centos-home" then delete it. Create a smaller version then restore your backup into it
      note: the application does not needed to be taken off line
      Step-1
      a) log on as root                           # do not first log on under /home then su
      b) ensure no interactive users              # command: w
      c) stop-disable cron jobs needing "/home"   #
      d) stop the SAMBA service                   # command: sudo service smb stop
      
      Step-2
      mkdir /hack                                 # create tmp folder (if sufficient space here)
      rsync -aX /home /hack                       # copy everything (-X = includes SELinux stuff)
      --------------- du -a /home # check the src file count du -a /hack/home # verify the dst file count (do they match?) Step-3 umount /dev/mapper/centos-home # un-mount this volume Step-4 lvremove /dev/mapper/centos-home # DANGER (now past the point of no return) Step-5 lvcreate -L 400GB -n home centos # create a smaller replacement LVM # (keep 100 GB free just in case) Step-6 mkfs.xfs -f /dev/centos/home # format the volume with xfs
      # note1: here the "-f" switch means "force" # note2: "-t xfs" assumed (but not allowed with -f) Step-7 mount -a # mount everything in "/etc/fstab" # -OR TYPE- # mount /dev/mapper/centos/home /home Step-8 lvextend -r -L500GB /dev/mapper/centos-root # extend this volume to 500 GB; use "-r" to resize # without "-r" you will only see the new size in "lvs" # with "-r" you will also see the new size in "df -h" Step-9 rsync -aX /hack/home # restore everything including SELinux stuff du -a /hack/home # check the src file count du -a /home # verify the dst file count (do they match?) Step-10 sudo service smb start # start Samba sudo service smb status
    3. Both our LVM volumes are sitting on the same RAID-60 volume so having two LVM volumes is surely redundant. We will backup the LVM associated with "/home" then delete it via lvremove. We also need to delete the "/home" mount point (it was created with the "-p" switch) via rmdir. Now just restore the backup into "/home"
      Caveat: remember to edit "/etc/fstab" then disable the line where "/dev/mapper/centos-home" was mounted as "/home". Why?
      The system may not properly boot to a functional state without this step. See tip-10 above
      note: the application does not needed to be taken off line
    4. shutdown the application then take the database offline; take the production server offline; take the production backup server offline; swap node names and i/p address; bring everything back up
      note: this will always be the default action if any of the above operations fail for whatever reason
Preliminary Steps
  • I downloaded a copy of Oracle VM VirtualBox from here ( https://www.oracle.com/virtualization/virtualbox/ ) then installed it on my PC
  • I next created a Red-Hat virtual machine (make sure your virtual hard disk is at least 100-GB in size for this 2-LVM experiment) then installed CentOS-7.7 into it with the default options (about 50-GB was assigned to each LVM)
    • Testing Option-2
      • preliminary tests worked as a can log in with my non-root user account
  • I next installed CentOS-7.7 into a surplus server (HP DL385p_gen8 with single 1-TB disk volume) in my lab
    • Testing Option-2 (attempt 1)
      • preliminary tests worked as a can log in with my non-root user account
      • secondary tests failed (I cannot start my Chrome browser or listen to any audio from NPR radio stations); this appears to be an SELinux problem since the event logs are full of those kinds of messages
        • logged back on as root then ran this SELinux command which (I think) fixed my account problems because my Chrome browser now works properly
          restorecon -rv /home
          but it would have been better to not rely on something like this which is done after-the-fact
        • I realized that I should have been using the "-X" switch in all my rsync operations so I updated the table above
      • I will repeat the total Option-2 experiment tomorrow
    • Testing Option-2 (attempt 2)
      • preliminary tests worked as a can log in with my non-root user account
      • secondary tests passed as my non-root account can use the browser with streaming audio (yay!)

Next Steps

  • time to try this on our DVLP box (I need to do it on a Saturday when no interactive users are locking files in slash-home
  • 2020-08-15: executed option-2 (above) on node "kawc4n" (DVLP); appears to be 100% successful; the whole procedure took a little less than 2-hours because I had to restore 205-GB over a 1-Gb/s Ethernet
  • 2020-08-22: executed option-2 (above) on node "kawc0f" (PROD); appears to be 100% successful; the whole procedure took a little less than 1-hour because the contents of slash home was on 5-GB so I first performed an rsync backup to a local folder
  • conversions complete!

15) Yum is failing to initialize on one system of six (2020-09-xx)

I am running 6 servers (one PROD, one DVLP, two local shadows, two remote shadows) but YUM is failing to initialize on the oldest unit (both PROD and DVLP were built in 2018 as CentOS-7.5 then YUM updated to CentOS-7.6)

Now inspect the following (pay attention to the red text - especially the last one just before the final prompt)
[root@kawc0f /]# yum makecache fast
Loaded plugins: fastestmirror, langpacks
Determining fastest mirrors
epel/x86_64/metalink | 16 kB 00:00:00 Could not retrieve mirrorlist ...
... https://mirrors.iuscommunity.org/mirrorlist?repo=ius-centos7&arch=x86_64&protocol=http
error was 14: HTTPS Error 404 - Not Found
             
One of the configured repositories failed (Unknown), and yum doesn't have enough cached
data to continue. At this point the only safe thing yum can do is fail. There are a few
ways to work "fix" this: 1. Contact the upstream for the repository and get them to fix the problem. 2. Reconfigure the baseurl/etc. for the repository, to point to a working upstream. This is
most often useful if you are using a newer distribution release than is supported by the
repository (and the packages for the previous distribution release still work). 3. Run the command with the repository temporarily disabled yum --disablerepo=<repoid> ... 4. Disable the repository permanently, so yum won't use it by default. Yum will then just
ignore the repository until you permanently enable it again or use --enablerepo for
temporary usage: yum-config-manager --disable <repoid> or subscription-manager repos --disable=<repoid> 5. Configure the failing repository to be skipped, if it is unavailable. Note that yum will
try to contact the repo. when it runs most commands, so will have to try and fail each
time (and thus. yum will be be much slower). If it is a very temporary problem though,
this is often a nice compromise: yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true Cannot find a valid baseurl for repo: ius/x86_64 [root@kawc0f /]#

Now Google the red quoted phrase "Cannot find a valid baseurl for repo: ius/x86_64" which returns hits like this:
https://github.com/iusrepo/announce/issues/18
where we learn that many of the repositories have moved from the ".org" domain to the ".io" domain. Apparently support for ".org" ended in April-2020.
Back in the day you would need to execute something like this:
yum install -y https://centos7.iuscommunity.org/ius-release.rpm

but this fixed my problem:
yum install -y https://repo.ius.io/ius-release-el7.rpm

16) The future of CentOS (the invisible hand of the market?)

Comments:
  • Scientific Linux (SL) first appeared in 2004 and was popular among scientists working at FermiLab, CERN and DESY to just name three of many. With the success of CentOS many organizations were convinced to swap CentOS for Scientific Linux. For example CERN (the home of the LHC) began favoring CentOS in 2015 although SL is still supported at CERN as of Dec-2020. Red Hat made an end-of-life announcement for Scientific Linux in 2019 (before Red Hat was acquired by IBM).
  • When it comes to Linux, my employer uses RHEL for all customer-facing production platforms and CentOS for everything else (this includes everything from application development, user acceptance, hands-on Linux training, etc). CentOS was also being used as on on-ramp for driving UNIX projects onto RHEL platforms. It appears that IBM has thrown a monkey-wrench into those plans. I have no idea what the future holds but history can be instructive. Recall that when Michael Widenius and others didn't like where SUN was taking MySQL, they created MariaDB (that decision seem prescient after Oracle acquired SUN; then promised the EU not to kill MySQL; then slowed MySQL bug fixes for more than a year until they noticed that the Linux community was preferentially installing MariaDB).

17) All four GUI consoles are locked up

caveat: all work here was done from a non-GUI session (usually a network connection)

  • Due to the COVID-19 pandemic, I only take a trip into the office once a week (usually Friday afternoons) just to ensure the environment is secure.
  • We are running four CentOS servers (all: HP DL385p Gen8; one PROD; one DVLP; a hot standbys for each)
  • On my Friday afternoon routines I always check the drive LEDs but never check the consoles which are usually dark (in energy-saving mode)
  • During my walk-through today:
    • I noticed PROD-console was not dark AND contained a solid charcoal-colored background with some green OK messages "I think" came from systemctl. I was able to get a non-GUI login prompt after hitting the CTRL-ALT-2 combo (3 and 4 worked as well)
    • I noticed DVLP-console was not dark AND contained a solid-black background with some white text messages "I think" came from dmesg. I was able to do the CTRL-ALT-2 thing here as well
    • The two hot standby units only presented text-prompts (no GUI).
      • Logging onto them then typing startx did not bring up a GUI. Typing systemctl get-default returned "graphical.target". Typing systemctl isolate graphical.target did nothing.
    • All units had not been running for more than 400 days.
  • Step-1 (hot standby units)
    • I tried both yum check-update and yum update but nothing was offered (perhaps these Linux instances were too old)
    • since no humans were logged into these, I used yum upgrade to bring them up from CentOS-7.7 to CentOS-7.9 then rebooted
    • Still no GUI on the consoles (even after logging in then typing startx) so I typed this:
      1. yum groups install "GNOME Desktop" (why wasn't this stuff updated during the OS upgrade?)
      2. systemctl isolate multi-user.target (equivalent to setting runlevel=3 on UNIX boxes)
      3. systemctl isolate graphical.target (equivalent to setting runlevel=5 on UNIX boxes)
    • Now the GUI auto-magically appeared on both consoles.
  • Step-2 (DVLP and PROD)
    • People logged on here so a reboot was not possible.
    • So I repeated steps 2-3 above which moved TTY1 out of GUI mode but could not put it back.
    • Next, I repeated steps 1-3 which brought back the console GUI.

going forward

  • We employ graphical consoles for the odd time that we prefer do something quickly (like make changes to the software firewall or reconfiguring a NIC). But this is a data center where console devices usually do not exist. On top of that, it is becoming apparent to me that GUI consoles are more trouble than they are worth so I'm going to permanently move these systems from graphical.target to multi-user.target with the hope that a simple startx will be all that is required for occasional graphical support at the console:
    systemctl isolate multi-user.target this temporary command makes immediate changes
    systemctl set-default multi-user.target this permanent command will affect the next reboot
  • caveat: remember that you will need to log out twice. Once from GUI mode then once from text mode

18) Finally solved a slow-response problem (2021-06-18)

  • I have been running six CentOS systems for the past few years (I run rsync multiple times a day to keep local-copy and remote-copy systems reasonably up-to-date)
    humans here
    no humans here no humans here
    PROD local-copy remote-copy
    DVLP local-copy remote-copy
  • DVLP remote-copy has never worked properly from an interactive point of view although rsync jobs to it are fine
    • symptoms:
      • after typing either "su -" or "sudo" I only see a password prompt after 5-10 seconds (this delay is not seen after a reboot)
      • typing "top -d 0.5" refreshes the display every 3-4 seconds rather than twice a second (this happens all the time)
      • lots of canary messages from rtkit-daemon in /var/log/messages like this one:
        Jun 4 16:46:00 bfdc0e rtkit-daemon[1021]: The canary thread is apparently starving.Taking action.
    • solution
      • for a time I thought the canary messages were associated with a bad USB device so I had someone unplug the console mouse and keyboard but this did nothing (the remote location is 160 km away)
      • for a time I thought it might be a BIOS problem since all my working machines employed a BIOS from 2014 whist this one was from 2016. HP's BIOS release notes contain a lot of references to timing issues associated with AMD processors so I decided to hack (er, play)
        • I typed the 'lscpu' command so I could see which cores were where
        • next, I disabled all cores associated with the second CPU and this fixed my slow-response problems.
        • use the man pages to learn how to disable CPU cores or review the lines in this BASH script: cpu_control.sh

19) Updating an old installation in stages (2021-10-xx)

caveat: this problem is on going (so the follow proposed solution is untested)

  • I have two old systems running CentOS-7.3 which appear to be too old to upgrade via yum
    (I have never seen this happen before so do not yet know what is going on; but I am seeing a lot of errors mentioning PROTECTED MULTILIB VERSIONS)
  • these two commands fail as yum tries to update directly to CentOS-7.9
    yum upgrade
    yum upgrade --skip-broken
  • I am going to attempt to update this system in stages as described here: https://digitolle.wordpress.com/2017/10/26/how-to-upgrade-centos-to-a-specific-version/
    where we modify this file: /etc/yum.repos.d/CentOS-Base.repo
  • steps:
    1. first visit this site to see what you need: https://vault.centos.org/ (when updating to 7.4 you need to specify 7.4.1708)
    2. modify yum config files to only point at the centos vault:
      su -
      cd /etc/yum.repos.d
      cp CentOS-Base.repo CentOS-Base.repo-old
      vi CentOS-Base.repo
      # just comment the mirror list AND uncomment the baseurl in four places [base] ... #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os baseurl=http://vault.centos.org/$releasever/os/$basearch/ ... [updates] ... #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates baseurl=http://vault.centos.org/$releasever/updates/$basearch/ ... [extras] ... #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras baseurl=http://vault.centos.org/$releasever/extras/$basearch/ ... [centosplus] ... #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus baseurl=http://vault.centos.org/$releasever/centosplus/$basearch/ ...
    3. now type this:
      yum upgrade --releasever=7.4.1708                  
    4. reboot then repeat all these steps for each required version (the yum config file will be clobbered each time)

20) rpm hack to get mpack + munpack on CentOS-7

  • I am attempting to move an inbound email processing interface from OpenVMS to CentOS-7
  • The old application is dependent upon munpack (mime-unpack) which is not available on CentOS-7 but was available on CentOS-6, UNIX and Windows
  • To make matters worse, if you drop these two quoted words: "centos 7" "munpack" into a google search, you will be referred to a Red Hat developer site where you are instructed to use uudecode which can only be used on pieces of an email after it has been pulled apart (so uuencode does not come close to being a drop in replacement for munpack)
  • Another promising program is ripmime but what do you do if you really want to stick with munpack?
  • I decided to first look at the application source files which were first published in UNIX. Then I looked at the CentOS-6 rpm file to see if it would be easier to modify it for use with CentOS-7. And that is when I discovered two binary executable files which can be used as-is:
    • I tested them on my CentOS-7 and they worked without any complaint
    • even though the command "munpack -?" only shows switches '-f' and '-q', the program supports '-t' (text try harder) which means this implementation is really version 1.6-2b which is seen on some OpenVMS systems
  • Here are my raw notes:
    caveat: ('el6' means 'enterprise linux 6' so the rpm is for CentOS-6 or RHEL-6):
    ================================================================================
    title  : mpack_notes.txt
    author : Neil Rieck
    created: 2021-11-17
    edit   : 2021-11-18 
    notes  : the 'munpack' utility is also found here
    platfom: CentOS-7 
    stanzas:
    1) play with file mpack-1.6.tar.gz
    2) play with file mpack-1.6-2.el6.rf.x86_64.rpm 
    ================================================================================
    
    1) mpack-1.6.tar.gz (mpack + munpack for UNIX + Windows)
    ------------------------------------------------------------ tar -tvf mpack-1.6.tar.gz # list verbosely (f=mpack-1.6.tar.gz) tar -xvf mpack-1.6.tar.gz # extract verbosely (f=mpack-1.6.tar.gz) # note: creates folder 'mpack-1.6' tar -xvf mpack-1.6.tar.gz -C yada # place output in folder 'yada' ================================================================================
    2) mpack-1.6-2.el6.rf.x86_64.rpm (mpack + munpack for CentOS-6)
    ------------------------------------------------------------------- mkdir mpack_rpm_hack # create directory cp mpack-1.6-2.el6.rf.x86_64.rpm mpack_rpm_hack # copy file to folder cd mpack_rpm_hack # move into folder rpm2cpio mpack-1.6-2.el6.rf.x86_64.rpm | cpio -idmv # extract contents tree --charset="ascii" # see the mess . |-- mpack-1.6-2.el6.rf.x86_64.rpm `-- usr |-- bin | |-- mpack | `-- munpack `-- share |-- doc | `-- mpack-1.6 | |-- Changes | |-- INSTALL | |-- README.mac | `-- README.unix `-- man `-- man1 |-- mpack.1.gz `-- munpack.1.gz 7 directories, 9 files [neil@kawc4n mpack_rpm_hack]$
    ./usr/bin/munpack -? # test the binary as-is
    munpack version 1.6 # yay!
    usage: munpack [-f] [-q] [-C directory] [files...]

21) procmail problems with SELinux on CentOS-7

  • I am using procmail v3.22 which is very old
  • I just wrote an application where procmailrc starts a Python3 app which:
    • opens a connection to a relational database (MySQL - MariaDB) on port 3306
    • then sends an email reply on port 25
  • The application does not work with SELinux in "enforcing mode" but does work in "permissive mode" (obviously)
  • I am seeing messages in /var/log/messages from settroubleshoot indicating problems with:
    • some scripts not being executable (as far as SELinux is concerned)
    • ports 3306 and 25 being blocked (as far as SELinux is concerned)
  • I have installed optional modules so that "man procmail_selinx" works but it now appears that I will need to write my own SELinux module for this application
  • here are two interim solutions: link

22) cannot resolve host names after a CentOS-7 upgrade

  • CAVEAT: this problem is specific to my personal "Virtual Private Server" hosted by IONOS
  • Since this is just my hobby site, I had not done a "yum update" for over two years.
  • So I typed "sudo yum update" (which brought me up to CentOS-7.9) then rebooted.
  • But now I cannot do any more updates because this system cannot resolve host names.
  • I first checked file "/etc/resolv.conf" which was blank (oops!)
  • steps:
    <sr>	[my-root-prompt]
    <ur> ifconfig
    <sr> ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 74.208.23.87 netmask 255.255.255.255 broadcast 74.208.23.87 inet6 fe80::250:56ff:fe0a:4fab prefixlen 64 scopeid 0x20<link> ether 00:50:56:0a:4f:ab txqueuelen 1000 (Ethernet) RX packets 1822 bytes 254353 (248.3 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1214 bytes 2218180 (2.1 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
    [my-root-prompt]
    <ur> ip addr
    <sr> ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
    default qlen 1000 link/ether 00:50:56:0a:4f:ab brd ff:ff:ff:ff:ff:ff inet 74.208.23.87/32 brd 74.208.23.87 scope global dynamic ens192 valid_lft 41794sec preferred_lft 41794sec inet6 fe80::250:56ff:fe0a:4fab/64 scope link valid_lft forever preferred_lft forever
    [my-root-prompt]
    <ur> cd /etc/sysconfig/network-scripts/
    <sr> [my-root-prompt]
    <ur> ls -l *ens192*
    <sr> -rw-r--r-- 1 root root 148 Mar 5 08:00 ifcfg-ens192
    [my-root-prompt]
    <ur> cat ifcfg-ens192
    <sr> BOOTPROTO=dhcp (note: since DHCP we need DNS info from the provider) DEVICE=ens192 DHCPV6C=yes DHCPV6C_OPTIONS="-nw" IPV6_AUTOCONF=yes IPV6INIT=yes NM_CONTROLLED=no ONBOOT=yes TYPE=Ethernet
    [my-root-prompt]
    <ur> vim ifcfg-ens192 (edit desired file with vim or nano) PEERDNS=yes (add on this line)
    <sr> [my-root-prompt]
    <ur> systemctl restart network.service
    <sr> [my-root-prompt]
    <ur> nslookup ibm.com
    <sr> Server: 212.227.123.16 Address: 212.227.123.16#53 Non-authoritative answer: (success) Name: ibm.com Address: 23.35.139.245 Name: ibm.com Address: 2600:1407:21:282::3831 Name: ibm.com Address: 2600:1407:21:28f::3831
    [my-root-prompt]
    ----------------------------------------- optional
    <ur> cd /etc
    <sr> [my-root-prompt]
    <ur> cp resolv.conf resolv.conf-copy
    [my-root-prompt]

23) yum errors on two of four platforms (2022-06-06)

I have four identical platforms all running CentOS-7 (last updated 4 month ago. This is what I see when I execute 'yum update'
Note that 'yum clean' fails the same way

[root@kawc3v ~]# yum clean
error: rpmdb: BDB0113 Thread/process 23380/139868776736832 failed: BDB1507 Thread died in Berkeley DB library
error: db5 error(-30973) from dbenv->failchk: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db5 -  (-30973)
error: cannot open Packages database in /var/lib/rpm
CRITICAL:yum.main:

And here's the fix:

mv /var/lib/rpm/__db* /tmp/
rpm --rebuilddb
yum clean all       

comment: 'rpm --rebuilddb' is undocumented under 'man rpm' so look for it under 'man rpmdb' and use it sparingly

24) Run an old  32-bit program on WINE (2022-08-06)

Okay so this is a weird problem which might not affect too many others.

  • Two decades ago we were provided with two licensed copies of Comm-press-2000 (purchased from bTrade) which we must use to encrypt files before sending it to one other company.
  • One version was for Windows-NT (to be used for testing) whilst the other was for production use on Solaris-8 for SPARC
  • You would think the spec would have been updated long ago to SFTP but this never happened (and the other company wants a stupid amount of money to make any changes to the existing interface)
  • Problem: we currently maintain two Solaris-8 SPARC servers in our data center (one primary; one backup) for no other purpose than encrypting files via Comm-press-2000 before FTPing them to the other company. We want to get rid of them.
  • Solution-1 (easy):
    • Install Wine-4.0.4 from a the epel repository (epel must be already enabled):
      <sr> $
      <ur> sudo yum install epel-release
      <sr> ...verbage...
      $
      <ur> sudo yum list wine.*
      <sr> Installed Packages
      Available Packages
      wine.x86_64
      <ur> sudo yum install wine.x86_64
      <sr> ...verbage...
      $
    • Next I tried to run one of the Comm-press-2000 programs:
      <sr> $
      <ur> wineconsole compress.exe <sr> wine: Bad EXE format for Z:\home\neil\wine\compress.exe. $

      which failed because WINE was expecting a 64-bit windows program
    • Time for a little hacking:
      <sr> $
      <ur> which wine <sr> /usr/bin/wine $
      <ur> ls -la /usr/bin/wine*
      <sr> lrwxrwxrwx. 1 root root 22 Aug 8 11:30 /usr/bin/wine ->
      /etc/alternatives/wine -rwxr-xr-x. 1 root root 11408 Apr 21 2020 /usr/bin/wine64 -rwxr-xr-x. 1 root root 2106864 Apr 21 2020 /usr/bin/wine64-preloader -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/wineboot -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/winecfg -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/wineconsole -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/winedbg -rwxr-xr-x. 1 root root 190200 Apr 21 2020 /usr/bin/winedump -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/winefile -rwxr-xr-x. 1 root root 95099 Apr 21 2020 /usr/bin/winemaker -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/winemine -rwxr-xr-x. 1 root root 1949 Apr 21 2020 /usr/bin/winepath lrwxrwxrwx. 1 root root 32 Aug 8 11:30 /usr/bin/wine-preloader ->
      /etc/alternatives/wine-preloader lrwxrwxrwx. 1 root root 28 Aug 8 11:30 /usr/bin/wineserver ->
      /etc/alternatives/wineserver -rwxr-xr-x. 1 root root 490768 Apr 21 2020 /usr/bin/wineserver64 $
      I see wine64 but where are the 32-bit programs?
    • So then I tried this:
      <sr> $
      <ur> sudo yum install wine32-release.noarch # adds a new repository
      <sr> ...verbage...
      $
      <ur> sudo yum install wine.i686
      <sr> ...verbage...
      $
      <ur> wineconsole compress one two
      <sr> $

      which worked.
  • Solution-2 (harder)
    • Build a 32-bit only version of Wine-4.0
    • From everything I've read so far, I only needed to build a 32-bit only version of WINE-4
    • My first attempt at doing this failed so it took some time to learn how to build a 32-bit version of hello_world.c which I have documented here:
  • see more of my WINE Notes

25) Can't restart Apache with new certificate files (2022-08-27)

  • Every August, I am tasked to renew a multi-domain SSL certificate then install it on 8 servers running in our department
  • The servers are all identically configured so I cannot explain why the update procedure failed on only one of them (CentOS) but now Apache is down and will not start
  • Here is an extract from /var/log/messages
    Aug 26 14:55:32 kawc3v systemd: Stopping The Apache HTTP Server...
    Aug 26 14:55:33 kawc3v systemd: Stopped The Apache HTTP Server.
    Aug 26 14:55:33 kawc3v systemd: Starting The Apache HTTP Server...
    Aug 26 14:55:33 kawc3v httpd: AH00526: Syntax error on line 114 of /etc/httpd/conf.d/ssl.conf:
    Aug 26 14:55:33 kawc3v httpd: SSLCertificateKeyFile: file '/etc/pki/tls/private/kawc96_20220822.key' does not exist or is empty
    Aug 26 14:55:33 kawc3v systemd: httpd.service: main process exited, code=exited, status=1/FAILURE
    Aug 26 14:55:33 kawc3v systemd: Failed to start The Apache HTTP Server.
    Aug 26 14:55:33 kawc3v systemd: Unit httpd.service entered failed state.
    Aug 26 14:55:33 kawc3v systemd: httpd.service failed.
    
  • Now I need to point out that the key file exists, has the correct protection bits, and both group and owner are set to root
  • I wasted a little more time than I should have on this one until I remembered that corporate security had installed an alternate logger which was programmed to throw away SELinux messages (which means I was not seeing any)
  • So here is the two line solution (first su to root or prefix with sudo as required)
    restorecon -F -R -v /etc/pki/tls/certs
    restorecon -F -R -v /etc/pki/tls/private 
  • What does restorecon do? It uses the SELinux database to see what the files in those directories should be set to then sets them. 

26) Cannot install packages into python-3.6.8 (2022-08-29)

Facts:
  • Official TLS protocol support changed this year after the final retirement of IE11at our company
    • supported: TLSv1.2 and TLSv1.3 (everything else is unsupported)
  • Python-3.6.8 is officially no longer supported as of December of 2021 (official list)
  • We are currently running pip-21.3.1 which is the highest version possible with python-3.6 (trying to install anything higher fails with the python version check)
  • The Python package library has been moved from https://pypi.python.org/ to https://pypi.org/ (not sure when this happened)
    • the first site now redirects to the second but they have different certificates
    • pip-21.3.1 uses the first site as its index (all newer versions of pip use the second site)
    • if you are sitting behind a proxy server like me, the proxy server complains when one site redirects to another with a different certificate. Also, it appears that pip-21.3.1 is trying to use TLSv1.1 to communicate with my employer's proxy server
Here are two scripts for two different python implementations on the same platform:
  • Some systems may need to run these scripts as root or sudo (especially if python was installed via yum or dnf)
  • Notice that I specified "http:" to communicate with my proxy server (no 's' means no 'ssl')
#!/usr/bin/bash
# title  : pip36_install_p1.sh
# author : Neil Rieck
# created: 2022-09-08
python3.6 -m pip install ${1} \
 --index https://pypi.org/simple \
 --trusted-host pypi.org \
 --trusted-host files.pythonhosted.org \
 --proxy http://proxy.your-domain.com:8083
#!/usr/bin/bash
# title  : pip39_install_p1.sh
# author : Neil Rieck
# created: 2022-09-08
python3.9 -m pip install ${1} \
--trusted-host pypi.org \
--trusted-host files.pythonhosted.org \
--proxy http://proxy.your-domain.com:8083
 

27) Need a second version of python (2022-09-10)

Okay so everyone knows that system tools on CentOS-7 (like yum and firewall-cmd) require Python-2.7 so I suppose this section should say "third" rather than "second" because our system is already hosting 2.7 and 3.6.8

Today I need to write a Python SFTP program based upon pysftp which is based upon paramiko but I cannot install paramiko on python-3.6.8 (I am seeing deprecated library alerts indicating that 3.6.8 is too old).

27a) Installing python-3.8.13

 CentOS-7.9 doesn't offer a better python via rpm so I tried an easy install of python-3.8.13 as seen here

1) sudo yum install centos-release-scl
2) sudo yum list rh-python3\*
3) sudo yum install rh-python38
4) scl enable rh-python38 bash
5) python3 --version
Python 3.8.13

Notes:

  1. reference: https://wiki.centos.org/AdditionalResources/Repositories/SCL
  2. most yum installs place python under /usr/bin
  3. this yum install placed python38 under /opt
    • which is why you need to execute step-4 every time any process wants to use this version of python-3.8.13
    • I never tried executing the interpreter directly from /opt (as you might do from within a shebang) but it could work
  4. I could not install paramiko in this version of python so I used yum to remove it

27b) Building python-3.9-13 from source code

This works -AND- I can now install paramiko

========================================================================================
title  : python3.x_build_on_centos7.txt
author : Neil Rieck
created: 2022-08-31
edit   : 2022-09-08 
links  :
1) https://www.python.org/downloads/source
2) https://docs.python.org/3/using/unix.html      (<<< how to build)
3) https://docs.python.org/3/using/configure.html (<<< how to build)
notes  :
1) this is an experimental build of python-3.9.13 on CentOS-7
2) Hopefully pip for python-3.9.13 works better with our corporate proxy server
3) python-3.9.13 allows me to install paramiko while python-3.6.8 does not 
========================================================================================
file linkage BEFORE the install:

[neil@kawc4n ~]$ ls -lad /usr/bin/python*
lrwxrwxrwx. 1 root root     7 Jun  5 08:56 /usr/bin/python -> python2
lrwxrwxrwx. 1 root root     9 Jun  5 08:56 /usr/bin/python2 -> python2.7
-rwxr-xr-x. 1 root root  7144 Nov 16  2020 /usr/bin/python2.7
-rwxr-xr-x. 1 root root  1835 Nov 16  2020 /usr/bin/python2.7-config
lrwxrwxrwx. 1 root root    16 Jun  5 08:56 /usr/bin/python2-config -> python2.7-config
lrwxrwxrwx. 1 root root     9 Jun  7 07:13 /usr/bin/python3 -> python3.6
-rwxr-xr-x. 2 root root 11328 Nov 16  2020 /usr/bin/python3.6
-rwxr-xr-x. 2 root root 11328 Nov 16  2020 /usr/bin/python3.6m
lrwxrwxrwx. 1 root root    14 Jun  5 08:56 /usr/bin/python-config -> python2-config
[neil@kawc4n ~]$

file linkage AFTER the install:

[neil@kawc4n ~]$ ls -lad /usr/bin/python*
lrwxrwxrwx. 1 root root     7 Jun  5 08:56 /usr/bin/python -> python2
lrwxrwxrwx. 1 root root     9 Jun  5 08:56 /usr/bin/python2 -> python2.7
-rwxr-xr-x. 1 root root  7144 Nov 16  2020 /usr/bin/python2.7
-rwxr-xr-x. 1 root root  1835 Nov 16  2020 /usr/bin/python2.7-config
lrwxrwxrwx. 1 root root    16 Jun  5 08:56 /usr/bin/python2-config -> python2.7-config
lrwxrwxrwx. 1 root root     9 Jun  7 07:13 /usr/bin/python3 -> python3.6
-rwxr-xr-x. 2 root root 11328 Nov 16  2020 /usr/bin/python3.6
-rwxr-xr-x. 2 root root 11328 Nov 16  2020 /usr/bin/python3.6m
-rwxr-xr-x. 1 root root 16328 Sep  7 16:09 /usr/bin/python3.9
-rwxr-xr-x. 1 root root  3073 Sep  7 16:12 /usr/bin/python3.9-config
lrwxrwxrwx. 1 root root    14 Jun  5 08:56 /usr/bin/python-config -> python2-config
[neil@kawc4n ~]$ 
========================================================================================
steps:

1 ) sudo yum install epel-release
2 ) sudo yum update
3 ) sudo yum -y groupinstall "Development Tools"
4 ) sudo yum -y install openssl-devel bzip2-devel libffi-devel xz-devel
5 ) gcc --version
6a) # get source file via pc <-----------------------------------------------+- pick one
visit: https://www.python.org/downloads/ | then download file Python-3.9.13.tgz to your pc | then ftp it to the server | 6b) # get source file via wget <---------------------------------------------+
https_proxy=https://proxy.your-domain.com:8083 \
wget https://www.python.org/ftp/python/3.9.13/Python-3.9.13.tgz 7 ) tar xvf Python-3.9.13.tgz 8 ) cd Python-3.9*/ 9a) # recipe-a (generates a large binary) <----------------------------------+- pick one # caveat: the default destination is: '/usr/local/bin/python3.9' | # but the next line will put it in: '/usr/bin/python3.9' | ./configure --prefix=/usr --enable-optimizations | tee nsr_39_step1.txt | 9b) # recipe-b (generates a small binary) <----------------------------------+
# caveat: the default destination is: '/usr/local/bin/python3.9' # but the next line will put it in: '/usr/bin/python3.9' sudo ./configure --prefix=/usr --enable-optimizations --enable-shared \
| tee nsr_39_step1.txt # sudo ldconfig (do this step after the make step) 10) # caveat: altinstall allows multiple versions of python to coexist # install does not allow multiple versions of python to coexist # so type carefully sudo make altinstall | tee nsr_39_step2.txt 11) # optional (required if you enabled shared libraries) sudo ldconfig ========================================================================= 20) # test executable python3.9 exit() 21) # display our packages python3.9 -m pip list --trusted-host pypi.org Package Version ---------- ------- pip 22.0.4 setuptools 58.1.0 22) # upgrade pip sudo python3.9 -m pip install --upgrade pip \ --trusted-host pypi.org \ --trusted-host files.pythonhosted.org \ --proxy http://proxy.your-domain.com:8083 23) # display our packages (again) python3.9 -m pip list --trusted-host pypi.org Package Version ---------- ------- pip 22.2.2 <<< setuptools 58.1.0

28) DNF tweak for Rocky Linux (2022-11-21)

For CentOS-7 systems sitting behind a corporate proxy server, you need to add one line to file /etc/yum.conf
proxy=http://proxy.your-domain.com:8083
For Rocky-8 systems sitting behind a corporate proxy server, you need to add two lines to file /etc/dnf/dnf.conf
proxy=http://proxy.your-domain.com:8083
sslverify=false 

caveats:

  • the company side of our proxy server requires that we always connect by protocol http but never https
  • our proxy server requires that we connect on port 8083. Your port will probably be different

Back to Linux Notes
Back to Home
Neil Rieck
Waterloo, Ontario, Canada.