Linux Notes: Real-world problems

  1. The information presented here is intended for educational use by qualified computer technologists.
  2. The information presented here is provided free of charge, as-is, with no warranty of any kind.
Edit: 2019-02-04

Real-world Linux Problems

Note: all these problems are associated with CentOS-7.2 or higher

1) We cannot install or update software via YUM on one of our CentOS-7 platforms

We have two Linux platforms; one for development and second for production (comment: two platforms may not be enough when using Linux; see the IBM-Managed warning after this section). The recommended approach is to first install (or update) software on the development box. If testing for the next few days (to few weeks) proves that everything is working properly then we would repeat the procedure on the production box. This also keeps both platforms more-or-less in sync.

I wanted to install the tree utility so I logged onto the development box where I entered this command:

	sudo yum install tree

... which worked properly.

Then I repeated this command on the production platform which failed with numerous errors associated with file /usr/libexec/urlgrabber-ext-down which is a python script. What was worse was this: you could not execute "firewall-cmd" or most yum commands including "yum check-update". Investigating further, I noticed that someone had installed python3 then updated the symbolic link so that the python command pulls up python3 rather than python2 (most Linux utilities in 2018 require Python 2.7)

There are only two ways out of this problem (remember that this is an active business system).

  1. modify the first line (the shebang or sharp-bang line) of broken system scripts from this "#! /usr/bin/python" to this "#! /usr/bin/python2"
  2. modify the symbolic link for "python" and point it at the symbolic link "python2" which probably points to a directory like "python2.7". This will restore the system to its previous functionality but could break something if newer customer scripts required python3 to be the default. So before modifying the symbolic link, modify the shebang line of customer scripts like this:
    "#! /usr/bin/python" to this "#! /usr/bin/python3"

Tip: if this is an emergency, just make minimal changes. For now, just modify the scripts for whatever is broken (eg. yum or firewall-cmd). But you will eventually need to put everything back to a pristine state. If your stuff needs python3 then you need to rely upon shebang. I have no idea why the Linux maintainers didn't do this for their scripts requiring Python2.

#
# 1) this is an example of a good working setup with two versions of python installed
# 2) notice (on the first result line) that python is pointing to python2
#    and python2 is pointing to python2.7
# 3) if yum and firewall-cmd require python2 then you would have thought the
#    developers would have modified the shebang lines of those utilites
#
$ cd /usr/bin
$ ls pytho* -la
lrwxrwxrwx. 1 root root     7 Jan 12 15:25 python -> python2
lrwxrwxrwx. 1 root root     9 Dec 20  2016 python2 -> python2.7
-rwxr-xr-x. 1 root root  7136 Nov  5  2016 python2.7
lrwxrwxrwx. 1 root root     9 Apr 12  2017 python3 -> python3.4
-rwxr-xr-x. 2 root root 11312 Jan 17  2017 python3.4
lrwxrwxrwx. 1 root root    17 Apr 12  2017 python3.4-config -> python3.4m-config
-rwxr-xr-x. 2 root root 11312 Jan 17  2017 python3.4m
-rwxr-xr-x. 1 root root   173 Jan 17  2017 python3.4m-config
-rwxr-xr-x. 1 root root  3366 Jan 17  2017 python3.4m-x86_64-config
lrwxrwxrwx. 1 root root    16 Apr 12  2017 python3-config -> python3.4-config

2) Never use a graphical console to update Linux software

I have experienced several instances where updating software though the graphical interface fails for some reason then breaks the graphical interface (or the whole system). It should not surprise anyone that updating gnome-session, or any of its dependencies, might disturb the very session that is running yum or rpm

So if you are on the system console (which is almost always a VGA monitor) and want to move to non-graphical session, try one of the following keystrokes:

key press description notes
CTRL ALT F2 switch to terminal 2 (/dev/tty2) text only
CTRL ALT F3 switch to terminal 3 (/dev/tty3) text only
CTRL ALT F4 switch to terminal 4 (/dev/tty4) text only
CTRL ALT F5 switch to terminal 5 (/dev/tty5) text only
CTRL ALT F6 switch to terminal 6 (/dev/tty6) text only
CTRL ALT F1 switch to terminal 1 (the graphical interface) only graphical when runlevel > 3

The only other way to safely disable graphics is to lower the runlevel of your system to 3. (but only do this if you are certain that you won't kill some process currently needed by your customers). Alternatively, use ssh to log into your system via the network then execute yum on that session.

3) Using Windows to access a Linux remote

The self-help blogs really fall down on this one because the only secure way to do this is to tunnel x-sessions over SSH. But whenever anyone on a self-help blog asks how to do this only using SSH, some idiot will chime in with a procedure on how to do it using VNC, RealVNC, TigerVNC or Vino which are all insecure.

To make matters worse, setting up a remote graphical session is almost impossible (at least under certain circumstances like Windows -> RedHat/CentOS) because GNOME3 contains 3-d extensions not found in most Windows clients. The best way out of this is to setup RedHat/CentOS on a machine at the client end then use it to connect to the desired Linux platform.
 
comment: some conspiracy-minded people think this change was deliberately done to stop support professionals from from using Windows as their default platform to support all others. 

Xming

CygWin and CygWin/X

4) Recovering a failed YUM update (2018-01-xx)

5) a recent (2018-01-xx) YUM update broke our development box (ouch!)

Caveat: the procedure just given will only fix the OpenSSL CLI. Note that msodbcsql will still be broken because that software calls routines in the shared libraries. To fix msodbcsql you (supposedly) need to do one of the following:
  1. fully install an older version of OpenSSL (libraries and all) in a secondary location then ensure all scripts invoking sqlcmd look there
    • I built an older version of OpenSSL from source code then installed it in /opt/oldopenssl
    • all scripts starting sqlcmd first define LD_LIBRARY_PATH to point to /opt/oldopenssl/lib
    • although strace proves that msodbcsql is first looking in a secondary location, msodbcsql still does not work
  2. completely replace the new version of OpenSSL (libraries and all) with an older version
    • playing with yum downgrade openssl* has not yet worked but I think I may be close
  3. reinstall the previous OS (CentOS-7.2 in this case)

6) a recent (2018-06-xx) update broke our production box (a major pain!)

One of our developers was experiencing problems developing a new LDAP-based application. So he invoked YUM to update LDAP on our production box. The big problem here is that the update was done in a careless way (without reading all the release notes). So the LDAP update also updated OpenSSL for the whole system so now we can no longer connect to that older Microsoft platform in Montreal. (see: this previous note)

It now appears that we will need to install a third CentOS platform whose only purpose would be to reach through to the older Microsoft platform. This platform would need to be modified so that it could never been updated.

7) Something is overwriting file "/etc/resolv.conf"

This problem is so weird that I'll stick to bullet points

8) Our console device is totally dead

We are running CentOS-7.2 on two HP-DL385-g6 servers (one PROD, one DVLP) and both have been running for 24 months without a reboot. These are older hardware platforms so I have been preparing to cut the whole thing over to to two new servers (HP-DL385p-gen8) next month. I just noticed I can't access one of the older via the console.

command result
CTRL ALT F1 screen turns solid blue
CTRL ALT F2 screen turns solid green
CTRL ALT F3 screen turns sold green

I've tried everything (short of rebooting) including a restart of various services (eg. "systemctl restart gdm.service") but it seems that the graphics card itself is locked up somehow.

I need to point out that we can do anything else we want via a remote ssh terminal session over the network. In fact, the customers are unaware of fact that anything is wrong. So let me advise everyone to ensure that at least one network port is configured -AND- that the firewall has been configured to allow ssh connections so you be able to manage your platform if your VGA console is fubared.


Back to Linux Notes
 Back to Home
Neil Rieck
Kitchener - Waterloo - Cambridge, Ontario, Canada.