OpenVMS Notes: "Alpha to Itanium" Porting Diaries

edit: 2024-03-25
Executive Summary
  • Platforms like PDP, VAX, and Alpha employ a firmware-based BIOS where you could boot the system with commands like "b dua0" to boot from disk dua0
  • All modern computers, including Itanium, now employ a firmware-software-hybrid technology, known as UEFI, which executed by an internal management processor (some sort of x86 thingy that is always running as long as power supply available, even if the server is powered off)
  • You can do a lot of stuff from a locally connected VGA terminal and keyboard but this does not include installing OpenVMS on a disk for the first time. That activity must be done over the serial port, or iLO, which acts as terminal device OPA0:
    serial this activity will require a serial null-modem cable connected between your Itanium and your laptop (or a VT100 compatible monitor if you can find one). Connecting a serial cable to your laptop may require the purchase of a third-party dongle (USB-to-DB9). See my notes below.
    CAVEAT: You do not want to be doing this for the first time on a weekend, or national holiday, or during an emergency. Learn how to do it now, then make many notes for future reference.
    iLO obviously no special hardware is required to connect to an iLO port. But you would need to set the iLO address, then connect it to your network, then connect to the port via another computer on the network.

The Name Game

A very brief history of Itanium

Project-01 : HP rx2800-i2

Introduction (my first Itanium)

Day 1 (well, just a few hours one afternoon)

BIOS - EFI - UEFI

So what's in your Itanium?

rx2660
Itanium rx2650
(drives are vertical; beer is optional)

UEFI over an iLO DB-9 console port

Wasted Time (caused by me)

Initial Setup

legend for my examples:
	<sr>	system response (what the system displays)
	<ur>	user response (what you should type or do)
	(text)	some action is described
	<enter>	hit the "enter" key
----------------------------------------------------------
<ur>	(connect a serial line to the iLO port then hit <enter>)
<sr>	MP login:
<ur>	Administrator
<sr>	MP password:
<ur>	******** (found on a pullout tag on the front of your chassis)
<sr>	Hewlett-Packard Integrated Lights-Out 3 for Integrity
	(C) Copyright 1999-2013 Hewlett-Packard Development Company, L.P.
		MP Host Name: kawc0x
		iLO MP Firmware Revision 01.55.02
	MP MAIN MENU:
		 CO: Console
		VFP: Virtual Front Panel
		 CM: Command Menu
		 CL: Console Log
		 SL: Show Event Logs
		 HE: Main Help Menu
		 X: Exit Connection
	[kawc0x]</> hpiLO->
<ur>	co
<sr>	(iLo system connects to system Itanium console)

	[Use Ctrl-B or ESC-( to return to MP main menu.]

	- - - - - - - - - - Prior Console Output - - - - - - - - - -
	1,0,0,0 5400006301E10000 0000000000000000 EVN_BOOT_START
	***********************************************************
	* ROM Version : 01.95
	* ROM Date    : Fri Feb 01 03:54:28 PST 2013
	***********************************************************
	1,0,0,0 3400083701E10000 000000000002000C EVN_BOOT_CELL_JOINED_PD
	1,0,0,0 340000B101E10000 0000003C0205000C EVN_MEM_DISCOVERY
	1,0,0,0 1400002601E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,1,0 1400002605E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,2,0 1400002609E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,3,0 140000260DE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,1,1 1400002607E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,2,1 140000260BE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,3,1 140000260FE10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,0,1 1400002603E10000 000000000006000C EVN_BOOT_CPU_LATE_TEST_START
	1,0,0,0 5400020701E10000 000000000011000C EVN_EFI_START

	Press Ctrl-C now to bypass loading option ROM UEFI drivers.

	1,0,0,0 3400008101E10000 000000000007000C EVN_IO_DISCOVERY_START
	1,0,0,0 5400020B01E10000 0000000000000006 EVN_EFI_LAUNCH_BOOT_MANAGER

	(C) Copyright 1996-2010 Hewlett-Packard Development Company, L.P.

	Note, menu interfaces might only display on the primary console device.
	The current primary console device is:
	Serial PcieRoot(0x30304352)/Pci(0x1C,0x5)/Pci(0x0,0x5)
	The primary console can be changed via the 'conconfig' UEFI shell command.

	Press:  ENTER  -  Start boot entry execution
		B / b  -  Launch Boot Manager (menu interface)
		D / d  -  Launch Device Manager (menu interface)
		M / m  -  Launch Boot Maintenance Manager (menu interface)
		S / s  -  Launch UEFI Shell (command line interface)
		I / i  -  Launch iLO Setup Tool (command line interface)

	*** User input can now be provided ***

	Automatic boot entry execution will start in 7 second(s).
<ur>	s
<sr>	Searching for devices.
	HP Smart Array P410i Controller     (version 5.78)  1 Logical Drive
	Currently the controller is in RAID mode
	Launching UEFI Shell.
	UEFI Shell version 2.10 [2.0]
	Current running mode 1.1.2
	Device mapping table
	  fs0    :Removable HardDisk - Alias hd9a0b blk0
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)/HD(...)
	  fs1    :Removable HardDisk - Alias hd9a0d blk1
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)/HD(...)
	  blk0   :Removable HardDisk - Alias hd9a0b fs0
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)/HD(...)
	  blk1   :Removable HardDisk - Alias hd9a0d fs1
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)/HD(...)
	  blk2   :BlockDevice - Alias (null)
		  PcieRoot(0x30304352)/Pci(0x1F,0x2)/Sata(0x0,0x0,0x0)
	  blk3   :Removable HardDisk - Alias (null)
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)/HD(...)
	  blk4   :Removable BlockDevice - Alias (null)
		  PcieRoot(0x30304352)/Pci(0x1,0x0)/Pci(0x0,0x0)/Scsi(0x0,0x0)
	Press ESC in 1 seconds to skip startup.nsh, any other key to continue.
	Shell>
--------------------------------------------------------------------------------------
Caveat: this listing actually shows two initialized volumes (file systems are prefixed
with "fs") and five block structured devices (prefixed with "blk"). On a brand new
system (with no optical media present in the DVD drive) you will likely see one or
more "blk" entries and zero "fs" entries. Notice how fs0 (red) points to blk0 (red)
and vice versa. Also notice the alias names in green and purple.

Booting the OpenVMS ISO DVD from UEFI (without the boot manager)

<sr>	Shell>
<ur>	map
<sr>	(system will display known devices; init'd media will appear with fs# entries)
<ur>	(insert VMS media into the DVD drive)
	map -r
<sr>	("-r" forces the map command to rescan all the hardware) 
	(the system will display known devices; now "fs0:" now refers to the DVD)
<ur>	fs0:
<sr>	fs0:>\
<ur>	ls
<sr>	(should get a directory listing)
<ur>	cd efi\boot
<sr>	fs0:\efi\boot>
<ur>	bootia64
<sr>	(OpenVMS boots then displays the familiar 8-line installation menu)
<ur>	(use the menu to install OpenVMS on the desired drive)
======================================================
Once OpenVMS is installed, eject the DVD then reboot
The system will reboot to UEFI but not OpenVMS (huh?)

Booting OpenVMS from the disk you just built (with the boot manager)

<sr>	Shell>
<ur>	map
<sr>	(displays known devices; in my case fs0: now refers to the OpenVMS disk)
<ur>	fs0:
<sr>	fs0:>\
<ur>	bcfg boot add 1 \EFI\VMS\VMS_LOADER.EFI "OpenVMS 8.4"
<sr>	fs0:>\
<ur>	exit
<sr>	system will drop into the UEFI menu
<ur>	B
<sr>	system will display boot menu
<ur>	choose the first entry (if you entered "add 1" above)
<sr>	system will boot OpenVMS
--------------------------------------------------------------------------------------
Question: So what just happened? Answer: The boot configuration command (BCFG) looked
at my currently selected device (fs0) then used associated drive information
(1459B1241-18EB-11E5-9F2B-AA000400FEFF in my case) to make an entry into position #1
of the boot configuration table. Previous entries were pushed down by one position.
Coles Notes Summary
: we now boot by UUID rather than drive letter

Day 2 (Installing OpenVMS)

Day 3

Day 4

Day 5 (playing with RAID)

Day 6 (adding two more drives)

 Day 7 (Simplistic Benchmarks)

Day 8

Day 9 (Began porting software)

  1. Today I created directories on the Itanium to replicate our development environment on the Alpha (we do not have any commercial source code versioning tools or source code repositories)
    [dvlp]		- files like:	yada.bas	(VMS-BASIC sources)
    				yada.com	(DCL build scripts)
    				yada.c		(DEC-C sources)
    [dvlp.cxx]	- files like:	yada.cxx	(DEC-C++ sources)
    [dvlp.fil]	- files like:	yada.rec	(record maps)
    				yada.opn	(file opens)
    [dvlp.fdl]	- files like:	yada.fdl	(RMS fdl files used in ISAM tuning)
    [dvlp.flb]	- files like:	yada.flb	(FMS form libraries)
    				yada.fms	(FMS forms)
    [dvlp.fun]	- files like:	yada.fun	(VMS-BASIC standalone functions)
    				yada.sub	(VMS-BASIC standalone sub programs)
    [dvlp.inc]	- files like:	yada.inc	(VMS-BASIC standalone includes)
    [dvlp.mar]	- files like:	yada.mar	(macro assembler stuff)
    [dvlp.sql]	- files like:	yada.sql	(SQL files to perform MariaDB maintenance)
    				yada.com	(DCL scripts to perform MariaDB maintenance)
    [src]		- production copies of everything found in/under [dvlp]
    [data]		- files like:	yada.dat	(RMS/ISAM production data files)
  2. Next, I copied over all our source code from the Alpha then prepared to run the build scripts
     
  3. Some very old (not yet modernized) BASIC programs reference VMS run-time libraries (STARLET.OLB) but the sources do not pick up declarations from BASIC$STARLET.TLB. These programs pickup their external symbols declarations via an include [.inc]vms_external.inc but expect to have symbols resolved at link time where we associate them with "[.mar]symbols.obj" which is created from DCL by invoking the macro assembler like so:
     
            $mac [.mar]symbols.mar

    Even though we are now on a different architecture CPU, this command worked without modification.
    comment: I have got to find the time to get rid of this old crud
     
  4. Many of our smaller programs only contain a few external functions found in [.fun] which are compiled every time we rebuild. Some larger programs (which are continually being rebuilt) expect to find the object files associated with precompiled functions inside our custom library ICSIS_FUNCTIONS.OLB (anyone modifying an external function is responsible for updating the library). We have been using a home-brewed DCL script to create/maintain this OLB for the past 15 years starting with VAX then leading to Alpha. On Itanium this script only works once (library creation) but fails when replacing an individual module upon a subsequent pass. (you will get a weird message about your OLB not being an ELF library).
    •  the fix: modify the library maintenance script replacing "libr/cre ICSIS_FUNCTIONS.OLB" with "libr/cre/obj ICSIS_FUNCTIONS.OLB"
       
  5. I invoked a few scripts which rebuild all 200+ programs. There were ~25 failures all due to poor housekeeping (eg. they also could not be rebuilt this way on Alpha)
    • the fix: minor modifications to ~25 programs which then were rebuilt without incident
       
  6. I have one program which builds with some very strange linker messages on Itanium. This area of the code is a place where I employ sys$cmexec to call a small function which sets/changes a logical name in table LNM$SYSTEM_TABLE (Why do it this way? It is a supervisor-mode logical name which means I must first change over to executive mode; if this stand-alone program used a CLI then I could just use LIB$SET_LOGICAL; in hindsight the CLI-based approach might have been less painful).
    • the fix:
      • modify the called code changing from a BASIC FUNCTION to a BASIC SUB
      • change one declaration in the calling code

Day 10

Day 11 (Apache woes : part 1)

Day 12 (Apache patch)

Day 13 (my FTP woes)

I think it is safe to say that most modern systems push messages between them using SFTP (which is based upon SSH) or SOAP. However, older interfaces connected to other non-OpenVMS systems may require FTP transfers and this can be a problem. Why?

  • almost all out-of-the-can software can not make sense of VMS/OpenVMS file listings as viewed from FTP (remote systems expect to see a UNIX display)
    • this is especially true of almost all graphical FTP tools available for PCs (Attachment's Reflection for UNIX and OpenVMS is one notable exception)
  • scripting is virtually identical on all implementations of SFTP
  • scripting is implementation specific with FTP (most remote systems expect to see UNIX)
  • the only things that FTP and SFTP have in common are three letters (this prevents others from using an SFTP client to connect to our FTP server)
  • while FTP supports ASCII-mode transfers, SFTP does not
  • FTP scripting will allow you to include a password in the script, this was removed from SFTP for obvious reasons

Okay so it turns out that we have two separate FTP-related issues which may force us to a third-party stack (MultiNet)

  1. Some foreign systems connect to our OpenVMS system via FTP and expect to see a UNIX-like file listing. These are programmatic transfers written using Microsoft's C# making use of the FTP API built into .NET -AND- we have had no luck getting those programmer's to change their code (they push-pull between various systems of which we are only one of many)
    • both TCPware and MultiNet provide logical names (eg. multinet_ftp_unix_style_by_default is one of many) to adjust the view as seen by connecting clients. This is done on a per-account basis by inserting ftp-specific logical names into the receiving account's LOGIN.COM script.
    • TCPIP Services for OpenVMS does not have anything like this.
      • TCPIP Services does employ a system-wide logical name ( TCPIP$FTPD_ UNIX_DISPLAY ) but it does not do what you might think (it only allows filenames with multiple dots to be displayed that way)
  2. We are required to do programmatic semi-intelligent FTP transfers to other sites so need an FTP API or FTP scripting capabilities. Here is the pseudo code:
    1. prepare to push a message file to a remote system
    2. connect to the remote system via FTP
    3. look to see if file "YADA.txt" exists
      1. If NO, then push our message
      2. If YES, then the remote system has not picked up previous file so don't overwrite it but try again in 60 seconds
      3. If still YES, then raise an alarm
    4. If this interface were rewritten today we would demand to push uniquely named files of the form "YADA-ccyymmddhhmmss.txt" but would still need to deal with problems like: "full destination disk", "unable to connect", and "receiving end not processing their inbound message files"

(our) Alpha Environment

Our existing AlphaServer DS20e employs the TCPware stack

  1. many of our inbound FTP accounts employ special ftp-specific logical names to make our system appear UNIX-like
  2. TCPware provides an "FTP Library" which is an FTP API used by high-level languages (BASIC and C in our case) to do outbound programmatic FTP transfers
  3. Some outbound FTP transfers still employ the programmatic scripting (referred to as a "take" file)

(my) Itanium Options

  1. get a TCPware license for this Itanium
    • minus: extra money
    • plus: very few (perhaps no) changes required in our programs or scripts
    • minus: TCPware is stuck at IPv4 and will never be upgraded to IPv6 according to the vendor (not currently a problem but could be within a year or two)
  2. get a MultiNet license for this Itanium
    • minus: extra money
    • plus: like TCPware, MultiNet supports special logical names to modify how FTP clients see our system
    • plus: supports both IPv4 and IPv6 (unlike TCPware)
    • minus: FTP Library is gone so no FTP API to do FTP from high level languages (which means that any program doing FTP will need to be rewritten using simple scripts)
    • minus: does not support programmatic scripting (a "/take" switch exists but works like the "/input" switch found in TCPIP Services)
      • any solution developed here would need to be really dumb
  3. stay with TCPIP Services for OpenVMS
    • plus: free with OpenVMS license
    • plus: supports both IPv4 and IPv6 (unlike TCPware)
    • minus: no  logical names (per-account or system-wide) to present UNIX-like displays to FTP clients
      • we would probably need to get a second UNIX or WINDOWS box which would be designated as designation for FTP clients. This could get messy
    • minus: no support for programmatic FTP scripting
      • any solution developed here would need to be really dumb:
        1. invoke script to connect to remote system to look for presence of desired file, then exit
        2. make some decision
        3. invoke a different script to do the next step

Day 14 (human-caused problems with SFTP)

  1. Received a trial license from Process Software then downloaded and installed MultiNet
  2. As usual, I was interrupted when configuring SSH which led to some delays in getting SFTP to work properly.
    • I believe this very misleading error message
              Disconnected; protocol error (Protocol error: packet too long
      would have happened with any stack from any vendor which is why I have documented the problem here.
  3. I think I found a problem with FTP UNIX mode. The MultiNet version of the process logicals are supposed to work the same way as they do in TCPware but they don't. So check out this stub from LOGIN.COM script of one of the accounts needing this
    $! logicals for TCPware
    $       DEFINE TCPWARE_FTP_UNIX_STYLE_BY_DEFAULT        TRUE
    $       DEFINE TCPWARE_FTP_DISALLOW_UNIX_STYLE          FALSE
    $       DEFINE TCPWARE_FTP_UNIX_STYLE_CASE_INSENSITIVE  TRUE
    $       DEFINE TCPWARE_FTP_UNIX_YEAR_OLD_FILES          TRUE
    $       DEFINE TCPWARE_FTP_STRIP_VERSION                TRUE
    $! logicals for MultiNet (our Itanium box)
    $       DEFINE MULTINET_FTP_UNIX_STYLE_BY_DEFAULT       TRUE
    $!~~~   DEFINE MULTINET_FTP_DISALLOW_UNIX_STYLE         FALSE   (bug - must be undefined)
    $       DEFINE MULTINET_FTP_UNIX_STYLE_CASE_INSENSITIVE TRUE
    $       DEFINE MULTINET_FTP_UNIX_YEAR_OLD_FILES         TRUE
    $       DEFINE MULTINET_FTP_STRIP_VERSION               TRUE

Day 15 (gSOAP)

Day 16 (problem with the Itanium Linker)

$! link-1 (does not work on Itanium)
$! yields: 1 error and 4 warnings
$ link  /exec=qsrv.exe -
 [dvlpc.qsrv.objs]qsrv.obj, -
 [dvlpc.qsrv.objs]cgi.obj, -
 [dvlpc.qsrv.objs]cdescriptor.obj, -
 [dvlpc.qsrv.objs]cquery.obj, -
 [dvlpc.qsrv.objs]cstatus.obj, -
 csmis$dvlp:ICSIS_FUN_LIBRARY.OLB/library, -
 mysql055_root:[lib.ia64]libclientlib.olb/library, -
 mysql055_root:[lib.ia64]libdbug.olb/library, -
 mysql055_root:[lib.ia64]libmysys.olb/library, -
 mysql055_root:[lib.ia64]libsql.olb/library, -
 mysql055_root:[lib.ia64]libstrings.olb/library, -
 mysql055_root:[lib.ia64]libvio.olb/library, -
 mysql055_root:[lib.ia64]libyassl.olb/library, -
 sys$share:LIBZSHR.EXE/share

$! link-2 (does work on Itanium)
$! yields: 0 errors and 0 warnings
$ link  /exec=qsrv.exe -
 [dvlpc.qsrv.objs]qsrv.obj, -
 [dvlpc.qsrv.objs]cgi.obj, -
 [dvlpc.qsrv.objs]cdescriptor.obj, -
 [dvlpc.qsrv.objs]cquery.obj, -
 [dvlpc.qsrv.objs]cstatus.obj, -
 sys$input/opt
 csmis$dvlp:ICSIS_FUN_LIBRARY.OLB/library
 mysql055_root:[lib.ia64]libclientlib.olb/library
 mysql055_root:[lib.ia64]libdbug.olb/library
 mysql055_root:[lib.ia64]libmysys.olb/library
 mysql055_root:[lib.ia64]libsql.olb/library
 mysql055_root:[lib.ia64]libstrings.olb/library
 mysql055_root:[lib.ia64]libvio.olb/library
 mysql055_root:[lib.ia64]libyassl.olb/library
 sys$share:LIBZSHR.EXE/share

Day 17 (Apache woes : part 2)

Overview

New Problem

Curiosity Killed The Cat (satisfaction brought it back) Quick Fix (2015-11-13) Final Fix (2015-11-27)

Day 18 (important miscellaneous stuff)

Third Party Software (we can't live without) Other

Day 19 (supported drives limitations)

Day 20 (drive limitations removed)

Day 21 (the cutover)


What a long strange trip this has been...

Summary (so far)

Project-02 : rx2660

Day 1 (fetch units; initial power up tests)

rx2660
Itanium rx2660
Front Panel LEDs
 

Day 2 (installing drives; installing OpenVMS)

Day 3 (updating firmware and other stuff)

Day 4 (optional: Installing Linux on Itanium)

Common Stuff

Comparing UEFI to EFI

At first blush you might think that the graphics-based setup associated with EFI systems may be of some use to you but you would be wrong.

Monitoring RAID disks with MSA

===========================================================================
title   : neil_msa_20181109.txt
purpose : detailed notes from a RAID adventure on 2018-11-09
platform: OpenVMS (Itanium) on an rx2660 hosting a P400 RAID controller
msa-docs: HP OpenVMS System | Management Utilities Reference | Manual: M-Z
          Order Number: BA555-90009
overview: DISK-5 (which is part of logical unit-1 on my RAID) has just been
	    replaced but the new disk needs to be coaxed back into action
facts   : DISK-1 is the only member of UNIT-0 (so no RAID)
	: DISKS 2-to-7 are members of UNIT-1 (RAID-5)
	: DISK-8 is the only member of UNIT-2 (so no RAID) but is currently
	    deslotted so I can perform some hot backups
	: I may reconfig this array sometime in the future so do not take
these settings as gospel notes 1) use the MSA utility if a RAID controller is present; otherwise
use the SAS utility 2) "MSA$UTIL.EXE" is the OpenVMS utility required to do this. So use these DCL statements to create two symbolic DCL commands: $ MSA == "$sys$system:msa$util.exe" (MSA works with all my machines) $ SAS :== $sys$system:sas$util.exe (SAS works with my P400 on the rx2600) (SAS does not work with my rx2660) (SAS does not work with my P410 on the rx2800) 3) in the MSA utility, physical DISKs begin numbering from "1" 4) in the MSA utility, logical UNITs begin numbering from "0" 5) eject the bad disk; wait 10 seconds 6) insert the new disk; wait up to 30 seconds (I have proved this by continually executing "show disk 5") ========================================================================= KAWC09(DVLP)::Neil> ! this is my DCL prompt KAWC09(DVLP)::Neil> msa ! MSA> set cont * MSA> show unit 1 Unit 1: In PDLA mode, Unit 1 is Lun 1. Cache status : enabled Max Boot Partition: Unknown Volume status : VOLUME using interim recovery mode 1 Disk(s) Failed or Removed: <<< whoops Disk 5: (SCSI bus 1, SCSI id 3) <<< whoops 6 Data Disk(s) used by lun 1: Disk 7: Partition 0; (SCSI bus 1, SCSI id 1) Disk 6: Partition 0; (SCSI bus 1, SCSI id 2) Disk 5: Partition 255; (SCSI bus 1, SCSI id 3) Disk 4: Partition 0; (SCSI bus 1, SCSI id 4) Disk 3: Partition 0; (SCSI bus 1, SCSI id 5) Disk 2: Partition 0; (SCSI bus 1, SCSI id 6) Spare physical drives: No spare drives are designated. Logical Volume Raid Level: RAID 1. Mirroring stripe_size=128.0KB Logical Volume Capacity : 410.10 [440.35] GB MSA> ================================================================ force the new DISK back into the existing UNIT ================================================================ MSA> scan all MSA> start recover Rebuild operation is triggered for Units. Issue SHOW UNIT command to know the status of the Units. MSA> show unit 1 Unit 1: In PDLA mode, Unit 1 is Lun 1. Cache status : enabled Max Boot Partition: Unknown Volume status : VOLUME is currently recovering <<< better ( Percentage complete 8 ) <<< better 6 Data Disk(s) used by lun 1: Disk 7: Partition 0; (SCSI bus 1, SCSI id 1) Disk 6: Partition 0; (SCSI bus 1, SCSI id 2) Disk 5: Partition 0; (SCSI bus 1, SCSI id 3) <<< this thing is now back Disk 4: Partition 0; (SCSI bus 1, SCSI id 4) Disk 3: Partition 0; (SCSI bus 1, SCSI id 5) Disk 2: Partition 0; (SCSI bus 1, SCSI id 6) Spare physical drives: No spare drives are designated. Logical Volume Raid Level: RAID 1. Mirroring stripe_size=128.0KB Logical Volume Capacity : 410.10 [440.35] GB MSA> exit ================================================================ some time later ================================================================ KAWC09(DVLP)::Neil> msa MSA> set cont * MSA> show unit 1 Unit 1: In PDLA mode, Unit 1 is Lun 1. Cache status : enabled Max Boot Partition: Unknown Volume status : VOLUME OK <<< cool 6 Data Disk(s) used by lun 1: Disk 7: Partition 0; (SCSI bus 1, SCSI id 1) Disk 6: Partition 0; (SCSI bus 1, SCSI id 2) Disk 5: Partition 0; (SCSI bus 1, SCSI id 3) Disk 4: Partition 0; (SCSI bus 1, SCSI id 4) Disk 3: Partition 0; (SCSI bus 1, SCSI id 5) Disk 2: Partition 0; (SCSI bus 1, SCSI id 6) Spare physical drives: No spare drives are designated. Logical Volume Raid Level: RAID 1. Mirroring stripe_size=128.0KB Logical Volume Capacity : 410.10 [440.35] GB MSA> MSA> exit KAWC09(DVLP)::Neil>

Reconfiguring RAID for OpenVMS (P400 on rx2660)

overview:

  1. I am now working on an OpenVMS system where the user data disks are protected by a RAID scheme while the system disk is not (yikes!). Since I intend to retire in three years, I want these systems to be more easily maintainable (swap a bad drive on the fly - then coax with MSA rather than taking the system down)
  2. Up until this year, I was under the impression that bootable OpenVMS system disks were difficult to copy because the UEFI/EFI files where stored in a FAT-32 partition (as they are on most modern UNIX and Linux systems). I was wrong (read on to learn more)
  3. Our current RAID looks like this:
    Disk (physical) 1 2 3 4 5 6 7 8
    Units (logical) 1 2
    Configuration JBOD RAID-1+0 JBOD
    Application SYSTEM USER1  Emergency SYSTEM
    but I want this:
    Disk (physical) 1 2 3 4 5 6 7 8
    Units (logical) 0 hot
    spare
    for unit 0
    1 hot
    spare
    for unit 1
    2 3
    Configuration RAID-1   RAID-1   JBOD JBOD
    Application SYSTEM   USER1   target for
    hot backup
    of USER1
    target for
    hot backup
    of SYSTEM

steps

first copy the data disk (RAID to single drive):
  1. use the $backup utility to copy dka1 to dka2 (either "from DCL after you boot the OpenVMS DVD" -or- "from OpenVMS provided the system is quiescent")
    $ mount dka1:/over=ident
    $ init  dka2: abcdef
    $ mount dka2:/for
    $ backup/image/ignore=(noback,inter) dka1: dka2:
  2. reboot the DVD (or OpenVMS system disk) but be sure to enter ORCA (online raid configuration utility) before the media is accessed.
    1. delete all unit definitions (do not do this unless you have good backups)
    2. move the new USER1 drive from slot 8 to slot 7
    3. move the old SYSTEM drive from slot 1 to slot 8
    4. create UNIT-0
      • assign disks 1 + 2 configured as RAID-1+0
        • I only want RAID-1 but this controller does not offer that choice
        • since we are only using two drives the controller will only implement mirroring (RAID-1)
      • assign disk 3 as a hot spare (type 's' on the disk you want)
    5. create UNIT-1
      • assign disks 4 + 5 configured as RAID-1+0
        • I only want RAID-1 but this controller does not offer that choice
        • since we are only using two drives the controller will only implement mirroring (RAID-1)
      • assign disk 6 as a hot spare
    6. create UNIT-2
      • assign disk 7 as JBOD (just a bunch of drives)
    7. create UNIT-3
      • assign disk 8 as JBOD (just a bunch of drives)

now copy both JBOD drives back to two RAID sets

  1. exit the ORCA utility then allow the DVD to continue to boot
    • you will see dots displayed on the VGA terminal for the next 5-minutes
    • now all output will only be seen on the serial console or iLO port (never on the VGA terminal)
  2. At the menu, press "8" to spawn to a process with a DCL prompt
    $ show device d						! view the disks (show see four)
    $!------------------------------------------------------
    $ mount dka3:/over=ident				! mount the copied SYSTEM volume
    $ init  dka0:/stru=5 abcdef				! init new SYSTEM volume; ODS5 is optional
    $ mount dka0:/foreign					!
    $ backup/image/ignore=(noback) dka3: dka0:		! "noback" is required for page + swap files
    $ dismount dka0:					!
    $ mount dka0:/over=ident				!
    $ set def dka0:[sys0.syscommon.sysmgr]			!
    $ @ BOOT_OPTIONS					! repair entries in the UEFI/EFI boot manager
    $ dismount dka0:					! this RAID volume is ready for use
    $!------------------------------------------------------
    $ mount dka2:/over=ident				! mount the copied USER1 volume
    $ init  dka1:/stru=5 abcdef				! init new USER1 volume; ODS5 is optional
    $ mount dka1:/foreign					!
    $ backup/image/ignore=(noback,inter) dka2: dka1:	! "noback" just incase
    $ dismount dka2:					!
    $ mount dka2:/over=ident				!
    $ show dev/full dka2:					! -+- do these appear to be similar in size?
    $ show dev/full dka3:					! -+
    $ dismount dka2:					!
    $ dismount dka3:					!
    $-------------------------------------------------------
    $! caveat: if you want to make disk-8 (unit-3) available for an emergency boot disk,
    $! then you must mount the disk then execute script SYS$MANAGER:BOOT_OPTIONS on that disk
    $! to repair (or create) the boot entry for the emergency disk  
  3. Comment: contrary to popular belief, the OpenVMS system disk is not partitioned. There is no FAT-32 partition to hold the UEFI/EFI boot files. However, this file "SYS$SYSDEVICE:[VMS$COMMON.SYS$LDR]SYS$EFI.SYS" is a special file which can be found, read, then executed by the EFI/UEFI boot loader.

update

Today I noticed a few missing files on unit-1 (USER1) so I fetched the hot swap disk from our remote site then swapped disk-7 (a.k.a. unit-2).
I tried mounting the disk like so: mount dkb2:/over=ident/noassist but got a medium offline error message. Anyway, here is the fix

$ mount dkb2:/over=ident/noassist
%MOUNT-F-ERROR, medium offline
$ msa
msa> set cont
msa> show disks
...snip... (this worked as expected)
msa> MSA> sho unit 2 Unit 2: In PDLA mode, Unit 2 is Lun 2. Cache status : enabled Max Boot Partition: Unknown Volume status : VOLUME failed <<<---*** 1 Data Disk(s) used by lun 2: Disk 7: Partition 0; (SCSI bus 1, SCSI id 1) Spare physical drives: No spare drives are designated. Logical Volume Raid Level: RAID 0 or JBOD. No fault tolerance stripe_size=128.0KB Logical Volume Capacity : 136.70 [146.78] GB
mas> accept unit 2
Media has been changed on unit 2
mas> exit
$ mount dkb2:/over=ident/noassist
%MOUNT-I-MOUNTED CSMIS$USER1 mounted on _KAWC09$DKA2:
$

Reconfiguring RAID for OpenVMS (P410i on rx2800-i2)

overview:

  1. I wanted to put my newly acquired MSA skills to good use by hardening our production system (rx2800-i2). That system has two RAID sets and I wanted to add two HOT-SPARES from MSA$UTIL so I would not need to take the system down (it has been running continuously for over 36-months)
  2. Our current RAID looks like this:
    Disk (physical) 1 2 3 4 5 6 7 8
    Units (logical) 0 1 2 2 empty empty
    Configuration RAID-1 RAID-1 JBOD JBOD    
    Application SYSTEM USER1 UTILITY emergency
    SYSTEM
       
    but I want this
    Disk (physical) 1 2 3 4 5 6 7 8
    Units (logical) 0 1 2 3 hot spare
    for unit 0
    hot spare
    for unit 1
    Configuration RAID-1 RAID-1 JBOD JBOD    
    Application SYSTEM USER1 UTILITY emergency
    SYSTEM
       

Steps

  1. I ordered two new 300 GB drives from HP then slotted them (this was done with the system running)
  2. I had no problems assigning disk-7 as a spare to unit-0 like so:
    MSA> set cont *
    MSA> set unit 0 /spare=7
  3. I was not able to assign disk-8 as a spare to unit-1 as the following display will show
    MSA> set cont *
    MSA> set unit 1 /spare=8
    This logical drive has been configured with a later version of
    an Array Configuration Utility.
    It is not safe to use this program to make changes.
    Adding or modification of Raid Unit failed.
    MSA> exit
    $
  4. I know that RAID unit-0 was built by HP in Singapore because that's the way it was delivered to Canada. I know that RAID unit-1 was built from the ORCA (Option Rom Configuration for Arrays) by me three years ago. So why has HP allowed the firmware to get ahead of program MSA$UTIL ???
    update: this alert (for an rx2660) shows that HP is aware of the problem
  5. Apparently I only have two ways forward
    1. Reboot the system then use the firmware tool (ORCA) to assign disk-8 to unit-1
    2. Partially shutdown the system (being sure to close all the files on the VMS volume associated with unit-1)
      • use $BACKUP to copy all of unit-1 to disk-8 (temporarily configured as JBOD)
      • use MSA$UTIL to delete the definition of unit-1
      • use MSA$UTIL to recreate unit-1
      • use $BACKUP to copy all of disk-8 to unit-1
      • use MSA$UTIL to add disk-8 to unit-1 as a hot-spare

Reconfig common

If you happened to use the ADD command in the RAID controllers to create a new unit-disk association that didn't exist before, then remember to use either one of these DCL command to make OpenVMS aware of it

!------------------------
$RUN SYS$SYSTEM:SYSGEN
    AUTOCONFIGURE ALL
    EXIT
!------------------------
$RUN SYS$SYSTEM:SYSMAN IO AUTOCONFIGURE EXIT

Building an iLO Maintenance Network

Weird networking problem with MultiNet-5.5

hardware: rx2660
os: OpenVMS-8.4
stack: MultiNet-5.5
Problem: okay so this is a weird one.

External Links


 Back to Home
Neil Rieck
Waterloo, Ontario, Canada.