OpenVMS Notes: System Tuning

  1. The information presented here is intended for educational use by qualified OpenVMS technologists.
  2. The information presented here is provided free of charge, as-is, with no warranty of any kind.

edit: 2018-03-31 (updated but this page is now totally incomprehensible; will rewrite ASAP)

A Very Simple Overview of OpenVMS Memory

  1. VMS and OpenVMS are multi-user virtual memory operating systems.

    • Question: What is virtual memory?
    • Answer: It is technology which enables a computer to split your memory address space between electronic memory (RAM) and magnetic memory (DISK).
    • Question: Why would engineers go to all this trouble?
    • Answers:
      • In the mid 1970's when DEC was designing the VMS operating system for the VAX hardware architecture, RAM (memory) was very much more expensive than DISK (storage)
      • It was well known that after most software finishes initialization, they tend to run 10% of their code 90% of the time. So wouldn't it be neat if 90% of that seldom used code was moved out to DISK? This action would free up RAM for other processes.
      • And what about program data like large arrays? For example, think about a professional astronomy simulation with 100,000 entries for nearby stars; you only need to reference stars in the direction you are looking so perhaps "currently" unused star data could be moved out to DISK as well.
    • Question: Memory is now much cheaper (and therefore more abundant) so is this technology still necessary today?
    • Answers:
      • Yes, but first a brief history is required with the explanation.
      • A 16-bit CPU can address a maximum memory size of 64 Kilobytes (2^16 = 65,536).
        • These 16-bit systems accomplished quite a bit by swapping processes between memory and disk.
        • Memory mapping schemes allowed these 16-bit PDP computers to support memory sizes of 18-bit, 22-bit, and 24-bit.
        • The operating system could see this memory but individual processes could not.
        • 24-bit sounds like a lot but it only represents 16 Megabytes (2^24 = 16,777,216)
      • The 32-bit VAX was developed in the mid 1970s to replace the 16-bit PDP.
      • A 32-bit CPU can address a maximum memory size of 4 Gigabytes (2^32 = 4,294,967,296)
      • In the mid 1970s, 4 GB of RAM was so unbelievably expensive that no commercial computers were ever built to host that amount.
        • comment: PCs in the early 2000s routinely support the installation of 4 GB memory; in 2010 we were seeing new PCs host 12 GB
      • The 32-bit VAX-11/780, code-named "Star", was introduced on 25 October 1977 at DEC's Annual Meeting of Shareholders. It was the first computer to implement the 32-bit VAX architecture
      • According to books like DEC is Dead, Long Live DEC and Showstopper! The Breakneck Race to Create Windows NT, development of a successor 64-bit technology began at DEC in the mid 1980s. BTW, some of this technology made its way to Microsoft in 1988 after Dave Cutler was pushed from DEC
      • The 64-bit Alpha 21064 (also known as the EV4) was introduced by DEC in November 1992.
      • In theory, a 64-bit CPU can address a maximum "memory size" of 18 Exabytes (2^64 = 18,446,744,073,709,551,616). However, machines based upon the DEC Alpha architecture can be manufactured to support 8 Terabytes (2^43 = 8,796,093,022,208). Here, I am talking a combination of electronic physical memory combined with disk-based virtual memory
      • So just as no one ever imagined a computer hosting 4 GB of 32-bit RAM, no one today can imagine owning a 64-bit computer holding 8 TB to 18 EB of RAM. On top of this, most programs still usually will only run 10% of the code 90% of the time so virtual memory technology will allow active processes to access the underused memory of other processes.
  2. VAX pages are always 512 bytes in size. Why? This was the industry standard size of a disk sector in the mid 1970s so a memory page of 512 bytes was directly mapped to a disc page of 512 bytes. By 2011 the standard size of a disk sector had increased to 4096 bytes and will probably get to 8192 within a decade.
     
  3. Alpha and Itanium make a distinction between pages and pagelets. Pagelets (smaller than a page) are always 512 bytes but "pages" are larger and vary between machines of the same type. Here are three ways to determine the size of a "memory page" on your machine:

    1.  
      $ mcr sysgen
      SYSGEN>  SHOW WSMAX
      Parameter Name            Current    Default     Min.       Max.   Unit  Dynamic
      --------------            -------    -------   -------    -------  ----  -------
      WSMAX                      100000       8192      1024  134217728 Pagelets   
       internal value              6250        512        64    8388608 Pages      
      SYSGEN>  EXIT
      $ write sys$output 100000/6250*512
      8192
      $


    2.  
      $ show memory/phy
      System Memory Resources on 20-JUN-2010 06:22:01.58
      
      Physical Memory Usage (pages): Total Free In Use Modified
      Main Memory (128.00MB) 16384 3977 11327 1080
      $ write sys$output 128 * 1024 * 1024 / 16384
      8192
      $  

    3.  
      $ write sys$output f$getsysi("PAGE_SIZE")
      8192
      $
  4. An excerpt from OpenVMS 8.3 System Manager's Manual, Volume 2 about pages and pagelets

    Pages and Pagelets

    On VAX systems, the operating system allocates and deallocates memory for processes in units called pages. A page on a VAX system is 512 bytes. Some system parameter values are allocated in units of pages.

    On Alpha and I64 systems, some system parameter values are allocated in units of pages, while others are allocated in units of pagelets. Both Alpha and I64 support a variety of page sizes. The OpenVMS operating system currently uses 8KB (8192 bytes) pages on Alpha and I64 systems. A pagelet is a 512-byte unit of memory. One Alpha or I64 pagelet is the same size as one VAX page. On an Alpha or I64 system with a page size of 8KB, 16 pagelets equal one page. When reviewing parameter values, especially those parameters related to memory management, be sure to note the units required for each parameter. Showing Parameter Values with SYSMAN and Showing Parameter Values with SYSGEN explain how to show parameter values and their units of allocation.

  5. Sysgen parameter WSMAX will place an upper limit on the maximum amount of RAM (physical memory) used by every process in the system no matter what method was used to create them (see next section).

  6. If you add more physical memory to any VMS or OpenVMS system, your system will not see any of it until you execute the autogen script.
    @sys$update:autogen { more parameters expected on this line }

    Caveats:
    1. Never execute this script unless you know what you are doing -OR- can experiment with a non-production platform
    2. You might think you can run this program without hurting your system but doing so might prevent your system from rebooting at some future time.
    3. More information about autogen can be found below or here

Process Creation

VMS processes are created in two primary ways.
  1. Logging in with a username + password. A program called sys$system:loginout.exe will fetch your authorized parameters for your account from SYSUAF (SYStem User Authorization File) then use them to create your process.
    • caveat: many SYSUAF parameters can be overridden by sysgen parameters provided the associated sysgen parameters are higher.
      Here is a short (incomplete) list:
      sysuaf sysgen description
      WSdefault PQL_DWSDEFAULT working set default (the starting size of physical memory)
      WSquota PQL_DWSQUOTA working set quota (the guaranteed size that the process can grow to)
      WSextent PQL_DWSEXTENT working set extent (the size that a process can grow to on a lightly loaded system)
      Pgflquo PQL_DPGFLQUOTA the maximum amount of virtual memory (this is the 90%)
  2. Detached processes can be created without referencing the UAF and might be started like this real-world example (parameters referenced here were defined elsewhere in the script)
    $ RUN SYS$SYSTEM:LOGINOUT -
    /PROCESS_NAME=FTP_LISTENER -
    /NOACCOUNTING -
    /NOAUTHORIZE -
    /DETACHED -
    /INPUT=TCPWARE:FTP_LISTENER.COM -
    /OUTPUT='LOGFILE' -
    /ERROR=NLA0: -
    /PRIORITY=5 -
    /PRIVILEGES=(NOSAME,ACNT,ALTPRI,BYPASS,DETACH,GROUP,LOG_IO,NETMBX,-
    SYSPRV,SHARE,TMPMBX,WORLD,EXQUOTA,PRMMBX,OPER,CMKRNL,-
    CMEXEC,SYSNAM,OPER,PHY_IO,SETPRV,PSWAPM) -
    /UIC=[1,3] -
    /AST_LIMIT='ASTLM' -
    /BUFFER_LIMIT='BYTLM' -
    /ENQUEUE_LIMIT='ENQLM' -
    /EXTENT='WSEXTENT' -
    /FILE_LIMIT='FILLM' -
    /IO_BUFFERED='BIOLM' -
    /IO_DIRECT='DIOLM' -
    /MAXIMUM_WORKING_SET='WSQUOTA' -
    /PAGE_FILE='PAGEFILE' -
    /QUEUE_LIMIT='TQELM' -
    /RESOURCE_WAIT -
    /SUBPROCESS_LIMIT='PRCLM' -
    /WORKING_SET='WSLIMIT'
    $!

System User Authorization File (SYSUAF)

Here is a sample UAF entry for my non-priv account:
$ set def sys$system
$ run authorize
UAF>show ns_rieck

Username: NS_RIECK                         Owner:  NSR_N123119_TEK
Account:  ADMCSM                           UIC:    [346,305] ([NS_RIECK])
CLI:      DCL                              Tables: 
Default:  CSMIS$USER3:[ADMCSM.NS_RIECK]
LGICMD:   CSMIS$COM:LOGIN.COM
Flags:    DisPwdDic DisPwdHis
Primary days:   Mon Tue Wed Thu Fri        
Secondary days:                     Sat Sun
Primary   000000000011111111112222  Secondary 000000000011111111112222
Day Hours 012345678901234567890123  Day Hours 012345678901234567890123
Network:  ##### Full access ######            ##### Full access ######
Batch:    ##### Full access ######            ##### Full access ######
Local:    -----  No access  ------            -----  No access  ------
Dialup:   -----  No access  ------            -----  No access  ------
Remote:   -----  No access  ------            -----  No access  ------
Expiration:            (none)    Pwdminimum:  8   Login Fails:     0
Pwdlifetime:           (none)    Pwdchange:   3-OCT-2003 16:03 
Last Login:  9-DEC-2004 17:45 (interactive), 30-SEP-2002 13:55 (non-interactive)
Maxjobs:         0  Fillm:       300  Bytlm:        99000
Maxacctjobs:     0  Shrfillm:      0  Pbytlm:           0
Maxdetach:       0  BIOlm:       100  JTquota:       8192
Prclm:          10  DIOlm:       100  WSdef:         2000
Prio:            4  ASTlm:       100  WSquo:         4000
Queprio:         0  TQElm:        10  WSextent:     16000
CPU:        (none)  Enqlm:       400  Pgflquo:      99000
Authorized Privileges: 
  GROUP        NETMBX       TMPMBX       WORLD
Default Privileges: 
  GROUP        NETMBX       TMPMBX       WORLD
UAF> 

Notes for some UAF Parameters:

  • WSDEF (working set default) is the starting amount of physical memory allocated to a starting process
     
  • WSQUO (working set quota) is the amount of physical memory guaranteed to an active process
     
  • WSEXTENT (working set extent) is the amount of physical memory an active process can grow to if OpenVMS determines that free memory is available AND is tuned to allow borrowing.
    Official Documentation Error Warning: OpenVMS 8.2 System Manager's Manual Volume 1 (dated: 2005)

    WSEXT specifies the maximum size to which a user's physical memory usage can grow, independent of the system load. This enlargement of the physical memory for a user is accomplished by the Adjust Working Set Limit ($ADJWSL) system service, and is normally done for the user by the operating system in response to heavy page faulting by the user. WSEXTENT is a nondeductible quota. This value should always be greater than or equal to WSQUO. The value is controlled by the system parameter WSMAX. Note that PQL_MWSEXTENT overrides the account's value for WSEXTENT if PQL_MWSEXTENT is larger than WSEXTENT. 
    The yellow highlighted text is wrong since a only a lightly loaded system will allow processes to borrow pages up to WSEXT. Once the freelist gets too small, the system will no longer allow processes to borrow and may even begin to trim processes back to WSQUO. This other reference is more correct:
    HP OpenVMS System Management Utilities Reference Manual: A-L (dated 2010)

    [WSEXT] Specifies the working set maximum. This represents the maximum amount of physical memory allowed to the process. The system provides memory to a process beyond its working set quota only when it has excess free pages. The additional memory is recalled by the system if needed. The value is an integer equal to or greater than WSQUOTA. By default, the value is 16384 pagelets on Alpha and Integrity server systems. The value cannot be greater than WSMAX. This quota value replaces smaller values of PQL_MWSEXTENT.
  • PGFLQUO (page file quota) is the maximum amount of space that this process may allocate in the Page File. This will place a maximum limit on the amount of virtual memory for that process.
    • You will not usually experience problems if you increase this value on a few problem accounts (generating large annual reports; compiling large programs) but be aware that "you will be robbing Peter to pay Paul" if your page file isn't large enough. Also, setting PGFLQUO high doesn't guarantee your problem accounts will have full access to the page file. Other processes may have already reserved a portion of it before your problem process started.
    • Processes which actually require a large amount of virtual memory but only have a small amount of physical memory (WSQUO) will probably cause the system to thrash. It is for this reason that Automatic Working Set Adjustment and BORROWING should always be enabled so that you use physical memory when it is available
Compilers always require a lot of resources. A few of our Alpha BASIC programs consist of more than 60k lines so the memory allotment for developers must be higher like so:
$ set def sys$system
$ run authorize
UAF>show neil

Username: NEIL                             Owner:  NSR_N123119_ADM
Account:  ADMCSM                           UIC:    [346,1] ([NEIL])
CLI:      DCL                              Tables: 
Default:  CSMIS$USER3:[ADMCSM.NEIL]
LGICMD:   CSMIS$ADM:LOGIN.COM
Flags:  DisPwdDic DisPwdHis
Primary days:   Mon Tue Wed Thu Fri        
Secondary days:                     Sat Sun
No access restrictions
Expiration:            (none)    Pwdminimum:  8   Login Fails:     0
Pwdlifetime:           (none)    Pwdchange:   4-DEC-2006 15:02 
Last Login: 19-JUN-2010 19:03 (interactive), 18-JUN-2010 23:05 (non-interactive)
Maxjobs:         0  Fillm:       300  Bytlm:      2000000
Maxacctjobs:     0  Shrfillm:      0  Pbytlm:           0
Maxdetach:       0  BIOlm:      1024  JTquota:       8192
Prclm:          20  DIOlm:      1024  WSdef:         2000
Prio:            4  ASTlm:       100  WSquo:        20000
Queprio:         0  TQElm:        10  WSextent:    200000
CPU:        (none)  Enqlm:      2000  Pgflquo:    2000000 <-- yikes
Authorized Privileges: 
  ACNT         ALLSPOOL     ALTPRI       AUDIT        BUGCHK       BYPASS
  CMEXEC       CMKRNL       DIAGNOSE     DOWNGRADE    EXQUOTA      GROUP
  GRPNAM       GRPPRV       IMPERSONATE  IMPORT       LOG_IO       MOUNT
  NETMBX       OPER         PFNMAP       PHY_IO       PRMCEB       PRMGBL
  PRMMBX       PSWAPM       READALL      SECURITY     SETPRV       SHARE
  SHMEM        SYSGBL       SYSLCK       SYSNAM       SYSPRV       TMPMBX
  UPGRADE      VOLPRO       WORLD
Default Privileges: 
  ACNT         ALLSPOOL     ALTPRI       AUDIT        BUGCHK       BYPASS
  CMEXEC       CMKRNL       DIAGNOSE     DOWNGRADE    EXQUOTA      GROUP
  GRPNAM       GRPPRV       IMPERSONATE  IMPORT       LOG_IO       MOUNT
  NETMBX       OPER         PFNMAP       PHY_IO       PRMCEB       PRMGBL
  PRMMBX       PSWAPM       READALL      SECURITY     SETPRV       SHARE
  SHMEM        SYSGBL       SYSLCK       SYSNAM       SYSPRV       TMPMBX
  UPGRADE      VOLPRO       WORLD
Identifier                         Value           Attributes
  DFU_ALLPRIV                      %X8001001D      
UAF>

SYSGEN (System Generation)

My Free List

	0	364		1376
	+-------+-------+-------+-------+------->>> growth direction
	|	|	|	|	|
	+-------+-------+-------+-------+------->>>
		|		|
		+ FREELIM	+ FREEGOAL
		  BORROWLIM
		  GROWLIM

	Notes:
	1) both the SWAPPER and AWSA continually monitor these four parameters
	2) when "free pages" fall below FREELIM, the SWAPPER will fight to get back to FREEGOAL
	3) AWSA will allow process to borrow pages past WSdefault (limited by WSextent)
	4) On smaller memory systems (like VAX), BORROWLIM and GROWLIM are not set so low  

SYSGEN Parameters

caveat: if you execute sysgen commands from a character-cell terminal, the output will be presented in a 24-line scrolling window which means you will not see everything. The following hack will temporarily divert output to a file named scratch.txt

$ def/user sys$output scratch.txt	! temporarily divert output to file
$ mcr sysgen				! activate the program
SYSGEN>  SHO /MAJOR			! execute a command (no output seen here)
SYSGEN>  SHO AW				! 
SYSGEN>  EXIT				!
$ type scratch.txt			! display file contents

Parameters in use: Active
Parameter Name            Current    Default    Min.         Max. Unit       Dynamic
--------------            -------    -------   -------    ------- ----       -------
GBLSECTIONS                   850        250        80      65535 Sections   
MAXPROCESSCNT                 346         32        12      16384 Processes  
MULTITHREAD                     2          1         0        256 KThreads   
SMP_CPUS                       -1         -1         0         -1 CPU bitmas 
SYSMWCNT                     4003       2048       512    1048576 Pagelets   
 internal value               251        128        32      65536 Pages      
BALSETCNT                     344         30         8      16384 Slots      
WSMAX                      100000       4096      1024    8388608 Pagelets   
 internal value              6250        256        64     524288 Pages      
NPAGEDYN                 13524992    1048576    163840         -1 Bytes      
PAGEDYN                   3416064     524288     65536         -1 Bytes      
FILE_CACHE                      0          0         0        100 Percent    
S2_SIZE                         0          0         0         -1 MBytes     
FREELIM                       364         32        16         -1 Pages      
LOCKIDTBL                    4455       3840      1792   16776959 Entries    
RESHASHTBL                   4096         64         1   16777216 Entries    
PFCDEFAULT                     64         64         0       2032 Pagelets   D
 internal value                 4          4         0        127 Pages      D
GBLPAGES                  1787424      65536     10240         -1 Pagelets   D
 internal value            111714       4096       640         -1 Pages      D
QUANTUM                        20         20         2      32767 10Ms       D
PFRATL                          0          0         0         -1 Flts/10Sec D
PFRATH                          8          8         0         -1 Flts/10Sec D
WSINC                        2400       2400         0         -1 Pagelets   D
 internal value               150        150         0         -1 Pages      D
WSDEC                        4000       4000         0         -1 Pagelets   D
 internal value               250        250         0         -1 Pages      D
FREEGOAL                     1376        200        16         -1 Pages      D
GROWLIM                       364         63         0         -1 Pages      D
BORROWLIM                     364        300         0         -1 Pages      D
IO_PREFER_CPUS                 -1         -1         0         -1 CPU bitmas D
LCKMGR_MODE                     0          0         0        255 CPU Count  D
LCKMGR_CPUID                    0          0         0         -1 CPU Id     D
Parameter Name            Current    Default    Min.         Max. Unit       Dynamic
--------------            -------    -------   -------    ------- ----       -------
AWSMIN                        512        512         0         -1 Pagelets   D
 internal value                32         32         0         -1 Pages      D
AWSTIME                        20         20         1         -1 10Ms       D
$ 

A very simple overview:

  1. When a process starts, lots of parameters are loaded from that account's SYSUAF record which was created via the authorize utility. Three of them are named WSDEF, WSQUO, and WSEXT
     
  2. A process is initially allowed WSDEF pages of physical memory (RAM) and can demand more by faulting up to WSQUO (this amount is guaranteed if AUTOGEN was done properly "and" no one was making manual changes to the SYSGEN parameters). If the system is lightly loaded, then a heavily faulting process may be allowed to borrow more pages up to WSEXTENT.
     
    1. SYSUAF values are only read when a process starts. If you change them the user will need to log off then back in for them to take effect
       
    2. WSMAX places an upper limit on process growth. Setting WSDEF, WSQUO, or WSEXT larger than WSMAX will have no effect.
       
  3. What does faulting mean? Faulting is the mechanism by which a process attempts to access a page of memory, and it does not exist. If the missing page can be found in memory (the freelist) then faulting this page back to your process is called soft faulting. If this missing page must be loaded from disk (either the pagefile or the program about to be started) then then faulting this page back into memory is called hard faulting.
     
  4. OpenVMS divides memory into paged and non-paged. Why would some memory be non-paged? Think about a hardware device doing a DMA transfer into, or out of, a memory buffer. You would never want this memory to be paged out (or swapped out). The following discussion only deals with paged memory.
     
  5. How does OpenVMS know when to allow a working set to grow (or shrink)? This is done by AWSA (automatic work set adjustment) which is described below.
     
  6. How does OpenVMS know when to allow a working set to BORROW (grow past WSdefault toward WSextent)? Free pages in OpenVMS are maintained in a structure known as the page cache or FREE LIST. When the system has more than GROWLIM pages, actively "faulting" processes (at a rate higher than PFRATH) can acquire more pages up to a maximum of WSEXTENT. On the flip side, when the system gets busy and the FREE LIST shrinks lower than FREELIM, the SWAPPER will begin to trim back all processes (extended processes first, then low faulting processes (lower than PFRATL) next). This can continue and may even involve trimming dormant (idle for longer than DORMANTWAIT) processes back to their working-set-defaults (WSDEF). After this the system will resort to swapping out the process entirely. The system will continue this until the FREE LIST size gets to FREEGOAL.

              $mon sys (to watch the FREE LIST grow or shrink)

    Note: GROWLIM and BORROWLIM work together to enable or disable borrowing. The borrow feature is first enabled when the system has more than BORROWLIM pages and then disabled when the system has fewer than GROWLIM pages. Together they provide a form of hysteresis to prevent the system from acting erratically. (e.g. quickly switching from borrowing to trimming). Also note that FREELIM = GROWLIM = BORROWLIM which is a common AUTOGEN setting for systems with lots of memory (e.g. 512 MB or greater)
     
  7. Program code and read-only data (constants) reside on the FREE LIST while program variable data reside on a smaller structure known as the MODIFIED LIST. This so-called dirty page cache will need to be written to the PAGEFILE before these pages are given to another process.
     
  8. When a processes needs more pages but is denied, it can get more by giving up some existing pages back to the FREE LIST. If it needs the page later and the desired page is on the FREE LIST, it can get it back by a process of "soft faulting". However, if the page has been given to someone else, the process will need to "hard fault" the data from disk.
     
  9. Busy systems may want to enable Automatic Working Set Adjustment (AWSA) in both directions (expansion and contraction). The value of PFRATL has been defaulted to zero for many years now but can be enabled to ensure that more free memory is usually available (at the expense of more faulting).
     
  10. SYSMWCNT sets the work set size of the system. If this value is too large, VMS will use too much physical memory; if this value is too small, user processes will be given too much memory which means the OS will become somewhat crippled.
     
  11. You should really read a tuning-guide before you mess with these parameters. That said, if you must hack then check out the following online help:
    $mcr sysgen
    SYSGEN>help sys freelim
    SYSGEN>help sys freegoal
    SYSGEN>help sys growlim
    SYSGEN>help sys borrowlim
    SYSGEN>help sys freelim
    SYSGEN>help sys awstim
    SYSGEN>help sys awsmin
    SYSGEN>help sys wsinc
    SYSGEN>help sys wsdec
    SYSGEN>help sys pfratl
    SYSGEN>help sys pfrath

Automatic Working Set Adjustment (AWSA)

The following discussion makes reference to sysgen parameters seen in the list ~ 100 lines above

  • automatic working set adjustment runs every AWSTIME (20 x 10 ms = 200 ms).
  • if a process has been faulting at a rate of PFRATH (8 Faults/10 Sec) and other systems conditions are correct (like light loading which results in a larger freelist), a process might be given an additional WSINC (2400) pagelets
  • if a process has at least AWSMIN (512 pagelets) and it has been faulting at a rate lower than PFRATL, it will loose WSDEC (4000) pages).
    • Because PFRATL is set to zero on my system, automatic working set reduction is disabled.
    • when automatic working set reduction is enabled (usually only on small memory systems), it is not a good ideal to have GROWLIM = BORROWLIM
  • FREELIM = GROWLIM = BORROWLIM which is a common AUTOGEN setting for systems with lots of memory (e.g. 512 MB or greater)

OpenVMS Processes

  • MAXPROCESSCNT specifies the maximum number of concurrent active processes
  • BALSETCNT specifies the maximum number of memory-resident processes
  • MAXPROCESSCNT must always be at least 2 greater than BALSETCNT
  • If a system contains active 500 processes and BALSETCNT is set to 400, then 98 processes will be in a swapped-out state. This is OK if they're batch jobs but will be bad news if they are interactive users.

OpenVMS Swapper

  • The OpenVMS swapper serves two functions, trimming and swapping.
  • When the FREE LIST drops below FREELIM, the swapper will
    • starting trimming processes that have borrowed pages back; possibly all the way to WSQUO. Idle processes may be trimmed back to WSDEF.
    • Processes idle for longer than DORMANTWAIT might be swapped out.
  • When the number of active processes exceeds BALSETCNT, the swapper will need to move processes in-and-out of memory in order to meet the needs of the round-robin-scheduler. When virtual memory systems swap in this fashion it's usually not pretty because they get back into the system with their working sets trimmed to WSDEF which is manytimes too small. This is one reason why some system managers will avoid trimming to WSDEF and just go straight to swapping.

Pool

  • Non-paged Dynamic Pool is the memory structure usually associated with MALLOC in "C"
  • The initial size (in bytes) of non-paged dynamic pool is set by NPAGEDYN but more space can be allocated as required
  • Paged Dynamic pool is used to hold RMS file structures, logical names, etc.

Retuning The System (via AUTOGEN)

Introduction

  • invoke the tuning script using one of the following:
    • @sys$update:autogen help (to get complete help)
    • @sys$update:autogen yada (to get shorter parameter help)
When To Retune
  • mandatory
    • whenever you turn up a new system
    • whenever you add or remove memory
  • optional
    • whenever workload changes drastically
    • whenever you suspect a resource-related problem
How To Retune (overview)
  • run system in production mode for at least 24 hours
  • retune and reboot
  • run the system for 7 days
  • retune and reboot
How To Retune (details)
  • optional: make manual changes in file "sys$system:modparams.dat" (this file is always read by autogen)
  • invoke the tuning script using one of the following:
    • @sys$update:autogen help (to get complete help)
    • @sys$update:autogen yada (to get shorter parameter help)
    • @sys$update:autogen savparams  setparams  check_feedback (if the system has been running for 24 hours)
    • @sys$update:autogen getdata  setparams  check_feedback (if the system has NOT been running for 24 hours)
  • now inspect the autogen report to make sure you haven't made some fatal errors:
    • $type/page sys$system:agen$params.report
  • now reboot like so:
    • @sys$update:autogen reboot

Manually Retuning The System (via SYSGEN)

Overview and Warnings
  1. Warning: manual tuning without AUTOGEN is for experts only. I have witnessed two occasions where someone (not me) made system parameter changes to OpenVMS which resulted in the system being unbootable. The only way to fix this is to do a "conversational boot" of VMS which drops the console into SYSGEN on the way up. At this point the operator needs to make manual SYSGEN changes to make things right before continuing with the boot process which happens as soon as you exit SYSGEN.
     
  2. Warning: the safest tuning method is to place your SYSGEN commands in file "sys$system:modparams.dat" then invoke the AUTOGEN script. This will ensure that dangerous settings and/or parameter conflicts are written to the AUTOGEN report file titled "sys$system:agen$params.report"
     
  3. That said, there are some situations where it will not be convenient, or possible, to reboot the system. After all, this isn't a PC operating system.
Manual Tuning
  • non-dynamic parameters can be changed by using SYSGEN but they won't take effect until after a reboot.
  • dynamic parameters can be changed on a running system without rebooting. Here is an example:
    $ mcr sysgen
    SYSGEN>  SHO PFRATL
    Parameter Name           Current Default    Min.    Max. Unit       Dynamic
    --------------           ------- ------- ------- ------- ----       -------
    PFRATL                         0       0       0      -1 Flts/10Sec D
    SYSGEN>  SET PFRATL   2
    SYSGEN>  SHO PFRATL     
    Parameter Name           Current Default    Min.    Max. Unit       Dynamic
    --------------           ------- ------- ------- ------- ----       -------
    PFRATL                         2       0       0      -1 Flts/10Sec D
    SYSGEN>  WRITE ACTIVE    (writes dynamic parameters to the active system in memory)
    SYSGEN>  WRITE CURRENT   (writes all parameters to the system parameter file on disk)
    SYSGEN>  EXIT
    $
  • if you don't perform any write commands, all changes you make will be lost
  • if you performed WRITE CURRENT but not WRITE ACTIVE , your changes won't take place until the next reboot
  • if you performed WRITE ACTIVE but not WRITE CURRENT, your changes will be put back on the next reboot
  • if you performed WRITE ACTIVE and WRITE CURRENT, dynamic changes will occur immediately and all changes will be put into place on the next reboot

An example MODPARAMS.DAT from one small system with almost no users 

!
! SYS$SYSDEVICE:[SYS0.SYSEXE]MODPARAMS.DAT
! Created during installation of OpenVMS AXP V7.2-1  8-JUL-1999 12:38:47.03
!
VAXCLUSTER=0             !
SCSNODE="KAWC99"         ! our webserver on the public internet
SCSSYSTEMID=15335        ! 14.999 (14 x 1024 +  999 = 15335)
AUTO_DLIGHT_SAV = 1      !
!
maxprocesscnt   = 64     !
balsetcnt       = 62     ! all our systems MUST HAVE room for at least 40 processes
!
wsmax           = 200000 !
tty_buf         = 132    !
TTY_TYPAHDSZ    = 4096   ! was 78
TTY_ALTYPAHD    = 4096   ! was 200
TTY_ALTALARM    = 2000   ! was 64
! TTY_DEFCHAR2  = 528386 x enable system password on terminals
TTY_DEFCHAR2    = 4098   !

Notes:
  1. MAXPROCESSCNT must be at least 2 units higher than BALSETCNT
  2. BALSETCNT determines how many processes will be resident in RAM at one time. If MAXPROCESSCNT was 99 and BALSETCNT was 33, then as many as 99-33-2=64 processes "could" be swapped out (waiting for a turn from the processor). You can reduce one form of swapping by making sure that MAXPROCESSCNT is 2 above BALSETCNT
  3. Autogen will tune this system to ensure that 64 processes have 200,000 virtual memory pagelets. It will do this by making sure the PAGEFILE is large enough. Remember that individual account settings in UAF may be set up to use much less. This way, some larger processes will now be able to borrow past their WSQUOTA (provided WSEXTENT and Pgflquo are set high enough)

Fixing a recent problem with MariaDB

legend:	"KAWC96::Neil>" is my DCL prompt on node KAWC96
	<sr>		system response
	<ur>		user response
-------------------------------------------------------------------------------
<sr>	KAWC96::Neil>
<ur>	sho sys/proc=*maria*
<sr>	OpenVMS V8.4  on node KAWC96    1-AUG-2015 09:03:06.97   Uptime  159 11:30:31
	  Pid    Process Name    State  Pri      I/O       CPU       Page flts  Pages
	0000B849 MariaDB_Server  HIB      5*********   3 23:50:55.77  47838479  18366 M 
	KAWC96::Neil>
<ur>	sh proc/id=0000B849/acc
<sr>
	 1-AUG-2015 09:03:30.32   User: MYSQL055_SRV     Process ID:   0000B849
	                          Node: KAWC96           Process name: "MariaDB_Server"

	Accounting information:
	 Buffered I/O count:1632136192  Peak working set size:     300000
	 Direct I/O count:   179876061  Peak virtual size:        1155520
	 Page faults:         47838479  Mounted volumes:                0
	 Images activated:           3
	 Elapsed CPU time:          3 23:50:55.77
	 Connect time:            158 18:03:14.59
	KAWC96::Neil>
-------------------------------------------------------------------------------
<ur>	mcr sysgen
<sr>	SYSGEN>
<ur>	SHO WSMAX
<sr>
	Parameter Name            Current    Default     Min.       Max.   Unit  Dynamic
	--------------            -------    -------   -------    -------  ----  -------
	WSMAX                      300000      32767      1024  134217728 Pagelets   
	 internal value             18750       2048        64    8388608 Pages      
	SYSGEN>  EXIT
	KAWC96::Neil>
-------------------------------------------------------------------------------
<ur>	set def sys$system
<sr>	KAWC96::Neil>
<ur>	r authorize
<sr>	UAF>
<ur>	sho MYSQL055_SRV
<sr>	Username: MYSQL055_SRV                     Owner:  ADMCSM
	Account:  MYSQL055                         UIC:    [37776,5] ([MYSQL055_SRV])
	CLI:      DCL                              Tables: DCLTABLES
	Default:  MYSQL055_ROOT:[MYSQL_SERVER]
	LGICMD:   MYSQL055_ROOT:[VMS]LOGIN.COM
	Flags:  DisCtlY Restricted DisWelcome DisNewMail DisMail DisReport
	              DisReconnect DisPwdDic DisPwdHis
	Primary days:   Mon Tue Wed Thu Fri        
	Secondary days:                     Sat Sun
	No access restrictions
	Expiration:            (none)    Pwdminimum:  8   Login Fails:     0
	Pwdlifetime:         90 00:00    Pwdchange:      (pre-expired) 
	Last Login:            (none) (interactive), 23-FEB-2015 15:00 (non-interactive)
	Maxjobs:         0  Fillm:      1000  Bytlm:       200000
	Maxacctjobs:     0  Shrfillm:      0  Pbytlm:           0
	Maxdetach:       0  BIOlm:      1000  JTquota:       4096
	Prclm:           8  DIOlm:      1000  WSdef:        16384
	Prio:            4  ASTlm:      2200  WSquo:        32768
	Queprio:         4  TQElm:        10  WSextent:     65536
	CPU:        (none)  Enqlm:      5000  Pgflquo:    2000000
	Authorized Privileges: 
	  GROUP        NETMBX       SYSLCK       TMPMBX       WORLD
	Default Privileges: 
	  CMKRNL       GROUP        NETMBX       SYSLCK       TMPMBX       WORLD
	UAF> exit
	%UAF-I-NOMODS, no modifications made to system authorization file
	%UAF-I-NAFNOMODS, no modifications made to network proxy database
	%UAF-I-RDBNOMODS, no modifications made to rights database
	KAWC96::Neil> 

Analysis:

  • the DCL command "$sho proc/acc" provides accounting information on a currently running process
    • This information is also written to the accounting file provided VMS accounting is currently enabled ($set acc/on)
      and data is being collected ($set acc/ena=image/ena=detach or $set acc/ena which defaults to all)
  • facts about this process:
    • has been up for 158 days
    • has consumed almost 4 days of CPU time
    • Peak working set size: 300000
    • Peak virtual size: 1155520
    • page faults: ~ 47 M
  • physical memory
    • notice (from sysgen) that WSMAX is set to 300000
    • this means that this process is using all the physical memory granted to it by OpenVMS. It could use more but is being limited by WSMAX
  • virtual memory
    • notice (from authorize) that this process has been granted 2 million pages in the pagefile but $sho proc/acc indicates we are only using ~ 1.1 million

Solution:

  • virtual memory
    • ignore virtual memory settings (pagefile-related stuff)
  • physical memory
    • conservative path
      • double the size of WSMAX from 300k to 600k
      • caveats:
        • This single change will most likely increase the sizes of the pagefile and swapfile.
        • If these files are on your system disk then you might wish to inspect the amount of free space before you invoke AUTOGEN.
        • If you are tight for free space then consider reducing both MAXPROCESSCNT and BALSETCNT before you invoke AUTOGEN.
    • liberal path
      • triple the size of WSMAX from 300k to 900k
      • see previous caveat
  • invoke the autogen script in sys$update
  • inspect the resultant report for errors and warnings
  • reboot if you think it safe to do so
  • before you restart MariaDB (which uses the account MYSQL055_SRV) use authorize to do the following:
    • increase WSQUO to (WSMAX * .5) or (WSMAX * .75) or WSMAX
    • increase WSEXT to WSMAX
    • on most Alpha or Itanium implementations with lots of memory, you should always set WSDEF to WSQUO

Future

  • after a week of collecting new data consider using AUTHORIZE to change the selected account in SYSUAF.
    • WSEXT affects how much physical memory can be used when the system is lightly loaded
    • WSQUO affects how much physical memory can be used when the system is running normally
    • if you have lots of memory, then certain accounts like APACHE$WWW and MYSQL055_SRV should always have WSDEF set to the same value as WSQUO

A quick word about Alpha and Itanium under OpenVMS-8.4

caveat: newer systems have a lot more memory. For example look at this short list of big systems I've worked on:

Platform Purchased
Memory Size
Final (Upgraded)
Memory Size
VAX 8550 32 MB 80 MB
AlphaServer DS20e 1 GB 4 GB
rx2800-i2 64 GB 64 GB

I do not know when this changed, but tuning OpenVMS-8.4 for Alpha or OpenVMS-8.4 for Itanium under  is a lot different than VMS-6.2 for VAX

  • In the VAX days you might set the system wide parameter WSMAX higher then advance WSDEFAULT, WSQUOTA, and WSEXTENT (in SYSUAF) of the process you wish to grant more physical memory
  • With OpenVMS-8.4 it would appear that setting WSMAX higher also set various PQL settings higher, and these override settings found in SYSUAF (provided PQL settings are higher)
  • Look at this Alpha where I set WSMAX = 600000 in file SYS$SYSTEM:MODPARAMS.DAT then invoked SYS$UPDATE:AUTOGEN
    • PQL_DWSEXTENT ends up being equal to WSMAX which makes sense; hand out memory when it is is available
    • PQL_DWSQUOTA ends up being 2% of WSMAX
    • PQL_DWSDEFAULT ends up being 1% of WSMAX
    • these last two will increase slowly based upon observed system stats via the FEEDBACK mechanism as well as other parameters like MAXPROCESSCNT and BALSETCNT. For example, I have seen PQL_DWSQUOTA being 55% of WSMAX on other Alphas
KAWC99::Neil> mcr sysgen
SYSGEN>  SHO WSMAX
Parameter Name            Current    Default     Min.       Max.   Unit  Dynamic
--------------            -------    -------   -------    -------  ----  -------
WSMAX                      600000      32767      1024  134217728 Pagelets   
 internal value             37500       2048        64    8388608 Pages      
SYSGEN>  SHO PQL
Parameter Name            Current    Default     Min.       Max.   Unit  Dynamic
--------------            -------    -------   -------    -------  ----  -------
[snip]
PQL_DPGFLQUOTA             131072     131072      4096         -1 Pagelets   D
 internal value              8192       8192       256         -1 Pages      D
PQL_MPGFLQUOTA              32768       4096      4096         -1 Pagelets   D
 internal value              2048        256       256         -1 Pages      D
[snip]
PQL_DWSDEFAULT               5760       4096      2048         -1 Pagelets   
 internal value               360        256       128         -1 Pages      
PQL_MWSDEFAULT               5760       2048      2048         -1 Pagelets   
 internal value               360        128       128         -1 Pages      
PQL_DWSQUOTA                11520       8192      4096         -1 Pagelets   D
 internal value               720        512       256         -1 Pages      D
PQL_MWSQUOTA                11520       4096      4096         -1 Pagelets   D
 internal value               720        256       256         -1 Pages      D
PQL_DWSEXTENT              600000      32767      8192         -1 Pagelets   D
 internal value             37500       2048       512         -1 Pages      D
PQL_MWSEXTENT              600000       8192      8192         -1 Pagelets   D
 internal value             37500        512       512         -1 Pages      D
[snip]

two more points

  • inspect PQL_DPGFLQUOTA which will raise the amount of every process where their respective SYSUAF setting of Pgflquo is lower. The higher of these two parameters is used to set the amount of Virtual Memory available to each process. Some system managers trying to reduce the size of Pgflquo in certain accounts (to reduce over subscription of the pagefile) probably noticed their changes did nothing
  • On every Alpha and Itanium I've ever looked at, sysgen parameter PFRATL is set to 0 while PFRATH is set to 8
    • With PFRATL set to zero there will be no automatic working set reduction on processes via AWSA. However, the SWAPPER will always trim working sets back to WSdefault or PQL_DWSDEFAULT (whichever is higher). This always happens before a process is swapped out.
    • With PFRATH is set to 8 the system will hand out memory very liberally

One sysgen command still does not work properly in 2018

I performed this test using SYSGEN on OpenVMS-8.4 I64 (patch kit 1200) on 2018-03-30

The following two sysgen commands DO NOT return the same number of items (the second command returns more items)
KAWC99::Neil> mcr sysgen
SYSGEN> SHO /PQL	! show all PQL parameters by category
[...snip...]
SYSGEN> SHO PQL
[...snip...]		! show all PQL parameters by name

Links:


Back to Home
Neil Rieck
Waterloo, Ontario, Canada.