Australia - Updated: 24-SEP-2003
hp.com home products and services support and drivers solutions how to buy
» contact hp
hp.com home hp OpenVMS ECOs

IMPORTANT NOTICE

The online distribution of OpenVMS and related product patches is being migrated to the HP ITRC (Information Technology Resource Center) patch distribution site. The new ITRC patch server will allow OpenVMS customers to take advantage of many enhanced features for patch searching and distribution.

Beginning August 1, 2003, OpenVMS and related Layered Product, publicly available patches will be available from the HP ITRC web site at

http://itrc.hp.com/service/patch/mainPage.do

The same patches will still be available from the existing patch server in Colorado Springs (http://www.support.compaq.com/patches/) through the end of October 2003, to give customers sufficient time to update their bookmarks and make the transition to the HP ITRC web site.

ECO kits will also be available by raw FTP from (ftp://ftp.itrc.hp.com/).

PLEASE UPDATE YOUR BOOKMARKS AND REGISTER ON THE NEW SITE NOW

Note: if you're having trouble connecting to the ITRC site, please delete any cookies for "itrc.hp.com" from your browser and try again. Report any difficulties with or suggestions to MrVMS

» Sydney CSC home page

Navigation
» ECOinfo main index
» Search ECOs
» Search FTP site
» Browse FTP site

ECO Indexes
» Chronological Index
» Indexed by Version
» Indexed by Rating
» Alpha Indexed by Name
» VAX Indexed by Name
» On Hold List

Associated Links
» OpenVMS Home Page
» OpenVMS News
» DIA/WIS Web Service

Feedback
» mail to CSC
.
Sydney Customer Support Centre OpenVMS ECO information
    Updated: 24-SEP-2003 (Use your browsers' Reload button to ensure you're viewing the most recent version)

ALPSHAD09_061 Alpha V6.1 Volume Shadowing ECO Summary

To obtain this kit please call the Customer Support Centre or use the FTP site

Search for this ECO kit and dependencies
Search the Compaq FTP web site this kit (exact match)
Search the Compaq FTP web site this or related ECOs

    
    
    
    Copyright (c) Digital Equipment Corporation 1994, 1997.  All rights reserved.
    
    PRODUCT:    Volume Shadowing for OpenVMS Alpha
    
    OP/SYS:     OpenVMS Alpha
    
    SOURCE:     Digital Equipment Corporation
    
    ECO INFORMATION:
    
         ECO Kit Name:  ALPSHAD09_061
         ECO Kits Superseded by This ECO Kit:  ALPSHAD07_061
                                               AXPSHAD06_061   (AXPSHAD)
                                               AXPSHAD04_061
                                               AXPSHAD02_061   (CSCPAT_2045)
                                               AXPSHAD01_061
                                               AXPSHAD01_015
                                               ALPSYS12_061
         ECO Kit Approximate Size:  5904 Blocks
         Kit Applies To:  OpenVMS Alpha V6.1, V6.1-1H1, V6.1-1H2
         System Reboot Necessary:  Yes
    
    
    NOTES:  If you install the ALPSHAD09_061 remedial kit, ALPSYS17_061
            should also be installed on your system in order to avoid a
            problem in PROCESS_MANAGEMENT which could cause your system
            to hang.  This problem will be fixed in a future version of
            the ALPSHAD remedial kit.
    
            When you install the ALPSHAD09_061 remedial kit you must also
            install the ALPSHAD10_061 or later remedial kit before rebooting
            your system.  Installing the ALPSHAD09_061 kit without installing
            ALPSHAD10_061, or later SHADOW kit, may experience the MERGE
            problem or the SHADZEROMBR bugcheck problem which was resolved
            in the ALPSHAD10_61 remedial kit.
    
            Future OpenVMS Alpha V6.1 kits that are issued  for  facilities
            included  in  the ALPSHAD09_061 kit will not install unless the
            ALPSHAD09_061 kit is installed on your  system  first.   It  is
            highly recommended that the complete ALPSHAD09_061 remedial kit
            be installed as soon as possible.  Installation  of  individual
            images from the ALPSHAD09_061 remedial kit is not supported and
            could result in unpredictable system behavior.
    
            If you have a mixed-architecture cluster, and have  not
            previously installed a shadowing kit, you must install this kit
            on the Alpha nodes as well as the VAX version of this kit on
            the VAX nodes of the cluster BEFORE you bring up both types of
            systems in a cluster again.  If both kits are not installed,
            you may not be able to create shadow sets.
    
            If you have previously installed a shadowing kit then you do
            not need to install the VAX version of this kit at this time as
            long as the shadowing kit installed on the VAX nodes of the
            cluster is VAXSHAD04_061 or later.
    
            Working configurations that contain SCSI shadow sets on
            dissimilar controllers may no longer work.
    
    
    ECO KIT SUMMARY:
    
    An ECO kit exists for Volume Shadowing on OpenVMS Alpha V6.1 through
    V6.1-1H2.  This kit addresses the following problems:
    
    Problems Addressed in the ALPSHAD09_061 Kit for OpenVMS Alpha V6.1,
    V6.1-1H1, V6.1-1H2:
    
      o  Shadowing crash immediately upon booting system with shadowed system
         disk, in SHSB$READ_SCB.
    
      o  A two member shadowset with member index 0 a copy target and index 1
         the only source member experiences a node failure on a node serving
         the disks.  The source member goes "available". The source index is
         never PACKACKed (Packet Acknowledgment) and the system remains with
         the set hung in mount verification forever.
    
      o  If Shadowing tries to mark a block bad on all disks due to it being
         bad on the source(s) and encounters an error it may return an
         incorrect status to the user.  The  status  will be SS$_NORMAL for
         MSCP devices and may be SS$_UNSUPPORTED for non-MSCP devices (as
         determined by routine SHSB$CHECK_MSCP).  An SS$_NORMAL error is
         misleading as it indicates all blocks were correctly marked bad,
         SS$_UNSUPPORTED doesn't seem to be a valid return status for
         shadowing I/Os.
    
      o  Removing a Disk Copy Data (DCD) copy target and adding it back again
         causes the source of the DCD copy to change.  This can cause the
         copy to be non-assisted if the alternate source isn't on the same
         controller.
    
      o  If a DCD copy is interrupted by a mini-merge the copy will restart
         at 0% copied (LBN 0) rather than continuing from where it left off.
         DCD copies should restart at the last copied LBN after interrupted
         by mini-merge.
    
      o  Failures to start copies or restart copies, usually after after a
         node halt, shutdown or reboot.  Additional symptoms observed include
         inconsistent values for HBS_CIP when compared to SHADOW_MAX_COPY,
         negative values for HBS_CIP and copies that should continue started
         over from the beginning.
    
      o  Demote CMPL to CMPW for #SS$_* to prevent incorrect status handling.
    
      o  TPU would output SPR text if a user pressed CTRL/C during the
         compile of TPU code that contained errors.  Users often do this when
         they accidentally try to compile non-TPU code or their procedure has
         many coding errors in it.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  If a three member Shadowset has it's index zero member as a copy
         target and all three members also require a MERGE, then when the
         COPY completes the MERGE does not take place.  The LBN for the just
         completed COPY (the last LBN on the disk) is passed as the MERGE
         starting LBN.  So it completes without doing any IO.
    
      o  When MONITOR is run on a terminal with more than 24 lines, MONITOR
         still uses only 24 lines.  For several classes (PROCESS, DISK, and
         CLUSTER), it would be nice if MONITOR could use the additional
         lines.  This ECO provides support for the PROCESS class - the one
         that could use it most.
    
         This feature was provided in OpenVMS Alpha V6.2.
    
      o  Specifying the MONITOR RMS with the  /PERCENT  qualifier  will cause
         MONITOR to unexpectedly terminate with an ACCVIO.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  Specifying the DISK Class to Monitor can result in unexpected side
         effects to the display.  When MONITOR DISK command is issued on a
         system with DFS (DECdfs for OpenVMS Systems) devices mounted, only
         the first three characters of the DFS name are displayed correctly.
         Instead of the fourth character, the low byte of the unit number is
         output.  It is often displayed as an non-printable character or as
         an escape sequence (in which case, may cause terminal lock-ups,
         resetting characteristics, etc).
    
      o  Due to an inadequate synchronization mechanism, the MONITOR DISK
         or MONITOR CLUSTER command can go into an infinite loop on
         multi-processor machines.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  When a DCD should be valid to do, it is not always done.  This
         results is doing a non-assisted FULL copy operation which takes much
         longer to do.
    
      o  Event Flag not set when completion AST also specified on $ENQ.
    
      o  A problem would occur if a satellite were to crash and then attempt
         to boot back into the cluster (in a SCSI CLUSTER). The physical
         device would be unavailable to the satellite so that it would never
         be allowed to boot back into the cluster.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  On multi-interconnect clusters, there is a window which will allow a
         lock remaster operation to complete without all interested nodes
         pointing to the new master.  This usually results in a number of
         nodes crashing with LOCKMGRERR bugchecks.  The situation is only
         possible after a node CLUEXITs.  Other required conditions are that
         the node which CLUEXITs must have a LOCKDIRWT of zero, such that a
         partial lock rebuild occurs after the CLUEXIT.  If a SS$_NODELEAVE
         error is returned for a node which is to participate in the
         remaster, we must stop the remaster from completing, and allow the
         lock rebuild to clean things up.
    
      o  A SET SECURITY or SET ACL on volumes on the cluster place High I/O
         on the server process.  This exhausts paged pool and AUDIT_SERVER
         goes into a RWPAG state.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  A field in the IRP that is used during Volume Processing was not
         initialized in clones of USER IOs.  If an error occurs, the code
         that determines the severity of the error can be misled by data in
         these fields.  It can fail to locate the error and return the IO as
         successful.  Since we also return a zero Byte count the User would
         see an Incomplete Segmented Transfer error.  The fix is to
         initialize the field when the clone is allocated.
    
      o  Listings are sometimes difficult to follow because there are varied
         format conventions used and some comments are misleading or missing.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  Certain applications calling $AUDIT_EVENT with AST's turned off will
         be interrupted when $AUDIT_EVENT returns to caller.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  Code relies on page being present when trying to release spinlock
         and if the system is paging heavily, this might not be the case.
    
         This problem is corrected in OpenVMS Alpha V6.2.
      o  Repeating wakeups from $SCHDWK show an accumulating drift over time.
    
         ENGINEERING NOTE:  This problem is *not* fixed in this ECO kit.
                            It will be addressed in a future ECO kit.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  COPY and/or BACKUP of a DISK to a TMSCP-Served TAPE, will fail when
         the tape device is placed in a MV state.  The failure does not occur
         in the same task is performed locally.
    
         COPY will fail with: "SYSTEM-F-TAPEPOSLOST, magnetic tape position lost".
    
         BACKUP will fail with:  "-SYSTEM-F-DATALOST, data lost".
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  To transition an OpenVMS process from the virtual balance set to the
         real balance set, the SPTE's (system page table entries) which
         describe its process PTE pages (process page table pages) need to be
         copied from saved memory back into the real balance slot from whence
         they originally came.  This makes the process' P0 and P1 space
         accessible again.  SPTE's for the process page table pages
         describing the undefined area between P0 and P1 must be represented
         by pre-initialized null values (actually, ERKW DZERO-type values).
         When this undefined void area is exactly zero pages (i.e., P0 and P1
         are tangent), the VBSS$READ_OPT2_VBSM routine takes the wrong
         branch, causing a VBSSERR bugcheck.  This fix adds a test for this
         case, and takes the image(s) correct branch.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  When a process is switched from a real balance slot to a virtual
         balance slot, the allocation fails, causing a VBSSERR bugcheck.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  When returning process quota (BYTLM) to a process for a created
         system global section compute returned quota value correctly.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  System crashes due to corrupted PTE entries.  The corruption appears
         to be Global Section Table Entries pointing to Global Section
         Descriptors.
    
         The problem occurs only if 3276 GBLSECTIONS is exceeded.  To check
         the number of Global Sections currently in use add the following
         values:
    
           o  SDA> VALIDATE QUEUE EXE$GL_GSDSYSFL !global sections
    
           o  SDA> VALIDATE QUEUE EXE$GL_GSDDELFL !delete pending global sections
    
           o  SDA> VALIDATE QUEUE EXE$GL_GSDGRPFL !group global sections
    
      o  Devices can remain allocated to processes that no longer exist.  The
         device remains unusable until the system is rebooted.
    
      o  If a previously shadowed disk is mounted with a MOUNT/OVER=SHADOW
         command and a new shadow set is created using this disk, OpenVMS
         Alpha will attempt to create the old shadow set using the old
         physical device names.
    
      o  The system crashes with a NOBVPVCB bugcheck.  The crash occurs on
         the kernel stack with MTAAACP.EXE as the current image.
    
      o  The system crashes with an XQPERR while dismounting a MAD drive.
    
      o  SUBTRACED errors not correctly determined for images installed
         /HEADER_RESIDENT.
    
         This problem is corrected in OpenVMS Alpha V6.2.
    
      o  When returning process quota (BYTLM) to a process for a created
         system global section compute returned quota value correctly.
    
      o  Users of RDB V6.1 may get ILLIOFUNC errors when doing IO to a Host
         Based Shadowset whose members are served.
    
      o  The user will see a large number of the shadow copies being done by
         OpenVMS rather than the controller, even when both disks are on the
         same controller and the controller has DCD capabilities.
    
      o  If a three member Shadowset has its index zero member as a copy
         target and all three members also require a MERGE, then when the
         COPY completes the MERGE does not take place.  The LBN for the just
         completed COPY (the last LBN on the disk) is passed as the MERGE
         starting LBN.  So it completes without doing any IO.
    
      o  System hang when I/Os pending to a shadow set do not complete.
    
      o  In previous shadow kits two new fields were added to the IRP data
         structure for shadow write logging information.  This new IRP
         definition size conflicted with the IRP sizes of other images on the
         system that were not part of the SHADOW kits.  This conflict could
         cause a variety of errors including fatal bugchecks.  This fix
         changes the IRP definitions back to the SSB versions and also adds
         some special definitions to the SHDRIVER for the new IRP fields.
    
      o  Fatal bugcheck from data structure corruption due to the value 10
         HEX being added to the corrupted field.  Crashes are of various
         types including node and cluster crashes, crashes due to invalid UCB
         addresses, invalid VCB addresses, invalid member IDs, invalid number
         of devices etc.
    
    Problems Addressed in the ALPSHAD07_061 Kit for OpenVMS Alpha V6.1,
    V6.1-1H1, V6.1-1H2:
    
    NOTE:  Although this kit contains previous fixes that may be applied
           to OpenVMS Alpha V1.5, beginning with the AXPSHAD06_061 ECO kit,
           there will be no new fixes included for OpenVMS Alpha V1.5.  If
           your system is running OpenVMS Alpha V1.5 and you are experiencing
           the problems listed in the PROBLEMS ADDRESSED IN AXPSHAD06_061 KIT
           FOR OPENVMS AXP V6.1 below, it is strongly recommended that you
           upgrade to OpenVMS Alpha V6.1 as soon as possible.
    
      o  Fatal bugchecks from data structure corruption may occur due to the
         addition of the value 10 HEX to the corrupted field.  Crashes are of
         various types and include node and cluster crashes, crashes due to
         invalid UCB addresses, invalid VCB addresses, invalid member IDs,
         and invalid number of devices.
    
      o  There is a race condition possible when a CFCB (Cache File Control
         Block) is being deleted due to XQP action and cache space is being
         reclaimed from a LIMBO file.
    
      o  Under certain conditions, a fork locks used by the virtual I/O cache
         may be created with an incorrect length.  This results in
         unsynchronized data access which can cause corruption.
    
      o  When a satellite node in a SCSI cluster crashes, the MSCP server
         marks the physical device as offline which prevents the satellite
         node from being able to boot back into the cluster.
    
    Problems Address in the AXPSHAD06_061 Kit for OpenVMS Alpha V6.1:
    
      o  Incorrect information in Register 6 and Register 7 causes the system
         to crash with a REGCORDET register corruption bugcheck.
    
      o  If the system manager fails to set the value of the ALLOCLASS SYSGEN
         parameter and then attempts to use shadowing, a shadow volume can be
         created, but new members cannot be added to the shadow set.  No
         error messages are received until an attempt is made to add a second
         member to the shadow set.  Using the following DCL 'MOUNT' command,
         the following error messages appear:
    
              $ MOUNT/SYSTEM DSA500 /SHADOW=DKB400 ALPHAVMS015
              %MOUNT-I-SHDWMEMFAIL, DKB400 failed as a member
                                    of the shadow set -SYSTEM-F-INCSHAMEM,
                                    incompatible shadow set member.
    
         "Incompatible" is not a true statement of the problem.  It is
         actually due to "missing allocation class," or "incorrect allocation
         class."
    
      o  I/O to a shadow set may become stalled if a shadow set member is
         dismounted at the same time from multiple nodes within a cluster.
    
      o  MOUNT will not add shadow set members unless they are either MSCP or
         SCSI.
    
      o  Shadow set member expulsion is currently based on the time it takes
         for a fork and wait and a PACKACK (Packet Acknowledgment) to
         complete rather than the actual time transpired.  On some devices,
         particularly SCSI devices, where a PACKACK can take approximately
         one minute, the timeout was much too long.  Using the default value
         of 20 (seconds) for SHADOW_MBR_TMO would actually mean that it would
         take 20 minutes to expel a member that is experiencing errors from a
         SCSI shadowset.
    
      o  SHDRIVER loss of synchronization may result in a crash where SHADDETINCON
         is triggered by the check at the end of MATCH_MASTER_SCB.  In this
         consistency check, the SHAD$W_DEVSTS_PASSIVE_MV_CNTR is verified to
         be zero and is not.  Another symptom is that the virtual unit
         UCB$W_RWAITCNT is zero.  Also shadow set member counts of zero may
         be seen.
    
      o  Crashes may occur in EXPEL_PACKACK_ANY with connections broken to
         all members and IRP$L_SHD_LOCK_FR5 = 1 (packack retries exhausted).
    
      o  All members of a shadow set become inaccessible at the same time and
         remain inaccessible for a period of time greater than "shadow member
         timeout" (SHADOW_MBR_TMO  or  SHADOW_SYS_TMO) seconds but less than
         MVTIMEOUT seconds.  All members subsequently become accessible
         within seconds of each other but not at exactly the same time.  This
         results in all but one member being expelled from the shadow set.
    
         This often occurs when changing HSJ microcode and all members are
         connected to the same HSJ.  When brought back online, polling will
         cause the devices to be found seconds apart which will result in all
         but one member being expelled.
    
      o  All members of the set must be checked to see if they meet the
         criteria of being MSCP.  The original design did not allow for
         having no index zero member.
    
      o  In a cluster, using $PROCESS_SCAN explicitly or implicitly with the
         DCL command, SHOW USER, sometimes causes a system crash due to an
         ACCVIO in kernel mode or an IVSSRVRQST bugcheck.
    
      o  When a node with a SCSI bus boots, it resets the SCSI bus.  In a
         multi-host SCSI cluster, this can cause the other node to experience
         I/O failures.  Normally, this results in a brief mount verification.
         The I/O is retried, succeeds, and there is no serious consequence.
         However, if the other node is in the process of booting and the
         system disk is a shadow set, the system will crash.
    
      o  PGFIPLHI bugcheck in the SHADOW_SERVER process at the REMQUE in
         K_GET_COPYSHAD_IRP.  On OpenVMS Alpha, the PC is A0E and the VA is 274.
    
      o  A double-deallocation crash may occur as the result of MOUNT not
         properly initializing the MTL pointer.  This error causes the
         pointer to have a stale value as a result of 2 calls to SYS$VMOUNT
         from a single program.  The problem will not happen as a result of
         DCL commands, since the cells are initialized at image activation.
         The stale pointer will only cause a problem if the system is unable
         to allocate space for defining the logical name.
    
      o  If a user attempts to mount a disk that is 100% full and the disk
         was originally initialized with a version of OpenVMS Alpha prior to
         the one currently in use, paged pool can be corrupted.  This leads
         to system crashes.  If the disk is filled AFTER it has been mounted,
         there will not be any problem.
    
      o  Tape devices with stacker/loaders, such as the TF857, may take up to
         6 minutes to Rewind/unload/load the next tape.  A change was made to
         the behavior of MOUNT to take this delay into account.  However, a
         side effect of this change is that non-stacker drives may also wait
         6 minutes before failing.
    
      o  Processes may hang in RWNPG state while waiting for a request for
         NPP (non-paged pool) so large that it cannot be satisfied.
    
      o  A system crash may occur with the current process executing a
         $CHKPRO system service call.  This happens when one routine running
         in user mode is interrupted by a KERNEL mode AST which activates a
         routine that uses the same memory.
    
      o  If a multi-programming application uses a non-homogenous access
         pattern to a file which is resident in Virtual I/O cache, there is a
         possibility that the size returned in the I/O status block from a
         READ operation will be truncated.
    
         If a clustered application uses of a large number of concurrent
         processes to perform file operations consisting of an OPEN, WRITE,
         and CLOSE sequence repetitively on the same data file, data
         corruption may occur.
    
         In a multi-programming environment where a significant amount of NEW
         data from a file is being loaded into the cache concurrently by
         multiple processes, the system may HANG.
    
      o  When a value block or value status block can not be returned,
         SYS$GETLKI returns the error SS$_ILLRSDM.  A correction has been
         made to SYS$GETLKI to now return all other requested information
         and update the wildcard search index.
    
      o  The Audit Server EXCLUDE process list becomes corrupt after a
         SET AUDIT/EXCLUDE=pid command is issued.
    
      o  Data corruption may occur in the file container during the use of
         PATHWORKS.  The corruption can be shown by running CHKDSK on the PC
         container disk.  Using PCDISK to IMPORT and EXPORT files to and from
         the container will show corrupted files when EXPORTed back to OpenVMS.
    
    
    Problems Addressed in AXPSHAD04_061 Kit for OpenVMS Alpha V6.1, V6.1-1H1,
    and V6.1-1H2 only:
    
      o  When booting two or more systems simultaneously from shadowed system
         disks, the systems may appear to hang.  Crashing the systems and
         examining the crash dumps indicates that shadowing driver blocking
         AST routines have not run.
    
      o  When a node runs out of SHADOW_MAX_COPY threads while mounting new
         copy target units, other nodes in the cluster that have available
         SHADOW_MAX_COPY threads will not pick up the copy work.  This
         results in the copy not being started for copy members that are
         added to shadow sets.
    
    Problems Addressed in AXPSHAD02_061 Kit for OpenVMS Alpha V6.1, V6.1-1H1,
    and V6.1-1H2 only:
    
      o  While running a UETP tape test, fatal controller errors occur.  This
         problem is caused by the incorrect interpretation of a TUDRIVER
         status subcode by TMSCP (the tape server).  After the installation
         of this ECO kit, a fatal controller error status is returned to the
         user when this occurs.
    
      o  Shadow sets have separate mount verification done by SHDRIVER,
         instead of the usual system mount verification.  The SHDRIVER mount
         verification has an error updating the volume label on shadow sets
         that have the volume label changed except on the node that issues
         the label change.  Once the devices are in this state, they can not
         be recovered until MVTIMEOUT is reached or a reboot of all affected
         nodes is performed.
    
         This correction enables the behavior of virtual units to be
         consistent with the behavior of physical units.
    
      o  Unnecessary calls to MOUNT verification or host-based volume
         shadowing processing may occur.  On Alpha nodes, these mount
         verification or Host-Based Volume Shadowing processing calls will
         fail, resulting in I/O hangs and, eventually, volume invalid errors.
    
      o  AVAILABLE or OFFLINE status returned from a transfer command does
         not implement the MSCP specification correctly.
    
      o  OpenVMS VAX MSCP Parity with OpenVMS Alpha.  A served disk may
         appear to be ONLINE when it is really OFFLINE.  This occurs because
         the MSCP server's CHECK_SERVICE routine searches the device database
         and incorrectly returns an ONLINE status.
    
      o  There is no synchronization between SHADOW_PROCESSING and
         INVALIDATE_ALL_ENTRIES, which allows these two code threads to
         run simultaneously.  This can cause a system crash due to the
         fact that the SHADOW_PROCESSING thread may remove a member from
         a multimember shadow set and the INVALIDATE_ALL_ENTRIES thread
         is not aware that the member has been removed.  The system
         crash occurs in RESTORE_WLE because no Write Log table
         exists.
    
      o  A problem exists with the SHADOW_SERVER.  Several symptoms
         of this problem are:
    
           +  Undiagnosable hangs in individual copy operations or on
              the entire server
    
           +  Unexpected copy aborts
    
           +  Poor copy performance
    
           +  Shadow set inconsistency
    
         An optional new system logical name, SHAD$COPY_BUFFER_SIZE, has
         also been added.   This system logical name can be used to control
         the buffer size of shadow copies.  SHAD$COPY_BUFFER_SIZE has a
         maximum size of 127 blocks (default) and a minimum size of 31
         blocks.  The size can be changed by using the DEFINE/SYSTEM
         command.
    
      o  High interrupt stack activity occurs on a node performing a merged
         copy operation.  This could adversely affect configurations using
         HSJ40 controllers with many shadow sets.
    
      o  Data inconsistency may exist between members of a Phase II shadow
         set.  This occurs under very heavy I/O operations to a shadow
         set while the members of that shadow set are undergoing failover
         from one controller to another.
    
      o  Invalid Command status processing of Write History Management
         commands unconditionally puts an entry into the error log.
         This occurs even when there is no actual error.
    
      o  A second shadow server may accidentally be created using the
         startup command procedure.  This results in desynchronization
         of shadow sets.  The startup procedure has been modified so
         that it does not allow multiple servers.
    
      o  When a serving node becomes so busy that it occasionally
         exhausts resource limits, the RWAITCNT for heavily used disks
         gets incremented.  If a client node requests on ONLINE and
         RWAITCNT is bumped, it is rejected by MSCP.  This makes
         MOUNTing devices very difficult.
    
      o  After a system failure, the number of blocks to be rewritten
         is not computed correctly.  This may cause inconsistent data
         between shadow set members.  This occurs during an assisted
         merge when the information regarding which LBNs to include
         is only requested from one shadow set member.
    
      o  A process issuing I/O to a TMSCP tape device may appear to
         hang after a controller failover attempt.  This is caused by
         an incorrect check of the cached data's lost error status,
         which results in an endless loop trying to recover a
         nonexistent error.
    
      o  OpenVMS Alpha systems are unable to reboot an MSCP controller,
         such as an HSC.  This might result in stalled pending I/O
         to MSCP or TMSCP devices.
    
      o  A device may be mounted by an MSCP server, even though a local
         controller could be used.  This situation may still occur after
         the installation of this ECO kit under extreme timing circumstances.
    
      o  When new MSCP server I/O is sent to a device that is RWAITCNT
         stalled and the connection from the driver to the device fails,
         server I/O is posted to the restart queue if it is active.  If
         not, they are incorrectly left on the UCB (Unit Control Block)
         pending queue.  This causes shadow sets to appear to be stalled.
    
         If the connection from the client to the server then fails,
         I/O from the client that has been passed to the driver is
         then allowed to complete.  If this I/O is stalled on the
         pending queue, it completes much later, possibly after
         the client has reissued the stalled I/O.
    
      o  I/O hangs to a shadow set might occur because the shadowing
         driver has no way to disable write logging if the write log
         entries are mismanaged or depleted to a point that the
         shadow set is unusable.
    
      o  An Invalid Exception bugcheck might occur in DUDRIVER during
         I/O request complete processing.
    
      o  In the past, MSCP could only serve 256 disks.  It can now
         serve 512.
    
      o  During disk and tape error recover, MSCP is unable to perform
         a TMSCP controller reset which results in a system crash.
    
      o  During the processing of a write-log entry in SHDRIVER, a
         register value may be improperly maintained if the system
         is low on nonpaged pool.  This will cause a system crash
         with an INVEXCEPTN Bugcheck within SHSB$GET_WLE_TABLE in
         module SHDSUBS when the entry is resumed.
    
      o  In the past, Volume Shadowing checked device IDs and the
         maximum logical block numbers (LBNs.)  Volume Shadowing
         now checks for geometries and maximum LBNs.  This
         enables devices like the RZ28 and RZ28B to operate in
         the same shadow set.  Even though their device IDs differ,
         their geometries and maximum LBNs will match when configured
         on like controllers.
    
         NOTE:  If this remedial kit is installed across a VMScluster
                system, SCSI shadow sets that are configured across
                different controller types are not supported and will
                no longer work.
    
      o  After approximately 18 hours of operation, some OPCOM
         messages that should be logged are skipped.
    
      o  If two members of a three-member shadow set are
         simultaneously removed, either intentionally or in
         a failover situation, the system may hang or fail.
    
      o  System crashes might occur during virtual I/O cache (VIOC)
         expansion under the following circumstances:
    
           +  Multiple processes (or processors) are accessing the same
              file concurrently;
    
           +  The cache space for that file was being expanded;
    
           +  That expansion caused the need for a new hash table
              structure.
    
      o  When subjected to a high I/O load and multiple failures,
         the write logging (minimerge) and shadowing synchronization
         subsystems become unreliable.
    
      o  Unreliable shadow subsystem behavior and shadow-set hangs
         result from VMScluster nodes failing to relinquish shadow-set
         resources.
    
      o  The TMSCP server bugchecks in TMSCP$FIND_UQB when a command
         that refers to a specific unit is processed and that unit
         does not have the Server Local Unit Number (SLUN) bit set.
    
         The fix contained in this ECO kit will cause the bugcheck
         to occur in TUDRIVER instead of the TMSCP server.
    
      o  I/O may stall to a served shadow-set member.  Load balancing
         makes this condition more likely.
    
      o  System crashes may occur during processing of stale I/O in
         Host-Based Volume Shadow Sets.   This I/O does not properly
         reflect changes in shadow set configuration like removal of
         members and changes in the write-logging state.
    
      o  Shadow set members may be inconsistent after the failure
         of a node accessing a shadow set served by an Alpha node.
         The amount of corrupted data depends on previous I/O
         operations to the shadow set.
    
    Problems Addressed in AXPSHAD01_061 Kit for OpenVMS Alpha V6.1 only:
    
      o  In Volume Shadowing for OpenVMS Alpha V6.1, minimerge
         functionality across mixed architecture VMSclusters was disabled.
         In order to reestablish the minimerge functionality, install this
         kit across any VMScluster that contains an OpenVMS Alpha V6.1 node.
    
         After installation of this kit, the entire cluster must be
         rebooted simultaneously.  Rolling upgrades are *NOT* supported.
    
      o  Mounting an RZ28B disk device with an RZ28 in the same
         shadow set is not allowed and will display the following error:
    
         %MOUNT-I-SHDWMEMFAIL, $1$DUA0 failed as a member of the shadow set
         -SYSTEM-F-INCSHAMEM, incompatible shadow set member
    
         This behavior is seen when RZ28/RZ28B shadow set members are
         connected with a local SCSI (Small Computer System Interface)
         controller.
    
         With this kit, RZ28 and RZ28B devices can be combined in a
         shadow set if they are connected to like controllers.
    
         NOTE:  If this kit is installed across a VMScluster, SCSI
                shadow sets configured across different controller
                types are not supported and will no longer work.
    
         VMSclusters with shadowed SCSI disks and mixed-architecture
         VMSclusters running OpenVMS Alpha V6.1 must apply the kit and reboot
         the entire cluster simultaneously, so that the entire VMScluster is
         running the same version of Volume Shadowing software.
    
      o  In a VMScluster (mixed Alpha/VAX environment), shadow sets served to
         the DEC 3000 Model 300 are reported as MEDOFL.  A DCL command, 'SHOW
         DEVICE/SERVED', from a VAX 6000 Model 400 shows the shadow sets as
         AVAILABLE.
    
    
    Problems addressed in ALPSYS12_061:
    
    o   $AUDIT_EVENT unconditionally enables AST processing when
        certain applications calling $AUDIT_EVENT (with AST's turned
        off) are interrupted.
    
    o   Image hangs with Kernel mode ASTs disabled after calling
        $AUDIT_EVENT.  The image could not be removed from the system.
    
    
    RELATED ARTICLES:
    
    Detailed articles describing the problems listed above may exist in the
    STORAGE and OPENVMS database(s).  To view these articles, open the
    appropriate product database and perform a query using either of
    the following search strings: 'ALPSHAD09_061' or 'ALPSHAD'.
    
    
    ECO KIT ORDERING INSTRUCTIONS:
    
    If after an evaluation you wish to obtain this kit, request it
    electronically using the appropriate Advanced Electronic Services
    (AES) Service Tool.  If you are not familiar with how to request
    kits electronically, open the DIA, WIS or DSNLINK database and
    review the article entitled:
    
         [AES] How To Electronically Request ECO Kits Using Service Tools
    
    
    INSTALLATION NOTES:
    
    If you are using the Shadowing option, it is highly recommended that
    this kit be installed.
    
      o  When you install the ALPSHAD09_061 remedial kit you must also
         install the ALPSHAD10_061 or later remedial kit before rebooting
         your system.  Installing the ALPSHAD09_061 kit without installing
         the ALPSHAD10_061 or later kit could lead to system instability.
    
      o  Future OpenVMS Alpha V6.1 kits that are issued for facilities
         included in the ALPSHAD09_061 kit will not install unless the
         ALPSHAD09_061 kit is installed on your system first.  It is highly
         recommended that the complete ALPSHAD09_061 remedial kit be
         installed as soon as possible.  Installation of individual images
         from the ALPSHAD09_061 remedial kit is not supported and could
         result in unpredictable system behavior.
    
      o  This kit *MUST* be installed on every Alpha in a mixed-architecture
         VMScluster, and the VAX version of this kit *MUST* be installed on
         every VAX system in the cluster BEFORE any systems are re-booted
         into the VMScluster.  If both kits are not installed, shadow sets
         cannot be created.
    
      o  Working configurations that contain SCSI shadow sets on dissimilar
         controllers may no longer work.
    
      o  VMSclusters with shadowed SCSI disks and mixed-architecture
         VMSclusters running OpenVMS Alpha V6.1 must apply the kit and reboot
         the entire cluster simultaneously.  In these cases, rolling upgrades
         are not supported.
    
    For more information, please see the Problem Description section of the
    Cover Letter/Release Notes supplied with this kit.
      
      ==========================================================================
      |                     Table of Kit Image Information                     |
      +----------------------------+----------+-----------------+--------------+
      |                            | Overall  | Image File      | Image Link   |
      | Image Name                 | Checksum | Identification  | Date/Time    |
      +----------------------------+----------+-----------------+--------------+
      | IO_ROUTINES.EXE            | 05B8DD0B | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:12:56.26  |
      +----------------------------+----------+-----------------+--------------+
      | LOCKING.EXE                | 80AF322C | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:16:18.66  |
      +----------------------------+----------+-----------------+--------------+
      | MONITOR_TV.EXE             | DD6FCCCE | X-4             |  7-AUG-1995  |
      |                                       |                 | 12:27:02.01  |
      +----------------------------+----------+-----------------+--------------+
      | MOUNTSHR.EXE               | D1911EAD | ALPHA  X5SC-E5N | 25-AUG-1995  |
      |                                       |                 | 06:43:46.43  |
      +----------------------------+----------+-----------------+--------------+
      | MSCP.EXE                   | CD860CE6 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:27:50.08  |
      +----------------------------+----------+-----------------+--------------+
      | MTAAACP.EXE                | DF6C9F6B | X-7             | 25-AUG-1995  |
      |                                       |                 | 07:25:30.37  |
      +----------------------------+----------+-----------------+--------------+
      | SECURITY.EXE               | 97131880 | X-5             | 25-AUG-1995  |
      |                                       |                 | 07:10:12.22  |
      +----------------------------+----------+-----------------+--------------+
      | SHADOW_SERVER.EXE          | 8D18C1EC | X-20            | 14-NOV-1995  |
      |                                       |                 | 19:16:26.37  |
      +----------------------------+----------+-----------------+--------------+
      | SPISHR.EXE                 | 1E5C5132 | ALPHA  X5SC-E5N | 25-AUG-1995  |
      |                                       |                 | 07:25:27.94  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$CLUSTER.EXE            | 29D2B9B0 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:33:34.67  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$DUDRIVER.EXE           | 2727A822 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:09:09.59  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$SHDRIVER.EXE           | 87ECD92B | X-3             | 14-NOV-1995  |
      |                                       |                 | 19:16:07.44  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$TUDRIVER.EXE           | 7B996B72 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:19:25.66  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$VCC.EXE                | 0E180CCE | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:17:44.26  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$VCC_MON.EXE            | B55A07C6 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:18:38.30  |
      +----------------------------+----------+-----------------+--------------+
      | TMSCP.EXE                  | BED32FF1 | X-3             | 25-AUG-1995  |
      |                                       |                 | 07:30:03.16  |
      +----------------------------+----------+-----------------+--------------+
      | VMS$REMEDIAL_ID.EXE        | A8A24B8E | V1.0            | 29-AUG-1995  |
      |                                       |                 | 11:13:21.50  |
      +----------------------------+----------+-----------------+--------------+
      | VPM.EXE                    | 36404649 | X-5             | 25-AUG-1995  |
      |                                       |                 | 07:25:39.00  |
      +----------------------------+----------+-----------------+--------------+
    
privacy statement using this site means you accept its terms feedback to the webmaster
VMS rules VMS rocks OpenVMS rules OpenVMS rocks