Australia - Updated: 24-SEP-2003
hp.com home products and services support and drivers solutions how to buy
» contact hp
hp.com home hp OpenVMS ECOs

IMPORTANT NOTICE

The online distribution of OpenVMS and related product patches is being migrated to the HP ITRC (Information Technology Resource Center) patch distribution site. The new ITRC patch server will allow OpenVMS customers to take advantage of many enhanced features for patch searching and distribution.

Beginning August 1, 2003, OpenVMS and related Layered Product, publicly available patches will be available from the HP ITRC web site at

http://itrc.hp.com/service/patch/mainPage.do

The same patches will still be available from the existing patch server in Colorado Springs (http://www.support.compaq.com/patches/) through the end of October 2003, to give customers sufficient time to update their bookmarks and make the transition to the HP ITRC web site.

ECO kits will also be available by raw FTP from (ftp://ftp.itrc.hp.com/).

PLEASE UPDATE YOUR BOOKMARKS AND REGISTER ON THE NEW SITE NOW

Note: if you're having trouble connecting to the ITRC site, please delete any cookies for "itrc.hp.com" from your browser and try again. Report any difficulties with or suggestions to MrVMS

» Sydney CSC home page

Navigation
» ECOinfo main index
» Search ECOs
» Search FTP site
» Browse FTP site

ECO Indexes
» Chronological Index
» Indexed by Version
» Indexed by Rating
» Alpha Indexed by Name
» VAX Indexed by Name
» On Hold List

Associated Links
» OpenVMS Home Page
» OpenVMS News
» DIA/WIS Web Service

Feedback
» mail to CSC
.
Sydney Customer Support Centre OpenVMS ECO information
    Updated: 24-SEP-2003 (Use your browsers' Reload button to ensure you're viewing the most recent version)

VMS73_XFC-V0300 (Alpha V7.3) XFC ECO Summary

To obtain this kit please call the Customer Support Centre or use the FTP site

Search for this ECO kit and dependencies
Search the Compaq FTP web site this kit (exact match)
Search the Compaq FTP web site this or related ECOs

    
    
    *OpenVMS] VMS73_XFC-V0300 (Alpha V7.3) XFC ECO Summary
    
    New Kit Date:       04-JUN-2003
    Modification Date:  Not Applicable
    Modification Type:  NEW KIT
    
    Copyright (c) Hewlett-Packard Company 2001,2002,2003.  All rights reserved.
    
    OP/SYS:     OpenVMS Alpha
    
    COMPONENT:  XFC
    
    SOURCE:     Hewlett-Packard Company
    
    ECO INFORMATION:
    
         ECO Kit Name:  VMS73_XFC-V0300
                        DEC-AXPVMS-VMS73_XFC-V0300--4.PCSI
         ECO Kits Superseded by This ECO Kit:  VMS73_XFC-V0200
         ECO Kit Approximate Size:  8832 Blocks
         Kit Applies To: OpenVMS Alpha V7.3
         System/Cluster Reboot Necessary:  Yes
         Rolling Re-boot Supported:  Yes
         Installation Rating:  INSTALL_2
                                 2 : To  be  installed  by   all  customers  using  the  following
         			         feature(s):XFC
         Kit Dependencies:
    
           The following remedial kit(s) must be installed BEFORE
           installation of this kit:
    
             VMS73_UPDATE-V0100
    
           In order to receive all the corrections listed in this
           kit, the following remedial kits should also be installed:
    
             None
    
    
    ECO KIT SUMMARY:
    
    An ECO kit exists for XFC components on OpenVMS Alpha V7.3.
    This kit addresses the following problems:
    
    PROBLEMS ADDRESSED IN VMS73_XFC-V0300 KIT
    
    
         O  Multiple XFC bug fixes and enhancements have been made:
    
               -  Files written by a DFS client to a disk drive served by a
                  cluster node can end up with stale data on the cluster
                  nodes not serving the drive.
    
               -  CPU spinwait bugchecks Some conditions (large numbers of
                  non-cached I/Os) can result in a very long internal XFC
                  queue.  On very large systems, searching this queue take
                  30 or more seconds.  A suggested workaround was to limit
                  the XFC cache to 4 or 5 GB.  This is no longer necessary.
                  XFC was inadvertently using the FILSYS and SCS spinlocks
                  in the wrong order.  The MTAACP (mag tape ACP) also uses
                  both spinlocks which can result in a deadlock and
                  subsequent cpuspinwait bugcheck.  This problem will not
                  show up with backup, but only when doing filesystem access
                  to a tape drive (e.g.  copy x.x mta0:  ) and then only if
                  the timing was just right.
    
                  It was possible for a XFC file truncate processing to take
                  enough time to result in a spinwait bugcheck.
    
               -  Volume depose speedup A volume dismount requires that all
                  files in the cache for that volume be deposed from the
                  cache (on the current node).  This operation was operating
                  at about 1 file per second resulting in very long times to
                  free memory.  In addition, the code deposed the first file
                  synchronously which could cause noticeable delays for the
                  dismount.
    
               -  Minimum cache size enforced.  XFC would allow any values
                  for VCC_MAX_CACHE including zero.  The result was either
                  caching being disabled cluster-wide or a memory management
                  bugcheck on the local node during boot.  This fix ensures
                  that about 5 MB of memory is always allocated to XFC
                  allowing the node to boot (there is also a message output
                  on the console).
    
               -  ASSERTFAIL bugcheck copying file to spooled device on
                  standalone nodes.  XFC assumed that all file deletes
                  passed through XFC allowing XFC to properly depose the
                  cache.  On standalone nodes only, this assumption lead to
                  XFC attempting to release a lock it didn't own and
                  crashing with an ASSERTFAIL bugcheck.  This typically
                  showed up while attempting to copy to a spooled device.
                  This does not occur on nodes in clusters.
    
               -  Performance data not being updated.  XFC was not calling
                  routine pms_std$end_rq() prior to completing disk I/Os.
                  This resulted in performance data collectors seeing I/O
                  starts, but not I/O completions.
    
               -  Corrupt LRU queue after truncate During I/O completion,
                  XFC cleans up structures associated with the I/O including
                  adjusting positions of extents (ECBs) in the LRU queue.
                  Occasionally, these elements have either been deallocated
                  or used for another I/O which results in a bugcheck.  This
                  is an extremely rare event.  It has been seen at one
                  internal site almost a year ago and at 3 customer sites.
    
                  The XFC truncate code had an implicit assumption that
                  there would not be active I/Os on the file.  The code
                  neglected to account for either XFC readahead I/Os or
                  asynchronous I/Os issued prior to the call to truncate.
                  The XFC truncate code was completely rewritten to properly
                  synchronize with concurrent I/Os to the file being
                  truncated.
    
               -  Public counters overflow The XFC public counter used by
                  the DCL command 'SHOW MEMORY/CACHE' were stored in
                  unsigned longwords limiting the maximum counts to
                  approximately 4 billion.  These counters have been
                  increased to unsigned quadwords.  In addition, the public
                  interface to the internal counters (CACHE$GET_STAT()) has
                  been enhanced to return up to 8 bytes of data for each of
                  these counters.
    
               -  ASSERTFAIL bugchecks in XFC lock processing If a write
                  happens for a file which is in read sharing mode, XFC
                  attempts to convert the File Arbitration Lock (FAL) from
                  PR mode (caching cluster-wide) to PW mode (caching locally
                  only).  If this conversion fails, then XFC moves the FAL
                  to CW mode and starts a thread to move the FAL back to a
                  caching mode.  This thread is called a FAL up conversion.
                  During this sequence, it was possible for a blocking AST
                  on the FAL to fire.  It would also lead to a FAL up
                  conversion being started.  If the timing were just right,
                  then two FAL up conversions could be in progress.  One of
                  the two would find the FAL in the wrong state and bugcheck
                  (ASSERTFAIL).
    
               -  ASSERTFAIL in routine XfcLockIsFALHeld () or
                  XfcLockReleaseFALViaEX () Under some conditions, it was
                  possible that a file truncate operation could happen while
                  an I/O was in progress.  The truncate operation would
                  leave data in cache for the cache, but with the XFC file
                  arbitration lock in a state not allowing valid data.  XFC
                  crashed with an ASSERTFAIL bugcheck when this
                  inconsistency was discovered.  This has been fixed by a
                  complete rewrite of the XFC truncate processing.
    
               -  Volume and file latencies incorrectly calculated XFC
                  provides statistics on average access latencies (via the
                  XFC SDA extensions and the CACHE$GET_STATVOL system
                  service).
    
                  It does this by accumulating the total latency times as
                  the accesses are completed and then, when the average is
                  requested, dividing by the number of accesses.
                  Unfortunately, the access counts include accesses for
                  which the latency could not be determined (because the
                  access began on one CPU and finished on another:  the
                  per-CPU cycle counter is used to determine the elapse
                  time) and, therefore, were not included in the accumulated
                  latency value.  XFC's statistics gathering fields already
                  include counts of the accesses not counted in the latency
                  accumulations.  So, the change is to include those counts
                  in the calculations.
    
               -  Improved performance of non-cached I/Os.  XFC was adding
                  overhead to I/Os which weren't being cached - for example
                  very large I/Os (6000 blocks).  This extra overhead has
                  been removed.
    
               -  XFC SDA extension enhancements
    
                  1.  Help for XFC SDA extension has been updated.
    
                  2.  The SDA command XFC SHOW FILE command now displays the
                      file name.  In addition, the output of the SDA command
                      XFC SHOW FILE /BRIEF is sorted by volume.
    
    
    
              Images Affected:[SYSLIB]ALPHA_XFC$SDA.EXE
    			  [SYS$LDR]SYS$XFCACHE.EXE
    			  [SYS$LDR]SYS$XFCACHE_MON.EXE
                              [SYSLIB]XFC$SDA.EXE
    			  [SYS$LDR]SYS$XFCACHE.DSF
    			  [SYSLIB]SYS$XFCACHE.STB
    
    
    
    
    
    
    
    PROBLEMS ADDRESSED IN VMS73_XFC-V0200 KIT
    
    
         O  Because of numerous problems reported  against  XFC  in  V7.3,
            customers  were instructed to disable XFC for V7.3 until these
            issues could be addressed.  This kit fixes all  problems  with
            XFC  reported  by customers as of 16 July 2002.  Once this kit
            is installed XFC can be  safely  re-enabled.   Note  that  all
            these issues were corrected for OpenVMS V7.3-1.
    
            Specific issues addressed are:
    
              1.  Process or System hangs
    
                   o  XFC internal structures describing cached files  could
                      be deleted while an active operation for that file was
                      stalled.   The  stalled  operation  would   never   be
                      restarted.   This would result in processes being left
                      in RWAST with other processes waiting for  release  of
                      file system locks.
    
                   o  If a blocking  AST  was  processed  on  the  XFC  File
                      Arbitration  lock  while  a readahead was in progress,
                      the readahead would be dismissed, but the blocking AST
                      was  not  restarted.   This would result in subsequent
                      I/Os to this file stalling.
    
                   o  In a cluster with both VIOC and XFC nodes, XFC  queues
                      a  NL  (null) to CW (concurrent write) lock conversion
                      without specifying a lock conversion  priority.   This
                      results  in  a  deadlock.  This is often seen with the
                      file  arbitration  lock  for  SYSUAF.DAT,  making   it
                      impossible to log into the cluster.
    
                   o  Under some conditions, a readahead I/O is not properly
                      cleaned  up  after  an I/O completes with an error.  A
                      subsequent close of the file will  hang,  waiting  for
                      the readahead to complete.
    
                   o  If a file lock transition is in progress at the time a
                      deaccess  was started, it is possible for the deaccess
                      not to  be  restarted.   This  results  in  a  process
                      hanging in RWAST state.
    
                   o  Under very low memory conditions, XFC could get into a
                      state in which there was not enough memory to make any
                      progress on I/Os.   This  fix  increases  the  default
                      amount  of  memory allocated to XFC at boot time to be
                      sufficient to make progress regardless of  how  little
                      free memory is available at any time.
    
                   o  The server process for host-based  RAID  can  hang  in
                      mutex  wait  state.  This is caused by the logical I/O
                      processing of XFC not recognizing that it is  safe  to
                      allow the logical I/O to proceed.
    
    
              2.  Stale data in files
    
                  In a cluster, there are several situations under which XFC
                  could  leave  stale  data  in  the  cache for a file.  The
                  symptom  would  be  that  a  file  would  appear  to  have
                  different  data  when  viewed  from  different  nodes in a
                  cluster.
    
                   o  File truncate processing would leave data in the cache
                      for  a  file, and the file arbitration lock in a state
                      only appropriate for no valid data.
    
                   o  Under some conditions, QIO write updates to files read
                      using paging I/O would not be seen on other nodes in a
                      cluster.
    
    
              3.  File corruption
    
                   o  Large  files  copied  from  IDE  CD  drives  could  be
                      corrupt.   This  is the result of XFC not honoring the
                      maximum I/O size specified by the driver.
    
                   o  Updates to files on a DFS-served device are not  being
                      seen.   The  DFS  server  bypasses the cache for write
                      I/Os.  XFC was  missing  these  writes  and  therefore
                      leaves stale data in the cache for these files.
    
                   o  An example program that uses fastio to implement  file
                      copy  fails  to  copy  the last block of a file copied
                      from a device that did not support fastio  (e.g.   IDE
                      CD or a RAM disk).
    
    
              4.  System Crashes
    
                   o  Under heavy loads, XFC would use a  fork  block  twice
                      resulting  in  a corrupt fork queue and a system crash
                      (typically a system service exception).
    
                   o  If a blocking AST for a file arbitration lock fired at
                      exactly  the  same time as that file was being deposed
                      for dismount, then XFC would eventually bugcheck  with
                      either  an XFC ASSERTFAIL bugcheck or a system service
                      exception.
    
                   o  In a cluster, XFC could retain data for a  file  on  a
                      node  that  was  caching  the  file  when the file was
                      deleted on a node not caching the file.  This resulted
                      in  stale  data  in  cache after the file was deleted.
                      Under some conditions, XFC would call the file  system
                      with  the  old,  incorrect  highwater mark information
                      resulting in an XQPERR bugcheck.
    
                   o  Readahead I/Os are sometimes issued after a  file  has
                      been  deaccessed.   If the readahead requires either a
                      file system mapping operation or a window  turn,  then
                      the  system  could  crash  with  either a NOTWCBIRP or
                      STRNOTWCB bugcheck.
    
                   o  XFC uses an uninitialized variable as the saved IPL to
                      a  spinlock  release  routine.   On  some  systems (in
                      particular AS400),  this  results  in  a  machinecheck
                      bugcheck.
    
                   o  XFC would crash attempting to do reads  or  writes  to
                      devices not supporting fastio.
    
    
              5.  Halt during boot.
    
                  If XFC detects system configurations that will  not  allow
                  XFC   to  run  properly,  XFC  will  halt  the  CPU  [with
                  PAL_HALT()].  This causes  CPUSPINWAIT  bugchecks  on  SMP
                  systems  that  take  extra  effort  to  diagnose.  XFC now
                  prints a message on the console and adjusts the parameters
                  MPW_HILIMIT and FREEGOAL to allow booting.
    
              6.  Set time flushes cache for system disk.
    
                  The $settime() system service would result in XFC flushing
                  all cached data for the system disk.
    
              7.  RAM disk caching state incorrect.
    
                  By default, XFC mounts RAM disks NOCACHE.   However,  this
                  was not happening in some circumstances.  XFC now disables
                  XFC caching for locally mounted RAM disks.
    
              8.  RAM disk performance enhancement.
    
                  I/O to a RAM disk is noticeably slowed when XFC is  turned
                  on  for  a  system even if the disk were mounted /NOCACHE.
                  This has now been corrected.
    
              9.  File sharing performance enhancement.
    
                  If an application writes to a file open on multiple  nodes
                  in  a  cluster,  XFC  stops caching for that file.  In the
                  past, XFC would not resume caching for that file  even  if
                  all  the  write  accessors  closed the file.  XFC will now
                  move a file back to caching when write accessors close the
                  file.
    
             10.  Cache hit performance improvement.
    
                  The performance of cache hits has been improved.
    
             11.  Spurious XFCACHE-W-DATALOSS messages to OPCOM.
    
                  XFC incorrectly assumes that reads  beginning  beyond  the
                  file  high  water  mark  are an integral number of blocks.
                  This results in the following problems:
    
                   o  XFCACHE-W-DATALOSS messages to OPCOM when the XFC  I/O
                      completion  code  discovered  that the number of bytes
                      copied did not equal the number of bytes requested.
    
                   o  Potential corruption in user program  space  when  the
                      bytes beyond the end of the user buffer were zeroed.
    
    
             12.  The system can crash with an  ASSERTFAIL,  "System  ASSERT
                  failure detected" bugcheck.
    
                  Crashdump Summary Information:
                  ------------------------------
                  Bugcheck Type:     ASSERTFAIL, System ASSERT failure detected
                  Current Process:   CTM$_0005000C
                  Current Image:     $3$DKC203:[SYS1.SYSCOMMON.][SYSEXE]COPY.EXE
                  Failing PC:        FFFFFFFF.802EC6E0    XFCLOCKISFALHELD_C+00D60
                  Failing PS:        30000000.00000804
                  Module:            SYS$XFCACHE    (Link Date/Time: 11-JUN-2002 20:07:42.49)
                  Offset:            000086E0
    
                  The bugcheck might be seen doing a backup  copy  operation
                  on  large,  busy systems.  It might also be seen on memory
                  tight systems doing DCL copies.
    
    
                Images Affected:[SYS$LDR]SYS$XFCACHE.EXE
                     	    [SYS$LDR]SYS$XFCACHE.DSF
    		            [SYS$LDR]SYS$XFCACHE.STB
    			    [SYS$LDR]SYS$XFCACHE_MON.EXE
    			    [SYS$LDR]SYS$XFCACHE_MON.DSF
      			    [SYS$LDR]SYS$XFCACHE_MON.STB
    			    [SYSLIB]XFC$SDA.EXE
    
    
    
    
    KIT INSTALLATION RATING:
    
         The following kit installation rating, based upon current CLD
         information, is provided to serve as a guide to which customers
         should apply this remedial kit.  (Reference attached Disclaimer of
         Warranty and Limitation of Liability Statement)
    
         INSTALLATION RATING:
    
           2 : To  be  installed  by   all  customers  using  the  following
               feature(s):  XFC
    
    
    INSTALLATION INSTRUCTIONS:
    
         This kit requires a system reboot.  HP strongly recommends that a
         reboot is performed immediately after kit installation to avoid
         system instability
    
         If you have other nodes in your OpenVMS cluster, they must also be
         rebooted in order to make use of the new image(s).  If it is not
         possible or convenient to reboot the entire cluster at this time, a
         rolling re-boot may be performed.
    
         Install this kit with the POLYCENTER Software installation utility
         by logging into the SYSTEM account, and typing the following at the
         DCL prompt:
    
         PRODUCT INSTALL VMS73_XFC /SOURCE=[location of Kit]
    
         The kit location may be a tape drive, CD, or a disk directory that
         contains the kit.
    
         Additional help on installing PCSI kits can be found by typing
         HELP PRODUCT INSTALL at the system prompt
    
    
    Special Installation Instructions:
    
         Scripting of Answers to Installation Questions
    
              During installation, this kit will ask and require user
              response to several questions.  If you wish to automate the
              installation of this kit and avoid having to provide responses
              to these questions, you must create a DCL command procedure
              that includes the following definitions and commands:
    
               -  $ DEFINE/SYS NO_ASK$BACKUP TRUE
    
               -  $ DEFINE/SYS NO_ASK$REBOOT TRUE
    
               -  Add the following qualifiers to the PRODUCT INSTALL
                  command and add that command to the DCL procedure.
    
                    /PROD=DEC/BASE=AXPVMS/VER=V3.0
    
    
               -  De-assign the logicals assigned
    
              For example, a sample command file to install the
              VMS73_XFC-V0300 kit would be:
    
              $
              $ DEFINE/SYS NO_ASK$BACKUP TRUE
              $ DEFINE/SYS NO_ASK$REBOOT TRUE
              $!
              $ PROD INSTALL VMS73_XFC/PROD=DEC/BASE=AXPVMS/VER=V3.0
              $!
              $ DEASSIGN/SYS NO_ASK$BACKUP
              $ DEASSIGN/SYS NO_ASK$REBOOT
              $!
              $ exit
    All trademarks are the property of their respective owners.
      
      ==========================================================================
      |                     Table of Kit Image Information                     |
      +----------------------------+----------+-----------------+--------------+
      |                            | Overall  | Image File      | Image Link   |
      | Image Name                 | Checksum | Identification  | Date/Time    |
      +----------------------------+----------+-----------------+--------------+
      | XFC$SDA.EXE                | D1900A8F | V1.0            |  7-MAR-2003  |
      |                                       |                 | 15:31:52.34  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$XFCACHE.EXE            | 7CBF1E46 | V1.0            | 7-MAR-2003   |
      |                                       |                 | 15:29:50.83  |
      +----------------------------+----------+-----------------+--------------+
      | SYS$XFCACHE_MON.EXE        | 57F4D29E | V1.0            | 7-MAR-2003   |
      |                                       |                 | 15:30:50.01  |
      +----------------------------+----------+-----------------+--------------+
      | XFC$SDA.EXE                | 3CB2A1D9 | V1.0            |  7-MAR-2003  |
      |                                       |                 | 15:31:47.50  |
      +----------------------------+----------+-----------------+--------------+
    
    
privacy statement using this site means you accept its terms feedback to the webmaster
VMS rules VMS rocks OpenVMS rules OpenVMS rocks