Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP Virtual Server Environment Management for Integrity Version 4.0 Release Notes > Chapter 4 Known Issues

Global Workload Manager

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Limitations

  • Localization Behaviors   Starting with the 4.0 release, gWLM has been localized into several languages. However:

    • When managing a gwlmagent version prior to 4.0, some error messages are still displayed in English

    • Events reported to HP SIM are reported in English regardless of the browser locale.

    • A change in the browser locale setting is reflected in:

      • VSEMgmt software at the start of the next user-interface action

      • HP SIM at the next logon

    • The properties files for gwlmagent and gwlmcmsd are parsed as English, regardless of the locale setting. So, be careful of using commas where English would use periods.

    • Some items are always in English:

      • Start-up messages from gwlmagent and gwlmcmsd

      • Log files

      • Messages from initial configuration

  • Unable to Manage Partitions with Inactive Cells or Deconfigured Cores  gWLM does not support management of partitions with either inactive cells or deconfigured cores. (gWLM may incorrectly try to allocate and assign those unavailable resources.)

    Workaround  Configure the cores and activate the cells.

  • Unable to Build a Single Shared Resource Domain   The following message might be displayed in the HP SIM interface to gWLM:

    Unable to build a single shared resource domain from the set of specified hosts: myhostA.mydomain.com myhostB.mydomain.com

    Workaround   This message indicates that there are no supported resource-sharing mechanisms available between the specified hosts. The message can occur if:

    • You have specified hosts in different complexes.

    • You have specified hosts in different nPartitions in a complex when there are no iCAP usage rights to share between the nPartitions.

    If you receive this message:

    • Inspect the /var/opt/gwlm/gwlmagent.log.0 files on the indicated managed nodes for error messages.

    • If partitions have been renamed, restarting the agents in the complex might correct the problem.

  • gWLM's Secure Communications Requires Perl  To secure communications on a gWLM managed node, /opt/perl/bin/perl (version D.5.8.0.D or later) must be on the system.

  • Outdated Size for an nPartition When Using Nested Partitions. (This issue applies to gWLM 2.x and 3.x agents.)   When monitoring an SRD with virtual partitions inside an nPartition on an HP-UX 11i v1 system, the monitored size of the nPartition can be outdated.

    Workaround   No action is necessary. Ignore the value displayed for nPartition size in an SRD with nested partitions on an HP-UX 11i v1 system.

  • Compatibility with HP Integrity Virtual Machines   Global Workload Manager version 4.0 is not compatible with HP Integrity VM A.02.00 or earlier. If you want to manage virtual machines using gWLM version 4.0, HP recommends that you upgrade to HP Integrity VM version A.03.00 or later.

    If you do not upgrade, you may see messages such as:

    Unable to deploy SRD 'name': A VM encountered with no size

    or

    Unable to deploy SRD 'name':guestCpuSetEntitlement (): hpvm_nonvm_cpu_set_entitlement (HPVM_NONVM, (100.000000,100.000000),FALSE) failed: (0,90)

    Workaround   Upgrade to HP Integrity VM version A.03.00 or later if possible.

    If you cannot upgrade from HP Integrity Virtual Machines version A.01.20, you must install the gWLM agent version A.02.00.00 on your VM Host.

    If you cannot upgrade from HP Integrity Virtual Machines version A.02.00, either install gWLM agent version A.02.50.00 or use gWLM version 4.0 to manage only virtual machines with entitlements specified in percentages. (That is, do not manage virtual machines with entitlements specified in CPU cycles.)

    To obtain older versions of the gWLM agent, and for assistance with this configuration, contact HP at the following email address: .

  • Compatibility with PRM and WLM   You cannot use gWLM with either Process Resource Manager (PRM) or Workload Manager (WLM) to manage the same system at the same time. Attempts to do so result in a message indicating that a lock is being held by whichever application is actually managing the system. To use gWLM in this situation, first turn off the application holding the lock.

    For PRM, enter the following commands:

    # /opt/prm/bin/prmconfig -d
    # /opt/prm/bin/prmconfig -r

    For WLM, enter the following command:

    # /opt/wlm/bin/wlmd -k

  • Compatibility with Global Instant Capacity   For information on restrictions when using gWLM with Global Instant Capacity, visit http://docs.hp.com/en/vse.html and locate the white paper Using Global Workload Manager with Global Instant Capacity.

  • Rare Incompatibility with Virtual Partitions   Depending on workload characteristics, gWLM can migrate CPU resources rapidly. This frequent migration can potentially, although very rarely, produce a race condition, causing the virtual partition to crash. It can also produce a panic, resulting in one or more of the following messages:

    No Chosen CPU on the cell-cannot proceed with NB PDC.

    or

    PDC_PAT_EVENT_SET_MODE(2) call returned error

    Workaround   Upgrading to vPars A.03.04 resolves this issue.

    With earlier versions of vPars, you can work around this issue as follows: Assign (using path assignment) at least one CPU per cell as a bound CPU to at least one virtual partition. (It can be any virtual partition). This ensures that there is no redesignation on CPU migrations. For example, if you have four cells (0, 1, 2, 3), each with four CPUs (10, 11, 12, 13) and four virtual partitions (vpar1, vpar2, vpar3, vpar4), you could assign 0/1x to vpar1, 1/1x to vpar2, 2/1x to vpar3, and 3/1x to vpar4, where x is 0,1,2,3.

  • Upgrade of Partition-Based SRDs Requires Rediscovery   If you are using gWLM and you have either of the following types of partition-based SRDs, and you have upgraded the gWLM agents in the partitions from gWLM A.01.x to gWLM 4.0, you cannot add other partitions in the same complex to the SRD:

    • A vPars-based SRD inside an nPartition

    • An nPartition-based SRD using iCAP

    Workaround   Use the following procedure on the CMS to reestablish the SRD:

    1. With the SRD deployed, rediscover the SRD. For a vPars-based SRD, enter the following command:

      # gwlm discover --type=vpar \
       --file=/tmp/myfile.xml hosts
      For an nPartition-based SRD, enter the following command:
      # gwlm discover --type=npar \
       --file=/tmp/myfile.xml hosts
      In these commands, replace hosts with a space-separated list of the partitions in the SRD.

    2. Make the following adjustments to the /tmp/myfile.xml file, as explained in gwlmxml(4):

      • Ensure that the mode attribute for the sharedResourceDomain element is set to the desired value (Managed or Advisory):

        mode="Managed"

      • Ensure that the interval attribute for the sharedResourceDomain element is set to the desired value:

        interval="x"

      • Ensure that the ticapMode attribute for the sharedResourceDomain element is set to all if you want gWLM to allocate TiCAP when needed:

        ticapMode="all"

      • Ensure that the workloadReference entries in the compartment definitions are correct, and adjust the names in the workload definitions themselves. For example, you might have host.OTHER.2 instead of host.OTHER.

    3. Import the file to re-create the SRD:

      # gwlm import --file=/tmp/myfile.xml --clobber
      Because the SRD was already deployed, the new SRD definition is deployed on import, taking the place of the original SRD.

  • Workloads in gWLM Do Not Follow Associated Serviceguard Packages  With the exception of virtual machines, a workload can be managed by gWLM in only one deployed SRD at a time. As a result, if a workload is directly associated with a Serviceguard package (using the selector in the Workload Definition dialog), gWLM can manage it on only one of the hosts on which it may potentially run. However, management of such a workload may disrupt the Virtualization Manager and Capacity Advisor tracking of the workload utilization between cluster members. Thus, it is recommended that you not directly manage a workload associated with a Serviceguard package.

    Workaround  For all hosts to which a workload associated with a Serviceguard package might fail over, you must apply a policy to an enclosing operating system instance (virtual partition or nPartition). You can use a gWLM conditional policy to change the resource allocation depending on which packages are present. This enables you to control the resource allocation of the enclosing operating system instance and still monitor the workload via Virtualization Manager.

  • Host Name Aliases Are Not Supported   Host name aliases are not supported by gWLM. Only canonical DNS host names (fully qualified domain names) are supported.

    Workaround   Use only canonical DNS names when configuring gWLM through either HP SIM or an XML file used with the gwlm command.

Major Issues

  • Instant Capacity B.11.*.08.03.00.* Incompatible with gWLM  Instant Capacity version B.11.11.08.03.00.* (for HP-UX 11i v1), Instant Capacity version B.11.23.08.03.00.* (for HP-UX 11i v2) and Instant Capacity version B.11.31.08.03.00. (for HP-UX 11i v3) are not compatible with gWLM.

    Workaround  Upgrade the gWLM-managed system to Instant Capacity version B.11.*.08.03.01.*.

  • Installing a Newer gWLM Agent on an Older CMS Makes System Unsupported   You can install a newer gWLM agent on a CMS using an earlier version of gWLM. For example, you can install the A.04.00.07 agent on a system with CMS version A.02.00.00.x. This configuration is invalid and leaves the VSE Management CMS software unusable. Starting with gWLM A.04.00.07, the CMS software validates that the agent software version is within the past two major releases and does not exceed the current release.

    Workaround   Update the CMS version. This update also installs the corresponding agent. (Because gWLM requires all managed nodes in an SRD to have the same agent version, you must update the agents on any other managed nodes that could be in an SRD that includes the CMS. For information about performing this update, see the VSE Management Software Installation and Update Guide.

  • gWLM Fails to Start with Certain Time Zone Settings  gwlmcmsd and gwlmagent can fail to start with certain time zone settings. The following message is displayed in the gwlmagent.log.0 file or the gwlmcmsd.log.0 file when you attempt to invoke either daemon:

    Unable to call method, 'main', with signature, 
    '([Ljava/lang/String;)V', in class, 'com/hp/gwlm/node/Node'.
    Exception in thread "main"
    

    Workaround   Use Java 1.5.0.12 or later.

  • gWLM Commands Core Dump   Attempts to run gwlm commands result in core dumps when /var is full.

    Workaround   Make space available in /var.

  • Unable to Create New Native Thread   A message containing the following text might be displayed:

    ... unable to create new native thread

    Workaround   This problem occurs because the following kernel parameters are set too low:

    • max_thread_proc

      Set max_thread_proc to at least 256.

    • nkthread

      Set nkthread to allow for your max_thread_proc value as well as the number of threads needed by all the other processes on the system.

Minor Issues

  • Force Undeploy” Link Requires gWLM Agents to be Running to Undeploy Certain SRDs   If any gWLM agents in an SRD with nested partitions are not running when you click the Force Undeploy link in the Shared Resource Domain view, you are redirected to the SRD Modify screen so you can restart the gWLM agents and complete a clean undeploy. If the gWLM agents cannot be started or are not reachable, you cannot use the undeploy operation from the gWLM interface in SIM.

    Workaround   Use the gwlm command: gwlm undeploy --srd=SRD_name --force

  • Starting Management of Monitored workloads with pset Compartments   If you attempt to manage a set of monitored workloads by applying a policy and managing them with pset compartments, you may get the following error:

    The value '0' specified for 'Total Size' must be a positive integer value.
    

    when attempting to complete the Workload & Policies step of the Manage Systems & Workloads Wizard.

    This message is displayed when you attempt to manage a set of pset compartments that require more cores than are available on the managed node. A pset has a minimum size of one core, so you need at least as many cores as workloads you are attempting to manage. The Total Size field cannot be calculated when there are not enough resources on the system to manage the set of monitored workloads in pset compartments.

    Workaround   You can manage the workloads using compartments based on fss groups (which have a smaller minimum size) or add resources to the partition or SRD to enable the pset minimum size requirements to be met.

  • Constant Use of TiCAP   Global Workload Manager can activate TiCAP if needed to satisfy SRD policies. To avoid unnecessary consumption of TiCAP, you must have a sufficient number of CPUs with permanent licenses available. If your SRD is larger than this amount, TiCAP is consumed to meet the needs of the SRD.

    Workaround   Deactivate TiCAP resources prior to creating an SRD. Any TiCAP resources that are active at this time are included in the SRD and, therefore, are consumed whenever the SRD is deployed.

  • Cell-Local Processors and iCAP Environment   Using cell-local processors with virtual partitions inside an nPartition that uses (iCAP) leads to failure of the icod_modify command.

    Workaround   Do not assign CPUs using cell specifications. Consider assigning CPUs to the virtual partitions using a hardware path.

    Alternatively, to use cell-local processors, update to vPars A.04.04 on HP-UX 11i v2 (B.11.23) or to vPars A.05.01 on HP-UX 11i v3 (B.11.31).

  • Multiple SRDs in a Complex Allowed to Use TiCAP   Global Workload Manager allows multiple SRDs in a complex to use TiCAP; it should prevent this situation from occurring.

    Workaround  Do not configure SRDs in this manner.

  • Making a Configuration Change to a Large SRD is Slow   Changes made to the configuration of a large SRD that is deployed might take a long time (several minutes) to take effect.

    Workaround   There is no workaround. The amount of time needed to complete a change depends on the time it takes to communicate with all the compartments in the SRD.

  • Events for gWLM CPU Migration Can Affect HP SIM CMS Performance   The HP products System Fault Management (SFM) and Event Monitoring Service (EMS hardware monitors in particular) generate events, or indications, when CPUs are migrated. Depending on workload characteristics, gWLM can migrate CPUs rapidly. Over time, this frequent migration can result in a high enough number of events that the performance of the HP SIM CMS is adversely affected.

    Workaround   The following options are available as workarounds:

    Option 1

    For systems managed by gWLM that are running HP-UX 11i v3, install the patches PHCO_36126 and PHSS_36078. (These patches are included in the September 2007 Operating Environment Update Release.) A fix to EMS hardware monitors is available with the September 2007 Operating Environment Update Release. Even with these patches and fixes, there is still one event generated for each change in CPU count.

    For systems managed by gWLM that are running HP-UX 11i v2, upgrade to the June 2007 Operating Environment Update Release.

    Option 2

    Upgrade to HP SIM C.05.01.00.01.xx on the CMS. This version of HP SIM does not, by default, subscribe to these events and will not have a performance degradation.

    Option 3

    If you want to subscribe to events, set up automatic event purging in HP SIM.

    For more information about any of these workarounds, see the HP SIM documentation (available from http://www.hp.com/go/hpsim).

  • CMS is Slow to Respond   The CMS is slow to respond.

    Workaround   Time a gwlm list command on the CMS. If it takes more than 10 seconds, perform the following steps:

    1. In the file /etc/opt/gwlm/conf/gwlmcms.properties (HP-UX) or install-path\VirtualServerEnvironment\conf\gwlmcms.properties (Windows), increase the CMS database cache size by increasing the value of the com.hp.gwlm.cms.cachesize property by 25%. (The cache is more memory efficient if the size is near a power of 2. If your target cache size is close to a power of 2, round it up to the next power. For example, if your target cache size is 60,000, round it up to 66,000.)

    2. Stop and restart gwlmcmsd using the following commands.

      NOTE: Stopping gwlmcmsd disables Virtualization Manager and Capacity Advisor.
      # gwlmcmsd --stop 
      # gwlmcmsd
      

  • Deleting Workloads Takes a Long Time   Once a request to delete a workload is issued, it can take a long time (several minutes) to complete the deletion.

    Workaround   Remove old historical monitoring and configuration data from the gWLM database by entering the following command:

    # gwlm history --truncate --truncate=<CCYY/MM/DD>

    If you prefer not to trim the database, you can delete multiple workloads simultaneously using the gwlm delete command.

    For more information, see gwlm(1M).

  • Integrity VM Prevents Discovery of psets and fss Groups   When the gWLM agent is installed on a system that has Integrity VM installed, discovery operations report only Integrity VM compartments even if psets and fss groups are present.

    Workaround   To discover psets or fss groups on the system, must remove Integrity VM.

  • Only Workloads with Managed Siblings Can be Added to SRDs with Nested Partitions   Using the gWLM command-line interface, you cannot add a workload to an SRD that has nested partitions unless a sibling of that workload is already managed in that SRD.

    Workaround   This is not an issue when you use the gWLM interface in HP SIM. Simply follow the instructions in Step 1 of the Manage Systems and Workloads wizard (reached by selecting Create->Shared Resource Domain), and select the set of hosts to include in a single SRD.

  • Unable to Remove Workload from Nested Partitions SRD   When attempting to remove the last (default) fss group from an SRD with nested partitions, you might encounter a message that includes the following text:

    Unable to remove workload workload_name: Attempting to remove a compartment with an unachievably low Fixed policy size. Increase the Fixed policy resource amount and try again.

    Workaround   Undeploy the SRD and delete it. Then create a new SRD without the fss group that you were trying to remove.

  • Combining psets and virtual partitions   When using psets on virtual partitions, assigning CPUs to virtual partitions by either path or cell specification can result in processes losing their processor set affiliations when CPUs are removed.

    Workaround  Two workarounds are available:

    • Do not assign CPUs to virtual partitions by either path or cell specification.

    • Set the gWLM policy minimum for pset 0 (the default/OTHER workload) to be greater than or equal to the sum of path-specific CPUs and cell-specific CPUs.

  • Configurations with Psets Nested in Virtual Partitions Rejected With vPars version < 4.0   gWLM does not support nesting psets in virtual partitions when the vPars version is earlier than vPars A.04.00. However, it has not always rejected such configurations. gWLM 4.0 does reject these configurations though. So, configurations you used with gWLM 2.x or gWLM 3.x can be rejected when you begin using gWLM 4.0 agents. Given such a configuration, if the SRD is undeployed before upgrading the agents, the re-deployment of the SRD will fail with an error message. If the SRD was left deployed while the agents were upgraded, the agents will not be able to restore SRD operations. Also, SIM events will be generated to report the validation failure.

    Workaround  There are two workarounds:

    • Update to vPars A.04.00 or later.

    • Update your configurations so that psets are not nested in virtual partitions.

  • "dangerous REALTIME job" Messages in syslog   If you install gWLM A.03.00.00 on a system with Integrity VM A.02.00 installed, you get messages of the following form in syslog:

    vm_fssagt[2461]: dangerous REALTIME job 2686 gwlmagent

    In place of gwlmagent, you might see parstatus, HPUXChildWrap, or wbemexec.

    Workaround   You can safely ignore this message. These processes are not real-time processes. (If you prefer, you can upgrade to Integrity VM A.03.00, which correctly identifies these processes and does not produce this message.)

  • Information Error During Shutdown   You may see a message similar to the following:

    Information Error during shutdown. The unbinding of objects in the registry may have failed, and the workload management lock has not been released. Associated Exception com.hp.gwlm.common.JniPlatformException: prm_ctrl_rel_cfg_lock failed because vm_fssagt:8343 is the lock owner

    Workaround   You can safely ignore this message.

  • Managing fss Groups on Systems with psets Restricts fss groups   When a system has psets, gWLM uses only pset 0 for fss groups. gWLM is able to manage CPUs that are allocated only to pset 0.

    Workaround   There is no workaround; this is simply how fss groups are implemented on a system with psets. You can continue with your fss groups inside pset 0 (leaving the other psets unmanaged), manage using psets instead (ignoring fss groups), or remove all the psets (other than pset 0) using the following command:

    # psrset -d all

  • Discovery Does Not Show Current Information for Stopped Virtual Machines   Global Workload Manager discovery does not always report current information for stopped virtual machines. Specifically, when a virtual machine is stopped and the number of vCPUs is changed, gWLM discovery does not show the changed number of vCPUs. Instead, it shows the number of vCPUs from the virtual machine's most recent start.

    Workaround   Start the virtual machines before performing discovery.

  • Multiple Network Interface Cards   As a client/server application, gWLM is more sensitive than other types of applications to the network configuration of your host. It supports management only within a single network domain. For example, if your CMS host has multiple network interface cards that are connected to multiple distinct networks, gWLM requires that the fully qualified host name resolve to the IP address that is reachable by the gWLM agents to be managed.

    This issue is most often a concern when a host is connected to both of the following items:

    • A corporate LAN/WAN via one network interface card and IP address

    • A second, private internal network and private IP address for communicating with a certain other set of hosts (such as cluster members)

    Global Workload Manager attempts to detect and report network configuration issues that can cause undesirable behavior, but in some cases this detection occurs in a context that can be reported only into a log file.

    Workaround   If you encounter some unexpected behavior (such as a gWLM agent that fails to update or report the status of its workloads), inspect the /var/opt/gwlm/glwmagent.log.0 file on the host for errors.

  • Incorrectly Configured Host Name or IP Address  You may see the following message in a log file (gwlmagent.log.0 or gwlmcmsd.log.0):

    Unable to determine the network address and/or hostname
    of the current host. This indicates a mis-configured network and/or a host
    name resolution issue for this host. For troubleshooting information, see the
    VSE Management Software Release Notes and search for this message. 
    

    The most common cause for this error is a problem in the host name configuration file in /etc/hosts (or equivalent on Windows) or incorrect settings of the /etc/nsswitch.conf file (HP-UX only).

    Background information  gWLM is not a simple client/server application. It involves:

    • Multiple managed-node “servers” (the set of gWLM agents in an SRD are all peer servers that cooperatively manage the SRD)

    • The CMS management server handling configuration and monitoring

    Under normal operation, all of these components need complete connectivity. At a minimum, gWLM requires that each host have a primary IP address/host name that is reachable from every other interacting gWLM component--the CMS and all gWLM agents in a single SRD. (gWLM agents in multiple SRDs need not have connectivity within undeployed SRDs.)

    By default, gWLM uses the primary IP address/host name for a given host. However, you can set up a management LAN, as discussed in the HP Global Workload Manager User's Guide, to use other IP addresses/host names.

    Workaround  Correct the configuration of the host so that:

    • The primary fully qualified domain name can be properly resolved (by DNS or by configuration files)

    • The IP address and primary fully qualified domain name are consistent for the host—and do not resolve to a local-host address (for example, 127.0.0.1)

    The procedure below explores one way to check the host's configuration.

    1. Run the vseassist tool to perform initial network configuration checks.

    2. To validate proper configuration on HP-UX, try the following steps:

      1. Get the current host name using the hostname command:

        		[mysystem#1] > hostname
        		mysystem
        
      2. Get the IP address configured for the host using nslookup:

        		[mysystem#2] > nslookup mysystem
        		Trying DNS
        		Name:   mysystem.mydomain.com
        		Address:  15.11.100.17
        
      3. Verify that /etc/hosts has the same name configured for the address. Note that the first name should be the fully qualified domain name, and any aliases are listed afterward.

        		[mysystem#3] > grep 15.11.100.17 /etc/hosts
        		15.11.100.17    mysystem.mydomain.com mysystem
        
      4. Verify that the reverse lookup of the IP address returns the same fully qualified domain name as configured in /etc/hosts.

        		[mysystem#4] > nslookup 15.11.100.17 
        		Trying DNS
        		Name:    mysystem.mydomain.com
        		Address:  15.11.100.17
        

    Fix any issues by editing /etc/hosts or for additional information, see:

    • The HP-UX IP Address and Client Management Administrator's Guide, available online at http://docs.hp.com.

    • The BIND 9 Administrator Reference Manual, available from the Internet Systems Consortium at http://www.isc.org/sw/bind/arm93.

    • The Windows documentation.

  • Error During Discovery of Compartments   The following message might be displayed when you use the Manage New Systems wizard or the gwlm discover command:

    Error during discovery of compartments.

    In addition, the /var/opt/gwlm/gwlmagent.log.0 file contains the following message:

    com.hp.gwlm.common.PlatformException: /usr/sbin/parstatus -w exited with a non-zero exit status. Captured stderr is: Error: Unable to get the local partition number.

    Workaround   This is most likely due to having an outdated version of the nPartition Provider software. Global Workload Manager uses a command that is made available by the nPartition Provider, which is typically in every version of HP-UX, to determine system capabilities.

    You can also use the /opt/vse/bin/vseassist command to diagnose the issue.

    Install the latest nPartition software, even if you are not using nPartitions.

    For HP-UX 11i v1, use version B.11.11.01.03.01.01 or later.

    For HP-UX 11i v2 on HP 9000 servers, use version B.11.23.01.03.01.01 or later.

    For HP-UX 11i v2 on HP Integrity servers, use version B.11.23.01.04 or later.

    You can find the nPartition Provider at the following locations:

    • The quarterly AR CD starting May 2005

    • The Software Depot website http://software.hp.com

  • Modifying Java While gWLM is Running  gWLM does not support any actions (including the use of update-ux) that remove, overwrite, or otherwise modify the version of Java that gWLM is using in a managed node or CMS that is part of a deployed SRD.

    Workaround  Undeploy an SRD before taking any actions that affect the version of Java that gWLM is using on systems that are part of the SRD. If you used update-ux, be sure to:

    • Restart the CMS daemon on the CMS

      Using the command-line interface: /opt/gwlm/bin/gwlmcmsd

      Using the HP SIM interface: Select the menus Configure -> Configure VSE Agents -> Start gWLM CMS Daemon

    • Restart the agent on the managed nodes

      Using the command-line interface: /opt/gwlm/bin/gwlmagent

      Using the HP SIM interface: Select the menus Configure -> Configure VSE Agents -> Start gWLM Agent

  • Configuration of Agent and CMS Not Synchronized   Occasionally, a gWLM agent and the gWLM CMS disagree on whether an SRD is actually deployed. This can occur when you use Ctrl-C to interrupt a gwlm deploy or undeploy command. It can also occur if there are errors saving a gWLM configuration: The configuration is deployed and then saved to the gWLM configuration repository. If the deploy occurs but the save fails, the gWLM agent sees the SRD as deployed while the CMS sees it as undeployed.

    Workaround   Use the --force option with gwlm deploy or gwlm undeploy to synchronize the agent and the CMS.

    For example, run the following command to force both the agent and the CMS to consider the SRD as deployed, substituting the name of your SRD for SRD:

    # gwlm deploy --srd=SRD --force

    For more information about the gwlm command, see gwlm(1M).

  • Missing or Unexpected Historical Data (System Clocks Differ)   You might have no historical data available for graphing, even though you are certain an SRD was deployed for the time period in question.

    A related issue occurs when you select a time period where you expect high system activity, but the graph shows limited activity. Similarly, you might expect very little activity for a time period, but the graph shows lots of activity.

    Workaround   Check that the system clocks on the CMS and on all the systems in the SRD are synchronized. If the clocks differ significantly, gWLM might not be able to match the data from the managed nodes with the time period you are trying to graph.

  • Missing Historical Data (gWLM CMS Daemon/Service Restarted)  You may have blank sections of a historic report for a workload, or you may see the following error message when displaying historic data for a workload:

    There is no gWLM historical data for the workload MyWorkload.wkld. The
    workload has never been managed by gWLM, or the historical data has been
    removed.
    

    Because of caching of gWLM historic data in HP SIM, if the gWLM CMS daemon/service is restarted after initially viewing historic data, the interface incorrectly reports that there is no data available to view or fails to load portions of the data.

    Workaround  

    1. Log out of HP SIM

    2. Log in to HP SIM again

    3. Generate the historic report again

  • Real-Time Data is Currently Loading   You might see the following message when trying to view real-time reports:

    Real-time data is currently loading, please wait... You might also verify that the remote node is running and SRDs have been deployed.

    Workaround   Normally, this condition is only temporary. If it persists, check that the gwlmagent daemon is running on the remote nodes. If it is running, stop and restart it. If the condition still persists, undeploy and redeploy the SRD.

  • Data Missing in Real-time Monitoring   Global Workload Manager might not display monitoring updates for an SRD on the command line or through the graphical interface in HP SIM. This can be caused by attempts to reform an SRD timing out, leaving the SRD in a state where the agent on each of its managed nodes must be restarted. It can also be caused by a managed node being down, having its gwlmagent not running, or being hung.

    If the managed node is down or gwlmagent is not running, you will see the following message:

    The gWLM agent process on the host is not running -- start the agent and retry.

    If the managed node is hung, or the SRD needs all its agents to be restarted, symptoms can include:

    • Output from the gwlm monitor command omitting data for some SRDs

    • The Shared Resource Domain View in HP SIM showing multiple SRDs with the critical error "SRD data is currently stale".

    Workaround   If an SRD does not provide real-time monitoring over a sustained period of time, restart the gWLM agent on each managed node in the SRD.

    In the case of a hung SRD member, while real-time monitoring of that SRD is blocked, the other SRDs continue to manage resources. However, the real-time monitoring of other SRDs may be blocked due to the hung SRD member. To restore monitoring of the other SRDs:

    1. Undeploy the SRD containing the hung member. This may required using the --force option to the gwlm undeploy command.

    2. Restart gwlmcmsd to clear the blocked monitoring, using the following commands on the CMS:

       # gwlmcmsd --stop
       # gwlmcmsd
       
    3. Create a new SRD to replace the undeployed one, leaving out the hung SRD member.

    4. Once the hung SRD member has been restored to normal operation, undeploy the replacement SRD and re-deploy the original SRD to return to the original state.

  • "Input date format error:null" When Creating Advanced Reports  In the HP SIM interface to gWLM, you may see the following message when attempting to create an advanced report:

    Input date format error:null
    

    This message is generated when a text field for a date is empty, even though the field is not displayed on the screen.

    Workaround  Select other report types until an empty date field appears. Enter a valid date in that field and then re-select the original report type.

  • Sample Missing at Start or End of gwlmreport Output   A report from gwlmreport is based on a report period that starts at midnight the day the report begins and ends at midnight the day the report ends. Any samples that overlap midnight at the start or end of the report period are excluded from the report.

    Workaround   There is no workaround, but you should be aware of this behavior.

  • Error Using The Secure gWLM Communications Tool  When using the Secure gWLM Communications tool in the HP SIM interface to gWLM, you may get the following error messages:

    ERROR: gwlmimportkey failed to import key for
    hostname-certificate-file on hostname: keytool error: 
    java.lang.Exception:
    Input not an X.509 certificate
    unable to correctly import the server key
    
    ERROR: Task 'Secure gWLM Communications' terminating.
    

    This message is displayed when a communications certificate file hostname-certificate-file has been corrupted or is not valid.

    Workaround  

    1. Delete the hostname-certificate-file specified in the error message from the following location on the CMS:

      • HP-UX:

        /etc/opt/gwlm/certs/hostname-certificate-file

      • Windows:

        C:\Program Files\HP\Virtual Server Environment\conf\certs\hostname-certificate-file (although a different path may have been selected at installation)

    2. Run the Secure gWLM Communications tool again.

  • Setting Policy Weights to Zero Results in Skewed Allocation   This issue affects only gWLM A.02.00.00.x agents.

    Policy weights help gWLM determine resource allocations when excess resources exist. With the weights for all the policies used in an SRD set to the same value, resources should be allocated evenly to the associated workloads. However, setting the weights to zero for all the policies in an SRD results in a single workload being allocated all the excess resource.

    Workaround   Instead of zero, use a weight value of one.

  • Workload with Fixed Policy Gets More Resources Than Requested   In an SRD with nested partitions, assigning fixed policies where the sum of the fixed values is less than the minimum of the parent compartment can result in workloads getting more resources than specified in the fixed policies.

    Workaround   Set up the fixed policies so that the number of CPUs requested is greater than or equal to the minimum number of CPUs required by the parent compartment.

  • Convergence Rate and OwnBorrow/Utilization Policies   This issue affects gWLM A.02.00.00.x agents.

    The convergence rate value that you (optionally) specify when defining a policy affects only custom policies. OwnBorrow and utilization policies are not affected.

    Workaround   There is no workaround. However, this issue is addressed starting in gWLM A.02.00.01.x agents.

  • Custom Metrics Lost on Redeploy   Custom policies use metric values that you provide via the gwlmsend command. If you redeploy an SRD that has a custom policy, the most recent value for the policy's metric is lost. In this situation, gWLM bases its allocations on the minimum request specified in the workload's policy. The workload can also receive any CPU resources that remain after all the policies have been satisfied.

    Workaround   Update the metric values for all your custom policies immediately after a redeploy.

  • Multiple SRDs Based on Virtual Partitions Can Occur   Typically, gWLM does not allow you to create multiple SRDs based on virtual partitions on a single nPartition or system at the same time. However, when multiple gWLM users are deploying SRDs at almost the same time, gWLM might inadvertently allow such multiple SRDs.

    Workaround   Delete one of the SRDs and then remanage the workloads from the deleted SRD by placing them in the remaining SRD.

  • Only One SRD is Allowed to be Deployed   You might see a message similar to the following:

    Error trying to deploy SRD, mysystem.vpar.000 to mysystem2.mydomain.com. SRD, mysystem2.fss.000 is already deployed. Only one SRD is allowed to be deployed.

    Workaround   Undeploy the SRD using the --force option with the gwlm undeploy command, and restart gwlmagent on the managed node.

  • SRD Deployment Times Out and Displays a Blank Screen  If you attempt to deploy an SRD, but:

    • gWLM times out and displays a blank screen

    • There are events from each managed node similar to the following event:

      gWLM Agent MySystem.MyDomain.com
      Information Unable to manage the following hosts:
      Associated Exception Unable to manage the following hosts: MySystem.MyDomain.com: The gWLM agent 
      process on the host is not running -- start the agent and retry.
      

    You need to configure gWLM to work with hosts on multiple LANs.

    Workaround  Read the HP Global Workload Manager User's Guide section on Using gWLM with Hosts on Multiple LANs.

  • Application Hangs in fss group   On HP-UX 11i v2 (B.11.23), an application inside an fss group might hang when running in a single-processor virtual partition, nPartition, or system.

    Workaround  Install patch PHKL_33052.

  • Scripts Not Placed in Correct Workloads   With compartments based on psets or fss groups, gWLM allows you to place scripts in the compartments using application records with alternate names. This works only if the shell or interpreter being used is listed in the file /etc/shells. Typically, perl is not in this file. So, perl scripts (and any other scripts based on shells or interpreters not listed in /etc/shells) are not properly placed.

    Executables are not affected by this issue.

    Workaround   Add /opt/perl/bin/perl, and any other needed shells or interpreters, to the file /etc/shells. Global Workload Manager will recognize the added shells or interpreters within 30 seconds.

    NOTE: Because the full pathname is not required for the script, a rogue user could get access to compartments based on psets or fss groups — that would otherwise not be accessible — by using the name of the script for new scripts or wrappers.
  • Processes Moved to Default pset or Default fss Group   All process placement with the gwlmplace command on a managed node is lost if:

    • The managed node is rebooted.

    • The local gwlmagent daemon is restarted.

    • You undeploy the current SRD.

    In these cases, processes are placed according to any application records or user records that apply. If no records exist, nonroot processes are placed in the default pset or default fss group; root processes are left where they are.

    Workaround   To maintain the process placements across redeploys, use gWLM's application records or user records when creating or editing your workload definitions in gWLM.

  • Process Placement Using psrset Is Ignored   When gWLM is managing the psets on a system, every process on the system has to go in a workload. gWLM places the processes according to application records or user records specified when you create or edit a workload definition. If no records exist, the processes are subject to the placement rules, which are explained in the online help topic "pset / fss group tips" in the section "Precedence of placement techniques."

    If you use the psrset command to place processes in psets, gWLM is likely to move the processes to the default pset.

    Workaround   To maintain the placement of a process, use gWLM's application records or user records when creating or editing your workload definitions in gWLM. If using records is not practical, use the gwlmplace command. However, you will have to use gwlmplace after each redeploy of an SRD to put the processes back in the desired workloads.

  • Unable to Remove Abandoned fss Groups   fss groups created by gWLM can become abandoned and cannot be easily removed. This situation can occur for various reasons. For example, when managing an SRD based on fss groups, a second CMS is used — perhaps because the original CMS went down. This can leave the SRD with fss groups that you cannot remove.

    Workaround  Using the HP SIM interface, you can create a new SRD that automatically integrates the existing fss groups.

    Alternatively, you can remove the fss groups, in which case you have several options. If you have PRM installed, enter the following command:

    # /opt/prm/bin/prmconfig -r

    If you do not have PRM installed, use the following procedure:

    1. Run discovery:

      # /opt/gwlm/bin/gwlm discover host --file=myfile.xml \
       --type=fss

      where host is the system with the fss groups.

    2. Import myfile.xml into the configuration repository:

      # /opt/gwlm/bin/gwlm import --file=myfile.xml

    3. Determine the SRD name by running the following command and checking the output for names that include host:

      # /opt/gwlm/bin/gwlm list

      For example, the name might be host.fss.xyz, where xyz are numbers 0-9.

    4. Deploy the SRD:

      # /opt/gwlm/bin/gwlm deploy --srd=host.fss.xyz

    5. Undeploy the SRD:

      # /opt/gwlm/bin/gwlm undeploy --srd=host.fss.xyz

    The fss groups should now be gone from the system. However, their workload definitions are still in the gWLM configuration repository. You can remove those definitions and the SRD definition by using the gWLM interface in HP SIM. Select Tools->VSE Management, then click the Shared Resource Domain tab. Select the SRD with the fss groups, and then select Delete->Shared Resource Domain.

  • Sizes/Allocations Less Than Policy Minimums for Virtual Machines   The sizes or allocations for virtual machines in a deployed SRD can appear to be less than their policy minimums.

    Workaround   Wait a few minutes, since it can take several minutes for gWLM to recognize a virtual machine transition between the states of off and on.

  • Negative Current Size for NONVM   If the CPUs on a VM Host are oversubscribed when you deploy an SRD on that host, gWLM shows current size for NONVM as a negative value.

    Workaround   Two options are available:

    • Adjust the entitlements of those virtual machines that are on so that the CPUs are not oversubscribed.

    • Stop one or more virtual machines until those still on do not oversubscribe the CPUs.

  • Unmanaging a Virtual Machine That Is On Leaves SRD Undeployed   When attempting to unmanage a virtual machine that is started, the SRD can be undeployed, even though the following message is displayed:

    The virtual machine VM_name on host hostname is on but does not have an associated gWLM policy. Please turn the virtual machine off, or apply a gWLM policy to provide the necessary resources.

    Workaround   Turn off the virtual machine and redeploy the SRD that contained it.

  • Log File Extensions Other than .log.0, .log.1, and .log.2   Global Workload Manager is designed to use the file extensions .log.0, .log.1, and .log.2 for its log files. Java file locking is used to ensure that only one gWLM process is updating a log file at any given time. Starting with Java 1.4.2.06, the file locking allows the creation of files with extensions of the form .log.0.n, where n is some integer.

    Workaround   If you are using Java version 1.4.2.06 or later and you want to check the logs for errors, use the following command to see which files have recent error messages:

    # /bin/ls -ltr /var/opt/gwlm/*log*

    You can then use /usr/bin/tail to view messages in recently updated log files.

    If you are sending the log files to HP Support, create a tar file using the following commands:

    # cd /
    # tar cvf /tmp/gwlmlogs4support.tar var/opt/gwlm/*log*

    Then send the /tmp/gwlmlogs4support.tar file to HP Support.

  • Advanced Reports Cannot Process Workloads with Spaces at Start/End of Name   Starting with gWLM A.03.00.00, workload names could contain spaces. However, the gwlmreport utility, which generates advanced reports, cannot process workload names that start or end with spaces.

    Workaround  Rename your workloads to not start or end with spaces.

  • gwlmreport ovpafeed --dataversion Problems  Running either of the following commands:

    • gwlmreport ovpafeed

    • gwlmreport ovpafeed --dataversion=4.0

    results in an error.

    Workaround  Use the following commands to set up the feed and extract data:

    • gwlmreport ovpafeed --setup --dataversion=3.0

    • gwlmreport ovpafeed --dataversion=3.0

  • Error When Securing Communications   You may see a message similar to the following one when attempting to secure gWLM communications:

    	keytool error: java.lang.Exception: Key password must be at least 6 characters
    	unable to create keystore /etc/opt/gwlm/certstor/gwlm.keystore
    	unable to create the gwlm keystore at /opt/gwlm/bin/gwlmsslconfig line 184.
    

    Workaround   Try securing communications again.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2007–2008 Hewlett-Packard Development Company, L.P.