Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home

HP XC System Software : Administration Guide
Version 3.0

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

HP Part Number: 5991-4850

Published: January 2006

Abstract

This document describes the tools and procedures necessary to administer, monitor, and maintain an HP XC system.


Table of Contents

About This Document
Intended Audience
Document Organization
HP XC Information
For More Information
Supplementary Information
Manpages
Related Information
Typographic Conventions
HP Encourages Your Comments
1 HP XC Administration Environment
Determining the Installation Type
Understanding Nodes, Roles, and Services
Administrator Passwords
Secure Shell (ssh)
Understanding the HP XC Command Environment
HP XC Command Set
Interpreting the nodelist Parameter
Executing a Command on Multiple Nodes
Understanding the Configuration and Management Database
Networking
Linux Virtual Server for HP XC System Alias
Network Time Protocol
Network Address Translation
Network Information Service
Local Storage
Understanding and Maintaining the File System
General File System Layout
Systemwide Directory
HP XC File System Layout
HP XC Service Configuration Files
Log Files
Modulefiles
Software Distribution
Recommended Administrative Tasks
2 Starting Up and Shutting Down the HP XC System
Understanding the Node States
Starting Up the HP XC System
Starting All Nodes
Determining Which Nodes Require Imaging
Imaging and Starting Nodes
Restarting a Node for Imaging
Shutting Down the HP XC System
Shutting Down One or More Nodes
Determining a Node's Power Status
Locating a Given Node
Disabling and Enabling a Node
3 Managing System Services
HP XC System Services
Displaying Services Information
Displaying All Services
Displaying the Nodes That Provide a Specified Service
Displaying the Services Provided by a Specified Node
Restarting a Service
Stopping a Service
Adding a Service
Understanding the roles_services.ini File
Understanding the .ini File for a Service
Adding a Service to an Existing Role
Creating a Role and Adding a Service to It
Global System Services
4 Managing Licenses
License Manager and License File
Determining If the License Manager Is Running
Starting and Stopping the License Manager
Starting the License Manager
Stopping the License Manager
Restarting the License Manager
5 Managing the Configuration and Management Database
Accessing the Configuration and Management Database
Querying the Configuration and Management Database
Displaying Configuration Details
Displaying the Nodes That Provide a Specified Service
Finding and Setting System Attribute Values
Backing Up the Configuration Database
Restoring the Configuration Database from a Backup File
Archiving Metrics Data from the Configuration Database
Restoring the Metrics Data from an Archive File
Purging Metrics Data from the Configuration Database
Dumping the Configuration Database
6 Monitoring the System
Monitoring Strategy
Monitoring Tools
Commands for Monitoring Node Status
Nagios
Nan Notification Aggregator and Delimiter
Supermon
The syslog and syslog-ng Services
The collectl Utility
Displaying System Environment Data
Displaying System Statistics
Displaying System Sensors from the Command Line
Monitoring Processor Usage and Load from the Command Line
Monitoring Memory from the Command Line
Monitoring Paging and Swap Data from the Command Line
System Monitoring with the Nagios GUI
Logging Node Events
Understanding the Event Logging Structure
The syslog-ng.conf Rules File
7 Network Administration
Network Address Translation Administration
Network Time Protocol Service
8 Distributing Software Throughout the System
Overview of the Image Replication and Distribution Environment
Adding Software or Modifying Files on the Golden Client
Installing Additional RPMs from the HP XC System Software Installation DVD
Using File Overrides to the Golden Image
Using Per-Node Service Configuration
Determining Which Nodes Will Be Imaged
The Golden Image Checksum
Updating the Golden Image
The cluster_config Utility
The updateimage Command
Exclusion Files
Ensuring That the Golden Image Is Current
Propagating the Golden Image to All Nodes
Using the Full Imaging Installation
Using the si_updateclient Utility
Using the cexec Command
Maintaining Service Configuration Globally
9 Opening an IP Port in the Firewall
Open Ports
Opening Ports in the Firewall
Opening a Temporary Port in the Firewall
Opening an IP Port in the Firewall Persistently
10 Connecting to a Remote Console
Console Management Facility
Accessing a Remote Console
11 Managing Local User Accounts
HP XC User and Group Accounts
General Procedures for Administering Local User Accounts
Adding a Local User Account
Modifying a Local User Account
Deleting a Local User Account
Configuring the ssh Keys for a User
Changing the Root Password
Synchronizing the NIS Database
12 Managing SLURM
Overview of SLURM
Configuring SLURM
Configuring SLURM System Interconnect Support
Configuring SLURM Servers
Configuring Nodes in SLURM
Configuring SLURM Partitions
Configuring SLURM Features
Propagating Resource Limits
Restricting User Access to Nodes
Job Accounting
Using the sacct Command
Disabling Job Accounting
Configuring Job Accounting
Monitoring SLURM
Draining Nodes
Configuring the SLURM Epilog Script
SLURM Daemon Log Maintentance
13 Managing LSF
Administering Standard LSF
Administering LSF-HPC
Integration of LSF-HPC with SLURM
Installation of LSF-HPC on SLURM
LSF-HPC Startup and Shutdown
Controlling the LSF-HPC Service
Load Indexes and Resource Information
Launching Jobs with LSF-HPC
Monitoring and Controlling LSF-HPC Jobs
Job Accounting
LSF-HPC Failover
LSF-HPC Monitoring
Enhancing LSF-HPC
Configuring an External Virtual Host Name for LSF-HPC on HP XC Systems
LSF Daemon Log Maintentance
14 Managing Modulefiles
15 Mounting File Systems
Overview of the Network File System on the HP XC System
Understanding the Global fstab File
Mounting Internal File Systems Throughout the HP XC System
Understanding the csys Utility in the Mounting Instructions
Mounting Internal File Systems
Mounting Remote File Systems
Understanding the Mounting Instructions
Mounting a Remote File System
16 Using Diagnostic Tools
Using the sys_check Utility
Using the ovp Utility for System Verification
Using the dgemm Utility to Analyze Performance
Using the System Interconnect Diagnostic Tools
HP XC Diagnostic Tools for the Myrinet System Interconnect
Using Diagnostic Tools for the Quadrics System Interconnect
Using Diagnostic Tools for the Gigabit Ethernet System Interconnect
17 Troubleshooting
System Interconnect Troubleshooting
Myrinet System Interconnect Troubleshooting
Quadrics System Interconnect Troubleshooting
InfiniBand System Interconnect Troubleshooting
SLURM Troubleshooting
SLURM Configuration Issues
SLURM Run-Time Troubleshooting
LSF-HPC Troubleshooting
18 Servicing the HP XC System
Adding a Node
Replacing a Client Node
Replacing a System Interconnect Board in an CP6000 System
A Installing LSF-HPC for SLURM into an Existing Standard LSF Cluster
Assumptions
Requirement
Sample Case
HP XC Preparation
Installing LSF-HPC
Perform Post Installation Tasks
Configuring the LSF Alias
Starting LSF on the HP XC System
Sample Running Jobs
Troubleshooting
B Installing Standard LSF on a Subset of Nodes
Requirements
Assumptions
Sample Case
Instructions
Glossary
Index
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.