Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home

HP XC System Software: User's Guide
Version 3.1

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

HP Part Number: 5991-7400

Published: November 2006

Abstract

This document provides information about the HP XC System Software Version 3.1 user and programming environment.


Table of Contents

About This Document
Intended Audience
New and Changed Information in This Edition
Typographic Conventions
HP XC and Related HP Products Information
Related information
Manpages
HP Encourages Your Comments
1 Overview of the User Environment
System Architecture
HP XC System Software
Operating System
Node Platforms
Node Specialization
Storage and I/O
File System
System Interconnect Network
Network Address Translation (NAT)
Determining System Configuration Information
User Environment
LVS
Modules
Commands
Application Development Environment
Parallel Applications
Serial Applications
Run-Time Environment
SLURM
Load Sharing Facility (LSF-HPC)
Standard LSF
How LSF-HPC and SLURM Interact
HP-MPI
Components, Tools, Compilers, Libraries, and Debuggers
2 Using the System
Logging In to the System
LVS Login Routing
Using the Secure Shell to Log In
Overview of Launching and Managing Jobs
Introduction
Getting Information About Queues
Getting Information About Resources
Getting Information About System Partitions
Launching Jobs
Getting Information About Your Jobs
Stopping and Suspending Jobs
Resuming Suspended Jobs
Performing Other Common User Tasks
Determining the LSF Cluster Name and the LSF Execution Host
Getting System Help and Information
3 Configuring Your Environment with Modulefiles
Overview of Modules
Supplied Modulefiles
Modulefiles Automatically Loaded on the System
Viewing Available Modulefiles
Viewing Loaded Modulefiles
Loading a Modulefile
Loading a Modulefile for the Current Session
Automatically Loading a Modulefile at Login
Unloading a Modulefile
Viewing Modulefile-Specific Help
Modulefile Conflicts
Creating a Modulefile
4 Developing Applications
Application Development Environment Overview
Compilers
MPI Compiler
Examining Nodes and Partitions Before Running Jobs
Interrupting a Job
Setting Debugging Options
Developing Serial Applications
Serial Application Build Environment
Building Serial Applications
Developing Parallel Applications
Parallel Application Build Environment
Building Parallel Applications
Developing Libraries
Designing Libraries for the CP4000 Platform
5 Submitting Jobs
Overview of Job Submission
Submitting a Serial Job Using LSF-HPC
Submitting a Serial Job with the LSF bsub Command
Submitting a Serial Job Through SLURM Only
Submitting a Parallel Job
Submitting a Non-MPI Parallel Job
Submitting a Parallel Job That Uses the HP-MPI Message Passing Interface
Submitting a Parallel Job Using the SLURM External Scheduler
Submitting a Batch Job or Job Script
Submitting a Job from a Host Other Than an HP XC Host
Running Preexecution Programs
6 Debugging Applications
Debugging Serial Applications
Debugging Parallel Applications
Debugging with TotalView
7 Monitoring Node Activity
Installing the Node Activity Monitoring Software
Using the xcxclus Utility to Monitor Nodes
Plotting the Data from the xcxclus Datafiles
Using the xcxperf Utility to Display Node Performance
Plotting the Node Performance Data
Running Performance Health Tests
8 Tuning Applications
Using the Intel Trace Collector and Intel Trace Analyzer
Building a Program — Intel Trace Collector and HP-MPI
Running a Program – Intel Trace Collector and HP-MPI
The Intel Trace Collector and Analyzer with HP-MPI on HP XC
Installation Kit
HP-MPI and the Intel Trace Collector
Visualizing Data – Intel Trace Analyzer and HP-MPI
9 Using SLURM
Introduction to SLURM
SLURM Utilities
Launching Jobs with the srun Command
The srun Roles and Modes
Using the srun Command with HP-MPI
Using the srun Command with LSF-HPC
Monitoring Jobs with the squeue Command
Terminating Jobs with the scancel Command
Getting System Information with the sinfo Command
Job Accounting
Fault Tolerance
Security
10 Using LSF-HPC
Information for LSF-HPC
Overview of LSF-HPC Integrated with SLURM
Differences Between LSF-HPC and LSF-HPC Integrated with SLURM
Job Terminology
Using LSF-HPC Integrated with SLURM in the HP XC Environment
Useful Commands
Job Startup and Job Control
Preemption
Submitting Jobs
LSF-SLURM External Scheduler
How LSF-HPC and SLURM Launch and Manage a Job
Determining the LSF Execution Host
Determining Available System Resources
Examining System Core Status
Getting Information About the LSF Execution Host Node
Getting Host Load Information
Examining System Queues
Getting Information About the lsf Partition
Getting Information About Jobs
Getting Job Allocation Information
Examining the Status of a Job
Viewing the Historical Information for a Job
Translating SLURM and LSF-HPC JOBIDs
Working Interactively Within an Allocation
LSF-HPC Equivalents of SLURM srun Options
11 Advanced Topics
Enabling Remote Execution with OpenSSH
Running an X Terminal Session from a Remote Node
Using the GNU Parallel Make Capability
Example Procedure 1
Example Procedure 2
Example Procedure 3
Local Disks on Compute Nodes
I/O Performance Considerations
Shared File View
Private File View
Communication Between Nodes
Using MPICH on the HP XC System
Using MPICH with SLURM Allocation
Using MPICH with LSF Allocation
A Examples
Building and Running a Serial Application
Launching a Serial Interactive Shell Through LSF-HPC
Running LSF-HPC Jobs with a SLURM Allocation Request
Example 1. Two Cores on Any Two Nodes
Example 2. Four Cores on Two Specific Nodes
Launching a Parallel Interactive Shell Through LSF-HPC
Submitting a Simple Job Script with LSF-HPC
Submitting an Interactive Job with LSF-HPC
Submitting an HP-MPI Job with LSF-HPC
Using a Resource Requirements String in an LSF-HPC Command
Glossary
Index

List of Examples

5-1 Submitting a Job from the Standard Input
5-2 Submitting a Serial Job Using LSF-HPC
5-3 Submitting an Interactive Serial Job Using LSF-HPC only
5-4 Submitting an Interactive Serial Job Using LSF-HPC and the LSF-SLURM External Scheduler
5-5 Submitting a Non-MPI Parallel Job
5-6 Submitting a Non-MPI Parallel Job to Run One Task per Node
5-7 Submitting an MPI Job
5-8 Submitting an MPI Job with the LSF-SLURM External Scheduler Option
5-9 Using the External Scheduler to Submit a Job to Run on Specific Nodes
5-10 Using the External Scheduler to Submit a Job to Run One Task per Node
5-11 Using the External Scheduler to Submit a Job That Excludes One or More Nodes
5-12 Using the External Scheduler to Launch a Command in Parallel on Ten Nodes
5-13 Using the External Scheduler to Constrain Launching to Nodes with a Given Feature
5-14 Submitting a Job Script
5-15 Submitting a Batch Script with the LSF-SLURM External Scheduler Option
5-16 Submitting a Batch Job Script That Uses a Subset of the Allocation
5-17 Submitting a Batch job Script That Uses the srun --overcommit Option
5-18  Environment Variables Available in a Batch Job Script
8-1 The vtjacobic Example Program
8-2 C Example – Running the vtjacobic Example Program
9-1 Simple Launch of a Serial Program
9-2 Displaying Queued Jobs by Their JobIDs
9-3 Reporting on Failed Jobs in the Queue
9-4 Terminating a Job by Its JobID
9-5 Cancelling All Pending Jobs
9-6 Sending a Signal to a Job
9-7 Using the sinfo Command (No Options)
9-8 Reporting Reasons for Downed, Drained, and Draining Nodes
10-1 Job Allocation Information for a Running Job
10-2 Job Allocation Information for a Finished Job
10-3 Using the bjobs Command (Short Output)
10-4 Using the bjobs Command (Long Output)
10-5 Using the bhist Command (Short Output)
10-6 Using the bhist Command (Long Output)
10-7 Launching an Interactive MPI Job
10-8 Launching an Interactive MPI Job on All Cores in the Allocation
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.