Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP Capacity Advisor Version 4.0 User's Guide > Chapter 3 Key Capacity Advisor Concepts

Trends and Forecasts

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

Understanding the trends in collected utilization data can provide insight into possible future requirements. These potential future requirements can be used to generate forecasts for planning. HP Capacity Advisor provides tools for analyzing utilization data to calculate trends from the data and to combine existing utilization data with projected trends to produce forecasts.

Determining Trends

Determining trends from collected utilization data can be a challenging task. Accurate trend analysis requires adequate historical data and an understanding of the cyclic nature of the data being analyzed as well as any special events that might be found in the historical data.

  • Trends are frequently small values, on the order of percents or fractions of a percent per month.

  • The cyclic data can easily be orders of magnitude greater than the trend (heavy calculations the day before payroll distribution, floods of users logging on after work on the East coast, and so on).

  • Special events can also be orders of magnitude greater than the trend (seasonal promotions, once per year calculations such as taxes).

Any algorithmic analysis must be able to deal with these problems. HP Capacity Advisor combines aggregation of points based on known business cycles to deal with cyclic patterns with exclusion of points to deal with special events, to provide data for a linear regression.

Aggregation of Points in Business Period Bins

To reduce the impact of cyclic changes in the historical data, a user-specified business period is used to break the data into time-interval based “bins” and each bin is then represented by a single point. The point can be the average, the peak, or the 90th percentile of the data (90% of the points are less than the value). A bin will not be used unless the percent of points within the bin that are valid exceeds the threshold you have specified.

IMPORTANT: A trend will not be calculated unless at least two bins with an adequate percentage of valid points exist within the range of data being analyzed.
Choosing an Appropriate Business Interval

It is crucial to have a significant amount of data for analysis. Choosing an appropriate business interval with a data collection period that is long enough helps to ensure that you have enough data for a useful analysis. For example, a business period of 1 week and data collection period of 1 month provides only four aggregate data points. This is insufficient to provide meaningful results.

To improve results, for this example, use a business interval of 1 day with a data collection of 1 month to provide 30 data points, or use a business interval of 1 week with a data collection of 6 months to provide 26 data points. Modifying the business interval and/or the data collection period gives you more flexibility in arriving at a significant amount of data for analysis.

Exclusion of Points

You can set the report period to exclude a special event or mark the time period invalid to exclude points collected during that period from a trend analysis.

Factors That Affect Data Validity

Within any data collection period, events can occur in the polled systems that affect the quality of data available during that time period. Capacity Advisor identifies data points that could adversely affect the quality and validity of report results.

The following are examples of events that Capacity Advisor can recognize (and disregard) as potential sources of invalid points:

  • System downtime during the collection period.

  • Out of the ordinary activity designated by you. You can manually designate time periods as invalid when you know resource usage has been outside the norm that you want to consider in your capacity planning.(See “The Graph” section of the Profile Viewer help topic in Capacity Advisor Helpfor hints on how to do this.)

  • Partial collection from a virtual machine or a VM host. When Capacity Advisor is unable to apply a correction that accounts for all activity on a VM host, it marks any partial data collection as invalid.

How this relates to setting a Validity Threshold

The Validity Threshold that you set should reflect your tolerance for obtaining a sufficient amount of valid data in the collection period that you designate. If the reports that you run show that the given validity threshold is not obtainable for the designated time period, this may indicate that many of the data points in the designated collection period are invalid.

In this case, you can choose a lower Validity Threshold with the understanding that the report outcome may be a less reliable indicator of probable resource usage, or you can select a different or longer data collection period to improve the likelihood of obtaining a sufficient percentage of valid points for a good report.

Linear Regression

The linear regression is based on a least squares fit that minimizes the sum of the squares of the vertical offsets between each of the aggregate points and the trend line that describes them.

TIP: Regressions performed over small data sets are not always meaningful and can be misleading. Any trend analysis based on less than a dozen aggregate points should be carefully compared with the historical data to see if it "makes sense." The maximum number of data points for the trend analysis is the total time for the report divided by the business period, since business periods can be excluded if they do not meet the validity criteria.

Because the trend is reported as an annual growth rate, it is best to have more than a year of historical data before trying to analyze trends.

Error Analysis

You can choose to include error analysis in the report. The following error value is available:

r-squared: r2 is the square of the correlation coefficient (r), and is used in the 'goodness of fit' analysis of trend estimations. r is a value between 0 and +/- 1. where values approaching +/- 1 indicate increasing validity of the data representation.

Forecast Calculations

HP Capacity Advisor forecasting allows you to combine a range of historical data (the forecast data range) with a predicted trend (the annual projected growth rate) to produce a forecast model. The forecast model can be used to provide an estimate of future utilization.

The Forecast Model Hierarchy

The forecast model can be specified at four different levels within Capacity Advisor, with more specific forecast models overriding more general models, as indicated in the following table:

Table 3-5 Forecast Models

ForecastDescriptionOverrides
Global ForecastApplies to all workloads in Capacity Advisor for which a more specific forecast is not provided.
• Nothing
Workload ForecastApplies to a specific workload in Capacity Advisor unless a more specific forecast is provided.
• Global
Scenario ForecastApplies to all workloads within a Capacity Advisor scenario for which a more specific forecast is not provided.
• Global
• Workload
Scenario Workload ForecastApplies to a specific workload within a Capacity Advisor scenario.
• Global
• Workload
• Scenario

 

Forecast Data Range

The forecast data range defines the historical data that is combined with the annual projected growth rate to produce the forecast model. The forecast data range can be specified as:

  • A fixed interval ending on a specific date

  • A fixed interval beginning on a specific date

  • The time interval between two dates

  • A fixed interval ending on the last full day of data collection

Annual Projected Growth Rate

The annual projected growth rate is specified in percent and can be positive for increasing utilization, negative for decreasing utilization or zero for no change. No change is the default. Separate rates can be specified for memory and CPU growth.

Combining the Data Range with the Annual Growth Rate

The forecast is applied point-by-point to the historical data within the range you have specified. It is applied linearly, so that a point 1 year from the starting point of a forecast is the result of the full growth rate being applied to the data. The data within the range you have provided is used to “tile” the future by applying the portion of the growth rate appropriate to each point to each point in the data range and repeating the data set until the desired end point is reached.

Related Topics

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2006-2008 Hewlett-Packard Development Company, L.P.