这是indexloc提供的服务,不要输入任何密码
Skip to content

How to generate forecasts with prediction_length > 64? #40

@clevilll

Description

@clevilll

Hi,

I have time data and split to train and test (keep it unseen) by slicing the df from the end part. I used your pipeline over data_train and tried to forecast as length as data_test unsuccessfully as below :

#-----------------------------------------------------------
# Libs
#-----------------------------------------------------------
# for plotting, run: pip install pandas matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from chronos import ChronosPipeline

#-----------------------------------------------------------
# LOAD THE DATASET
#-----------------------------------------------------------

df = pd.read_csv('https://raw.githubusercontent.com/amcs1729/Predicting-cloud-CPU-usage-on-Azure-data/master/azure.csv')
df['timestamp'] =  pd.to_datetime(df['timestamp'])
data = df.rename(columns={'min cpu': 'min_cpu',
                          'max cpu': 'max_cpu',
                          'avg cpu': 'avg_cpu',})



# Data preparation
# ==============================================================================
sliced_df = data[['timestamp', 'avg_cpu']]

# Convert data from Hz to MHz
# ==============================================================================
sliced_df['avg_cpu_Mhz'] = sliced_df['avg_cpu'] / 1000000
sliced_df

# Configuration
# ==============================================================================
name_columns='avg_cpu_Mhz'
lags=288
steps=288
n_backtest=3

step_size = steps * n_backtest
data_train = sliced_df[:-step_size]
data_test  = sliced_df[-step_size:] #unseen

# Pipeline
# ==============================================================================
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

# context must be either a 1D tensor, a list of 1D tensors,
# or a left-padded 2D tensor with batch as the first dimension
context = torch.tensor(data_train['avg_cpu_Mhz'])
prediction_length = 64 #len(data_test) #12

forecast = pipeline.predict(
    context,
    prediction_length,
    num_samples=288, #20,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
) # forecast shape: [num_series, num_samples, prediction_length]

but results is as follow:

# visualize the forecast
forecast_index = range(len(data_train), len(data_train) + prediction_length)
low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

plt.figure(figsize=(8, 4))
plt.plot(data_train['avg_cpu_Mhz'], color="royalblue", label="historical train data")
plt.plot(data_test['avg_cpu_Mhz'] , color="navy",      label="historical test data", linestyle='dashed')
plt.plot(forecast_index, median,    color="tomato",    label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")

plt.title('Chronos forecast result')
plt.ylabel(' CPU usage [MHz]',   fontsize=15)
plt.xlabel('Timestamp', fontsize=15)
plt.legend()
plt.grid()
plt.show()

img

  • How I can configure the arguments within predict() class to have forecast autoregressive over unseen data_test ?
  • why prediction_length recommended to be <=64 ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    FAQFrequently asked question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions