这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@abdulfatir
Copy link
Contributor

@abdulfatir abdulfatir commented Nov 3, 2025

Issue #, if available: #354

Description of changes: This PR adds Chronos2Pipeline.embed to enable users to extract embeddings from the last encoder layer in an easy way. The API and behavior is similar to what Chronos and Chronos-Bolt provides.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@abdulfatir abdulfatir added the run-eval Run evaluation CI workflow label Nov 3, 2025
Comment on lines +550 to +561
def encode(
self,
context: torch.Tensor,
context_mask: torch.Tensor | None = None,
group_ids: torch.Tensor | None = None,
future_covariates: torch.Tensor | None = None,
future_covariates_mask: torch.Tensor | None = None,
num_output_patches: int = 1,
future_target: torch.Tensor | None = None,
future_target_mask: torch.Tensor | None = None,
output_attentions: bool = False,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish the diff would be more helpful here: is the body of this simply moved from forward?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the first (encoding) portion from forward has been factored out into encode.

or 2-dimensional of shape (n_variates, history_length). The history_lengths may be different across elements; left-padding
will be applied, if needed.
batch_size
The batch size used for generating embeddings. Note that the batch size here means the total number of time series which are input into the model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is clear to me: does the batch_size refer to the .shape[0] of the tensors being processed? Or does it span the variates dimension as well? (.shape[1]) I suppose it's the latter, given the docstring for the dataset class:

batch_size
The batch size for training the model. Note that the batch size here means the number of time series, including target(s) and
covariates, that are input into the model. If your data has multiple target and/or covariates, the effective number of time series
tasks in a batch will be lower than this value.

I see this is pretty much the description of batch_size everywhere (here, predict methods, dataset class). Maybe the confusion comes from "total number of time series" instead of "total number of variates", or something like that. But this could also be addressed separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, there's no notion of a variate dimension in the model: only batch and time (patch) axes. The batch_size here refers to the maximum items x (co)-variates per batch. Open to suggestions on a better docstring.

Copy link
Contributor Author

@abdulfatir abdulfatir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lostella.

Comment on lines +550 to +561
def encode(
self,
context: torch.Tensor,
context_mask: torch.Tensor | None = None,
group_ids: torch.Tensor | None = None,
future_covariates: torch.Tensor | None = None,
future_covariates_mask: torch.Tensor | None = None,
num_output_patches: int = 1,
future_target: torch.Tensor | None = None,
future_target_mask: torch.Tensor | None = None,
output_attentions: bool = False,
):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the first (encoding) portion from forward has been factored out into encode.

or 2-dimensional of shape (n_variates, history_length). The history_lengths may be different across elements; left-padding
will be applied, if needed.
batch_size
The batch size used for generating embeddings. Note that the batch size here means the total number of time series which are input into the model.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally, there's no notion of a variate dimension in the model: only batch and time (patch) axes. The batch_size here refers to the maximum items x (co)-variates per batch. Open to suggestions on a better docstring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-eval Run evaluation CI workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants