+
Skip to content

Conversation

poopsiclepooding
Copy link

@poopsiclepooding poopsiclepooding commented Sep 22, 2025

What does this implement/fix? Explain your changes.

This code implements fABBA transform as found in https://github.com/nla-group/fABBA/tree/master .
fABBA is a time series symbolic representation transform, it compresses and digitizes the time series
into series of symbols.
Have added a python implementation of fABBA along with example usage.

Depends on #8959 for the unequal length output handling, #8959 should be merged first.

Does your contribution introduce a new dependency? If yes, which one?

No new dependency.

What should a reviewer concentrate their feedback on?

Any formatting or major code issues. Any errors with particular combination of hyperparameters

Did you add any tests for the change?

Haven't added any particular test. The example provided illustrates the working of algorithm. If any test suggestions are there I can add them.

Any other comments?

Original fABBA implementation was really helpful to write this - https://github.com/nla-group/fABBA/tree/master

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the sktime root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • Optionally, for added estimators: I've added myself and possibly to the maintainers tag - do this if you want to become the owner or maintainer of an estimator you added.
    See here for further details on the algorithm maintainer role.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@fkiraly fkiraly changed the title [ENH] Added fABBA Transform [ENH] fABBA Transform Sep 22, 2025
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Welcome to open source!

Some comments above.

Further questions/requests:

@fkiraly fkiraly added module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing enhancement Adding new functionality labels Sep 22, 2025
@poopsiclepooding
Copy link
Author

Will look into code formatting.
The implementation in original branch doesn't work for multiple time series of different lengths so I had changed that.
Also I wanted option to export symbols as int(labels of clusters) rather than just char so implemented that as well.
I used the original repo as reference for main algorithms.

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 22, 2025

I used the original repo as reference for main algorithms.

I see - we should cite it and the paper then.

@poopsiclepooding
Copy link
Author

Should I add the repo citation below the paper citation or somewhere else as well?

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 22, 2025

Should I add the repo citation below the paper citation or somewhere else as well?

Once in References, and then from the main text, I would say.

@chenxinye
Copy link

chenxinye commented Sep 25, 2025

Hi @fkiraly and @poopsiclepooding,

Thanks @poopsiclepooding for implementing it, later I will also update my repo to support length variability for time series.

I think my paper [1] has mentioned this method can be easily extended to arbtrary length time series:

I suggest the citations can be:
[1] X. Chen, 2024. Joint symbolic aggregate approximation of time series
[2] X. Chen and S. Güttel., 2024. fABBA: A Python library for the fast symbolic approximation of time series. Journal of Open Source Software
[3] X. Chen and S. Güttel. 2023. An Efficient Aggregation Method for the Symbolic Representation of Temporal Data. ACM Trans. Knowl. Discov. Data
[4] S. Elsworth and S. Güttel. 2023. ABBA: adaptive Brownian bridge-based symbolic aggregation of time serie

Thanks,
Xinye

@poopsiclepooding
Copy link
Author

Thanks a lot @chenxinye . I will add those citations

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 27, 2025

The failures seem to indicate that a single time series is getting transformed intio multiple. Is this expeted behaviour? If yes, we need to set tags differently.

@poopsiclepooding
Copy link
Author

Yes that is expected behaviour. If we set partition hyperparameter then time series is broken into parts before apply fABBA and each part's symbolic representation is returned. What should I change the tags to?

@fkiraly
Copy link
Collaborator

fkiraly commented Sep 30, 2025

there is currently no valid tag for this behaviour, we will have to look into it on framework level. May I suggest to skip the failing tests for now via tests:skip_by_namd and raise an issue that explicitly refers to the issue:

if an mtype is passed that can only represent equal length, and the transform changes it to unequal length, the transform tries to convert it back to the original mtype which fails. We need to have a design decision on this.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 10, 2025

@poopsiclepooding, I have updated the framework to handle transformers that turn series into unequal length, here: #8959

I will merge #8959 into this Pr and the tests for the fabba transformer should pass now.

FYI, I changed the name to FABBA since classes need to start with capital letters.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 10, 2025

I think the example is now failing, see the failing tests.

@poopsiclepooding
Copy link
Author

I think the example is now failing, see the failing tests.

Yes fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Adding new functionality module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载