Support external annotations files to allow selective loading and avoid memory issues

We're working on [PechaData](https://github.com/PechaData), a multilingual Buddhist corpus project in collaboration with [bdrc.io](http://bdrc.io/) and [pecha.org](http://pecha.org/). As a format, Stam is a dream for our project, and we're starting to build our project on top of it with a mechanism to update annotation coordinates when the base text is updated. 

However, our dataset includes many large texts (>10mb .txt) featuring multiple annotation layers often larger than the initial text file and we are concerned about performance issues when we have to load all the annotations in memory even when we only need a couple of sets of annotations. (i.e. we have a file with 15 annotation sets including POS tags and dependencies but we only need the text and the annotations for the table of content.)

Have you considered externalizing annotations in separate files like the .ann files of BrAT or do you have another solution to load annotations selectively? We thought about patching Stam to find a solution but we would much prefer a solution coming from the creators.

Thanks a lot for your work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support external annotations files to allow selective loading and avoid memory issues #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support external annotations files to allow selective loading and avoid memory issues #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions