+
Skip to content

Support external annotations files to allow selective loading and avoid memory issues #21

@ngawangtrinley

Description

@ngawangtrinley

We're working on PechaData, a multilingual Buddhist corpus project in collaboration with bdrc.io and pecha.org. As a format, Stam is a dream for our project, and we're starting to build our project on top of it with a mechanism to update annotation coordinates when the base text is updated.

However, our dataset includes many large texts (>10mb .txt) featuring multiple annotation layers often larger than the initial text file and we are concerned about performance issues when we have to load all the annotations in memory even when we only need a couple of sets of annotations. (i.e. we have a file with 15 annotation sets including POS tags and dependencies but we only need the text and the annotations for the table of content.)

Have you considered externalizing annotations in separate files like the .ann files of BrAT or do you have another solution to load annotations selectively? We thought about patching Stam to find a solution but we would much prefer a solution coming from the creators.

Thanks a lot for your work!

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载