StreamExtractor capable of extracting RDF items from a list of Files, each file is expected to contain a single RDF item
Inspired by https://fs2.io/#/getstarted/example
- Value parameters:
- charset
Charset to be used to operate the requested files
- concurrentItems
Maximum number of items to be extracted and parsed for RDF in parallel (set it to 1 for sequential execution, bear in mind that high values won't necessarily translate into performance improvements unless you know what you are doing)
- files
List of files to be processed, represented by their paths
- format
Format of the RDF data arriving from the Stream, the Extractor expects all data items to share format
- inference
Inference of the RDF data arriving from the Stream, the Extractor expects all data items to share inference
- Note:
StreamExtractors type parameter is set to String since data read from files will be interpreted as Strings
- Companion:
- object
- Source:
- FileExtractor.scala
Value members
Inherited methods
Check the user-controlled inputs to this extractor, preventing the creation of it if necessary
Check the user-controlled inputs to this extractor, preventing the creation of it if necessary
- Throws:
- IllegalArgumentException
On invalid extractor parameters
- Inherited from:
- StreamExtractor
- Source:
- StreamExtractor.scala
Concrete fields
Get the initial input stream by taking the list of files, reading the bytes in each of them, and decoding them according to charset
Get the initial input stream by taking the list of files, reading the bytes in each of them, and decoding them according to charset
- Note:
Parallelism in file reading is attempted via prefetch
- Source:
- FileExtractor.scala
Inherited fields
The initial inputStream, transformed through toDataItems to get a stream of RDF Items
The initial inputStream, transformed through toDataItems to get a stream of RDF Items
- Inherited from:
- StreamExtractor
- Source:
- StreamExtractor.scala