org.ragna.comet.stream.extractors.kafka

Type members

Classlikes

case class KafkaExtractor[K, V](configuration: KafkaExtractorConfiguration, format: DataFormat, inference: InferenceEngine, concurrentItems: Int, itemTimeout: Option[FiniteDuration])(implicit keyDeserializer: Deserializer[IO, K], valueDeserializer: Deserializer[IO, V], toRdfElement: V => RDFElement) extends StreamExtractor[V]

StreamExtractor capable of extracting RDF items from an Apache Kafka stream of items

StreamExtractor capable of extracting RDF items from an Apache Kafka stream of items

Kafka streams are operated as FS2 streams thanks to the library FS2-Kafka

Type parameters:
K

Type to which kafka-stream keys are deserialized

V

Type to which kafka-stream values are deserialized

Value parameters:
concurrentItems

Maximum number of items to be parsed for RDF in parallel (to parallelize the pulling of items from Kafka, see configuration.concurrentItems)

configuration

Configuration supplied to this extractor to interact with a Kafka stream

format

Format of the RDF data arriving from the Stream, the Extractor expects all data items to share format

inference

Inference of the RDF data arriving from the Stream, the Extractor expects all data items to share inference

keyDeserializer

Deserializer used to map the incoming stream keys to instances of type K

valueDeserializer

Deserializer used to map the incoming stream keys to instances of type V

Throws:
KafkaException

When the underlying Kafka consumer cannot be built or the connection fails

See also:
Note:

Although incoming stream items can be deserialized into any format V (as long as a KafkaDeserializer is provided for type V), they'll eventually be converted to RDFElements in order to parse RDF from them

Companion:
object
Source:
KafkaExtractor.scala
sealed case class KafkaExtractorConfiguration(topic: String, server: String, port: Int, groupId: String, offsetReset: AutoOffsetReset, autoCommit: Boolean, sessionTimeout: FiniteDuration, commitRecovery: CommitRecovery, commitTimeout: FiniteDuration, commitInterval: FiniteDuration, pollTimeout: FiniteDuration, pollInterval: FiniteDuration, concurrentItems: Int)

Configuration and parameters related to how the program should interact with a Kafka stream (stream origin location, committing, polling, etc.)

Configuration and parameters related to how the program should interact with a Kafka stream (stream origin location, committing, polling, etc.)

Defaults are provided for most data, yet a target topic is required

Value parameters:
autoCommit

Whether to automatically commit offsets periodically (see commitInterval) or not

commitInterval

Time to wait before performing an offset commit operation

commitRecovery

Strategy to be followed when a commit operation fails

commitTimeout

Time to wait for offsets to be committed before raising an error

concurrentItems

Maximum number of items to be extracted and parsed for RDF in parallel (set it to 1 for sequential execution, bear in mind that high values won't necessarily translate into performance improvements unless you know what you are doing)

groupId

Id of the group to which the underlying Kafka consumer belongs

offsetReset

Strategy to be followed when there are no available offsets to consume on the consumer group

pollInterval

Time interval between poll operations

pollTimeout

How long the an stream polling operation is allowed to block

port

Port from which server is broadcasting

server

Hostname or IP address of the server broadcasting data

sessionTimeout

Consumer timeout for connecting to Kafka to or processing incoming items, some Kafka servers may limit require certain session timeouts

topic

Name of the topic from which data will be extracted

Note:

Commit-behaviour settings are meaningless if autoCommit is set to false

Some restrictions are in place for configuration instances, and invalid values for the configuration data (e.g.: invalid port, instant timeouts/intervals) will not be accepted (see testConfiguration)

Companion:
object
Source:
KafkaExtractorConfiguration.scala

Companion object with utils used in Kafka extractor configurations

Companion object with utils used in Kafka extractor configurations

Companion:
class
Source:
KafkaExtractorConfiguration.scala