这是indexloc提供的服务,不要输入任何密码
Skip to content

Overview of Ingest 2 requirements #1077

@SimonCockx

Description

@SimonCockx

A fundamental part of the digital reporting pipeline that the Rune DSL supports is ingestion - the process of reading serial data in any format, such as FpML, and creating a corresponding CDM object that represents the same data in a uniform way. In the past, modellers could define this transformation from serial data to a CDM object by using synonyms. However, three problems have caused us to rethink the way in which we can define these transformations.

  1. Synonyms are non-transparent. To figure out where a particular CDM field comes from, it often requires deep knowledge and analysis of the synonym structures. Certain features such as string-based matching and implicit skipping of synonym levels make it hard to navigate, read and understand a transformation. Additionally, their syntax is inconsistent with other similar features such as functions and reporting rules, adding to their obscureness.
  2. Synonyms are error-prone. Since synonyms uses string-based matching at runtime, typo's and copy-paste errors can easily sneak in, especially during refactoring of a model or version changes of the serial schema.
  3. Synonyms are closed source. Although their syntax is part of the open-source Rune DSL, the synonym code generator is closed source. To make the CDM truly open source, ingestion should be too.

To address these problems, ingestion has been split up in two separate tasks:

  1. XSD importing: by directly representing the serial schema (e.g., XSD) as a Rune model, deserialization becomes trivial, spelling errors can be caught, and type errors can be detected during modelling.
  2. Translation by functions: by using functions instead of synonyms, the transformation is consistent with other pipeline steps, in which no implicit steps exist. Additionally, it becomes possible to navigate the transformation by clicking through function definitions, increasing readability and transparency of the transformation. Since functions are fully open source, this also addresses the third problem.

The two new parts that ingestion consists of are illustrated in the graph below.

graph LR
    A["Serial format
    (JSON/XML/CSV)"] -->|"deserialization"| B["Imported Rune model
    (FpML)"]
    A -->|"ingestion"| C["CDM"]
    B -->|"translate"| C
    C -->|"reporting"| D["DRR"]
    D -->|"projection"| E["Imported Rune model
    (ISO 20022)"]
    E -->|"serialization"| F["Serial format
    (XML)"]
Loading

By moving from synonyms to functions and expressions, however, we loose some expressiveness and reusability through synonym overriding. This issue outlines the missing requirements, and provides a solution proposal for each of them.

Requirements and proposals

See sub-issues below.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsubject: code generationThis issue is about code generationsubject: model validationThis issue is about validation of Rosetta models, such as the type systemsubject: syntaxThis issue is about the syntax of Rosetta

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions