-
Notifications
You must be signed in to change notification settings - Fork 32
Description
A fundamental part of the digital reporting pipeline that the Rune DSL supports is ingestion - the process of reading serial data in any format, such as FpML, and creating a corresponding CDM object that represents the same data in a uniform way. In the past, modellers could define this transformation from serial data to a CDM object by using synonyms. However, three problems have caused us to rethink the way in which we can define these transformations.
- Synonyms are non-transparent. To figure out where a particular CDM field comes from, it often requires deep knowledge and analysis of the synonym structures. Certain features such as string-based matching and implicit skipping of synonym levels make it hard to navigate, read and understand a transformation. Additionally, their syntax is inconsistent with other similar features such as functions and reporting rules, adding to their obscureness.
- Synonyms are error-prone. Since synonyms uses string-based matching at runtime, typo's and copy-paste errors can easily sneak in, especially during refactoring of a model or version changes of the serial schema.
- Synonyms are closed source. Although their syntax is part of the open-source Rune DSL, the synonym code generator is closed source. To make the CDM truly open source, ingestion should be too.
To address these problems, ingestion has been split up in two separate tasks:
- XSD importing: by directly representing the serial schema (e.g., XSD) as a Rune model, deserialization becomes trivial, spelling errors can be caught, and type errors can be detected during modelling.
- Translation by functions: by using functions instead of synonyms, the transformation is consistent with other pipeline steps, in which no implicit steps exist. Additionally, it becomes possible to navigate the transformation by clicking through function definitions, increasing readability and transparency of the transformation. Since functions are fully open source, this also addresses the third problem.
The two new parts that ingestion consists of are illustrated in the graph below.
graph LR
A["Serial format
(JSON/XML/CSV)"] -->|"deserialization"| B["Imported Rune model
(FpML)"]
A -->|"ingestion"| C["CDM"]
B -->|"translate"| C
C -->|"reporting"| D["DRR"]
D -->|"projection"| E["Imported Rune model
(ISO 20022)"]
E -->|"serialization"| F["Serial format
(XML)"]
By moving from synonyms to functions and expressions, however, we loose some expressiveness and reusability through synonym overriding. This issue outlines the missing requirements, and provides a solution proposal for each of them.
Requirements and proposals
See sub-issues below.