PRIMAT is an open source (ALv2) toolbox for the definition and execution of PPRL workflows. It offers several components for data owners and the central linkage unit that provide state-of-the-art PPRL methods, including Bloom-filter-based encoding and hardening techniques, LSH-based blocking, metric space filtering, post-processing and more.
PRIMAT is developed by the Database Group of the University of Leipzig, Germany.
- Task of identifying record in different databases reffering to the same person
- Protection of sensitive personal information
- Applications in medicine & healthcare, national security and marketing analysis
- Gurantee privacy by minimizing disclosure risk
- Scalability to millions of records
- High linkage quality
- PPRL tool covering the entire PPRL life-cycle
- Flexible definition and execution of PPRL workflows
- Comparative evaluation of PPRL approaches
- Modules for both data owner and the trusted linkage unit
Component/Module | Function/Feature | Status |
---|---|---|
Data generator & corruptor | - Data generation - Data corruption |
Implemented Planned |
Data cleaning | - Split/merge/remove attributes - Replace/remove unwanted values - OCR transformation |
Implemented Implemented Implemented |
Encoding | - Bloom filter encoding & hardening - Support of alternative encoding schemes |
Implememnted Planned |
Matching | - Standard & LSH-based blocking, Metric Space filtering - Threshold-based classification - Post-processing - Multi-threaded execution - Distributed matching - Multi-Party support, match cluster management - Incremental Matching |
Implemented Implemented Implemented Implemented Integration outstanding In development In development |
Evaluation | - Measure for assessing quality & scalability - Masked match result visualization |
Implemented Integration outstanding |
- Java 11+
- Maven
- Ubuntu (recommended)
The data owner application consists of components for pre-processing (data cleaning and stardadization) functions and Bloom-filter-based encoding of records containing person-related data.
To run the data owner application run the following command in the primat directory (where the pom is located):
mvn clean javafx:run -Dprimat.mainClass=dbs.pprl.toolbox.data_owner.gui.DataOwnerApp
The linkage unit application provides linkage functionalities, in particular blocking, similarity calculation and classification, post-processing. Furthermore, it consists of evaluation facilities to compare different PPRL workflows in terms of quality (recall, precision, f-measure) and scalability (runtime, reduction ratio).
To run the linkage unit application run the following command in the primat directory (where the .pom-file is located):
mvn clean javafx:run -Dprimat.mainClass=dbs.pprl.toolbox.lu.gui.LinkageUnitApp