The purpose of this paper is to analyze and implement incremental updates of data lineage storage in the software tool Manta Flow. The basis of this work is the study of current data lineage storage in Manta Flow, research of existing solutions of incremental updates in version control systems, research of incremental backups in databases, analysis and design of a new solution of incremental updates in Manta Flow and a subsequent prototype implementation and performance testing execution.
The resulting prototype can be deployed into the existing Manta Flow product, reducing time complexity of updates in data lineage storage in orders of magnitude