Phi 21 worked with a boutique startup that provides best in class solutions in the big data and data integrations space. One of their flagship products is an Analytics automation platform with extensive capabilities to ingest and process large volumes of data from different sources and build complex data pipelines. One of the key features on the roadmap was the ability to collect and store metadata and lineage information which would provide complete traceability for users. Phi 21 was engaged to provide a generic solution for managing metadata and lineage details.
Based on extensive research and past experience in dealing with such requirements it was decided to define a generic type system covering most of the commonly used types of data source. In the future this could be extended to include additional types of data sources. Once the type system was formalized, Phi 21 evaluated the option of either using available open source technologies or building a home grown solution. After evaluating the options it was decided to build the solution using open source Apache Atlas which provided all the required capabilities and was easy to integrate using REST API’s.