About Data Lineage ….. my notes.Or more accurately links.

  1. https://en.wikipedia.org/wiki/Data_lineage
  2. https://en.wikipedia.org/wiki/Meta-data_management
  3. https://en.wikipedia.org/wiki/ISO/IEC_11179 a repository standard, http://aristotlemetadata.com, an implementation of the standard at apache.
  4. https://en.wikipedia.org/wiki/Big_data
  5. A white paper, “Lineage Tracing for General Data Warehouse Transformations” from the wikipedia article on data lineage.
  6. Practical Lineage Tracing in Data Warehouses, by Cui & Widom, they pose a scenario and show how storage and inversion allows linegae to be captured and queried. The problem is exclusively defined in SQL.
  7. Data lineage demystified/ at dataversity.net, this is easy to read and mainly talks about why. It is a puff piece for ASG, who have tools in this market and have published a white paper, they host it at whitepapers.dataversity.net, I have mirrored it here. here…. ASG advertise their Enterprise Intelligence and Data Lineage solutions hsted here…, published here … and mirrored here. All this material is strong on why, and the coverage of the problem, weaker on explaining how its taps work, and how its reports meet business need. They published a case study with 5 examples of the use of their tool.
  8. The architectural scope of data governance a blog on Informatica’s site by Bob Kerel. He argues this is a process and needs a framework. He categorises the heterogeneity of the sources and repos, and classifies the problems as profiling, discovery, semantics (aka glossaries) and management/lineage. I wonder if the semweb people have anything to offer.
  9. Supporting Fine-Grained Data Lineage in a Database Visualization Environment, authors are Woodruff & Stonebraker. Is this the paper that invented “lazy macro”?  Hover here for more.
  10. https://www.timmitchell.net/post/2016/05/06/etl-data-lineage/

I used a picture from here as the featured picture as graphs are a good visualisation of data lineage (it would seem)

See also

  1. http://ilpubs.stanford.edu:8090/525/1/2001-5.pdf
  2. http://ilpubs.stanford.edu:8090/403/1/1999-47.pdf


The more I look at this, why graphs and not ELHs.


Dave Technology , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.