Ryan Gross
1 min readMay 30, 2019

--

I agree with you that simply versioning all of the code won’t be enough. If we think of data pipelines as compilers to convert data to an executable function, then the data is the “code” in this example. When you discuss versioning code, you are talking about the “compiler” code that does the data processing. I also get the idea of collaboratively editing metadata, and reviewing it using git PRs, in fact we have done this on a client project before.

When you are asking for a git for metadata, it seems like you are looking for something more. I’m thinking what you are looking for there is similar to DVC. Have you looked at that project?

--

--

Ryan Gross
Ryan Gross

Written by Ryan Gross

Emerging Tech & Data Leader at Credera | Interested in how people & machines learn, and how to bring them together.

Responses (1)