Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataset field level Lineage support? #1519

Closed
clojurians-org opened this issue Dec 19, 2019 · 3 comments
Closed

dataset field level Lineage support? #1519

clojurians-org opened this issue Dec 19, 2019 · 3 comments
Assignees
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. feature-request Request for a new feature to be added

Comments

@clojurians-org
Copy link
Contributor

i want to parse the sql to get the dataset field level Lineage by some open-source tool:
queryparser [https://github.com/uber/queryparser] for [hive, vertica, presto]
hssqlppp [https://github.com/JakeWheat/hssqlppp] for [oracle, postgresql]

i make the kafka avro message, and i want to push it.
but it don't find the Lineage for dataset field level.

i can just find the related information by upstreams from MetadataChangeEvent.avsc.

what's the adjustments i need to do it for such requirement?

@keremsahin1 keremsahin1 self-assigned this Dec 19, 2019
@keremsahin1
Copy link
Contributor

Hi @clojurians-org

DataHub doesn't support field level lineage at this point. We only have a support for dataset level lineage. You can see upstreams or downstreams of a specific dataset such as a mysql table. It's on our roadmap to have this but can't give timeline estimates for that. But, seems like we have fair amount request for this feature. We'll definitely consider this while defining our roadmap.

Thanks

@keremsahin1 keremsahin1 added the feature-request Request for a new feature to be added label Dec 21, 2019
@clojurians-org
Copy link
Contributor Author

i already finish the table level field extract by parsing hive sql if you're intested.
i will add oracle stored procedure parsing later.

field-level lineage is very easy to support by backend parsing. but it need ui model to display.

  cd contrib/metadata_etl
  cat metadata_sample/hive_1.sql | bin/lineage_hive_generator.hs

https://github.com/clojurians-org/simple-datahub/blob/master/contrib/metadata-etl/bin/lineage_hive_generator.hs

@keremsahin1 keremsahin1 assigned hshahoss and unassigned keremsahin1 Jan 24, 2020
@mars-lan mars-lan added the accepted An Issue that is confirmed as a bug by the DataHub Maintainers. label Mar 29, 2020
@mars-lan
Copy link
Contributor

Let's concentrate the discussion in this issue: #1731

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. feature-request Request for a new feature to be added
Projects
None yet
Development

No branches or pull requests

4 participants