-
Notifications
You must be signed in to change notification settings - Fork 955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Atlas Lineage support #1103
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
ffe4b0b
atlas_lineage | :tada: Initial commit.
mgorsk1 fa91fb7
atlas_lineage | :rotating_light: Removing linter warnings.
mgorsk1 bf894ad
atlas_lineage | :ok_hand: Updating code due to code review changes.
mgorsk1 059033f
atlas_lineage | :rewind: Reverting changes.
mgorsk1 9fcc099
atlas_lineage | :fire: Removing code or files.
mgorsk1 cbc04d9
atlas_lineage | :bug: Fixing a bug.
mgorsk1 ef15e80
atlas_lineage | :recycle: Refactoring code.
mgorsk1 e7ddcd5
atlas_lineage | :bug: Fixing a bug.
mgorsk1 9ee4031
atlas_lineage | :bug: Fixing a bug.
mgorsk1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright Contributors to the Amundsen project. | ||
# SPDX-License-Identifier: Apache-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
# Copyright Contributors to the Amundsen project. | ||
# SPDX-License-Identifier: Apache-2.0 | ||
import abc | ||
import re | ||
from typing import Any, Dict, Optional | ||
|
||
|
||
class AtlasStatus: | ||
ACTIVE = "ACTIVE" | ||
DELETED = "DELETED" | ||
|
||
|
||
class AtlasCommonParams: | ||
qualified_name = 'qualifiedName' | ||
guid = 'guid' | ||
attributes = 'attributes' | ||
relationships = 'relationshipAttributes' | ||
uri = 'entityUri' | ||
|
||
|
||
class AtlasCommonTypes: | ||
bookmark = 'Bookmark' | ||
user = 'User' | ||
reader = 'Reader' | ||
|
||
|
||
class AtlasTableTypes: | ||
table = 'Table' | ||
|
||
|
||
class AtlasDashboardTypes: | ||
metadata = 'Dashboard' | ||
group = 'DashboardGroup' | ||
query = 'DashboardQuery' | ||
chart = 'DashboardChart' | ||
execution = 'DashboardExecution' | ||
|
||
|
||
class AtlasKey(abc.ABC): | ||
""" | ||
Class for unification of entity keys between Atlas and Amundsen ecosystems. | ||
|
||
Since Atlas can be populated both by tools from 'Atlas world' (like Apache Atlas Hive hook/bridge) and Amundsen | ||
Databuilder (and each of the approach has a different way to render unique identifiers) we need such class | ||
to serve as unification layer. | ||
""" | ||
|
||
def __init__(self, raw_id: str, database: Optional[str] = None) -> None: | ||
self._raw_identifier = raw_id | ||
self._database = database | ||
|
||
@property | ||
def is_qualified_name(self) -> bool: | ||
""" | ||
Property assessing whether raw_id is qualified name. | ||
|
||
:returns: - | ||
""" | ||
if self.atlas_qualified_name_regex.match(self._raw_identifier): | ||
return True | ||
else: | ||
return False | ||
|
||
@property | ||
def is_amundsen_key(self) -> bool: | ||
""" | ||
Property assessing whether raw_id is amundsen key. | ||
|
||
:returns: - | ||
""" | ||
if self.amundsen_key_regex.match(self._raw_identifier): | ||
return True | ||
else: | ||
return False | ||
|
||
def get_details(self) -> Dict[str, str]: | ||
""" | ||
Collect as many details from key (either qn or amundsen key) | ||
|
||
:returns: dictionary of entity properties derived from key | ||
""" | ||
if self.is_qualified_name: | ||
return self._get_details_from_qualified_name() | ||
elif self.is_amundsen_key: | ||
return self._get_details_from_key() | ||
else: | ||
raise ValueError(f'Value is neither valid qualified name nor amundsen key: {self._raw_identifier}') | ||
|
||
def _get_details(self, pattern: Any) -> Dict[str, str]: | ||
""" | ||
Helper function collecting data from regex match | ||
|
||
:returns: dictionary of matched regex groups with their values | ||
""" | ||
try: | ||
result = pattern.match(self._raw_identifier).groupdict() | ||
|
||
return result | ||
except KeyError: | ||
raise KeyError | ||
|
||
def _get_details_from_qualified_name(self) -> Dict[str, str]: | ||
""" | ||
Collect as many details from qualified name | ||
|
||
:returns: dictionary of entity properties derived from qualified name | ||
""" | ||
try: | ||
return self._get_details(self.atlas_qualified_name_regex) | ||
except KeyError: | ||
raise ValueError(f'This is not valid qualified name: {self._raw_identifier}') | ||
|
||
def _get_details_from_key(self) -> Dict[str, str]: | ||
""" | ||
Collect as many details from amundsen key | ||
|
||
:returns: dictionary of entity properties derived from amundsen key | ||
""" | ||
try: | ||
return self._get_details(self.amundsen_key_regex) | ||
except KeyError: | ||
raise ValueError(f'This is not valid qualified name: {self._raw_identifier}') | ||
|
||
@property | ||
@abc.abstractmethod | ||
def atlas_qualified_name_regex(self) -> Any: | ||
""" | ||
Regex for validating qualified name (and collecting details from qn parts) | ||
|
||
:returns: - | ||
""" | ||
pass | ||
|
||
@property | ||
@abc.abstractmethod | ||
def amundsen_key_regex(self) -> Any: | ||
""" | ||
Regex for validating amundsen key (and collecting details from key parts) | ||
|
||
:returns: - | ||
""" | ||
pass | ||
|
||
@property | ||
@abc.abstractmethod | ||
def qualified_name(self) -> str: | ||
""" | ||
Properly formatted qualified name | ||
|
||
:returns: - | ||
""" | ||
pass | ||
|
||
@property | ||
@abc.abstractmethod | ||
def amundsen_key(self) -> str: | ||
""" | ||
Properly formetted amundsen key | ||
|
||
:returns: - | ||
""" | ||
pass | ||
|
||
|
||
class AtlasTableKey(AtlasKey): | ||
@property | ||
def atlas_qualified_name_regex(self) -> Any: | ||
return re.compile(r'^(?P<schema>.*?)\.(?P<table>.*)@(?P<cluster>.*?)$', re.X) | ||
|
||
@property | ||
def amundsen_key_regex(self) -> Any: | ||
return re.compile(r'^(?P<database>.*?)://(?P<cluster>.*)\.(?P<schema>.*?)\/(?P<table>.*?)$', re.X) | ||
|
||
@property | ||
def qualified_name(self) -> str: | ||
if not self.is_qualified_name: | ||
spec = self._get_details_from_key() | ||
|
||
schema = spec['schema'] | ||
table = spec['table'] | ||
cluster = spec['cluster'] | ||
|
||
return f'{schema}.{table}@{cluster}' | ||
else: | ||
return self._raw_identifier | ||
|
||
@property | ||
def amundsen_key(self) -> str: | ||
if self.is_qualified_name: | ||
spec = self._get_details_from_qualified_name() | ||
|
||
schema = spec['schema'] | ||
table = spec['table'] | ||
cluster = spec['cluster'] | ||
|
||
return f'{self._database}://{cluster}.{schema}/{table}' | ||
elif self.is_amundsen_key: | ||
return self._raw_identifier | ||
else: | ||
raise ValueError(f'Value is neither qualified name nor amundsen key: {self._raw_identifier}') | ||
|
||
|
||
class AtlasColumnKey(AtlasKey): | ||
@property | ||
def atlas_qualified_name_regex(self) -> Any: | ||
return re.compile(r'^(?P<schema>.*?)\.(?P<table>.*?)\.(?P<column>.*?)@(?P<cluster>.*?)$', re.X) | ||
|
||
@property | ||
def amundsen_key_regex(self) -> Any: | ||
return re.compile(r'^(?P<database>.*?)://(?P<cluster>.*)\.(?P<schema>.*?)\/(?P<table>.*?)\/(?P<column>.*)$', | ||
re.X) | ||
|
||
@property | ||
def qualified_name(self) -> str: | ||
if self.is_amundsen_key: | ||
spec = self._get_details_from_key() | ||
|
||
schema = spec['schema'] | ||
table = spec['table'] | ||
cluster = spec['cluster'] | ||
column = spec['column'] | ||
|
||
return f'{schema}.{table}.{column}@{cluster}' | ||
elif self.is_qualified_name: | ||
return self._raw_identifier | ||
else: | ||
raise ValueError(f'Value is neither qualified name nor amundsen key: {self._raw_identifier}') | ||
|
||
@property | ||
def amundsen_key(self) -> str: | ||
if self.is_qualified_name: | ||
spec = self._get_details_from_qualified_name() | ||
|
||
schema = spec['schema'] | ||
table = spec['table'] | ||
cluster = spec['cluster'] | ||
column = spec['column'] | ||
|
||
source = self._database.replace('column', 'table') if self._database else '' | ||
|
||
return f'{source}://{cluster}.{schema}/{table}/{column}' | ||
elif self.is_amundsen_key: | ||
return self._raw_identifier | ||
else: | ||
raise ValueError(f'Value is neither qualified name nor amundsen key: {self._raw_identifier}') |
Empty file.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgorsk1 you don't need to name this
atlas_utils
, as this is clear that the file is under /utils directory.I'd rename this to
atlas.py
only.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i'll let @dechoma fix this :D