Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc(redshift) - Adding Redshift ingestion quickstart guide #7700

Merged
merged 3 commits into from
Mar 31, 2023

Conversation

treff7es
Copy link
Contributor

Adding Redshift ingestion quickstart guide
Small fix for BigQuery and Snowflake doc

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Small fix for BigQuery and Snowflake doc
@github-actions github-actions bot added the docs Issues and Improvements to docs label Mar 28, 2023

## Configure Recipe

5. Navigate to the **Sources** tab and click **Create new source**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be step 5, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a new set of steps, or is that we are doing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other docs

* [BigQuery Resource Viewer](https://cloud.google.com/bigquery/docs/access-control#bigquery.resourceViewer) -> This role is for Table-Level Lineage and Usage extraction
* [Logs View Accessor](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataViewer) -> This role is for Table-Level Lineage and Usage extraction
* [BigQuery Data Viewer](https://cloud.google.com/bigquery/docs/access-control#bigquery.dataViewer) -> This role is for Profiling
* [BigQuery Read Session User](https://cloud.google.com/bigquery/docs/access-control#bigquery.readSessionUser) -> This role is for Profiling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- thank you for catching this!

---
# Configuring Your Redshift Connector to DataHub

Now that you have created a DataHub user in Redshift in [the prior step](setup.md), it's now time to set up a connection via the DataHub UI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change "it's now time to set up" to "it's time to set up" -- no now required


* **Usage statistics** to help you understand recent query activity
* **Table-level lineage** (where available) to automatically define interdependencies between datasets
* **Table- and column-level profile statistics** to help you understand the shape of the data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to help you understand relationships between Tables and Columns"

## Redshift Prerequisites

1. Connect to your Amazon Redshift cluster using an SQL client such as SQL Workbench/J or Amazon Redshift Query Editor with your Admin user.
2. Create a [User](https://docs.aws.amazon.com/redshift/latest/gsg/t_adding_redshift_user_cmd.html) if you don't have one already.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create a Redshift User" that will be used to perform the metadata extraction

@@ -52,6 +52,9 @@ In order to configure ingestion from Snowflake, you'll first have to ensure you
grant references on all views in database identifier($db_var) to role datahub_role;
grant references on future views in database identifier($db_var) to role datahub_role;

-- Assign privileges to extract lineage and usage statistics from Snowflake by executing the below query.
grant imported privileges on database snowflake to role datahub_role;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments overall looks great! Merged your static assets PR.

@jjoyce0510 jjoyce0510 merged commit 86960ad into datahub-project:master Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues and Improvements to docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants