Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(configs): Simplify Kafka Topic name configurations + docs #5198

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docker/datahub-gms/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,13 @@ UI_INGESTION_DEFAULT_CLI_VERSION=0.8.38

# Uncomment to configure kafka topic names
# Make sure these names are consistent across the whole deployment
# METADATA_CHANGE_PROPOSAL_TOPIC_NAME=MetadataChangeProposal_v1
# FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME=FailedMetadataChangeProposal_v1
# METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
# METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME=MetadataChangeLog_Timeseries_v1
# PLATFORM_EVENT_TOPIC_NAME=PlatformEvent_v1
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1
# Deprecated!
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
# METADATA_CHANGE_EVENT_NAME=MetadataChangeEvent_v4
# FAILED_METADATA_CHANGE_EVENT_NAME=FailedMetadataChangeEvent_v4
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-mae-consumer/env/docker-without-neo4j.env
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-mae-consumer/resources/entity-regis

# Uncomment to configure topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_TOPIC_NAME=MetadataAuditEvent_v4
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1

# Uncomment and set these to support SSL connection to Elasticsearch
Expand Down
6 changes: 5 additions & 1 deletion docker/datahub-mae-consumer/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-mae-consumer/resources/entity-regis

# Uncomment to configure topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_TOPIC_NAME=MetadataAuditEvent_v4
# METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
# METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME=MetadataChangeLog_Timeseries_v1
# PLATFORM_EVENT_TOPIC_NAME=PlatformEvent_v1
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1
# Deprecated!
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4

# Uncomment and set these to support SSL connection to Elasticsearch
# ELASTICSEARCH_USE_SSL=
Expand Down
7 changes: 5 additions & 2 deletions docker/datahub-mce-consumer/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@ GMS_PORT=8080

# Uncomment to configure kafka topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_MCE_TOPIC_NAME=MetadataChangeEvent_v4
# KAFKA_FMCE_TOPIC_NAME=FailedMetadataChangeEvent_v4
# METADATA_CHANGE_PROPOSAL_TOPIC_NAME=MetadataChangeProposal_v1
# FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME=FailedMetadataChangeProposal_v1
# Deprecated!
# METADATA_CHANGE_EVENT_NAME=MetadataChangeEvent_v4
# FAILED_METADATA_CHANGE_EVENT_NAME=FailedMetadataChangeEvent_v4

# Uncomment and set these to support SSL connection to GMS
# NOTE: Currently GMS itself does not offer SSL support, these settings are intended for when there is a proxy in front
Expand Down
44 changes: 23 additions & 21 deletions docs/how/kafka-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,25 +78,21 @@ The following are environment variables you can use to configure topic names use
- (Deprecated) `METADATA_CHANGE_EVENT_NAME`: The name of the metadata change event topic.
- (Deprecated) `METADATA_AUDIT_EVENT_NAME`: The name of the metadata audit event topic.
- (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME`: The name of the failed metadata change event topic.
- (Deprecated) `KAFKA_MCE_TOPIC_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from. This should technically be
the same as `METADATA_CHANGE_EVENT_NAME` and will soon be removed.
- (Deprecated) `KAFKA_FMCE_TOPIC_NAME`: The name of the deprecated topic that failed MCEs will be written to. This should technically be
the same as `FAILED_METADATA_CHANGE_EVENT_NAME` and will soon be removed.
- (Deprecated) `KAFKA_TOPIC_NAME`: The name of the deprecated topic that MAEs are writtent to. This is used by the MAE consumer when
reading messages. It should contain the same value as `METADATA_AUDIT_EVENT_NAME` and will soon be removed.

### MCE Consumer (datahub-mce-consumer)

- (Deprecated) `KAFKA_MCE_TOPIC_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from. This should technically be
the same as `METADATA_CHANGE_EVENT_NAME` and will soon be removed and replaced by `METADATA_CHANGE_EVENT_NAME`.
- (Deprecated) `KAFKA_FMCE_TOPIC_NAME`: The name of the deprecated topic that failed MCEs will be written to. This should technically be
the same as `FAILED_METADATA_CHANGE_EVENT_NAME` and will soon be removed and replaced by
`FAILED_METADATA_CHANGE_EVENT_NAME`.
- `METADATA_CHANGE_PROPOSAL_TOPIC_NAME`: The name of the topic for Metadata Change Proposals emitted by the ingestion framework.
- `FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME`: The name of the topic for Metadata Change Proposals emitted when MCPs fail processing.
- (Deprecated) `METADATA_CHANGE_EVENT_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from.
- (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME`: The name of the deprecated topic that failed MCEs will be written to.

### MAE Consumer (datahub-mae-consumer)

- (Deprecated) `KAFKA_TOPIC_NAME`: The name of the deprecated metadata audit event topic. This will soon be removed
and replaced by `METADATA_AUDIT_EVENT_NAME`.
- `METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME`: The name of the topic for Metadata Change Logs that are produced for Versioned Aspects.
- `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME`: The name of the topic for Metadata Change Logs that are produced for Timeseries Aspects.
- `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME`: The name of the topic for Platform Events (high-level semantic events).
- `DATAHUB_USAGE_EVENT_NAME`: The name of the topic for product analytics events.
- (Deprecated) `METADATA_AUDIT_EVENT_NAME`: The name of the deprecated metadata audit event topic.

### DataHub Frontend (datahub-frontend-react)

Expand Down Expand Up @@ -131,12 +127,12 @@ configurations inside your `values.yaml` file.
datahub-gms:
...
extraEnvs:
- name: METADATA_CHANGE_EVENT_NAME
value: "MetadataChangeEvent"
- name: METADATA_AUDIT_EVENT_NAME
value: "MetadataAuditEvent"
- name: FAILED_METADATA_CHANGE_EVENT_NAME
value: "FailedMetadataChangeEvent"
- name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomMetadataChangeProposal_v1"
- name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
- name: FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomFailedMetadataChangeProposal_v1"
- name: KAFKA_CONSUMER_GROUP_ID
value: "my-apps-mae-consumer"
....
Expand All @@ -150,11 +146,17 @@ datahub-frontend:
# If standalone consumers are enabled
datahub-mae-consumer;
extraEnvs:
- name: KAFKA_TOPIC_NAME
- name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
....
- name: METADATA_AUDIT_EVENT_NAME
value: "MetadataAuditEvent"
datahub-mce-consumer;
extraEnvs:
- name: KAFKA_MCE_TOPIC_NAME
- name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
....
- name: METADATA_CHANGE_EVENT_NAME
value: "MetadataChangeEvent"
....
```
Expand Down
5 changes: 5 additions & 0 deletions docs/how/updating-datahub.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ This file documents any backwards-incompatible changes in DataHub and assists pe

### Deprecations

- `KAFKA_TOPIC_NAME` environment variable in **datahub-mae-consumer** and **datahub-gms** is now deprecated. Use `METADATA_AUDIT_EVENT_NAME` instead.
- `KAFKA_MCE_TOPIC_NAME` environment variable in **datahub-mce-consumer** and **datahub-gms** is now deprecated. Use `METADATA_CHANGE_EVENT_NAME` instead.
- `KAFKA_FMCE_TOPIC_NAME` environment variable in **datahub-mce-consumer** and **datahub-gms** is now deprecated. Use `FAILED_METADATA_CHANGE_EVENT_NAME` instead.


### Other notable Changes
- #5132 Profile tables in `snowflake` source only if they have been updated since configured (default: `1`) number of day(s). Update the config `profiling.profile_if_updated_since_days` as per your profiling schedule or set it to `None` if you want older behaviour.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,11 @@ public class MetadataChangeEventsProcessor {

private final Histogram kafkaLagStats = MetricUtils.get().histogram(MetricRegistry.name(this.getClass(), "kafkaLag"));

@Value("${KAFKA_FMCE_TOPIC_NAME:" + Topics.FAILED_METADATA_CHANGE_EVENT + "}")
@Value("${FAILED_METADATA_CHANGE_EVENT_NAME:${KAFKA_FMCE_TOPIC_NAME:" + Topics.FAILED_METADATA_CHANGE_EVENT + "}}")
private String fmceTopicName;

@KafkaListener(id = "${METADATA_CHANGE_EVENT_KAFKA_CONSUMER_GROUP_ID:mce-consumer-job-client}", topics =
"${KAFKA_MCE_TOPIC_NAME:" + Topics.METADATA_CHANGE_EVENT + "}", containerFactory = "kafkaEventConsumer")
"${METADATA_CHANGE_EVENT_NAME:${KAFKA_MCE_TOPIC_NAME:" + Topics.METADATA_CHANGE_EVENT + "}}", containerFactory = "kafkaEventConsumer")
public void consume(final ConsumerRecord<String, GenericRecord> consumerRecord) {
kafkaLagStats.update(System.currentTimeMillis() - consumerRecord.timestamp());
final GenericRecord record = consumerRecord.value();
Expand Down