Skip to content

Commit

Permalink
refactor(configs): Simplify Kafka Topic name configurations + docs (d…
Browse files Browse the repository at this point in the history
  • Loading branch information
jjoyce0510 authored and maggiehays committed Aug 1, 2022
1 parent ea83678 commit 386856c
Show file tree
Hide file tree
Showing 7 changed files with 48 additions and 27 deletions.
7 changes: 7 additions & 0 deletions docker/datahub-gms/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,13 @@ UI_INGESTION_DEFAULT_CLI_VERSION=0.8.38

# Uncomment to configure kafka topic names
# Make sure these names are consistent across the whole deployment
# METADATA_CHANGE_PROPOSAL_TOPIC_NAME=MetadataChangeProposal_v1
# FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME=FailedMetadataChangeProposal_v1
# METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
# METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME=MetadataChangeLog_Timeseries_v1
# PLATFORM_EVENT_TOPIC_NAME=PlatformEvent_v1
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1
# Deprecated!
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
# METADATA_CHANGE_EVENT_NAME=MetadataChangeEvent_v4
# FAILED_METADATA_CHANGE_EVENT_NAME=FailedMetadataChangeEvent_v4
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-mae-consumer/env/docker-without-neo4j.env
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-mae-consumer/resources/entity-regis

# Uncomment to configure topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_TOPIC_NAME=MetadataAuditEvent_v4
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1

# Uncomment and set these to support SSL connection to Elasticsearch
Expand Down
6 changes: 5 additions & 1 deletion docker/datahub-mae-consumer/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-mae-consumer/resources/entity-regis

# Uncomment to configure topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_TOPIC_NAME=MetadataAuditEvent_v4
# METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
# METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME=MetadataChangeLog_Timeseries_v1
# PLATFORM_EVENT_TOPIC_NAME=PlatformEvent_v1
# DATAHUB_USAGE_EVENT_NAME=DataHubUsageEvent_v1
# Deprecated!
# METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4

# Uncomment and set these to support SSL connection to Elasticsearch
# ELASTICSEARCH_USE_SSL=
Expand Down
7 changes: 5 additions & 2 deletions docker/datahub-mce-consumer/env/docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@ GMS_PORT=8080

# Uncomment to configure kafka topic names
# Make sure these names are consistent across the whole deployment
# KAFKA_MCE_TOPIC_NAME=MetadataChangeEvent_v4
# KAFKA_FMCE_TOPIC_NAME=FailedMetadataChangeEvent_v4
# METADATA_CHANGE_PROPOSAL_TOPIC_NAME=MetadataChangeProposal_v1
# FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME=FailedMetadataChangeProposal_v1
# Deprecated!
# METADATA_CHANGE_EVENT_NAME=MetadataChangeEvent_v4
# FAILED_METADATA_CHANGE_EVENT_NAME=FailedMetadataChangeEvent_v4

# Uncomment and set these to support SSL connection to GMS
# NOTE: Currently GMS itself does not offer SSL support, these settings are intended for when there is a proxy in front
Expand Down
44 changes: 23 additions & 21 deletions docs/how/kafka-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,25 +78,21 @@ The following are environment variables you can use to configure topic names use
- (Deprecated) `METADATA_CHANGE_EVENT_NAME`: The name of the metadata change event topic.
- (Deprecated) `METADATA_AUDIT_EVENT_NAME`: The name of the metadata audit event topic.
- (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME`: The name of the failed metadata change event topic.
- (Deprecated) `KAFKA_MCE_TOPIC_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from. This should technically be
the same as `METADATA_CHANGE_EVENT_NAME` and will soon be removed.
- (Deprecated) `KAFKA_FMCE_TOPIC_NAME`: The name of the deprecated topic that failed MCEs will be written to. This should technically be
the same as `FAILED_METADATA_CHANGE_EVENT_NAME` and will soon be removed.
- (Deprecated) `KAFKA_TOPIC_NAME`: The name of the deprecated topic that MAEs are writtent to. This is used by the MAE consumer when
reading messages. It should contain the same value as `METADATA_AUDIT_EVENT_NAME` and will soon be removed.

### MCE Consumer (datahub-mce-consumer)

- (Deprecated) `KAFKA_MCE_TOPIC_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from. This should technically be
the same as `METADATA_CHANGE_EVENT_NAME` and will soon be removed and replaced by `METADATA_CHANGE_EVENT_NAME`.
- (Deprecated) `KAFKA_FMCE_TOPIC_NAME`: The name of the deprecated topic that failed MCEs will be written to. This should technically be
the same as `FAILED_METADATA_CHANGE_EVENT_NAME` and will soon be removed and replaced by
`FAILED_METADATA_CHANGE_EVENT_NAME`.
- `METADATA_CHANGE_PROPOSAL_TOPIC_NAME`: The name of the topic for Metadata Change Proposals emitted by the ingestion framework.
- `FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME`: The name of the topic for Metadata Change Proposals emitted when MCPs fail processing.
- (Deprecated) `METADATA_CHANGE_EVENT_NAME`: The name of the deprecated topic that an embedded MCE consumer will consume from.
- (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME`: The name of the deprecated topic that failed MCEs will be written to.

### MAE Consumer (datahub-mae-consumer)

- (Deprecated) `KAFKA_TOPIC_NAME`: The name of the deprecated metadata audit event topic. This will soon be removed
and replaced by `METADATA_AUDIT_EVENT_NAME`.
- `METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME`: The name of the topic for Metadata Change Logs that are produced for Versioned Aspects.
- `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME`: The name of the topic for Metadata Change Logs that are produced for Timeseries Aspects.
- `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME`: The name of the topic for Platform Events (high-level semantic events).
- `DATAHUB_USAGE_EVENT_NAME`: The name of the topic for product analytics events.
- (Deprecated) `METADATA_AUDIT_EVENT_NAME`: The name of the deprecated metadata audit event topic.

### DataHub Frontend (datahub-frontend-react)

Expand Down Expand Up @@ -131,12 +127,12 @@ configurations inside your `values.yaml` file.
datahub-gms:
...
extraEnvs:
- name: METADATA_CHANGE_EVENT_NAME
value: "MetadataChangeEvent"
- name: METADATA_AUDIT_EVENT_NAME
value: "MetadataAuditEvent"
- name: FAILED_METADATA_CHANGE_EVENT_NAME
value: "FailedMetadataChangeEvent"
- name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomMetadataChangeProposal_v1"
- name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
- name: FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomFailedMetadataChangeProposal_v1"
- name: KAFKA_CONSUMER_GROUP_ID
value: "my-apps-mae-consumer"
....
Expand All @@ -150,11 +146,17 @@ datahub-frontend:
# If standalone consumers are enabled
datahub-mae-consumer;
extraEnvs:
- name: KAFKA_TOPIC_NAME
- name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
....
- name: METADATA_AUDIT_EVENT_NAME
value: "MetadataAuditEvent"
datahub-mce-consumer;
extraEnvs:
- name: KAFKA_MCE_TOPIC_NAME
- name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
value: "CustomMetadataChangeLogVersioned_v1"
....
- name: METADATA_CHANGE_EVENT_NAME
value: "MetadataChangeEvent"
....
```
Expand Down
5 changes: 5 additions & 0 deletions docs/how/updating-datahub.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ This file documents any backwards-incompatible changes in DataHub and assists pe

### Deprecations

- `KAFKA_TOPIC_NAME` environment variable in **datahub-mae-consumer** and **datahub-gms** is now deprecated. Use `METADATA_AUDIT_EVENT_NAME` instead.
- `KAFKA_MCE_TOPIC_NAME` environment variable in **datahub-mce-consumer** and **datahub-gms** is now deprecated. Use `METADATA_CHANGE_EVENT_NAME` instead.
- `KAFKA_FMCE_TOPIC_NAME` environment variable in **datahub-mce-consumer** and **datahub-gms** is now deprecated. Use `FAILED_METADATA_CHANGE_EVENT_NAME` instead.


### Other notable Changes
- #5132 Profile tables in `snowflake` source only if they have been updated since configured (default: `1`) number of day(s). Update the config `profiling.profile_if_updated_since_days` as per your profiling schedule or set it to `None` if you want older behaviour.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,11 @@ public class MetadataChangeEventsProcessor {

private final Histogram kafkaLagStats = MetricUtils.get().histogram(MetricRegistry.name(this.getClass(), "kafkaLag"));

@Value("${KAFKA_FMCE_TOPIC_NAME:" + Topics.FAILED_METADATA_CHANGE_EVENT + "}")
@Value("${FAILED_METADATA_CHANGE_EVENT_NAME:${KAFKA_FMCE_TOPIC_NAME:" + Topics.FAILED_METADATA_CHANGE_EVENT + "}}")
private String fmceTopicName;

@KafkaListener(id = "${METADATA_CHANGE_EVENT_KAFKA_CONSUMER_GROUP_ID:mce-consumer-job-client}", topics =
"${KAFKA_MCE_TOPIC_NAME:" + Topics.METADATA_CHANGE_EVENT + "}", containerFactory = "kafkaEventConsumer")
"${METADATA_CHANGE_EVENT_NAME:${KAFKA_MCE_TOPIC_NAME:" + Topics.METADATA_CHANGE_EVENT + "}}", containerFactory = "kafkaEventConsumer")
public void consume(final ConsumerRecord<String, GenericRecord> consumerRecord) {
kafkaLagStats.update(System.currentTimeMillis() - consumerRecord.timestamp());
final GenericRecord record = consumerRecord.value();
Expand Down

0 comments on commit 386856c

Please sign in to comment.