Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(plugins): spring custom plugins #10389

Merged
merged 31 commits into from
May 9, 2024
Merged

Conversation

david-leifker
Copy link
Collaborator

@david-leifker david-leifker commented Apr 26, 2024

Allow custom plugins to use Spring for injection and autoconfiguration of custom plugins.

This is in addition to the ClassGraph (non-Spring) plugin loader. Spring is optional.

Created a single 20MB dependency for plugins to use. Created a separate metadata-io-api module to isolate most of the non-essential classes.

The artifact is io.acryl:datahub-custom-plugin-lib found in the module :metadata-integration:java:custom-plugin-lib

Notes: The Spring context is isolated so all required classes must be present in the packageScan configuration for Spring to find.

Additional Changes:

  • Class graph - optimize class scanning and added cache
  • OpenAPIv3 Spec Generator fixes

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

…om-plugins

# Conflicts:
#	metadata-io/src/main/java/com/linkedin/metadata/service/BusinessAttributeUpdateHookService.java
#	metadata-jobs/pe-consumer/src/main/java/com/datahub/event/hook/BusinessAttributeUpdateHook.java
#	metadata-jobs/pe-consumer/src/test/java/com/datahub/event/hook/BusinessAttributeUpdateHookTest.java
#	metadata-service/configuration/src/main/resources/application.yaml
@github-actions github-actions bot added the devops PR or Issue related to DataHub backend & deployment label Apr 26, 2024
@kevin1chun
Copy link
Contributor

@david-leifker in this change then, the plugin classloader will no longer have access to the core application classloader?
So the plugin jar would need to contain a specific version of spring?

I know we discussed this last time, and it seems like this will help by adding isolation between core application and plugin dependencies. It doesn't help us solve for initializing core components like logging and metrics in the core app though. That could be solved separately, but just wanted to confirm my assumption there

@kevin1chun
Copy link
Contributor

Another question, if we are able to reference the main application spring context from the plugin spring context, can we pull in the metrics reporter exclusively?

@david-leifker
Copy link
Collaborator Author

@david-leifker in this change then, the plugin classloader will no longer have access to the core application classloader? So the plugin jar would need to contain a specific version of spring?

I know we discussed this last time, and it seems like this will help by adding isolation between core application and plugin dependencies. It doesn't help us solve for initializing core components like logging and metrics in the core app though. That could be solved separately, but just wanted to confirm my assumption there

I believe that the classes already imported would be available to plugins. The Spring context is isolated and the packageScan can include pages to load into this isolated context which is used for the plugin. The packages are not restricted to any package location so beans or components from anything can be autowired.

@kevin1chun
Copy link
Contributor

@david-leifker in this change then, the plugin classloader will no longer have access to the core application classloader? So the plugin jar would need to contain a specific version of spring?
I know we discussed this last time, and it seems like this will help by adding isolation between core application and plugin dependencies. It doesn't help us solve for initializing core components like logging and metrics in the core app though. That could be solved separately, but just wanted to confirm my assumption there

I believe that the classes already imported would be available to plugins. The Spring context is isolated and the packageScan can include pages to load into this isolated context which is used for the plugin. The packages are not restricted to any package location so beans or components from anything can be autowired.

Awesome, that's helpful context. So in our case in which we want to reference classes that are contained in the core application classloader, we would just initialize our plugin with the directive to load packages in datahub gms that contain beans that we want.
There's probably some more complexity here though since we would need to understand all of the dependent paths to load as well.

@david-leifker david-leifker marked this pull request as ready for review May 7, 2024 19:12
@@ -105,6 +85,95 @@ public PluginFactory loadPlugins() {
return this;
}

/**
* Memory intensive operation because of the size of the jars. Limit packages, classes scanned,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the limit being applied?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packages and class names are used in the acceptPackages and acceptClass

new ClassGraph()
.acceptPackages(packageNames.stream().distinct().toArray(String[]::new) .acceptClasses(classNames.stream().distinct().toArray(String[]::new))

IntStream.concat(
IntStream.of(baseClazz.getName().hashCode()),
configs.stream().mapToInt(AspectPluginConfig::hashCode)))
.sum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so this is creating a key like
sum ( classLoaders.hashCode..., aspectPluginConfigClass.hashCode... )

Is this guaranteed to be unique?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unique within a given jvm execution. It might leak a bit when/if jars are being constantly replaced since each time that happens we'll be creating orphans. This was primarily added to help with tests where we run multiple concurrent test threads and memory usage was triggering OOM.

# Conflicts:
#	entity-registry/src/main/java/com/linkedin/metadata/aspect/validation/StructuredPropertiesValidator.java
@david-leifker david-leifker merged commit 6ed21bd into master May 9, 2024
41 of 42 checks passed
@david-leifker david-leifker deleted the spring-custom-plugins branch May 9, 2024 19:56
sleeperdeep pushed a commit to sleeperdeep/datahub that referenced this pull request Jun 25, 2024
Co-authored-by: Kevin Chun <kevinchun@netflix.com>
Co-authored-by: Kevin Chun <kevin1chun@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops PR or Issue related to DataHub backend & deployment publish-docker release-notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants