Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package sd treats every endpoint independently, which is inefficient #403

Closed
buptmiao opened this issue Dec 7, 2016 · 5 comments
Closed

Comments

@buptmiao
Copy link
Contributor

buptmiao commented Dec 7, 2016

I'm new to go-kit. I saw the code below when I read the apigateway example. I am confused that every endpoint has a respective subscriber, and every subscriber will start 2 goroutines: one watcher, and one loop. Here is the problem, If I have a microservice with too many endpoints, at client-side the watchers will watch the same key in parallel. Actually, only one watcher
is necessary. I consider this can cost a lot of network bandwidth. So, is there a resolution?

var (
			tags        = []string{}
			passingOnly = true
			endpoints   = addsvc.Endpoints{}
		)
		{
			factory := addsvcFactory(addsvc.MakeSumEndpoint, tracer, logger)
			subscriber := consulsd.NewSubscriber(client, factory, logger, "addsvc", tags, passingOnly)
			balancer := lb.NewRoundRobin(subscriber)
			retry := lb.Retry(*retryMax, *retryTimeout, balancer)
			endpoints.SumEndpoint = retry
		}
		{
			factory := addsvcFactory(addsvc.MakeConcatEndpoint, tracer, logger)
			subscriber := consulsd.NewSubscriber(client, factory, logger, "addsvc", tags, passingOnly)
			balancer := lb.NewRoundRobin(subscriber)
			retry := lb.Retry(*retryMax, *retryTimeout, balancer)
			endpoints.ConcatEndpoint = retry
		}
@peterbourgon
Copy link
Member

Yes, this is an inefficiency in the current design. I tried to solve it once with the concept of an instance, but it didn't work out. I'm open to suggestions! Labeling and renaming the issue accordingly.

@peterbourgon peterbourgon changed the title Is every endpoint has a relative subscriber with a sd. package sd treats every endpoint independently, which is inefficient Dec 7, 2016
@buptmiao
Copy link
Contributor Author

buptmiao commented Dec 7, 2016

I think the problem is caused by the Subscriber interface definition.

type Subscriber interface {
	Endpoints() ([]endpoint.Endpoint, error)
}

The method Endpoints returns a endpoints list, which means that a subscriber can determine the transport layer, and it does not make sense. So, I think the relative definition maybe like this:

type Subscriber interface {
	Instances() ([]string, error)
}
type Balancer interface {
        Instance() (string, error)
}

And put the factory into the retry layer, So, there is a way we can reuse subscriber and balancer for each endpoint.
Then the example maybe like this:

subscriber := consulsd.NewSubscriber(client, logger, "addsvc", tags, passingOnly)
balancer := lb.NewRoundRobin(subscriber)
                {
			factory := addsvcFactory(addsvc.MakeSumEndpoint, tracer, logger)
			retry := lb.Retry(*retryMax, *retryTimeout, balancer, factory)
			endpoints.SumEndpoint = retry
		}
		{
			factory := addsvcFactory(addsvc.MakeConcatEndpoint, tracer, logger)
			retry := lb.Retry(*retryMax, *retryTimeout, balancer, factory)
			endpoints.ConcatEndpoint = retry
		}

@basvanbeek
Copy link
Member

basvanbeek commented Jan 14, 2017

To elaborate more on the inefficiency, the example makes it seem clear cut that @buptmiao's proposal makes a big difference but actually it doesn't. I've provided Peter with real production examples of both angles implemented and the current SD package implementation for services holding multiple middlewares, multiple functions and SD + LB have so far been easier to build under the current API.

We acknowledge potential room for improvement but unfortunately it's not easy to demonstrate and might turn out that the problem is just shifted to a different location rather than solved.

POC's are super welcome!

@yurishkuro
Copy link
Contributor

Seems this is similar to #463. The PR #478 I opened allows implementations to expose string-based API just for service locations. However, to address your issue directly the discovery implementations should change to not implement Subscriber, and therefore not require a Factory, which would be the next change after #478. I think factory is only needed by the Cache that manages endpoints, so the code would change to:

discoverer := consulsd.NewSubscriber(client, /*- factory,*/ logger, "addsvc", tags, passingOnly)
{
	factory := addsvcFactory(addsvc.MakeSumEndpoint, tracer, logger)
        cache := endpoint.NewCache(factory, discoverer, logger)
        balancer := lb.NewRoundRobin(cache)
	retry := lb.Retry(*retryMax, *retryTimeout, balancer /*, - factory*/) 
	endpoints.SumEndpoint = retry
}
{
	factory := addsvcFactory(addsvc.MakeConcatEndpoint, tracer, logger)
        cache := endpoint.NewCache(factory, discoverer, logger)
        balancer := lb.NewRoundRobin(cache)
	retry := lb.Retry(*retryMax, *retryTimeout, balancer /*, - factory*/) 
	endpoints.ConcatEndpoint = retry
}

@peterbourgon
Copy link
Member

I thiiiink this can be closed after #492. Unless the OP objects I'll do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants