Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Enable request hedging for WebClient #806

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions docs/src/main/asciidoc/spring-cloud-commons.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -1055,3 +1055,62 @@ myrandom=${cachedrandom.appname.value}
== Configuration Properties

To see the list of all Spring Cloud Commons related configuration properties please check link:appendix.html[the Appendix page].

== Spring Cloud Request Hedging
=== Introduction
Request hedging is a mechanism to improve performance of idempotent requests with large worst-case response times. It
does this by automatically retrying after the original request has been active for some configurable amount of time,
usually based on a quantile of the request's typical response time. Unlike timing out the request and retrying, request
hedging will keep the original requests and return the first request to succeed. This trades off a small percentage
of additional network traffic to reduce the long-tail latency.

=== Usage
You can configure a `WebClient` to do request hedging by creating a `HedgerPolicy` `@Bean` and use the `@Hedged`
qualifier. The `HedgerPolicy` allows you to specify how long to wait before executing the extra requests, a maximum
number of extra requests to create. Most users will want the `LatencyPercentileHedgerPolicy`, which executes new
requests when the previous ones have exceeded some percentile of the historical latency.

This example shows a typical usage:

====
[source,java,indent=0]
----
@Configuration
public class MyConfiguration {
@Bean
LatencyPercentileHedgerPolicy hedgerPolicy() {
return new LatencyPercentileHedgerPolicy(
3, // Maximum extra requests to execute
0.95 // Percentile of latency to wait before executing an additional request
);
}

@Bean
HedgerPolicyFactory hedgerPolicyFactory(LatencyPercentileHedgerPolicy policy) {
return new LatencyPercentileHedgerPolicyFactory(policy);
}

@Hedged
@Bean
WebClient.Builder webClientBuilder() {
return WebClient.builder();
}
}

public class MyClass {
@Autowired
private WebClient.Builder webClientBuilder;

public Mono<String> doOtherStuff() {
return webClientBuilder.build().get().uri("http://www.example.com/")
.retrieve().bodyToMono(String.class);
}
}
----
====

=== Reporting
Hedging also provides a `HedgerListener`. A hedged service request could involve more than one actual HTTP request. The
listener provides a mechanism to track each individual HTTP request. For example if we wanted to track latencies,
we might want to track each HTTP request separately rather than the unified whole. These listeners can be provided to
the `HedgerPolicyFactory`.
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*
* Copyright 2013-2020 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.client.hedger;

import java.lang.annotation.Documented;
import java.lang.annotation.ElementType;
import java.lang.annotation.Inherited;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

import org.springframework.beans.factory.annotation.Qualifier;

/**
* Annotation for a method invocation that is hedge-able. Hedging requests is a way to retry requests that have long
* worst-case response times. If a request is known to be safe to issue multiple times, we will fire off multiple
* requests, potentially after some initial delay.
*
* This is useful to try to cut down on the long tail latencies of requests. For example, one could set the delay
* to the 95th percentile latency of the downstream service. Typically that will result in a significant reduction of
* 95th percentile latency in exchange for 2-5% traffic increase. (The actual numbers depend on the distribution of
* latencies.)
*
* {@see http://accelazh.github.io/storage/Tail-Latency-Study} for more background on how hedging works.
*
* @author Csaba Kos
* @author Kevin Binswanger
*/
@Target({ ElementType.FIELD, ElementType.PARAMETER, ElementType.METHOD })
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
@Qualifier
public @interface Hedged {

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
/*
* Copyright 2012-2020 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.client.hedger;

import org.springframework.beans.BeansException;
import org.springframework.beans.factory.config.BeanPostProcessor;
import org.springframework.context.ApplicationContext;
import org.springframework.lang.NonNull;
import org.springframework.web.reactive.function.client.WebClient;

/**
* A {@link BeanPostProcessor} that applies
* {@link HedgerExchangeFilterFunction} filter to all
* {@link WebClient.Builder} instances annotated with {@link Hedged}.
*
* @author Kevin Binswanger
*/
public class HedgedWebClientBuilderBeanPostProcessor implements BeanPostProcessor {
private final HedgerExchangeFilterFunction exchangeFilterFunction;
private final ApplicationContext context;

public HedgedWebClientBuilderBeanPostProcessor(
HedgerExchangeFilterFunction exchangeFilterFunction,
ApplicationContext context
) {
this.exchangeFilterFunction = exchangeFilterFunction;
this.context = context;
}

@Override
public Object postProcessBeforeInitialization(@NonNull Object bean, String beanName)
throws BeansException {
if (bean instanceof WebClient.Builder) {
if (context.findAnnotationOnBean(beanName, Hedged.class) == null) {
return bean;
}
((WebClient.Builder) bean).filter(exchangeFilterFunction);
}
return bean;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
/*
* Copyright 2013-2020 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.client.hedger;


import java.time.Duration;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import reactor.util.function.Tuple2;

import org.springframework.lang.NonNull;
import org.springframework.web.reactive.function.client.ClientRequest;
import org.springframework.web.reactive.function.client.ClientResponse;
import org.springframework.web.reactive.function.client.ExchangeFilterFunction;
import org.springframework.web.reactive.function.client.ExchangeFunction;

/**
* WebClient filter function that allows for "hedging" requests that take too long: after the specified delay, if
* the original request was not completed yet, it will fire one or more "hedged" requests in hopes that they will
* return sooner than the original request.
*
* This is useful to try to cut down on the long tail latencies of requests. For example, one could set the delay
* to the 95th percentile latency of the downstream service. Typically that will result in a significant reduction of
* 95th percentile latency in exchange for 2-5% traffic increase. (The actual numbers depend on the distribution of
* latencies.)
*
* {@see http://accelazh.github.io/storage/Tail-Latency-Study} for more background on how hedging works.
*
* @author Csaba Kos
* @author Kevin Binswanger
*/
public class HedgerExchangeFilterFunction implements ExchangeFilterFunction {

private static final Log log = LogFactory.getLog(HedgerExchangeFilterFunction.class);

private final HedgerPolicyFactory hedgerPolicyFactory;

public HedgerExchangeFilterFunction(HedgerPolicyFactory hedgerPolicyFactory) {
this.hedgerPolicyFactory = hedgerPolicyFactory;
}

@Override
@NonNull
public Mono<ClientResponse> filter(@NonNull ClientRequest request, ExchangeFunction next) {
HedgerPolicy hedgerPolicy = hedgerPolicyFactory.getHedgingPolicy(request);
HedgerListener[] hedgerListeners = hedgerPolicyFactory.getHedgingListeners(request);
Duration delay = hedgerPolicy.getDelayBeforeHedging(request);
int numHedges = numberOfHedgedRequestsDelayAware(request, hedgerPolicy, delay);
return withSingleMetricsReporting(hedgerListeners, request, next.exchange(request), 0)
.mergeWith(
Flux.range(1, numHedges)
.delayElements(delay)
.flatMap(hedgeNumber -> withSingleMetricsReporting(hedgerListeners, request, next.exchange(request), hedgeNumber)
.onErrorResume(throwable -> {
if (log.isDebugEnabled()) {
log.debug("Hedged request " + hedgeNumber + " to " + request.url() + " failed", throwable);
}
return Mono.empty();
}))
)
.next();
}

private int numberOfHedgedRequestsDelayAware(
ClientRequest request,
HedgerPolicy hedgerPolicy,
Duration delay
) {
if (delay.isNegative()) {
return 0;
}
else {
return hedgerPolicy.getMaximumHedgedRequests(request);
}
}

private Mono<ClientResponse> withSingleMetricsReporting(
HedgerListener[] hedgerListeners,
ClientRequest request,
Mono<ClientResponse> response,
int hedgeNumber
) {
return response
.elapsed()
.doOnSuccess(tuple -> {
for (HedgerListener hedgerListener : hedgerListeners) {
try {
hedgerListener.record(request, tuple.getT2(), tuple.getT1(), hedgeNumber);
}
catch (Exception e) {
log.warn("Hedger listener threw an exception trying to report on attempt " + hedgeNumber, e);
}
}
})
.map(Tuple2::getT2);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/*
* Copyright 2013-2020 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.client.hedger;

import org.springframework.web.reactive.function.client.ClientRequest;
import org.springframework.web.reactive.function.client.ClientResponse;

/**
* Hedging could create multiple requests to a given service where an application previously made one. This provides
* a hedging-aware mechanism to do any special handling necessary. Example uses:
* <ul>
* <li>Track the latency of each individual request instead of the total batch<./li>
* <li>Tag a metric with the request number.</li>
* <li>Track the number of hedged requests made.</li>
* </ul>
* @author Csaba Kos
* @author Kevin Binswanger
*/
@FunctionalInterface
public interface HedgerListener {
/**
* Invoked by {@link HedgerExchangeFilterFunction} on every individual attempt.
* @param request The HTTP request.
* @param response The eventual HTTP response.
* @param elapsedMillis The number of milliseconds elapsed just for this attempt (ignores time spent waiting on
* previous attempts).
* @param requestNumber Which attempt this is. The original will be 0, and the first hedged request will be 1.
*/
void record(ClientRequest request,
ClientResponse response,
long elapsedMillis,
int requestNumber);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright 2013-2020 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.cloud.client.hedger;

import java.time.Duration;

import org.springframework.web.reactive.function.client.ClientRequest;
import org.springframework.web.reactive.function.client.WebClient;

/**
* Controls the parameters for a {@link WebClient.Builder} annotated with {@link Hedged}.
* @author Kevin Binswanger
*/
public interface HedgerPolicy {
/**
* The maximum number of requests that could be made if the original does not succeed in time.
* @param request The HTTP request to be made.
* @return How many extra requests can be made. 0 or less will do no hedging.
*/
int getMaximumHedgedRequests(ClientRequest request);

/**
* How long to wait after each request to execute the next one. When this time is reached, a single extra request
* will be made. Then that duration has to pass *again* before making the second request, and so on.
* @param request The HTTP request to be made.
* @return How long to wait after each request.
*/
Duration getDelayBeforeHedging(ClientRequest request);
}
Loading