in spring springcloud microservices opensource ~ read.

Service Discovery. More Than It Seems. Part 2

In first episode we successfully fetched data from Mesos Marathon into Spring Cloud beans directly. At the same time, we had the first problems, one of which we will analyze in the current part of the story.

Let's remember our connection configuration to Marathon:

            scheme: http       #url scheme
            host: marathon     #marathon host
            port: 8080         #marathon port

What problems do we see here? First - we do not have any authorization while connecting that is strange for production usage. Second, we can specify only one host and port. In principle, it would be possible to try hiding several masters behind one load balancer or DNS. But we make an additional point of failure that we want to avoid.

Password - whole head!

There are exists two available for us authorization mechanics: Basic and Token. Basic-Authorization so hackneyed, because every developer knows it. Its essence to pain is dull. Take login and password. Glue them with :. Encode into Base64. Add HTTP-header Authorization with value Basic <Base64>. That's all.

With a token, it is slightly more challenging. In open-source implementation it is unavailable. But such method will reasonable for those who use DC/OS. For this purpose, it is necessary just to add slightly another authorization header:

Authorization: token=<auth_token>  

Thus we can add several necessary properties to our configuration:

            token: <dcos_acs_token>
            username: marathon
            password: mesos

And further, we can be guided by simple priorities. If the token is specified, then we take it. Otherwise, we make login and password and do basic-authorization. Well and in the absence of that and another, we create the client without authorization.

Feign.Builder builder = Feign.builder()  
        .encoder(new GsonEncoder(ModelUtils.GSON))
        .decoder(new GsonDecoder(ModelUtils.GSON))
        .errorDecoder(new MarathonErrorDecoder());

if (!StringUtils.isEmpty(token)) {  
    builder.requestInterceptor(new TokenAuthRequestInterceptor(token));
else if (!StringUtils.isEmpty(username)) {  
    builder.requestInterceptor(new BasicAuthRequestInterceptor(username,password));

builder.requestInterceptor(new MarathonHeadersInterceptor());

return, baseEndpoint);  

Marathon client is implemented with declarative HTTP-client Feign, that could be extended by interceptors. In our case, they are provided extra HTTP-headers to query. After that, builder constructs a proxy object with additional behavior according to an interface that it implements. An interface should have one or several methods which could be called on remote side:

public interface Marathon {  
    // Apps
    @RequestLine("GET /v2/apps")
    GetAppsResponse getApps() throws MarathonException;

    //Other methods

So, warm-up ended, now we will be engaged in the more complex challenge.

Fault-tolerance client

If we have deployed production installation of Mesos and Marathon, then a number of masters from which we could read data will be more than one. Moreover, some of them could be lucky unavailable, broken, or be maintaining. The impossibility to obtain information will lead to the obsolescence of information on the client side, and, therefore, at some point to making wrong decisions. Or, we will tell that in a case of an update of an application software, we generally won't receive the list of instances and will be out of service. All of that is not actually good. We need a smart client load balancing to fix it.

It would be logical to use Ribbon as the most appropriate candidate because it is already used for client load balancing inside of Spring Cloud. We will talk more about balancing strategies in upcoming articles. Now we limit the only basic functionality that is required for solving our problem.

First of all, we need to use balancer in feign-client:

Feign.Builder builder = Feign.builder()  
            .client(RibbonClient.builder().lbClientFactory(new MarathonLBClientFactory()).build());

Maybe you have a question. What is lbClientFactory and why we should use our own? Shortly, this factory construct client load balancer. By default feign-client doesn't have important feature: retry calls if something goes wrong. For possibility to do retries, we should add it during object construction:

public static class MarathonLBClientFactory implements LBClientFactory {  
    public LBClient create(String clientName) {
        LBClient client = new LBClientFactory.Default().create(clientName);
        IClientConfig config = ClientFactory.getNamedConfig(clientName);
        client.setRetryHandler(new DefaultLoadBalancerRetryHandler(config));
        return client;

Don't worry about the fact that our retry-handler has prefix Default. Inside of it there is all that we need. Try to configure it.

Because many feign-clients might exist in application, but Marathon client is only one of them, then properties has following pattern:


In our case:


All properties for client load balancer are stored in configuration manager that is named Archarius. Is our case theese properties are stored in memory, and we could add them on the fly. For implement it we should add helper method setMarathonRibbonProperty in our modified client. In method we set different kind of properties by the following pattern:

ConfigurationManager.getConfigInstance().setProperty(MARATHON_SERVICE_ID_RIBBON_PREFIX + suffix, value);  

And now before construction of feign-client we should initialize them:

setMarathonRibbonProperty("listOfServers", listOfServers);  
setMarathonRibbonProperty("OkToRetryOnAllOperations", Boolean.TRUE.toString());  
setMarathonRibbonProperty("MaxAutoRetriesNextServer", 2);  
setMarathonRibbonProperty("ConnectTimeout", 100);  
setMarathonRibbonProperty("ReadTimeout", 300);  

What is interesting here. At first it is listOfServers. In fact it is enum of all possible pairs of host and port of Marathon masters, that is separated by comma. In our case we should add proxy-property that should be translated into ribbon-property:

            listOfServers: m1:8080,m2:8080,m3:8080

So, now every new call to master will go to one of theese servers.

And we should not forget set OkToRetryOnAllOperations in true for enabling retry.

Max retries count should be setup in MaxAutoRetriesNextServer property. Why here uses NextServer suffix? So simple. Because MaxAutoRetries option exists, and it defines retry count for the first server before next will be used. By default, it has 0 value. It means that after first fault attempt client goes to next candidate immediately. And you should remember, that MaxAutoRetriesNextServer define retry count without a first attempt.

And finally, for avoiding long-long connection, we should set ConnectTimeout and ReadTimeout within reasonable limits.


In this part of series, we've done Marathon-client fault-tolerant and more customizable. Moreover, we use those solutions that are already used in Spring Cloud. But we still far away from our goal because the most interesting part is not ready yet.

See you in the next part

comments powered by Disqus