ROR recovery behavior when auth endpoint fails

We’ve had 2 situations where Elasticsearch for some reason can no longer talk to the auth endpoint (we use a custom endpoint for both auth and groups_provider.

As far as we can see, the only way to recover from this situation is to restart Elasticsearch. I’m looking for information around how ROR is designed to handle transient auth errors - should it go into a controlled backoff/retry loop, or is the only way to recover a node reboot whenever this occurs?

Stacktrace shows: Connection timed out: no further information

Hello @trondhindenes, we’ll implement this directly in the new core rewrite. will take 3 more weeks or so.

I’ve just checked and in new core it works. There is no need to restart anything. After LDAP service recovery, all LDAP queries starts to work again.

1 Like