LDAP connection timeout leads to authentication error

hrr · December 17, 2025, 9:51am

LDAP connection timeout leads to authentication error

I’m using LDAP backend in readonlyrest config. Once for a while users are unable to login into kibana. Logs indicates, that readonlyrest cannot access connection to LDAP backend due to timeout. After several attempts plugin reestablish connection to LDAP and user has granted access. This frustrates users, because situation repeats itself every day.
I assemble testing environment to replicate that behaviour. It apperars that readonlyrest plugin established connection at during initialization - after elasticsearch starts, or when configuration of plugin is changed. Then, after a while (I tested 1h period), when connection are not used, they timeout. When user tries to authenticate, at frst, plugin in elasticsearch tires to use timeouted connection, and get exception. Then after a white it tries to reconnect, but in the meantime plugin in kibana gets info about forbiden access.
I’ve tried to fiddle with avaliable timeout and cache parameters in readonlyrest config to remediate this behaviour, but without success.

I’ve include k8s yaml file with testing deployment and log files of all pods (elasticsearch, kibana and openldap)

Expected behaviour

Elasticsearch plugin should try to reconnect to auth backend before returning “access denied” info to client (kibana plugin).

Technical details

ROR Version: 1.67.3

Elasticsearch Version: 7.17.1

Logs and config files

Screenshots

{“customer_id”: “9fdfc5d6-ebc4-4311-a12b-4b0f1e9d130e”, “subscription_id”: “e031e519-f2e9-4a01-b815-5d15b49d0665”}

coutoPL · December 17, 2025, 2:51pm

Hi @hrr,

Thanks for reporting this.

In ROR, we use an LDAP connection pool, which is why you see long-lived connections on the LDAP server side.

We’ve improved the health checking of the pool’s connection, so you should not experience the issue anymore. Please, test this pre-build:

ROR 1.68.0-pre11 for 7.17.1

and let us know if the problem is gone.

hrr · December 18, 2025, 1:13pm

Unfortunately, problem still persist. Logs from elasticsearch:

[2025-12-18T12:59:59,517][ERROR][t.b.r.a.b.Block          ] [elastic-test-1] [1d450c0c-afb4-4455-926d-8ea507636244-1446298784#159411] ldap_group: kibana: ldap_auth rule matching got an error LDAP returned code: timeout [85], cause: result code='85 (timeout)' diagnostic message='The asynchronous operation encountered
a client-side timeout after waiting 10001 milliseconds for a response to arrive.'
tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.LdapUnexpectedResult: LDAP returned code: timeout [85], cause: result code='85 (timeout)' diagnostic message='The asynchronous operation encountered a client-side timeout after waiting 10001 milliseconds for a response to arrive.'
        at tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.LdapUnexpectedResult$.apply(UnboundidLdapUsersService.scala:112) ~[?:?]
        at tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.groupsFrom$$anonfun$2(UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.scala:86) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:44) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:45) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.groupsFrom(UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.scala:75) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:44) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:45) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapUsersService.fetchLdapUser(UnboundidLdapUsersService.scala:66) ~[?:?]
        at runAsync @ tech.beshu.ror.es.IndexLevelActionFilter.handleRequest(IndexLevelActionFilter.scala:205) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.doFetchGroupsOf(UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.scala:63) ~[?:?]
        at runAsync @ tech.beshu.ror.es.IndexLevelActionFilter.handleRequest(IndexLevelActionFilter.scala:205) ~[?:?]
        at map @ tech.beshu.ror.utils.TaskOps$.andThen$extension(TaskOps.scala:30) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.auth.base.BaseAuthorizationRule.authorizeLoggedUser(BaseAuthorizationRule.scala:101) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.auth.base.BaseAuthorizationRule.authorizeLoggedUser(BaseAuthorizationRule.scala:102) ~[?:?]
        at map @ tech.beshu.ror.utils.TaskOps$.measure$extension$$anonfun$1$$anonfun$2(TaskOps.scala:60) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapAuthenticationService.ldapAuthenticate(UnboundidLdapAuthenticationService.scala:62) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapAuthenticationService.ldapAuthenticate(UnboundidLdapAuthenticationService.scala:62) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:44) ~[?:?]
[2025-12-18T13:00:09,683][ERROR][t.b.r.a.b.d.l.i.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering] [elastic-test-1] [1d450c0c-afb4-4455-926d-8ea507636244-1446298784#159411] LDAP getting user groups returned error: [code=85 (timeout), cause=result code='85 (timeout)' diagnostic message='The asynchronous operation encountered a client-side timeout after waiting 10012 milliseconds for a response to arrive.']
[2025-12-18T13:00:09,689][ERROR][t.b.r.a.b.Block          ] [elastic-test-1] [1d450c0c-afb4-4455-926d-8ea507636244-1446298784#159411] ldap_group: kibana-user: ldap_auth rule matching got an error LDAP returned code: timeout [85], cause: result code='85 (timeout)' diagnostic message='The asynchronous operation encountered a client-side timeout after waiting 10012 milliseconds for a response to arrive.'
tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.LdapUnexpectedResult: LDAP returned code: timeout [85], cause: result code='85 (timeout)' diagnostic message='The asynchronous operation encountered a client-side timeout after waiting 10012 milliseconds for a response to arrive.'
        at tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.LdapUnexpectedResult$.apply(UnboundidLdapUsersService.scala:112) ~[?:?]
        at tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.groupsFrom$$anonfun$2(UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.scala:86) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:44) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapConnectionPool.process(UnboundidLdapConnectionPool.scala:45) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.definitions.ldap.implementations.UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.groupsFrom(UnboundidLdapDefaultGroupSearchAuthorizationServiceWithServerSideGroupsFiltering.scala:75) ~[?:?]
        at liftF @ tech.beshu.ror.accesscontrol.blocks.Block$Lifter.apply(Block.scala:200) ~[?:?]
        at mapBoth @ tech.beshu.ror.accesscontrol.blocks.Block.execute(Block.scala:67) ~[?:?]
        at runAsync @ tech.beshu.ror.es.IndexLevelActionFilter.handleRequest(IndexLevelActionFilter.scala:205) ~[?:?]
        at parMap2 @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.ipMatchesAddress(BaseHostsRule.scala:61) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.$anonfun$3(BaseHostsRule.scala:64) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.$anonfun$3(BaseHostsRule.scala:66) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.$anonfun$3(BaseHostsRule.scala:66) ~[?:?]
        at parMap2 @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.ipMatchesAddress(BaseHostsRule.scala:61) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.ipMatchesAddress(BaseHostsRule.scala:63) ~[?:?]
        at flatMap @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.ipMatchesAddress(BaseHostsRule.scala:67) ~[?:?]
        at map @ tech.beshu.ror.accesscontrol.blocks.rules.tranport.BaseHostsRule.ipMatchesAddress(BaseHostsRule.scala:68) ~[?:?]
        at runAsync @ tech.beshu.ror.es.IndexLevelActionFilter.handleRequest(IndexLevelActionFilter.scala:205) ~[?:?]
        at foldMap @ tech.beshu.ror.boot.ReadonlyRest.runStartingFailureProgram(ReadonlyRest.scala:107) ~[?:?]

First and second attempt to login in kibana failed, only the third one succeeded.

coutoPL · December 18, 2025, 2:34pm

This is a different case.

Now, the request to LDAP failed due to a request timeout (default 10 seconds).
You should consider changing the request timeout (see docs) or, even better, adding a cache (see docs)

hrr · December 22, 2025, 11:00am

Is it really? This error occured with two different databases - AD from Azure and openldap served on-premise. If it is matter of timeout/cache values, the defaults doesn’t work with two most popular ldap solutions.

Changing request timeout changes only time, you have to wait for error to occur. For request_timeout_in_sec set to 20s I have the same behaviour:

[2025-12-22T06:14:26,754][ERROR][t.b.r.a.b.Block          ] [elastic-test-1] [012c0c8d-c3bb-4968-9465-00dfdf5d225e-978310386#4564088] ldap_group: kibana: ldap_auth rule matching got an error LDAP returned code: timeout [85], cause: result code='85 (timeout)' diagnostic message='The asynchronous operation encountered a client-side timeout after waiting 20001 milliseconds for a response to arrive.'

I have tried several values here - with the same result.

Cacheing responses does not help here also. When users actively use kibana, problem does not occurs - connections to ldap are kept alive by regular queries. Problem occurs only after long period of inactivity - usualy overnight.
After that time connections to ldap are closed on the database side, which triggers error by first user trying to use kibana in the morning. Keeping cache for that long, (beside not beeing efecitve for user queries) is not good from security perspecive.

coutoPL · December 22, 2025, 11:19am

I was referring to the “LDAP returned code: timeout [85]”. It means client timeout.
I understand that you are sure (because of the two separate LDAP servers) that the LDAP server was not so busy that it wasn’t able to handle the request within a given time.

Ok, I will try to reproduce it on my side. Will get back to you when I find something

coutoPL · December 22, 2025, 11:21am

Nevertheless, please confirm that after installing the sent pre-build you see different LDAP error.

I see in your logs that is was “LDAP error [81]”, now is “LDAP error [85]”. Is it correct?

coutoPL · December 22, 2025, 12:22pm

I have one more pre-build to check.
I didn’t reproduce the problem yet, but I noticed one thing that could be improved (in the context of the k8s-based test env, you showed us).

Please, test it on your side:
ROR 1.68.0-pre13 for 7.17.1

hrr · December 22, 2025, 12:23pm

Yes it is. But I can find in logs error 85 for elasticsearch running with readonlyrest version 1.65.1.

hrr · December 23, 2025, 6:21am

Version pre13 so far works good - I’ve managed to log in to kibana without errors after night. I’ll do more tests later this day, but looks promising

coutoPL · December 23, 2025, 9:00am

ok, great. I’ve used a different health check to determine connection degradation. It seems that in environments with a proxy (like the k8s one), the previous health check may not be reliable.

hrr · December 23, 2025, 11:24am

More tests confirmed, that this solution works. At least for me . Thank you.

coutoPL · December 23, 2025, 11:30am

Great! It will be released with ROR 1.68.0 (we are going to do the release later this year)

coutoPL · January 7, 2026, 9:08am

ROR 1.68.0 is released

clarkc43 · February 9, 2026, 1:56pm

The health check in ROR typically trips when a proxy or k8s are in front if you’re still seeing LDAP timeouts. Pre13 fixed issue by altering the connection validation method, so you should be OK to go with 1.68.0. In essence, the new build handles slow or proxied LDAP much more reliably than the older version, which will misreport error 81/85 under stress.