we’ve upgraded ROR today from 1.14.0 to 1.16.10, and we see a significant degradation of performance. The load of the node increased significantly since the upgrade, and apparently was no longer able to cope with the incoming bulk requests. After downgrading again things go back to normal it seems.
Does anybody else see such a behavior ? Any idea how to debug this ?
If you have huge bulk requests, there might be one feature killing your performance. I checked the old 1.14.0 sources and a thing we didn’t have back then is the sub-request introspection.
The difference between the two versions is that in the newer one the indices rule goes through all the sub-requests and evicts the ones that are targeting non-permitted indices.
In the old version instead, it would just check the main request indices list and reason about that alone (which is a summary of all the indices involved in all the sub-requests).
I can prepare a build with a setting that disables this feature so we can confirm or not if this is the reason.
Thanks for the feedback LD. The logging level is already very basic, and we are not using auditing on this cluster. There were no other changed, and we can see the effect nicely by up or downgrading ROR.
Well, I think Simone put the finger on the stuff. @sscarduzio , your idea about a new option to be able to enable/disable “nested indices” authorization check is a good idea, but do you think it would be possible to do something more than an “all or nothing” way ?
I mean a more granular option, based on indice mask ?
SubRequest_authorization_check :
#enabled by default for all request
enable : false
indices_mask : ["logstash_sec*", "logstash_hosting*"] # would not check subrequest for these matching indices
# could be "*" for all indices
It may cost some work, because a mask could match multiple indices ( example “log*” )
yep this sounds reasonable, but first let’s see if the “nothing” option brings us back to normality. I sent you a pre build for 5.2.1 to test this. Let me know how it goes.
I managed to deploy the test version which I got from Simone. There was an issue reading the configuration file which prevented ES to start up. We’re iterating on it. More news asap.
Status update: With the latest version (with subrequest checking disabled) we still see a load which is about 2-4x higher than what we see for 1.14.0. With plain 1.1.6.10 it was about a factor 8-10. Looks better though a factor >2 in degradation is still quite high. I’ll try to get some more statistics.
Great to hear! There’s been more or less half a dozen optimisations between 1.16.10 and pre8. But pre9 adds another important optimisation (plus a better implementation of the settings in readonlyrest.yml file). There’s just one more optimisation that I need to benchmark, and then I release 116.11.
Very nice! I gave it a try, it indeed helps. We found that subrequest checking is still rather expensive though.
It would be nice if it was easier to switch it on or off by configuration rather than a Java option (unless that introduces again a performance penalty of cause). What do you think ?
I agree the java option is really crude way to switch the feature off. But I want to take a step back: the sub-request scanning feature originated when I was trying to make the index_rewrite rule work: we needed sub-request granularity to change every index field in every sub-request.
Of course, that turned out being a total failure due to the wide and ever changing set of ActionRequest types within ES, but also - clearly - for performance.
As of today, index_rewrite rule has been removed from documentation, but sub-request checking is still on by default. And I would be in fact very keen to disable this feature.
Some background about sub-request checking
The subrequest checking is trying to address the scenario when you have a bulk request that contains a set or requests that apply to different indices.
Without the subrequest checking, the indices rule evaluates the bulk request as a whole. So the global set of indices is evaluated and if one is prohibited, the whole request is dropped.
With subrequest checking If the bulk request contains 10 subrequests, one of which is trying to index/update/delete the wrong index, only that subrequest is discarded and the other nine will remain. Now the bulk request can be allowed by ROR.
I will remove this feature from the codebase in 1.16.12 if no major objection arises. For now feel free to disable this behaviour using this java option:
-existing indices :
log_de_xxxx
log_en_xxx
log_lu_ei_xxx
Ror rule indices allowing : log_lu_ei*
and in kibana I use a pattern log*
since kibana will query ES , it will try to display both 3 indices
and since the logged user can see only log_lu_ei_xxx,
Today kibana react as displaying only results gained from log_lu_ei_xxx .
is it the activity of sub-request checking ?
I mean, if it is the functionality of subrequest checking, and if it is disabled or deleted, the user will get no data display in kibana ?