Performance degradation after upgrading

Hi, all,

we’ve upgraded ROR today from 1.14.0 to 1.16.10, and we see a significant degradation of performance. The load of the node increased significantly since the upgrade, and apparently was no longer able to cope with the incoming bulk requests. After downgrading again things go back to normal it seems.
Does anybody else see such a behavior ? Any idea how to debug this ?

Thanks in advance!

Hi,

interesting feedback.

did you just upgraded RoR, or something else in your stack ?
did you change something in your RoR config section?

also, may it be related to audit and logging activity (there is a huge difference between 1.14.x and 1.16.x

you may try to set log as info or basic only (refer to doc, I think it chaged a bit) and give a try.
LD

Hello @schwicke!

If you have huge bulk requests, there might be one feature killing your performance. I checked the old 1.14.0 sources and a thing we didn’t have back then is the sub-request introspection.

The difference between the two versions is that in the newer one the indices rule goes through all the sub-requests and evicts the ones that are targeting non-permitted indices.
In the old version instead, it would just check the main request indices list and reason about that alone (which is a summary of all the indices involved in all the sub-requests).

I can prepare a build with a setting that disables this feature so we can confirm or not if this is the reason.

Hi, Simone,
worth a try indeed. What you say sound quite reasonable. That would be great! Thanks in advance!

Thanks for the feedback LD. The logging level is already very basic, and we are not using auditing on this cluster. There were no other changed, and we can see the effect nicely by up or downgrading ROR.

Well, I think Simone put the finger on the stuff.
@sscarduzio , your idea about a new option to be able to enable/disable “nested indices” authorization check is a good idea, but do you think it would be possible to do something more than an “all or nothing” way ?

I mean a more granular option, based on indice mask ?

 SubRequest_authorization_check :
    #enabled by default for all request
    enable : false    
    indices_mask : ["logstash_sec*", "logstash_hosting*"]   # would not check subrequest for these matching indices
    # could be "*" for all indices

It may cost some work, because a mask could match multiple indices ( example “log*” )

yep this sounds reasonable, but first let’s see if the “nothing” option brings us back to normality. I sent you a pre build for 5.2.1 to test this. Let me know how it goes.

1 Like

Working on it. Struggling with the expected location of config/readonlyrest.yml
More news asap.

I think I got it working (thanks to Simone!) after all. We need to get some stats now to conclude.

@schwicke , you mean , it works with the special release you got from Simone, or an optimisation you made on your side with the 1.16.10 ?

Fred

I managed to deploy the test version which I got from Simone. There was an issue reading the configuration file which prevented ES to start up. We’re iterating on it. More news asap.

1 Like

Status update: With the latest version (with subrequest checking disabled) we still see a load which is about 2-4x higher than what we see for 1.14.0. With plain 1.1.6.10 it was about a factor 8-10. Looks better though a factor >2 in degradation is still quite high. I’ll try to get some more statistics.

Status update: the latest pre8 release with subrequest checking disabled looks much better. Good job, thanks a lot, Simone!

1 Like

Great to hear! There’s been more or less half a dozen optimisations between 1.16.10 and pre8. But pre9 adds another important optimisation (plus a better implementation of the settings in readonlyrest.yml file). There’s just one more optimisation that I need to benchmark, and then I release 116.11.

So guys, we have a new wildcard matcher that does not use regex, its’ just a tad faster than the previous one… :joy:

old: 6207 ms
new: 665 ms
Gain: 89.28629%

Over 10e7 iterations.

This is massive because it’s called N times for each request. With N=#of strings in indices/actions rule.

Very nice! I gave it a try, it indeed helps. We found that subrequest checking is still rather expensive though.

It would be nice if it was easier to switch it on or off by configuration rather than a Java option (unless that introduces again a performance penalty of cause). What do you think ?

Hey @schwicke! Glad we got to the bottom of this :slight_smile:

And thanks for all the telemetry you were able to circle back to me in the process. Would have been really hard to spot these bottlenecks otherwise.

I agree the java option is really crude way to switch the feature off. But I want to take a step back: the sub-request scanning feature originated when I was trying to make the index_rewrite rule work: we needed sub-request granularity to change every index field in every sub-request.

Of course, that turned out being a total failure due to the wide and ever changing set of ActionRequest types within ES, but also - clearly - for performance.

As of today, index_rewrite rule has been removed from documentation, but sub-request checking is still on by default. And I would be in fact very keen to disable this feature.

Some background about sub-request checking

The subrequest checking is trying to address the scenario when you have a bulk request that contains a set or requests that apply to different indices.

Without the subrequest checking, the indices rule evaluates the bulk request as a whole. So the global set of indices is evaluated and if one is prohibited, the whole request is dropped.

With subrequest checking If the bulk request contains 10 subrequests, one of which is trying to index/update/delete the wrong index, only that subrequest is discarded and the other nine will remain. Now the bulk request can be allowed by ROR.

Anybody against disabling this by default?

  • DISABLED BY DEFAULT
  • LEAVE IT ON BY DEFAULT
  • DELETE FEATURE

0 voters

I will remove this feature from the codebase in 1.16.12 if no major objection arises. For now feel free to disable this behaviour using this java option:

-Dcom.readonlyrest.subreq.introspection=false

mmh @sscarduzio,

maybe I m wrong, but , would it be related if :

-existing indices :
log_de_xxxx
log_en_xxx
log_lu_ei_xxx

  • Ror rule indices allowing : log_lu_ei*

and in kibana I use a pattern log*
since kibana will query ES , it will try to display both 3 indices
and since the logged user can see only log_lu_ei_xxx,
Today kibana react as displaying only results gained from log_lu_ei_xxx .

is it the activity of sub-request checking ?

I mean, if it is the functionality of subrequest checking, and if it is disabled or deleted, the user will get no data display in kibana ?

fred