RoR 1.45.0 with ES 7.17.7 is hitting FORBID rule when the indices should not match

Just upgraded our production stack for the first time in over a year. It’s now running ES 7.17.7, and I downloaded the latest ES plug-in, so that is 1.45.0. Most of our end users immediately are hitting a “FORBID” rule which was intended to keep them from snooping on some sensitive logs in indices named with a specific wildcard pattern; this combination of rules used to work fine in the cluster pre-upgrade, which was RoR 1.18.0 in ES 6.7.1.

I can send privately the full text of our ACLs, but basically this is a set of 2 consecutive rules, like this:

(…couple of simple rules…)

  1. If the indices names match “sensitivelogs-", and the user is in LDAP Active Directory group CanGetToSensitiveLogs, and the action matches "indices:data/read/”, then ALLOW them to get the data.

  2. OK, we fell through that rule, so… if the indices names matches that same pattern, and the action matches that same pattern, then I don’t even care what groups they may or may not be a member of… FORBID this access.

(…and then continue with other “allow” rules that no longer care about the indices name pattern)

In the Elasticsearch logs, I can see it matching on that second rule, and the detailed ACL evaluation HIStory shows “actions->true, indices->true”. So it seems as if the rule is triggering, forbidding them access, except that the dashboards that my end users are actually looking at have nothing to do with the “sensitivelogs-*” indices; they are querying completely unrelated index names. I didn’t notice this symptom after the upgrade, as I was sanity-checking the dashboard rendering, because my username happens to be in that CanGetToSensitiveLogs group, so I didn’t experience any rejection. But most of the rest of the company are not in that group, so they get Forbidden errors and missing data in their Kibana UI.

Does this sound familiar? I know in the past (months ago) you repaired for us a problem with a user being a member of multiple AD groups, which happened to have broken when you implemented something for RoR Enterprise (multi-tenant); you were retrieving the list of LDAP groups but then selecting only one to remember for ACL purposes. You fixed that behavior for RoR PRO non-Enterprise customers like us, back in September 2021. But this seems like a different problem: The ACL rule that it’s triggering does not involve a match to Group membership, only to indices and actions, and I do not want it to trigger in this case (when the index name does not match), but it is nonetheless.

This might have something to do with the new Kibana / ES 7.x behavior of using “/_async_search” to go collect the results of each query on the screen in a dashboard, but that shouldn’t matter… the ES query should be either asking for “sensitivelogs-*” or not doing so, unambiguously. In the ES logging line, it says “IDX:<N/A>”, which is interesting; is that where it’s supposed to be logging a specific index name?

Let me know what logs you need or how I can help you recreate the problem. Thanks!

– Jeff Saxe

Hi @JeffSaxe

Let’s start with two basic things:

  1. ES log (the FORBIDDEN one)
  2. your current ROR settings (eg. readonlyrest.yaml)

Could you please provide them?

Sorry for the delay, Mateusz. My apologies, I guess I found and worked around this issue months ago when I did a test upgrade from 6.x to 7.x on a cloned testing cluster. I found I had to add one small 3-line ACL before my allow-then-forbid pair, and it worked around the problem. But then when I upgraded my production cluster this past weekend, I ran into the problem again and forgot what I had done before! “Me from today” curses “me from months ago” for failing to do his documentation.

Anyway, I will excerpt my ACL rules and logs while obscuring some internal names. The issue has to do with the new (as of ES and Kibana 7) behavior of using the API “_async_search” to help make very-long-running dashboards more responsive. My allow-then-forbid pair looked like this:

- name: "Permit specific group to sensitivelogs"
  ldap_auth:
    name: "ourQIMdomain"
    groups: [ "CanGetToSensitiveLogs" ]
  indices: ["sensitivelogs-*"]
  actions: ["indices:data/read/*"]
  type: allow
  kibana_hide_apps: ["readonlyrest_kbn"]

- name: "Deny everyone else to sensitivelogs"
  indices: ["sensitivelogs-*"]
  actions: ["indices:data/read/*"]
  type: forbid

So the first one intends to confirm that you’re in the right AD group, that you’re asking to perform any kind of “indices:data/read” operation, and the specific index name or names you’re asking for is the sensitive ones. If so, great, you’re in. Then on the second one, I repeated all the same matching conditions, except we already know you’re not in the correct AD group, so immediately forbid this. But of course, if you’re not even asking for the sensitivelogs indices, then this rule shouldn’t match, either, so it should fall through this rule to the next rule in the list. However, right after the 7.x upgrade, my users got 403 forbidden replies when trying to view indices that they were perfectly well entitled to:

[2022-12-04T19:12:38,427][INFO ][t.b.r.a.l.AccessControlLoggingDecorator] [qim-elastic1-d4] FORBIDDEN by { name: ‘Deny everyone else to sensitivelogs’, policy: FORBID, rules: [actions,indices] req={ ID:1806295291-487367954#2843606, TYP:GetAsyncResultRequest, CGR:<N/A>, USR:jeff.saxe (attempted), BRS:true, KDX:null, ACT:indices:data/read/async_search/get, OA:192.168.48.230/32, XFF:qim-elastic1-kibana.qim.com:5601, DA:192.168.48.230/32, IDX:<N/A>, MET:GET, PTH:/_async_search/FkQ4U1A4NkdBVGRTN01SaWJvNzAyZkEebEtSSHkxUGVRUk9pSzE0Z0JhWFNpUToyODQzMzE2, CNT:<N/A>, HDR:Accept-Charset=utf-8, Authorization=, Host=qim-elastic1-d4:9200, connection=close, content-length=0, user-agent=elasticsearch-js/7.14.0-canary.7 (linux 4.15.0-197-generic-x64; Node.js v14.17.5), x-elastic-client-meta=es=7.14.0p,js=14.17.5,t=7.14.0p,hc=14.17.5, x-elastic-product-origin=kibana, x-forwarded-for:5601=qim-elastic1-kibana.qim.com, x-opaque-id=3efe87d6-51e3-45cb-ada4-fd0b80ec96d0, x-ror-kibana-request-method=post, x-ror-kibana-request-path=/internal/bsearch, HIS:[Kibana Admin super-user for human to edit Kibana security-> RULES:[auth_key_sha256->false]], [Permit specific group to sensitivelogs-> RULES:[ldap_auth->false]], [Deny everyone else to sensitivelogs-> RULES:[kibana_hide_apps->true, actions->true, indices->true]], }

[2022-12-04T19:12:38,528][INFO ][t.b.r.a.l.AccessControlLoggingDecorator] [qim-elastic1-d4] FORBIDDEN by { name: ‘Deny everyone else to sensitivelogs’, policy: FORBID, rules: [actions,indices] req={ ID:1265128958–1012665066#2843636, TYP:DeleteAsyncResultRequest, CGR:<N/A>, USR:jeff.saxe (attempted), BRS:true, KDX:null, ACT:indices:data/read/async_search/delete, OA:192.168.48.230/32, XFF:qim-elastic1-kibana.qim.com:5601, DA:192.168.48.230/32, IDX:<N/A>, MET:DELETE, PTH:/_async_search/FkQ4U1A4NkdBVGRTN01SaWJvNzAyZkEebEtSSHkxUGVRUk9pSzE0Z0JhWFNpUToyODQzMzE2, CNT:<N/A>, HDR:Accept-Charset=utf-8, Authorization=, Host=qim-elastic1-d4:9200, connection=close, content-length=0, user-agent=elasticsearch-js/7.14.0-canary.7 (linux 4.15.0-197-generic-x64; Node.js v14.17.5), x-elastic-client-meta=es=7.14.0p,js=14.17.5,t=7.14.0p,hc=14.17.5, x-elastic-product-origin=kibana, x-forwarded-for:5601=qim-elastic1-kibana.qim.com, x-opaque-id=01c11340-3760-4e82-b554-efd451cf164e, x-ror-kibana-request-method=delete, x-ror-kibana-request-path=/internal/search/ese/FkQ4U1A4NkdBVGRTN01SaWJvNzAyZkEebEtSSHkxUGVRUk9pSzE0Z0JhWFNpUToyODQzMzE2, HIS:[Kibana Admin super-user for human to edit Kibana security-> RULES:[auth_key_sha256->false]], [Permit specific group to sensitivelogs-> RULES:[ldap_auth->false]], [Deny everyone else to sensitivelogs-> RULES:[kibana_hide_apps->true, actions->true, indices->true]], }

These are the new kind of query, an “_async_search”, the first one a GET and the second a DELETE. These appear to be used by the Kibana dashboards to check the status of an async query (with some unique ID in the URL) and then to get rid of that ID after it’s no longer needed. Unfortunately, the structure of this incoming URL doesn’t allow ReadOnlyREST to know what the index name or names are that the previously-submitted async search involved, so it mentally sets “IDX” to a “not available” value. Then when it’s evaluating whether this entire ACL rule should match (and take the allow/forbid action) or fall through, the fact that the IDX is not available makes the “indices” part of the ACL go ahead and match, although I would prefer it not to. Result: the user gets denied.

So, since these “get status or delete on a previously-submitted async” queries come in constantly, and they only make sense if the end user already has a GUID that corresponds to a previous search that was started and was not forbidden, I mentally calculated that it was not a security risk to allow all of these. So I added the following new rule, just above the pair I had:

- name: "huh, weird, need to permit certain stuff on async_search"
  actions: [ "indices:data/read/async_search/delete", "indices:data/read/async_search/get" ]
  type: allow

After this change, the problem is fixed. All of these two specific kinds of async search queries, get and delete, are always allowed (whether they originated with a sensitive log or not) for all users, regardless of membership in groups. Now, if a user tries to actually query the sensitive logs without being in the AD group, the FORBID message is now correct:

[2022-12-07T12:15:48,970][INFO ][t.b.r.a.l.AccessControlLoggingDecorator] [qim-elastic1-d4] FORBIDDEN by { name: ‘Deny everyone else to sensitivelogs’, policy: FORBID, rules: [actions,indices] req={ ID:974521999–1333034261#39126179, TYP:SubmitAsyncSearchRequest, CGR:<N/A>, USR:jeff.saxe (attempted), BRS:true, KDX:null, ACT:indices:data/read/async_search/submit, OA:192.168.48.231/32, XFF:qim-elastic1-kibana.qim.com:5601, DA:192.168.48.230/32, IDX:sensitivelogs-paloaltonew-, MET:POST, PTH:/sensitivelogs-paloaltonew-/_async_search, CNT:<OMITTED, LENGTH=979.0 B> , HDR:Accept-Charset=utf-8, Authorization=, Host=qim-elastic1-d4:9200, connection=close, content-length=979, content-type=application/json, user-agent=elasticsearch-js/7.14.0-canary.7 (linux 4.15.0-197-generic-x64; Node.js v14.17.5), x-elastic-client-meta=es=7.14.0p,js=14.17.5,t=7.14.0p,hc=14.17.5, x-elastic-product-origin=kibana, x-forwarded-for:5601=qim-elastic1-kibana.qim.com, x-opaque-id=12369058-0922-4827-ae83-96714541fdb9, x-ror-kibana-request-method=post, x-ror-kibana-request-path=/internal/bsearch, HIS:[Kibana Admin super-user for human to edit Kibana security-> RULES:[auth_key_sha256->false] RESOLVED:[indices=sensitivelogs-paloaltonew-]], [huh, weird, need to permit certain stuff on async_search-> RULES:[actions->false] RESOLVED:[indices=sensitivelogs-paloaltonew-]], [Permit specific group to sensitivelogs-> RULES:[ldap_auth->false] RESOLVED:[indices=sensitivelogs-paloaltonew-*]], [Deny everyone else to sensitivelogs-> RULES:[actions->true, indices->true] RESOLVED:[indices=sensitivelogs-paloaltonew-2022.02,sensitivelogs-paloaltonew-2022.10,sensitivelogs-paloaltonew-2022.05,sensitivelogs-paloaltonew-2022.12,sensitivelogs-paloaltonew-2022.01,sensitivelogs-paloaltonew-2021.12,sensitivelogs-paloaltonew-2022.06,sensitivelogs-paloaltonew-2022.08,sensitivelogs-paloaltonew-2022.07,sensitivelogs-paloaltonew-2022.11,sensitivelogs-paloaltonew-2022.04,sensitivelogs-paloaltonew-2022.03,sensitivelogs-paloaltonew-2022.09]], }

The action is now an “async_search/submit”, and the IDX has the correct names (both the wildcard as taken from the URL, and the RESOLVED as expanded to the individual names), and now all the conditions of the ACL can be evaluated for matching, and it correctly determines that I should be forbidden. If I put myself in the “Can Get to” AD group, then my original first ACL also still matches fine, and I’m allowed to see them.

So all in all, I don’t think this is really a problem with ReadOnlyREST; you could call it user error and growing pains with 7.x. Some possible outcomes:

  1. You could decide to alter that section of the RoR ACL matching logic that if the ACL wants to match on “indices”, but the index name is not-available, that it doesn’t match, rather than does; I don’t know, though… you might have already thought about this situation and found that it would be safer, and confuse fewer customers, to leave the logic the way you have right now! Certainly if you make this change, you’d want to explain the reasoning in your release notes.
  2. I could change my Deny ACL to not match all a wildcard of actions beginning “indices:data/read”, but to take away the trailing asterisk and spell out all the ones I want to deny. However, that’s more troublesome, and it runs the risk of some new variety of data/read action being introduced anytime in a future ES version that I didn’t think of, and then slipping through my deny net. So it’s safer for me to keep denying “data/read/*”.
  3. You could add some kind of warning about this issue in the Kibana docs or example ACL rules. Or we could just rely on your other customers Google searching and stumbling across this forum post, and solving it themselves the same way I did. At least it’s not causing me any further pain.

– JeffS

1 Like

Hi Jeff, sorry for the long delay! Your issue is now understood, and it needs some work for reproduction and fix. Will bump it up through the backlog!

@JeffSaxe I’ve been analyzing the abovementioned problem, but I’m not sure I can reproduce your configuration. To see if we are on the same page, I’ve created PR in our ror-sandbox project: [RORDEV-788] reproduction by coutoPL · Pull Request #8 · beshu-tech/ror-sandbox · GitHub

It’s a docker-based solution that allows us to quickly set up a simple cluster with one node of ES with ROR and one node of Kibana with ROR (with specified versions). We can try to reproduce your case in an isolated environment, with configuration reduced to minimum and Kibana sample data. Please, take a look (instructions in the PR’s description).


My generic view on the problem:

Yes, we thought about that thing. The async search API is pretty similar to the Scroll API. It can be described like this:

  1. run action and get a thing which acts eg. like sql cursor
  2. use the thing to get the next result

In the case of the Async API we have submit and get and in the case of the Scroll API - _search (with scroll flag) and _search/scroll (to get the next page of results).

In 1. we have an explicit index context (index name in request path or body).
In 2. there is no explicit index context (we know that getting the next pack of data is related to the index, but we have only some ID, not the index name per se).

And now we can think about how the indices rule behaves. By default, the rule matches when a request doesn’t involve indices. And the async_search/get and async_search/delete requests are being treated by ROR as requests that are not related to indices (the mentioned explicit index context).


Let’s go back to your case. Based on the above, it means that:

  • GET /_async_search/{ID}
  • DELETE /_async_search/{ID}

should be matched by:

- name: "Permit specific group to sensitivelogs"
  ldap_auth:
    name: "ourQIMdomain"
    groups: [ "CanGetToSensitiveLogs" ]
  indices: ["sensitivelogs-*"]
  actions: ["indices:data/read/*"]
  type: allow
  kibana_hide_apps: ["readonlyrest_kbn"]

and this is the main thing I don’t understand in your report because it looks like you claim it is not matched.

My ror-sandbox configuration shows that it works as I described.
But maybe I missed sth. Could you please take a look?

Thanks in advance