Enable rule to prevent returning specific field to bulk query

ld57 · July 20, 2018, 1:18pm

Hello,

Not related to RoR,

but I wonder if it exists a way to prevent a field to be returned to a bulk query ?

if not, would it be possible to imagine that a feature like this could be created in RoR ?

Example :
document contain
field1 =a, field2=b, field3=c

a rule would be like :
Ror_prevent_fields: [“field2”, “field4”, “account”, “sec*”]

and any query would return :
field1 and field3 ( in this usecase)

would it be possible ? ( with exclude or something like this )?

{
    "_source": {
       "excludes": [ "field1","field4","account","sec*" ]
    }
   ...
}```

sscarduzio · July 20, 2018, 2:17pm

@ld57 are you aware of the fields rule?

We introduced this feature at the beginning of the summer, it makes possible to achieve field level security, which is I think what you mean.

ld57 · July 20, 2018, 2:43pm

Geezzz I love this product

saving my life with this GDPR constraint !!!

thanks for pointing this to me … WTB new eyes

sscarduzio · July 20, 2018, 2:46pm

HAHA you are welcome @ld57! Actually achieving this was quite a technical blood bath. But hopefully it’s worth it

Talking about GDPR, I was thinking of creating a similar rule for obfuscating the content of certain terms. What do you think?

ld57 · July 20, 2018, 3:08pm

Well,

the issue is the following :

if just obfuscating or hashing is required it would be ok,

but here that I am facing :
usecase :
multiple doc, multiple indice
a similar field type “lastname”

kibana and bulk request must not see the content of lastname ( GDPR rule ), but they need to correlate docs based on the “lastname” ( security purpose and tracking )

if correlation reveals an issue, a request for authorisation is made to identify this “lastname”.

in my usecase, I can not just hide lastname, but I need to “transform it” to a hash to be able to correlate information.

and if I get authorisation to reveal lastname, I must be able to “decode” the hash.

here the approach I have chosen :

in my logstash filter, I use prune and fingerprint and clone
I use fingerprint to hash the value of a field, then I use prune and clone to store :
in public indice : all field except original “lastname”, replaced by “hash_lastname”
in restricted indice as update : original “lastname” and “hash_lastname”

if users read message from public indice ,they do not see lastname orginal content , but a hash.
then scripts can be used to correlate multiple search, based on hash.
If an authorisation is rised, then a “superuser” get access to restricted indice, and search the hash, to get orginal lastname.

it is a quite complicated and eat ressources on logstash side.

If it would be possible to replace field value on the fly by a hash with RoR based on a rule , it would be useful, but I guess performance will be dramatically decreased.

what do you think ?

sscarduzio · July 20, 2018, 8:28pm

I agree that the best way to do this is to create an extra field at ingest time (for performance reasons), which is exactly what you are doing.

This in conjunction with fields rule would certainly help you in the sense that you can write to blocks: one for superusers that see all the fields, and one for regular users and scripts that will exclude the non-hashed sensitive fields.

So yes, I think you are all set in your use case. Thanks for sharing, finally a very concrete and sensible application of GDPR.

ld57 · July 20, 2018, 11:32pm

Yep.

i Also need to use the field rule because I have to store the original message field unmodified.
And of course it contains sensible data.

I need to keep it for future export purpose, of the complete original log file.

Using field rule is perfect to hide message field to public user !