elasticsearch - Faceting on part of a string -
let's i've got documents in index. 1 of fields url. like...
{"url": "server1/some/path/a.doc"}, {"url": "server1/some/otherpath/b.doc"}, {"url": "server1/some/c.doc"}, {"url": "server2/a.doc"}, {"url": "server2/some/path/b.doc"}
i'm trying extract counts paths search results. presumably query-per-branch.
eg:
initial query: server1: 3 server2: 2 server1 query: some: 3 server1/some query: path: 1 otherpath: 1
now can broadly see 2 ways approach , i'm not great fan of either.
option 1: scripting. mvel seems limited mathematical operations (at least can't find string split in docs) have in java. that's possible feels lot of overhead if there lot of records.
option 2: store path parts alongside document...
{"url": ..., "parts": ["1|server1","2|some","3|path"]}, {"url": ..., "parts": ["1|server1","2|some","3|otherpath"]}, {"url": ..., "parts": ["1|server1","2|some"]}, {"url": ..., "parts": ["1|server2"]}, {"url": ..., "parts": ["1|server2","2|some","3|path"]}
this way like. urls starting 'server1/some', facet on parts starting 3|
. feels horribly hackish.
what's way this? can pre-processing required need counts coming es it's count of results query important.
given doc url /a/b/c
have multivalued field url
, input (using preprocessing) values: /a
, /a/b
, /a/b/c
edit
when want contrain showing counts paths of depth design multiple multivalued fields described above. each field represent particular depth.
the es-client should contain logic decide depth (and field) query facets.
still feels hack though, , indeed without control of data end lots of fields this.
Comments
Post a Comment