elasticsearch - Faceting on part of a string -


let's i've got documents in index. 1 of fields url. like...

{"url": "server1/some/path/a.doc"}, {"url": "server1/some/otherpath/b.doc"}, {"url": "server1/some/c.doc"}, {"url": "server2/a.doc"}, {"url": "server2/some/path/b.doc"} 

i'm trying extract counts paths search results. presumably query-per-branch.

eg:

initial query:     server1: 3     server2: 2  server1 query:     some: 3  server1/some query:     path: 1     otherpath: 1 

now can broadly see 2 ways approach , i'm not great fan of either.

option 1: scripting. mvel seems limited mathematical operations (at least can't find string split in docs) have in java. that's possible feels lot of overhead if there lot of records.

option 2: store path parts alongside document...

{"url": ..., "parts": ["1|server1","2|some","3|path"]}, {"url": ..., "parts": ["1|server1","2|some","3|otherpath"]}, {"url": ..., "parts": ["1|server1","2|some"]}, {"url": ..., "parts": ["1|server2"]}, {"url": ..., "parts": ["1|server2","2|some","3|path"]} 

this way like. urls starting 'server1/some', facet on parts starting 3|. feels horribly hackish.

what's way this? can pre-processing required need counts coming es it's count of results query important.

given doc url /a/b/c

have multivalued field url , input (using preprocessing) values: /a, /a/b, /a/b/c

edit

when want contrain showing counts paths of depth design multiple multivalued fields described above. each field represent particular depth.

the es-client should contain logic decide depth (and field) query facets.

still feels hack though, , indeed without control of data end lots of fields this.


Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

keyboard - C++ GetAsyncKeyState alternative -

android - java.net.UnknownHostException(Unable to resolve host “URL”: No address associated with hostname) -