hibernate - Add tika bridge to new defined field in FieldBridge -


i have entity witch points binary data numeric identifier (binid). utility class can provide binary stream form given id. target index binary stream - file.

the concept is, create bridge binary data identifier field. inside bridge i'll call utility class, gets stream , create new field given stream. i'd stream indexed/analyzed tika bridge.

i use fieldbridge without luceneoptions. additionally can not annotate entity class, use programmatic api.

so far looks like:

public class searchmappingfactory {     @factory     public searchmapping getsearchmapping(){         searchmapping mapping = new searchmapping();         mapping.entity(attachment.class)             .indexed()             .property("id", elementtype.field)             .documentid()             .property("name", elementtype.field)             .field()             .property("description", elementtype.field)             .field()             .property("binid", elementtype.field)             .field()             .name("attachmentfile")             .bridge(attachmentcontentsearchbridge.class)             .property("content", elementtype.field)  // try define additional bridge             .field()             .bridge(tikabridge.class)         ;         return mapping;     }; } 

and bridge:

public class attachmentcontentsearchbridge implements fieldbridge {      @override     public void set(string name, object value, document document, luceneoptions luceneoptions) {         reader reader = new inputstreamreader(mybinutil.getstreamforid((integer)value));         field field = new field("content",reader); //i'd add tika bridge here, cant         document.add(field);     } } 

lets start bridge. it's quite simple, problem is, can not define bridge new created field content - major problem get.

i tried solve adding content field mapping, can define bridge. definition accepted , application starts , works, index content has no keywords :(

please give advice how define tikebridge field created within fieldbridge.

thank time reading , hope help.

if stream data via id , custom util class, cannot use @tikabridge annotation. documentation of annotation suggests works binary data fields or string/url fields. in latter case string/url used load binary data.

in case have re-implement happens in org.hibernate.search.bridge.builtin.tikabridge.

the interesting parts are:

public void set(string name, object value, document document, luceneoptions luceneoptions) {     if ( value == null ) {         throw new illegalargumentexception( "null cannot passed tika bridge" );     }     inputstream in = null;     try {         in = getinputstreamfordata( value );          metadata metadata = metadataprocessor.preparemetadata();         parsecontext parsecontext = parsecontextprovider.getparsecontext( name, value );          stringwriter writer = new stringwriter();         writeoutcontenthandler contenthandler = new writeoutcontenthandler( writer );          parser parser = new autodetectparser();         parser.parse( in, contenthandler, metadata, parsecontext );         luceneoptions.addfieldtodocument( name, writer.tostring(), document );          // allow optional indexing of metadata user         metadataprocessor.set( name, value, document, luceneoptions, metadata );     }     catch ( exception e ) {         throw propagate( e );     }     {         closequietly( in );     } } 

you need input stream of data, create tika parser , pass stream output stringwriter tika can write data. in end need add extracted data new field using luceneoptions.


Comments

Popular posts from this blog

jquery - How can I dynamically add a browser tab? -

node.js - Getting the socket id,user id pair of a logged in user(s) -

keyboard - C++ GetAsyncKeyState alternative -