hibernate - Add tika bridge to new defined field in FieldBridge -
i have entity witch points binary data numeric identifier (binid). utility class can provide binary stream form given id. target index binary stream - file.
the concept is, create bridge binary data identifier field. inside bridge i'll call utility class, gets stream , create new field given stream. i'd stream indexed/analyzed tika bridge.
i use fieldbridge without luceneoptions. additionally can not annotate entity class, use programmatic api.
so far looks like:
public class searchmappingfactory {     @factory     public searchmapping getsearchmapping(){         searchmapping mapping = new searchmapping();         mapping.entity(attachment.class)             .indexed()             .property("id", elementtype.field)             .documentid()             .property("name", elementtype.field)             .field()             .property("description", elementtype.field)             .field()             .property("binid", elementtype.field)             .field()             .name("attachmentfile")             .bridge(attachmentcontentsearchbridge.class)             .property("content", elementtype.field)  // try define additional bridge             .field()             .bridge(tikabridge.class)         ;         return mapping;     }; }   and bridge:
public class attachmentcontentsearchbridge implements fieldbridge {      @override     public void set(string name, object value, document document, luceneoptions luceneoptions) {         reader reader = new inputstreamreader(mybinutil.getstreamforid((integer)value));         field field = new field("content",reader); //i'd add tika bridge here, cant         document.add(field);     } }   lets start bridge. it's quite simple, problem is, can not define bridge new created field content - major problem get. 
i tried solve adding content field mapping, can define bridge. definition accepted , application starts , works, index content has no keywords :(
please give advice how define tikebridge field created within fieldbridge.
thank time reading , hope help.
if stream data via id , custom util class, cannot use @tikabridge annotation. documentation of annotation suggests works binary data fields or string/url fields. in latter case string/url used load binary data.
in case have re-implement happens in org.hibernate.search.bridge.builtin.tikabridge.
the interesting parts are:
public void set(string name, object value, document document, luceneoptions luceneoptions) {     if ( value == null ) {         throw new illegalargumentexception( "null cannot passed tika bridge" );     }     inputstream in = null;     try {         in = getinputstreamfordata( value );          metadata metadata = metadataprocessor.preparemetadata();         parsecontext parsecontext = parsecontextprovider.getparsecontext( name, value );          stringwriter writer = new stringwriter();         writeoutcontenthandler contenthandler = new writeoutcontenthandler( writer );          parser parser = new autodetectparser();         parser.parse( in, contenthandler, metadata, parsecontext );         luceneoptions.addfieldtodocument( name, writer.tostring(), document );          // allow optional indexing of metadata user         metadataprocessor.set( name, value, document, luceneoptions, metadata );     }     catch ( exception e ) {         throw propagate( e );     }     {         closequietly( in );     } }   you need input stream of data, create tika parser , pass stream output stringwriter tika can write data. in end need add extracted data new field using luceneoptions.
Comments
Post a Comment