hibernate - Add tika bridge to new defined field in FieldBridge -
i have entity witch points binary data numeric identifier (binid
). utility class can provide binary stream form given id. target index binary stream - file.
the concept is, create bridge binary data identifier field. inside bridge i'll call utility class, gets stream , create new field given stream. i'd stream indexed/analyzed tika bridge.
i use fieldbridge without luceneoptions. additionally can not annotate entity class, use programmatic api.
so far looks like:
public class searchmappingfactory { @factory public searchmapping getsearchmapping(){ searchmapping mapping = new searchmapping(); mapping.entity(attachment.class) .indexed() .property("id", elementtype.field) .documentid() .property("name", elementtype.field) .field() .property("description", elementtype.field) .field() .property("binid", elementtype.field) .field() .name("attachmentfile") .bridge(attachmentcontentsearchbridge.class) .property("content", elementtype.field) // try define additional bridge .field() .bridge(tikabridge.class) ; return mapping; }; }
and bridge:
public class attachmentcontentsearchbridge implements fieldbridge { @override public void set(string name, object value, document document, luceneoptions luceneoptions) { reader reader = new inputstreamreader(mybinutil.getstreamforid((integer)value)); field field = new field("content",reader); //i'd add tika bridge here, cant document.add(field); } }
lets start bridge. it's quite simple, problem is, can not define bridge new created field content
- major problem get.
i tried solve adding content
field mapping, can define bridge. definition accepted , application starts , works, index content
has no keywords :(
please give advice how define tikebridge field created within fieldbridge.
thank time reading , hope help.
if stream data via id , custom util class, cannot use @tikabridge annotation. documentation of annotation suggests works binary data fields or string/url fields. in latter case string/url used load binary data.
in case have re-implement happens in org.hibernate.search.bridge.builtin.tikabridge.
the interesting parts are:
public void set(string name, object value, document document, luceneoptions luceneoptions) { if ( value == null ) { throw new illegalargumentexception( "null cannot passed tika bridge" ); } inputstream in = null; try { in = getinputstreamfordata( value ); metadata metadata = metadataprocessor.preparemetadata(); parsecontext parsecontext = parsecontextprovider.getparsecontext( name, value ); stringwriter writer = new stringwriter(); writeoutcontenthandler contenthandler = new writeoutcontenthandler( writer ); parser parser = new autodetectparser(); parser.parse( in, contenthandler, metadata, parsecontext ); luceneoptions.addfieldtodocument( name, writer.tostring(), document ); // allow optional indexing of metadata user metadataprocessor.set( name, value, document, luceneoptions, metadata ); } catch ( exception e ) { throw propagate( e ); } { closequietly( in ); } }
you need input stream of data, create tika parser , pass stream output stringwriter tika can write data. in end need add extracted data new field using luceneoptions.
Comments
Post a Comment