java - How come Flume-NG HDFS sink does not write to file when the number of events equals or exceeds the batchSize? -


i trying configure flume such logs roll hourly or when reach default block size of hdfs (64 mb). below current configuration:

imp-agent.channels.imp-ch1.type = memory imp-agent.channels.imp-ch1.capacity = 40000 imp-agent.channels.imp-ch1.transactioncapacity = 1000  imp-agent.sources.avro-imp-source1.channels = imp-ch1 imp-agent.sources.avro-imp-source1.type = avro imp-agent.sources.avro-imp-source1.bind = 0.0.0.0 imp-agent.sources.avro-imp-source1.port = 41414  imp-agent.sources.avro-imp-source1.interceptors = host1 timestamp1 imp-agent.sources.avro-imp-source1.interceptors.host1.type = host imp-agent.sources.avro-imp-source1.interceptors.host1.useip = false imp-agent.sources.avro-imp-source1.interceptors.timestamp1.type = timestamp  imp-agent.sinks.hdfs-imp-sink1.channel = imp-ch1 imp-agent.sinks.hdfs-imp-sink1.type = hdfs imp-agent.sinks.hdfs-imp-sink1.hdfs.path = hdfs://mynamenode:8020/flume/impressions/yr=%y/mo=%m/d=%d/logger=%{host}s1/ imp-agent.sinks.hdfs-imp-sink1.hdfs.fileprefix = impr imp-agent.sinks.hdfs-imp-sink1.hdfs.batchsize = 10 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollinterval = 3600 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollcount = 0 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollsize = 66584576  imp-agent.channels = imp-ch1 imp-agent.sources = avro-imp-source1 imp-agent.sinks = hdfs-imp-sink1 

my intention configuration above write hdfs in batches of 10 , roll file being written hourly. seeing of data appears held in memory until since under 64mb until files rolls after 1 hour. there settings should tweaking in order desired behavior?

to answer myself, flume writing data hdfs in batches. file length reported open because block in process of being written to.


Comments