java - How come Flume-NG HDFS sink does not write to file when the number of events equals or exceeds the batchSize? -
i trying configure flume such logs roll hourly or when reach default block size of hdfs (64 mb). below current configuration:
imp-agent.channels.imp-ch1.type = memory imp-agent.channels.imp-ch1.capacity = 40000 imp-agent.channels.imp-ch1.transactioncapacity = 1000 imp-agent.sources.avro-imp-source1.channels = imp-ch1 imp-agent.sources.avro-imp-source1.type = avro imp-agent.sources.avro-imp-source1.bind = 0.0.0.0 imp-agent.sources.avro-imp-source1.port = 41414 imp-agent.sources.avro-imp-source1.interceptors = host1 timestamp1 imp-agent.sources.avro-imp-source1.interceptors.host1.type = host imp-agent.sources.avro-imp-source1.interceptors.host1.useip = false imp-agent.sources.avro-imp-source1.interceptors.timestamp1.type = timestamp imp-agent.sinks.hdfs-imp-sink1.channel = imp-ch1 imp-agent.sinks.hdfs-imp-sink1.type = hdfs imp-agent.sinks.hdfs-imp-sink1.hdfs.path = hdfs://mynamenode:8020/flume/impressions/yr=%y/mo=%m/d=%d/logger=%{host}s1/ imp-agent.sinks.hdfs-imp-sink1.hdfs.fileprefix = impr imp-agent.sinks.hdfs-imp-sink1.hdfs.batchsize = 10 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollinterval = 3600 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollcount = 0 imp-agent.sinks.hdfs-imp-sink1.hdfs.rollsize = 66584576 imp-agent.channels = imp-ch1 imp-agent.sources = avro-imp-source1 imp-agent.sinks = hdfs-imp-sink1
my intention configuration above write hdfs in batches of 10 , roll file being written hourly. seeing of data appears held in memory until since under 64mb until files rolls after 1 hour. there settings should tweaking in order desired behavior?
to answer myself, flume writing data hdfs in batches. file length reported open because block in process of being written to.
Comments
Post a Comment