快捷搜索:  汽车  科技

springbatch 读取文件不全(全网最详细SpringBatch读跨多行文件讲解)

springbatch 读取文件不全(全网最详细SpringBatch读跨多行文件讲解)us412453 203 tqm 2020-03-27下面讲的例子是每两行表示一条记录:412222 201 tom 2020-02-27 china

写在前面:我是「境里婆娑」。我还是从前那个少年,没有一丝丝改变,时间只不过是考验,种在心中信念丝毫未减,眼前这个少年,还是最初那张脸,面前再多艰险不退却。

写博客的目的就是分享给大家一起学习交流,如果您对 java感兴趣,可以关注我,我们一起学习。

前言:在工作中可能会遇到SpringBatch读取的文件记录跨多行或者文件中存在多种不同的记录格式,不必担心SpringBatch已经帮我们把接口都预留好了,只需要稍微改造就可以轻松实现。

读记录跨多行文件

当Flat文件格式非标准是,通过实现记录分隔策略接口RecordSeparatorPolicy来实现非标准Flat格式文件。非标准Flat文件有多种情况,例如记录跨多行、以特定的字符开头、以特定的字符结尾。

下面讲的例子是每两行表示一条记录:

412222 201 tom 2020-02-27

china

412453 203 tqm 2020-03-27

us

412222 205 tym 2020-05-27

jap

默认的记录分割策略SimpleRecordSeparatorPolicy或者DefaultRecordSeparatorPolicy已经不能处理此类文件。我们可以实现接口RecordSeparatorPolicy来自定义分割策略MulitiLineRecordSeparatorPolicy

读记录跨多行文件时,使用到的核心组件类图如下:

springbatch 读取文件不全(全网最详细SpringBatch读跨多行文件讲解)(1)

在本类图中除了MulitiLineRecordSeparatorPolicy和CommonFieldSetMapper是自定义实现的,其他组件都是SpringBatch自带。

MulitiLineRecordSeparatorPolicy:负责从文件中确认一条完整记录,在本实现中每读到四个逗号分隔符,则认为是一条完整的记录

/** * @author shuliangzhao * @date 2020/12/6 13:05 */ public class MulitiLineRecordSeparatorPolicy implements RecordSeparatorPolicy { private String delimiter = " "; private int count = 0; public int getCount() { return count; } public void setCount(int count) { this.count = count; } public String getDelimiter() { return delimiter; } public void setDelimiter(String delimiter) { this.delimiter = delimiter; } @Override public boolean isEndOfRecord(String record) { return countDelimiter(record) == count; } private int countDelimiter(String record) { String temp = record; int index = -1; int count = 0; while ((index=temp.indexOf(" ")) != -1) { temp = temp.substring(index 1); count ; } return count; } @Override public String postProcess(String record) { return record; } @Override public String preProcess(String record) { return record; } }

delimiter :定义为读的的分割符号

count:分隔符总数,给定的字符串包含的分隔符个数等于此值,则认为是一条完整的记录。

1、读跨多行文件job配置

读跨多行文件job基于javabean配置如下

/** * 读记录跨多行文件 * @author shuliangzhao * @date 2020/12/6 13:38 */ @Configuration @EnableBatchProcessing public class MulitiLineConfiguration { @Autowired private JobbuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Autowired private PartitonMultiFileProcessor partitonMultiFileProcessor; @Autowired private PartitionMultiFilewriter partitionMultiFileWriter; @Bean public Job mulitiLineJob() { return jobBuilderFactory.get("mulitiLineJob").start(mulitiLineStep()).build(); } @Bean public Step mulitiLineStep() { return stepBuilderFactory.get("mulitiLineStep") .<CreditBill CreditBill>chunk(12) .reader(mulitiLineRecordReader()) .processor(partitonMultiFileProcessor) .writer(partitionMultiFileWriter) .build(); } @Bean @StepScope public MulitiLineRecordReader mulitiLineRecordReader() { return new MulitiLineRecordReader(CreditBill.class); } }

2、读跨多行文件reader

MulitiLineRecordReader详细如下

/** * @author shuliangzhao * @date 2020/12/6 13:09 */ public class MulitiLineRecordReader extends FlatFileItemReader { public MulitiLineRecordReader(Class clz) { setResource(CommonUtil.createResource("D:\\aplus\\muliti\\muliti.csv")); String[] names = CommonUtil.names(clz); DefaultLineMapper defaultLineMapper = new DefaultLineMapper(); CommonFieldSetMapper commonFieldSetMapper = new CommonFieldSetMapper(); commonFieldSetMapper.setTargetType(clz); defaultLineMapper.setFieldSetMapper(commonFieldSetMapper); DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer(); delimitedLineTokenizer.setFieldSetFactory(new DefaultFieldSetFactory()); delimitedLineTokenizer.setNames(names); delimitedLineTokenizer.setDelimiter(" "); defaultLineMapper.setLineTokenizer(delimitedLineTokenizer); MulitiLineRecordSeparatorPolicy mulitiLineRecordSeparatorPolicy = new MulitiLineRecordSeparatorPolicy(); mulitiLineRecordSeparatorPolicy.setCount(4); mulitiLineRecordSeparatorPolicy.setDelimiter(" "); setRecordSeparatorPolicy(mulitiLineRecordSeparatorPolicy); setLineMapper(defaultLineMapper); } 3、自定义FieldSetMapper

自定义CommonFieldSetMapper

/** * @author shuliangzhao * @date 2020/12/4 22:14 */ public class CommonFieldSetMapper<T> implements FieldSetMapper<T> { private Class<? extends T> type; @Override public T mapFieldSet(FieldSet fieldSet) throws BindException { try { T t = type.newInstance(); Field[] declaredFields = type.getDeclaredFields(); if (declaredFields != null) { for (Field field : declaredFields) { field.setAccessible(true); if (field.getName().equals("id")) { continue; } String name = field.getType().getName(); if (name.equals("java.lang.Integer")) { field.set(t fieldSet.readInt(field.getName())); }else if (name.equals("java.lang.String")) { field.set(t fieldSet.readString(field.getName())); }else if (name.equals("java.util.Date")) { field.set(t fieldSet.readDate(field.getName())); }else{ field.set(t fieldSet.readString(field.getName())); } } return t; } } catch (Exception e) { e.printStackTrace(); } return null; } public void setTargetType(Class<? extends T> type) { this.type = type; } } 4、读跨多行文件processor

PartitonMultiFileProcessor 详细如下

@Component @StepScope public class PartitonMultiFileProcessor implements ItemProcessor<CreditBill CreditBill> { @Override public CreditBill process(CreditBill item) throws Exception { CreditBill creditBill = new CreditBill(); creditBill.setAcctid(item.getAcctid()); creditBill.setAddress(item.getAddress()); creditBill.setAmout(item.getAmout()); creditBill.setDate(item.getDate()); creditBill.setName(item.getName()); return creditBill; } } 5、读跨多行文件writer

PartitionMultiFileWriter详细如下

@Component @StepScope public class PartitionMultiFileWriter implements ItemWriter<CreditBill> { @Autowired private CreditBillMapper creditBillMapper; @Override public void write(List<? extends CreditBill> items) throws Exception { if (items != null && items.size() > 0) { items.stream().forEach(item -> { creditBillMapper.insert(item); }); } } }

至此,我们完成了对文件分区的处理。

如果向更详细查看以上所有代码请移步到github:(https://github.com/FadeHub/spring-boot-learn/blob/master/spring-boot-springbatch/src/main/java/com/sl/config/MulitiLineConfiguration.java)

猜您喜欢: