springbatch 读取文件不全(全网最详细SpringBatch读跨多行文件讲解)
springbatch 读取文件不全(全网最详细SpringBatch读跨多行文件讲解)us412453 203 tqm 2020-03-27下面讲的例子是每两行表示一条记录:412222 201 tom 2020-02-27 china
写在前面:我是「境里婆娑」。我还是从前那个少年,没有一丝丝改变,时间只不过是考验,种在心中信念丝毫未减,眼前这个少年,还是最初那张脸,面前再多艰险不退却。
写博客的目的就是分享给大家一起学习交流,如果您对 java感兴趣,可以关注我,我们一起学习。
前言:在工作中可能会遇到SpringBatch读取的文件记录跨多行或者文件中存在多种不同的记录格式,不必担心SpringBatch已经帮我们把接口都预留好了,只需要稍微改造就可以轻松实现。
读记录跨多行文件当Flat文件格式非标准是,通过实现记录分隔策略接口RecordSeparatorPolicy来实现非标准Flat格式文件。非标准Flat文件有多种情况,例如记录跨多行、以特定的字符开头、以特定的字符结尾。
下面讲的例子是每两行表示一条记录:
412222 201 tom 2020-02-27
china
412453 203 tqm 2020-03-27
us
412222 205 tym 2020-05-27
jap
默认的记录分割策略SimpleRecordSeparatorPolicy或者DefaultRecordSeparatorPolicy已经不能处理此类文件。我们可以实现接口RecordSeparatorPolicy来自定义分割策略MulitiLineRecordSeparatorPolicy
读记录跨多行文件时,使用到的核心组件类图如下:

在本类图中除了MulitiLineRecordSeparatorPolicy和CommonFieldSetMapper是自定义实现的,其他组件都是SpringBatch自带。
MulitiLineRecordSeparatorPolicy:负责从文件中确认一条完整记录,在本实现中每读到四个逗号分隔符,则认为是一条完整的记录
/**
 * @author shuliangzhao
 * @date 2020/12/6 13:05
 */
public class MulitiLineRecordSeparatorPolicy implements RecordSeparatorPolicy {
    private String delimiter = " ";
    private int count = 0;
    public int getCount() {
        return count;
    }
    public void setCount(int count) {
        this.count = count;
    }
    public String getDelimiter() {
        return delimiter;
    }
    public void setDelimiter(String delimiter) {
        this.delimiter = delimiter;
    }
    @Override
    public boolean isEndOfRecord(String record) {
        return countDelimiter(record) == count;
    }
    private int countDelimiter(String record) {
        String temp = record;
        int index = -1;
        int count = 0;
        while ((index=temp.indexOf(" ")) != -1) {
            temp = temp.substring(index  1);
            count  ;
        }
        return count;
    }
    @Override
    public String postProcess(String record) {
        return record;
    }
    @Override
    public String preProcess(String record) {
        return record;
    }
}
    
delimiter :定义为读的的分割符号
count:分隔符总数,给定的字符串包含的分隔符个数等于此值,则认为是一条完整的记录。
1、读跨多行文件job配置读跨多行文件job基于javabean配置如下
/**
 * 读记录跨多行文件
 * @author shuliangzhao
 * @date 2020/12/6 13:38
 */
@Configuration
@EnableBatchProcessing
public class MulitiLineConfiguration {
    @Autowired
    private JobbuilderFactory jobBuilderFactory;
    @Autowired
    private StepBuilderFactory stepBuilderFactory;
    @Autowired
    private PartitonMultiFileProcessor partitonMultiFileProcessor;
    @Autowired
    private PartitionMultiFilewriter partitionMultiFileWriter;
    @Bean
    public Job mulitiLineJob() {
       return jobBuilderFactory.get("mulitiLineJob").start(mulitiLineStep()).build();
    }
    @Bean
    public Step mulitiLineStep() {
        return stepBuilderFactory.get("mulitiLineStep")
                .<CreditBill CreditBill>chunk(12)
                .reader(mulitiLineRecordReader())
                .processor(partitonMultiFileProcessor)
                .writer(partitionMultiFileWriter)
                .build();
    }
    @Bean
    @StepScope
    public MulitiLineRecordReader mulitiLineRecordReader() {
        return new MulitiLineRecordReader(CreditBill.class);
    }
}
    
2、读跨多行文件reader
MulitiLineRecordReader详细如下
/**
 * @author shuliangzhao
 * @date 2020/12/6 13:09
 */
public class MulitiLineRecordReader extends FlatFileItemReader {
    public MulitiLineRecordReader(Class clz) {
        setResource(CommonUtil.createResource("D:\\aplus\\muliti\\muliti.csv"));
        String[] names = CommonUtil.names(clz);
        DefaultLineMapper defaultLineMapper = new DefaultLineMapper();
        CommonFieldSetMapper commonFieldSetMapper = new CommonFieldSetMapper();
        commonFieldSetMapper.setTargetType(clz);
        defaultLineMapper.setFieldSetMapper(commonFieldSetMapper);
        DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer();
        delimitedLineTokenizer.setFieldSetFactory(new DefaultFieldSetFactory());
        delimitedLineTokenizer.setNames(names);
        delimitedLineTokenizer.setDelimiter(" ");
        defaultLineMapper.setLineTokenizer(delimitedLineTokenizer);
        MulitiLineRecordSeparatorPolicy mulitiLineRecordSeparatorPolicy = new MulitiLineRecordSeparatorPolicy();
        mulitiLineRecordSeparatorPolicy.setCount(4);
        mulitiLineRecordSeparatorPolicy.setDelimiter(" ");
        setRecordSeparatorPolicy(mulitiLineRecordSeparatorPolicy);
        setLineMapper(defaultLineMapper);
    } 3、自定义FieldSetMapper
    
自定义CommonFieldSetMapper
/**
 * @author shuliangzhao
 * @date 2020/12/4 22:14
 */
public class CommonFieldSetMapper<T> implements FieldSetMapper<T> {
    private Class<? extends T> type;
    @Override
    public T mapFieldSet(FieldSet fieldSet) throws BindException {
        try {
            T t = type.newInstance();
            Field[] declaredFields = type.getDeclaredFields();
            if (declaredFields != null) {
                for (Field field : declaredFields) {
                    field.setAccessible(true);
                    if (field.getName().equals("id")) {
                        continue;
                    }
                    String name = field.getType().getName();
                    if (name.equals("java.lang.Integer")) {
                        field.set(t fieldSet.readInt(field.getName()));
                    }else if (name.equals("java.lang.String")) {
                        field.set(t fieldSet.readString(field.getName()));
                    }else if (name.equals("java.util.Date")) {
                        field.set(t fieldSet.readDate(field.getName()));
                    }else{
                        field.set(t fieldSet.readString(field.getName()));
                    }
                }
                return t;
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }
    public void setTargetType(Class<? extends T> type) {
        this.type = type;
    }
} 4、读跨多行文件processor
    
PartitonMultiFileProcessor 详细如下
@Component
@StepScope
public class PartitonMultiFileProcessor implements ItemProcessor<CreditBill CreditBill> {
    @Override
    public CreditBill process(CreditBill item) throws Exception {
        CreditBill creditBill = new CreditBill();
        creditBill.setAcctid(item.getAcctid());
        creditBill.setAddress(item.getAddress());
        creditBill.setAmout(item.getAmout());
        creditBill.setDate(item.getDate());
        creditBill.setName(item.getName());
        return creditBill;
    }
} 5、读跨多行文件writer
    
PartitionMultiFileWriter详细如下
@Component
@StepScope
public class PartitionMultiFileWriter implements ItemWriter<CreditBill> {
    @Autowired
    private CreditBillMapper creditBillMapper;
    @Override
    public void write(List<? extends CreditBill> items) throws Exception {
        if (items != null && items.size() > 0) {
            items.stream().forEach(item -> {
                creditBillMapper.insert(item);
            });
        }
    }
}
    
至此,我们完成了对文件分区的处理。
如果向更详细查看以上所有代码请移步到github:(https://github.com/FadeHub/spring-boot-learn/blob/master/spring-boot-springbatch/src/main/java/com/sl/config/MulitiLineConfiguration.java)




