Spring Batch Basics

In this article, we will look at Spring Batch Basics. It’s an introduction to Spring Batch. We will also see the different components of the Spring Batch and how the different components work together. We will also see code snippets for the main class, controller class, and config class.

 

Spring Batch Basics

Spring batch is a very lightweight framework from the Spring Framework family, designed and developed for scalable and robust batch applications. While we develop a scalable project, we often have to process an enormous amount of data especially in the form of reading the data from a source (like CSV file), process it (business logic), and write to another source (database).

Spring Batch comes as a very useful and high-performing tool for this nature of work. They have developed the Spring batch using a POJO-based approach of the core Spring Framework. As a prerequisite, someone starting with Spring Batch should have a good understanding of the Spring Framework.

 

1. Advantages of Spring Batch

  • Spring Batch provides us reusable functions, and they can help us process enormous volumes of data records. This includes logging, tracking, transaction management, and job-related statistics among others.
  • Spring Batch allows us to build scalable batch applications through optimizations.
  • Spring Batch is a straightforward framework to adapt for a new application and upgrading the batch processing for an existing project.
  • Almost all applications have batch jobs and Spring Batch, with its robust framework and support, can solve many of the challenges the IT teams face daily.
  • Spring Batch’s JobRepository makes it easy to see the various information about key Spring Batch components like JobLauncher, JobExecution, and StepExecution (we will discuss these components further in this article).

 

2. Components of Spring Batch:

The following diagram depicts the key concepts that make up the core of the Spring Batch and helps you to get understanding of spring batch basics.

  1. A Job may have one or many steps and each step has exactly one ItemReader, one ItemProcessor, and one ItemWriter.
  2. A JobLauncher is required to launch the Job, and a JobRepository required to store the metadata about the running process.
  3. A job associated with one or more JobInstances and each of those JobInstances  defined by its JobParameters (they are used to start the Job).
  4. Every run of JobInstances is JobExecution, and every run of a step is StepExecution.

We will understand each of these components in the following section.

Spring Batch Basics

2.1. Job:

In batch processing, a Job encapsulates the entire batch process. We can configure it via XML configuration or via Java-based configuration (read Annotations). Here is an example of a job-configuration in Java:

@RestController
public class SpringBatchJobController {

    @Autowired
    JobLauncher jobLauncher;

    @Autowired
    Job processJob;

    @RequestMapping("/invokejob")
    public String handle() throws Exception {}
}

 

2.2. Job Instances:

The logical job run called JobInstance, for example, consider the beginning of the day job for trading applications (BOD JOBS). The job will run once every morning on a schedule and each day run will be tracked separately, which means one logical JobInstance per day.

 

2.3. Job Parameters:

Each JobInstance requires a set of parameters to start itself, known as JobParameters. In our previous example of BOD jobs, the day on which this job will run is nothing but JobParameter. So, consider the date of the job run day July 23, and July 24 as JobParameters. The time at which it would trigger this job is also a JobParameter.

 

2.4. Job Execution:

JobExecution is the actual run of JobInstance. A JobInstance can have multiple JobExecution that run successfully. It also tracks what has happened during the run of the JobInstance (Job) and its various statuses like the current exit. For a successful run, the exit status should be 0.

 

3. Step

A Step is the sequential phase of a batch job. A step can read an object, or process a list of objects, or delete a record from a table.Each job can have one or more steps defined for it. The step can have exactly one ItemReader, ItemProcessor, and ItemWriter and it contains all the required information to define and control the actual batch processing.

 

3.1. StepExecution

Like in Job, we have individual JobExecution, for Step, we have individual StepExecution; the StepExecution represents a single attempt to execute a step. Like JobExecution, StepExecution stores information about current and exit statuses, start and end times, and so on.

 

4. JobRepository

JobRepository in the Spring batch takes care of all the CRUD (create, read, update, and delete) operations and ensures persistence. It does this for JobLauncher, Job, and Step. It starts with the launch of the Job. JobExecution is obtained from the JobRepository and during the run, the instances of StepExecution and JobExecution are persisted to the repository.

The Spring annotation @EnableBatchProcessing takes care of setting up the repository automatically.

 

4.1. JobLauncher

JobLauncher is a simple interface for launching a Job with a set of JobParameters. The following interface definition is from library itself:

package org.springframework.batch.core.launch;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersInvalidException;
import org.springframework.batch.core.repository.JobExecutionAlreadyRunningException;
import org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException;
import org.springframework.batch.core.repository.JobRestartException;

public interface JobLauncher {
    JobExecution run(Job var1, JobParameters var2) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException;
}

 

5. ItemReader

ItemReader is associated with Step. It represents the retrieval of input for the Step. It is sequential, one item at a time. ItemReader returns null when it can’t provide further items.

import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;

public class SBReader implements ItemReader < String > {
    
}

 

6. ItemProcessor

ItemProcessor is associated with Step and it represents the business processing of the step (item provided by ItemReader). Once it completes its processing, it sends the item to ItemWriter, in case the item isn’t valid, it will return null.

import org.springframework.batch.item.ItemProcessor;

public class SBProcessor implements ItemProcessor < String, String > {

}

 

7. ItemWriter

ItemWriter is associated with Step, and it represents the output for the step. It understands the current item only, and it isn’t aware of the next items.

import org.springframework.batch.item.ItemWriter;

public class SBWriter implements ItemWriter < String > {
    
}

 

8. Code Snippets

We will cover the entire setup of a basic spring batch job using a read-process-write job in the next article; though I am posting a few of the code snippets from that project. This will give you some idea of the actual setup. In the next article, we will also show you the H2 DB setup and how to see the various tables that Spring Batch creates during the run. We are using Spring Boot to build our application but this is optional and you can build it without Spring Boot.

8.1. Main Class:

@SpringBootApplication
@EnableBatchProcessing
public class SpringBootBatchBasicApplication {

    public static void main(String[] args) {
        SpringApplication.run(SpringBootBatchBasicApplication.class, args);
    }

}

 

8.2. Controller:

@RestController
public class SpringBatchJobController {

    @Autowired
    JobLauncher jobLauncher;

    @Autowired
    Job processJob;

    @RequestMapping("/invokejob")
    public String handle() throws Exception {

        JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis())
            .toJobParameters();
        jobLauncher.run(processJob, jobParameters);

        return "Batch job invoked";
    }
}

 

8.3. Config

package com.javadevjournal.springbootbatch.config;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import com.javadevjournal.springbootbatch.listener.SpringBarchJobCompletionListener;
import com.javadevjournal.springbootbatch.step.SBProcessor;
import com.javadevjournal.springbootbatch.step.SBReader;
import com.javadevjournal.springbootbatch.step.SBWriter;

@Configuration
public class SpringBatchConfig {

    @Autowired
    public JobBuilderFactory jobBuilderFactory;

    @Autowired
    public StepBuilderFactory stepBuilderFactory;

    @Bean
    public Job processJob() {
        return jobBuilderFactory.get("processJob")
            .incrementer(new RunIdIncrementer()).listener(listener())
            .flow(orderStep1()).end().build();
    }

    @Bean
    public Step orderStep1() {
        return stepBuilderFactory.get("orderStep1"). < String, String > chunk(1)
            .reader(new SBReader()).processor(new SBProcessor())
            .writer(new SBWriter()).build();
    }

    @Bean
    public JobExecutionListener listener() {
        return new SpringBarchJobCompletionListener();
    }

}

 

Summary

In this post, we started with an introduction to Spring Batch. We covered the following points under Spring Batch Basics.

  • We learned the basics of Spring Batch.
  • What are the core components of Spring Batch?
  • Learned the flow diagram and connection between these different components.
  • We have seen a few of the code snippets from the actual Spring Batch Project.

The source code is available on our GitHub repository.