1. 首先需要在pom.xml中添加相关依赖:
<dependency>
    <groupId>org.mybatis.spring.boot</groupId>
    <artifactId>mybatis-spring-boot-starter</artifactId>
    <version>2.1.4</version>
</dependency>
<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-jdbc</artifactId>
    <version>2.3.9</version>
</dependency>
  1. 配置MyBatis和Hive的数据源,可以在application.yml中进行配置:
spring:
  datasource:
    url: jdbc:hive2://localhost:10000/default
    username: hive
    password: hive
    driver-class-name: org.apache.hive.jdbc.HiveDriver
mybatis:
  mapper-locations: classpath:mapper/*.xml
  1. 在mapper中定义新增数据的SQL语句,例如:
<insert id="batchInsertData">
    insert into table_name (id, name, age) values
    <foreach collection="list" item="item" separator=",">
        (#{item.id}, #{item.name}, #{item.age})
    </foreach>
</insert>
  1. 在service中编写批量新增数据的方法,例如:
@Service
public class DataService {
    @Autowired
    private DataMapper dataMapper;

    public void batchInsertData(List<Data> dataList) {
        int batchSize = 1000;
        int totalCount = dataList.size();
        int batchCount = totalCount / batchSize + (totalCount % batchSize == 0 ? 0 : 1);

        for (int i = 0; i < batchCount; i++) {
            int startIndex = i * batchSize;
            int endIndex = Math.min((i + 1) * batchSize, totalCount);

            List<Data> subList = dataList.subList(startIndex, endIndex);
            dataMapper.batchInsertData(subList);
        }
    }
}
  1. 在Controller中调用批量新增数据的方法:
@RestController
public class DataController {
    @Autowired
    private DataService dataService;

    @PostMapping("/data/batchInsert")
    public String batchInsertData() {
        List<Data> dataList = new ArrayList<>();
        // 初始化数据
        ...
        dataService.batchInsertData(dataList);
        return "success";
    }
}
  1. 运行程序,访问http://localhost:8080/data/batchInsert,即可批量新增数据到Hive数据库中
使用mybatis+springboot将百万数据批量新增到hive数据库写批量处理的代码逻辑示例

原文地址: https://www.cveoy.top/t/topic/fodG 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录