DataX 二次开发支持 Oracle 更新数据
创始人
2024-03-30 01:44:27
0

文章目录

        • 1、原理
        • 2、源码修改
          • 2.1 OracleWriter注释对writeMode的限制
          • 2.2 WriterUtil,增加oracle逻辑
          • 2.3 CommonRdbmsWriter.Task修改
          • 2.4 测试

前文回顾:
《DataX 及 DataX-Web 安装使用详解》
《DataX 源码调试及打包》
《DataX-Web 源码调试及打包》

目前很多主流数据库支持 on duplicate key update(当主键冲突update数据)模式,DataX 也支持通过配置 writeMode 来配置写入模式。但是目前Oracle只支持insert配置项

如何适配 Oracle 数据库 on duplicate key update 模式,今天大佬超就来带你进行二次开发。


1、原理

writeMode 的 insert、replace、update 配置项底层采用 INSERT INTOREPLACE INTO/INSERT INTO … ON DUPLICATE KEY UPDATE 语句:

其中 insert into 当主键/唯一性索引冲突时会写不进去冲突的行;后两者没有遇到主键/唯一性索引冲突时与 insert into 行为一致,遇到冲突时会用新行替换原有行所有字段。

oracle 不支持类似 MySQL的 REPLACE INTOINSERT … ON DUPLICATE KEY UPDATE,所以只支持 insert 配置项。要实现此功能,需要利用 Oracle 的 merge 语句,先来看下 merge 语法。

MERGE INTO [target-table] A USING [source-table sql] B 
ON([conditional expression] and [...]...) 
WHEN MATCHED THEN[UPDATE sql] 
WHEN NOT MATCHED THEN [INSERT sql]

merge 语法其实就是存在就更新,不存在就插入。

示例:

MERGE INTO USERS A USING ( SELECT 18 AS "ID",'chaodev' AS "USER_ID" FROM DUAL ) TMP 
ON (TMP."ID" = A."ID" AND TMP."USER_ID" = A."USER_ID" ) 
WHEN MATCHED THEN 
UPDATE SET "USER_NAME" = '大佬超',"USER_PHONE" = '18000000000',"LASTUPDATETIME" = SYSDATE 
WHEN NOT MATCHED THEN 
INSERT ("ID","USER_ID","USER_NAME","USER_PHONE","LASTUPDATETIME") VALUES(18,'chaodev','大佬超','18000000000',SYSDATE)

所以最终实现原理就是:更改datax 的 oraclewriter 源码,通过 merge into 语句,实现 UPSERT 语义。


2、源码修改

涉及修改的类和方法如下:

oraclewriter包

com.alibaba.datax.plugin.writer.oraclewriter.OracleWriter:修改允许用户配置 writeMode。

plugin-dbms-util包

com.alibaba.datax.plugin.rdbms.writer.util.WriterUtil:增加oralce的逻辑代码。

com.alibaba.datax.plugin.rdbms.writer.CommonRdbmsWriter:CommonRdbmsWriter.Task类替换 startWriteWithConnection() 、doBatchInsert() 和fillPreparedStatement() 方法。


2.1 OracleWriter注释对writeMode的限制

在这里插入图片描述


2.2 WriterUtil,增加oracle逻辑

在这里插入图片描述

增加oracle的逻辑代码,如下

/**
* 新增oracle update模块
* @author 程序员大佬超
* @date 20221202
* @param columnHolders
* @param valueHolders
* @param writeMode
* @param dataBaseType
* @param forceUseUpdate
* @return
*/
public static String getWriteTemplate(List columnHolders, List valueHolders,String writeMode, DataBaseType dataBaseType, boolean forceUseUpdate)
{String mode = writeMode.trim().toLowerCase();boolean isWriteModeLegal = mode.startsWith("insert") || mode.startsWith("replace") || mode.startsWith("update");if (!isWriteModeLegal) {throw DataXException.asDataXException(DBUtilErrorCode.ILLEGAL_VALUE,String.format("您所配置的 writeMode:%s 错误. 因为DataX 目前仅支持replace,update 或 insert 方式. 请检查您的配置并作出修改.", writeMode));}String writeDataSqlTemplate;if (forceUseUpdate || mode.startsWith("update")) {if (dataBaseType == DataBaseType.MySql || dataBaseType == DataBaseType.Tddl) {writeDataSqlTemplate = new StringBuilder().append("INSERT INTO %s (").append(StringUtils.join(columnHolders, ",")).append(") VALUES(").append(StringUtils.join(valueHolders, ",")).append(")").append(onDuplicateKeyUpdateString(columnHolders)).toString();}else if (dataBaseType == DataBaseType.Oracle) {writeDataSqlTemplate = new StringBuilder().append(onMergeIntoDoString(writeMode, columnHolders, valueHolders)).append("INSERT (").append(StringUtils.join(columnHolders, ",")).append(") VALUES(").append(StringUtils.join(valueHolders, ",")).append(")").toString();}else {throw DataXException.asDataXException(DBUtilErrorCode.ILLEGAL_VALUE,String.format("当前数据库不支持 writeMode:%s 模式.", writeMode));}}else {//这里是保护,如果其他错误的使用了update,需要更换为replaceif (writeMode.trim().toLowerCase().startsWith("update")) {writeMode = "replace";}writeDataSqlTemplate = new StringBuilder().append(writeMode).append(" INTO %s (").append(StringUtils.join(columnHolders, ",")).append(") VALUES(").append(StringUtils.join(valueHolders, ",")).append(")").toString();}return writeDataSqlTemplate;
}

调用的方法新增如下,主要就是拼接merge语句:

public static String onMergeIntoDoString(String merge, List columnHolders, List valueHolders) {String[] sArray = getStrings(merge);StringBuilder sb = new StringBuilder();sb.append("MERGE INTO %s A USING ( SELECT ");boolean first = true;boolean first1 = true;StringBuilder str = new StringBuilder();StringBuilder update = new StringBuilder();for (String columnHolder : columnHolders) {if (Arrays.asList(sArray).contains(columnHolder)) {if (!first) {sb.append(",");str.append(" AND ");} else {first = false;}str.append("TMP.").append(columnHolder);sb.append("?");str.append(" = ");sb.append(" AS ");str.append("A.").append(columnHolder);sb.append(columnHolder);}}for (String columnHolder : columnHolders) {if (!Arrays.asList(sArray).contains(columnHolder)) {if (!first1) {update.append(",");} else {first1 = false;}update.append(columnHolder);update.append(" = ");update.append("?");}}sb.append(" FROM DUAL ) TMP ON (");sb.append(str);sb.append(" ) WHEN MATCHED THEN UPDATE SET ");sb.append(update);sb.append(" WHEN NOT MATCHED THEN ");return sb.toString();
}public static String[] getStrings(String merge) {merge = merge.replace("update", "");merge = merge.replace("(", "");merge = merge.replace(")", "");merge = merge.replace(" ", "");return merge.split(",");
}

2.3 CommonRdbmsWriter.Task修改

修改 startWriteWithConnection() 方法

/**
* 更改适配oracle update
* @author 程序员大佬超
* @date 20221202
* @param recordReceiver
* @param taskPluginCollector
* @param connection
*/
public void startWriteWithConnection(RecordReceiver recordReceiver, TaskPluginCollector taskPluginCollector, Connection connection)
{this.taskPluginCollector = taskPluginCollector;List mergeColumns = new ArrayList<>();if (this.dataBaseType == DataBaseType.Oracle && !"insert".equalsIgnoreCase(this.writeMode)) {LOG.info("write oracle using {} mode", this.writeMode);List columnsOne = new ArrayList<>();List columnsTwo = new ArrayList<>();String merge = this.writeMode;String[] sArray = WriterUtil.getStrings(merge);for (String s : this.columns) {if (Arrays.asList(sArray).contains(s)) {columnsOne.add(s);}}for (String s : this.columns) {if (!Arrays.asList(sArray).contains(s)) {columnsTwo.add(s);}}int i = 0;for (String column : columnsOne) {mergeColumns.add(i++, column);}for (String column : columnsTwo) {mergeColumns.add(i++, column);}}mergeColumns.addAll(this.columns);// 用于写入数据的时候的类型根据目的表字段类型转换this.resultSetMetaData = DBUtil.getColumnMetaData(connection,this.table, StringUtils.join(mergeColumns, ","));// 写数据库的SQL语句calcWriteRecordSql();List writeBuffer = new ArrayList<>(this.batchSize);int bufferBytes = 0;try {Record record;while ((record = recordReceiver.getFromReader()) != null) {if (record.getColumnNumber() != this.columnNumber) {// 源头读取字段列数与目的表字段写入列数不相等,直接报错throw DataXException.asDataXException(DBUtilErrorCode.CONF_ERROR,String.format("列配置信息有错误. 因为您配置的任务中,源头读取字段数:%s 与 目的表要写入的字段数:%s 不相等. 请检查您的配置并作出修改.",record.getColumnNumber(),this.columnNumber));}writeBuffer.add(record);bufferBytes += record.getMemorySize();if (writeBuffer.size() >= batchSize || bufferBytes >= batchByteSize) {doBatchInsert(connection, writeBuffer);writeBuffer.clear();bufferBytes = 0;}}if (!writeBuffer.isEmpty()) {doBatchInsert(connection, writeBuffer);writeBuffer.clear();}}catch (Exception e) {throw DataXException.asDataXException(DBUtilErrorCode.WRITE_DATA_ERROR, e);}finally {writeBuffer.clear();DBUtil.closeDBResources(null, null, connection);}
}

修改 doBatchInsert() 方法

/**
* 更改适配oracle update
* @author 程序员大佬超
* @date 20221202
* @param connection
* @param buffer
* @throws SQLException
*/
protected void doBatchInsert(Connection connection, List buffer)throws SQLException
{PreparedStatement preparedStatement = null;try {connection.setAutoCommit(false);preparedStatement = connection.prepareStatement(this.writeRecordSql);if (this.dataBaseType == DataBaseType.Oracle && !"insert".equalsIgnoreCase(this.writeMode)) {String merge = this.writeMode;String[] sArray = WriterUtil.getStrings(merge);for (Record record : buffer) {List recordOne = new ArrayList<>();for (int j = 0; j < this.columns.size(); j++) {if (Arrays.asList(sArray).contains(this.columns.get(j))) {recordOne.add(record.getColumn(j));}}for (int j = 0; j < this.columns.size(); j++) {if (!Arrays.asList(sArray).contains(this.columns.get(j))) {recordOne.add(record.getColumn(j));}}for (int j = 0; j < this.columns.size(); j++) {recordOne.add(record.getColumn(j));}for (int j = 0; j < recordOne.size(); j++) {record.setColumn(j, recordOne.get(j));}preparedStatement = fillPreparedStatement(preparedStatement, record);preparedStatement.addBatch();}}else {for (Record record : buffer) {preparedStatement = fillPreparedStatement(preparedStatement, record);preparedStatement.addBatch();}}preparedStatement.executeBatch();connection.commit();}catch (SQLException e) {LOG.warn("回滚此次写入, 采用每次写入一行方式提交. 因为: {}", e.getMessage());connection.rollback();doOneInsert(connection, buffer);}catch (Exception e) {throw DataXException.asDataXException(DBUtilErrorCode.WRITE_DATA_ERROR, e);}finally {DBUtil.closeDBResources(preparedStatement, null);}
}

修改 fillPreparedStatement() 方法

/**
* 更改适配oracle update
* @author 程序员大佬超
* @date 20221202
* @param preparedStatement
* @param record
* @return
* @throws SQLException
*/
protected PreparedStatement fillPreparedStatement(PreparedStatement preparedStatement, Record record)throws SQLException
{for (int i = 0; i < record.getColumnNumber(); i++) {int columnSqltype = this.resultSetMetaData.getMiddle().get(i);preparedStatement = fillPreparedStatementColumnType(preparedStatement, i,columnSqltype, record.getColumn(i));}return preparedStatement;
}

2.4 测试

重新打包后,测试job

{"job": {"setting": {"speed": {"channel": 3,"byte": 1048576},"errorLimit": {"record": 0,"percentage": 0.02}},"content": [{"reader": {"name": "mysqlreader","parameter": {"username": "root","password": "123456","column": ["`id`","`user_id`","`user_password`","`user_name`","`user_phone`","`email`","`nick_name`","`head_url`","`sex`","`state`","`create_time`","`create_user`","`lastUpdateTime`"],"splitPk": "","connection": [{"table": ["users"],"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/im"]}]}},"writer": {"name": "oraclewriter","parameter": {"username": "yxc","password": "123456","column": ["\"ID\"","\"USER_ID\"","\"USER_PASSWORD\"","\"USER_NAME\"","\"USER_PHONE\"","\"EMAIL\"","\"NICK_NAME\"","\"HEAD_URL\"","\"SEX\"","\"STATE\"","\"CREATE_TIME\"","\"CREATE_USER\"","\"LASTUPDATETIME\""],"writeMode": "update(\"ID\",\"USER_ID\")","connection": [{"table": ["USERS"],"jdbcUrl": "jdbc:oracle:thin:@//192.168.157.142:1521/orcl"}]}}}]}
}

注:writeMode的update括号里字段要加上\"

查看运行日志,可以看到已经正确拼接了MERGE语句:
在这里插入图片描述

在这里插入图片描述

任务执行成功。切换 insert 和 update 模式检验,无异常。



更多技术干货,请持续关注程序员大佬超。
原创不易,转载请注明出处。

相关内容

热门资讯

银河麒麟V10SP1高级服务器... 银河麒麟高级服务器操作系统简介: 银河麒麟高级服务器操作系统V10是针对企业级关键业务...
【NI Multisim 14...   目录 序言 一、工具栏 🍊1.“标准”工具栏 🍊 2.视图工具...
不能访问光猫的的管理页面 光猫是现代家庭宽带网络的重要组成部分,它可以提供高速稳定的网络连接。但是,有时候我们会遇到不能访问光...
Android|无法访问或保存... 这个问题可能是由于权限设置不正确导致的。您需要在应用程序清单文件中添加以下代码来请求适当的权限:此外...
AWSECS:访问外部网络时出... 如果您在AWS ECS中部署了应用程序,并且该应用程序需要访问外部网络,但是无法正常访问,可能是因为...
北信源内网安全管理卸载 北信源内网安全管理是一款网络安全管理软件,主要用于保护内网安全。在日常使用过程中,卸载该软件是一种常...
AWSElasticBeans... 在Dockerfile中手动配置nginx反向代理。例如,在Dockerfile中添加以下代码:FR...
AsusVivobook无法开... 首先,我们可以尝试重置BIOS(Basic Input/Output System)来解决这个问题。...
ASM贪吃蛇游戏-解决错误的问... 要解决ASM贪吃蛇游戏中的错误问题,你可以按照以下步骤进行:首先,确定错误的具体表现和问题所在。在贪...
​ToDesk 远程工具安装及... 目录 前言 ToDesk 优势 ToDesk 下载安装 ToDesk 功能展示 文件传输 设备链接 ...