前言:
本文档主要描述通过Oracle Grid Infrastructure Agents (XAG)基于Oracle RAC实现GoldenGate(OGG)软件高可用的实施操作
环境信息:
源端 | 目标端 | ||
节点一IP 节点二IP | 192.168.1.84 192.168.1.86 | 节点一IP 节点二IP | 192.168.1.200 192.168.1.210 |
VIP | 192.168.1.88 | VIP | 192.168.1.204 |
数据库版本 | 11.2.0.4 | 数据库版本 | 12.2.0.1 |
OGG版本 | 19.1.0.0.4 | OGG版本 | 19.1.0.0.4 |
XAG版本 | 10.2 | XAG版本 | 7.1.0 |
EXTRACT进程 | E_BSS_1 | REPLICAT进程 | R_ING_1 |
PUMP进程 | P_BSR_1 |
源端OGG高可用配置:
1 配置vip资源:
--添加vip资源,这里使用集群新建的network 2网络,可以使用默认的network 1,同网段即可
#network is the network number that you want to use.
#ip is the IP address provided by your system administrator for the new Application VIP. This IP address must be in the same subnet as determined above.
#gg_vip_source is the name of the application VIP that you will create.
/u01/app/11.2.0/grid/bin/appvipcfg create -network=2 \-ip=192.168.1.88 \-vipname=gg_vip_source \-user=root--root授权Oracle用户有启动vip的权限
/u01/app/11.2.0/grid/bin/crsctl setperm resource gg_vip_source -u user:oracle:r-x
--Oracle用户启动资源
/u01/app/11.2.0/grid/bin/crsctl start resource gg_vip_source
--验证资源
[oracle@rac1 ~]$ /u01/app/11.2.0/grid/bin/crsctl status resource gg_vip_source
NAME=gg_vip_source
TYPE=app.appvip_net2.type
TARGET=ONLINE
STATE=ONLINE on rac2
2 配置ACFS资源:
--配置ACFS,在节点一操作即可
--添加vol卷
su -grid
ASMCMD> volcreate -G DATA -s 5G acfsvol
--查看acfs生成的卷组
su - grid
ASMCMD> volinfo -G DATA acfsvol
Diskgroup Name: DATA
Volume Name: ACFSVOLVolume Device: /dev/asm/acfsvol-119State: ENABLEDSize (MB): 5120Resize Unit (MB): 32Redundancy: UNPROTStripe Columns: 4Stripe Width (K): 128Usage: Mountpath:
--挂载/ogg 文件系统
--确认asm下面生成acfs卷组
# ls /dev/asm
--创建挂载目录
su - root
mkdir /ogg
--格式划acfs
su - root
# /sbin/mkfs -t acfs /dev/asm/acfsvol-119
--加入集群管理
su - root
/u01/app/11.2.0/grid/bin/srvctl add filesystem -d /dev/asm/acfsvol-119 -g 'DATA' -v ACFSVOL -m /ogg -u oracle
--挂载
/u01/app/11.2.0/grid/bin/srvctl start filesystem -d /dev/asm/acfsvol-119
--查看是否online以及mount
su - grid
crsctl stat res -t |grep -i acfs
ora.data.acfsvol.acfs
ora.registry.acfs
df -h
--如果没有online,再手动mount
su - root
# mount.acfs -o all
--授予目录权限
su - root
# chown oracle.oinstall /ogg
3 安装ogg软件:
--安装ogg软件,要安装在acfs所在的目录/ogg里面
cd fbo_ggs_Linux_x64_shiphome/
cd Disk1/
./runInstaller
4 ogg开启ddl支持:
sqlplus / as sysdba
@marker_setup.sql
@ddl_setup.sql
@role_setup.sql
GRANT GGS_GGSUSER_ROLE TO ogg_owner;
@ddl_enable.sql
@?/rdbms/admin/dbmspool.sql
@ddl_pin ogg_owner
5 配置MGR进程:
--需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
GGSCI (rac1) 3> edit params mgr
port 7809
DYNAMICPORTLIST 7940-8100
AUTOSTART ER *
AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
--启动mgr进程
GGSCI (rac1) 5> start mgr
6 源端添加表级附加日志:
--连接数据库
GGSCI>dblogin userid ogg password ogg
--添加表级附加日志
GGSCI>add trandata test.*
--显示日志添加情况
GGSCI>info trandata test.*
7 配置EXTRACT进程:
--添加extract进程
add extract E_BSS_1 tranlog,begin now,threads 2
add exttrail ./dirdat/es,extract E_BSS_1,megabytes 1000
--需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip
--配置ogg连接数据库的tns
ORCL =(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = orcl)))
--配置extract进程
GGSCI (rac1) 3> edit params e_bss_1
extract E_BSS_1
SETENV (ORACLE_HOME = "/u01/app/oracle/product/11.2.0/db_1")
SETENV (NLS_LANG = "AMERICAN_AMERICA.ZHS16GBK")
userid ogg@orcl, password ogg
exttrail ./dirdat/es
gettruncates
TRANLOGOPTIONS INCLUDEREGIONID,EXCLUDEUSER ogg
TRANLOGOPTIONS BUFSIZE 2048000
TRANLOGOPTIONS DBLOGREADER,DBLOGREADERBUFSIZE 2048000
DISCARDFILE ./dirrpt/e_bss_1.dsc,APPEND,MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPORTROLLOVER AT 6:00
REPORTCOUNT EVERY 1 HOURS,RATE
DDL INCLUDE MAPPED , OBJTYPE 'TABLE' &
INCLUDE MAPPED OBJTYPE 'INDEX'
DDLOPTIONS ADDTRANDATA RETRYOP RETRYDELAY 10 MAXRETRIES 10
DDLOPTIONS REPORT
FETCHOPTIONS MISSINGROW ABEND
STATOPTIONS REPORTFETCH
WARNLONGTRANS 1H,CHECKINTERVAL 10m
DYNAMICRESOLUTION
TABLE test.test;
8 配置PUMP进程:
--添加投递进程
add extract P_BSR_1,exttrailsource ./dirdat/es
add RMTTRAIL ./dirdat/rs,ext P_BSR_1,megabytes 1000
--需要注意的是这里远程的IP是目标端的VIP,这样才能在failover之后,依然可以正常投递
--配置投递进程参数
GGSCI (rac1) 3> edit params P_BSR_1
extract P_BSR_1
rmthost 192.168.1.204, mgrport 7809
rmttrail ./dirdat/rs
passthru
DISCARDFILE ./dirrpt/p_bsr_1.dsc,APPEND,MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPORTROLLOVER AT 6:00
REPORTCOUNT EVERY 1 HOURS,RATE
TABLE TEST.TEST;
9 添加源端OGG高可用服务实例:
++++++++++++++++++++++++++++++++++++++++++++++++参数说明
agctl add goldengate --help
Adds Goldengate instance to Oracle Clusterware. Instance name--gg_home GoldenGate home directory--instance_type {source | target | dual}--server_pool Name of the Server pool where instance can be run--nodes [,...] List of nodes where instance can be run--vip_name Virtual IP name--network Network number--ip IP address--user Operating System user name that owns the instance--group Name of the group to which the Operating System user belongs--oracle_home ORACLE_HOME location--databases [,...] List of database instance dependencies--db_services [,...] List of database service dependencies--filesystems [,...] List of file system resource dependencies--attribute =[,...] Attributes that can be applied--environment_vars [,...] Additional environment variables to set--monitor_extracts [,...] List of EXTRACT groups to monitor--monitor_replicats [,...] List of REPLICAT groups to monitor--critical_extracts [,...] List of critical EXTRACT groups to monitor and failover--critical_replicats [,...] List of critical REPLICAT groups to monitor and failover--dataguard_autostart Start GoldenGate on DataGuard role transition to PRIMARY--jagent_autostart Start JAgent on GoldenGate startup
--agctl添加高可用source_ogg实例,oracle用户执行以及管理
--注意这里源端VIP不是必需的,因为源端不需要用到VIP
--注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover
agctl add goldengate source_ogg --gg_home /ogg \--instance_type source \--nodes rac1,rac2 \--vip_name gg_vip_source \--filesystems ora.data.acfsvol.acfs\--databases ora.orcl.db \--oracle_home /u01/app/oracle/product/11.2.0/db_1 \--monitor_extracts E_BSS_1,P_BSR_1
--查看实例配置
[oracle@rac1 ogg]$ agctl config goldengate source_ogg
Instance name: source_ogg
Application GoldenGate location is: /ogg
Goldengate MicroServices Architecture environment: no
GoldenGate instance type is: source
EXTRACT groups to monitor: E_BSS_1,P_BSR_1
REPLICAT groups to monitor:
Critical EXTRACT groups:
Critical REPLICAT groups:
Autostart on DataGuard role transition to PRIMARY: no
Autostart JAgent: no
Configured to run on Nodes: rac1 rac2
ORACLE_HOME location is: /u01/app/oracle/product/11.2.0/db_1
Databases needed: ora.orcl.db
File System resources needed: ora.data.acfsvol.acfs
VIP name: gg_vip_source
10 启动源端OGG高可用服务实例:
--开启source_ogg实例,oracle用户执行
[oracle@rac1 ogg]$ agctl start goldengate source_ogg --node rac1
--会把mgr进程启动,mgr会把extract进程,pump进程启动
[oracle@rac1 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:00 00:00:03
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:00
--查看集群服务状态,当前服务在节点一
crsctl stat res -t
Cluster Resources
--------------------------------------------------------------------------------
gg_vip_source1 ONLINE ONLINE rac1
ora.LISTENER_SCAN1.lsnr1 ONLINE ONLINE rac1
ora.bkrac1-vip.vip1 ONLINE ONLINE rac1
ora.bkrac2-vip.vip1 ONLINE ONLINE rac2
ora.cvu1 ONLINE ONLINE rac1
ora.oc4j1 ONLINE ONLINE rac2
ora.orcl.db1 ONLINE ONLINE rac1 Open 2 ONLINE ONLINE rac2 Open
ora.rac1.vip1 ONLINE ONLINE rac1
ora.rac2.vip1 ONLINE ONLINE rac2
ora.scan1.vip1 ONLINE ONLINE rac1
xag.source_ogg.goldengate1 ONLINE ONLINE rac1
GGSCI (rac1) 2>
--关闭source_ogg实例,用oracle用户执行
agctl stop goldengate source_ogg
--手动切换source_ogg实例到去其他节点,用oracle用户执行
agctl relocate goldengate source_ogg --node rac2
目标端OGG高可用配置:
1 配置VIP,ACFS,OGG软件安装参照源端配置即可
2 配置MGR进程:
--需要注意配置的参数是autostart,因为高可用是依靠mgr去自动启动进程的
GGSCI (rac1) 3> edit params mgr
port 7809
DYNAMICPORTLIST 7940-8100
AUTOSTART ER *
AUTORESTART ER *,RETRIES 10, WAITMINUTES 1, RESETMINUTES 60
PURGEOLDEXTRACTS /ogg/dirdat/*,USECHECKPOINTS,MINKEEPDAYS 5
LAGREPORTHOURS 1
LAGINFOMINUTES 30
LAGCRITICALMINUTES 45
--启动mgr进程
GGSCI (rac1) 5> start mgr
3 配置replicat应用进程:
--创建检查表
GGSCI> dblogin userid ogg password ogg
GGSCI > add checkpointtable ogg.rep_ogg_ckpt
--添加应用进程
GGSCI> add replicat R_ING_1,exttrail ./dirdat/rs,checkpointtable ogg.rep_ogg_ckpt
--需要注意的是要通过tns去连接数据库,而不是ORACLE_SID方式,并且连接的tns需要可以访问到全部节点,最好连scan_ip
--配置ogg连接数据库的tns
ORCL =(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = orcl)))
--配置应用进程
GGSCI (rac1) 3> edit params r_ing_1
REPLICAT R_ING_1
SETENV (ORACLE_HOME = "/oracle/app/oracle/product/12.2/db_1")
SETENV (NLS_LANG = "AMERICAN_AMERICA.AL32UTF8")
userid ogg@orcl, password ogg
ASSUMETARGETDEFS
DBOPTIONS DEFERREFCONST
DBOPTIONS SUPPRESSTRIGGERS
DISCARDFILE ./dirrpt/r_ing_1.dsc, APPEND, MEGABYTES 1000
DISCARDROLLOVER AT 6:00
REPERROR (DEFAULT, ABEND)
REPERROR (24344, DISCARD)
gettruncates
DDL INCLUDE MAPPED , OBJTYPE 'TABLE' &
INCLUDE MAPPED OBJTYPE 'INDEX'
DDLOPTIONS REPORT
4 添加目标端OGG高可用服务实例:
--agctl添加高可用target_ogg实例,用oracle用户执行以及管理
--注意这里的database也可以不用设置,这里的设置主要是强关联database,这样当节点的database关闭会触发ogg failover
agctl add goldengate target_ogg --gg_home /ogg \--instance_type target \--nodes rac1,rac2 \--vip_name gg_vip_source \--filesystems ora.mgmt.acfsvol.acfs \--databases ora.orcl.db \--oracle_home /oracle/app/oracle/product/12.2/db_1 \
--monitor_replicats R_ING_1
--查看实例配置
[oracle@rac1 ogg]$ agctl config goldengate target_ogg
GoldenGate location is: /ogg
GoldenGate instance type is: target
Configured to run on Nodes: rac1 rac2
ORACLE_HOME location is: /oracle/app/oracle/product/12.2/db_1
Databases needed: ora.orcl.db
File System resources needed: ora.mgmt.acfsvol.acfs
VIP name: gg_vip_source
EXTRACT groups to monitor: E_BSS_1,P_BSR_1
REPLICAT groups to monitor: R_ING_1
Critical EXTRACT groups:
Critical REPLICAT groups:
Autostart on DataGuard role transition to PRIMARY: no
Autostart JAgent: no
5 启动目标端OGG高可用服务实例:
--开启source_ogg实例,用oracle用户执行
[oracle@rac1 ogg]$ agctl start goldengate target_ogg --node rac1
--会把mgr进程启动,mgr会把extract进程,pump进程启动
[oracle@rac1 ogg]$ ./ggsci
[oracle@rac1 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 12c on Oct 18 2019 01:38:51
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac1) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING R_ING_1 00:00:00 00:00:00
--查看集群服务状态,当前服务在节点一
crsctl stat res -t
Cluster Resources
--------------------------------------------------------------------------------
gg_vip_source1 ONLINE ONLINE rac1 STABLE
ora.LISTENER_SCAN1.lsnr1 ONLINE ONLINE rac2 STABLE
ora.MGMTLSNR1 ONLINE ONLINE rac1 169.254.179.17 192.168.2.10,STABLE
ora.asm1 ONLINE ONLINE rac1 Started,STABLE2 ONLINE ONLINE rac2 Started,STABLE3 OFFLINE OFFLINE STABLE
ora.cvu1 ONLINE ONLINE rac2 STABLE
ora.mgmtdb1 ONLINE ONLINE rac1 Open,STABLE
ora.orcl.db1 ONLINE ONLINE rac1 Open,HOME=/oracle/app/oracle/product/12.2/db_1,STABLE2 ONLINE ONLINE rac2 Open,HOME=/oracle/app/oracle/product/12.2/db_1,STABLE
ora.qosmserver1 ONLINE ONLINE rac2 STABLE
ora.rac1.vip1 ONLINE ONLINE rac1 STABLE
ora.rac2.vip1 ONLINE ONLINE rac2 STABLE
ora.scan1.vip1 ONLINE ONLINE rac2 STABLE
xag.target_ogg.goldengate1 ONLINE ONLINE rac1 STABLE
--------------------------------------------------------------------------------
GGSCI (rac1) 2>
--关闭source_ogg实例,用oracle用户执行
agctl stop goldengate target_ogg
--手动切换source_ogg实例到去其他节点,用oracle用户执行
agctl relocate goldengate target_ogg --node rac2
测试OGG高可用配置:
1 源端OGG节点主机重启:
--在源端开启会话,模拟表操作
declarev_count number(10):=0;
beginfor i in 1..1000000 loopinsert into test.test select * from dba_objects where rownum<4;delete from test.test where rownum<2;update test.test set owner='aaa'where rownum<2;v_count:=v_count+1;if v_count>=1 THENcommit;v_count:=0;end if;end loop;commit;
end;
/
--当前源端ogg所在节点主机rac1
[oracle@rac1 ~]$ agctl status goldengate source_ogg
Goldengate instance 'source_ogg' is running on rac1
--重启主机rac1
--ogg实例服务会自动转移到节点二rac2
[oracle@rac2 ogg]$ agctl status goldengate source_ogg
Goldengate instance 'source_ogg' is running on rac2
[oracle@rac2 ogg]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 19.1.0.0.4 OGGCORE_19.1.0.0.0_PLATFORMS_191017.1054_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Oct 17 2019 23:13:12
Operating system character set identified as UTF-8.
Copyright (C) 1995, 2019, Oracle and/or its affiliates. All rights reserved.
GGSCI (rac2) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:10
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:02
GGSCI (rac2) 2>
--目标端进程无感知
--源端和目标端两边的数据同步一致
2 目标端OGG节点主机重启:
--当前目标端ogg所在节点主机rac1
[oracle@rac1 ogg]$ agctl status goldengate target_ogg
Goldengate instance 'target_ogg' is running on rac1
--重启主机rac1
--ogg实例服务会自动转移到节点二rac2
[oracle@rac2 ~]$ agctl status goldengate target_ogg
Goldengate instance 'target_ogg' is running on rac2
GGSCI (rac2) 1> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING R_ING_1 00:00:00 00:00:08
GGSCI (rac2) 2>
--源端的投递进程会短暂abend,因为vip要从目标节点一漂移到节点二,所以会短暂的连接不上,但由于配置了autorestart参数,所以mgr进程会尝试重新start投递进程
GGSCI (rac2) 11> info all
--出现abend
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:06
EXTRACT ABENDED P_BSR_1 00:00:00 00:01:39
--自己恢复正常
GGSCI (rac2) 12> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING E_BSS_1 00:00:02 00:00:00
EXTRACT RUNNING P_BSR_1 00:00:00 00:00:04
--源端和目标端两边的数据同步一致
source:
SQL> select count(*) from test.test;
COUNT(*)
----------294962
SQL>
target:
select count(*) from test1.test
COUNT(*)
----------294962