diff --git a/.vscode/settings.json b/.vscode/settings.json new file mode 100644 index 0000000000000000000000000000000000000000..9d14cfb2c108e9f0823938972fa2ac5c6d0ce9d0 --- /dev/null +++ b/.vscode/settings.json @@ -0,0 +1,3 @@ +{ + "ansible.python.interpreterPath": "/bin/python" +} \ No newline at end of file diff --git a/README.md b/README.md index 5d676c1ce816bf1fd3dae932036eab0a689e18db..10f21f19110375fe1cd92a7ef24c90226e44a355 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,7 @@ 1. 暂不支持 DCF 模式的集群部署。 1. 暂不支持离线部署。 1. 暂不支持多地容灾部署。 +1. 暂不提供部署后的数据库或数据库用户的创建流程。 # 优势特点 diff --git a/ansible.cfg b/ansible.cfg index 89309017093754ab31ae8dac775e03934c8d30f4..47aa338d93d57019d4b6540ab76594630b48feee 100644 --- a/ansible.cfg +++ b/ansible.cfg @@ -10,7 +10,7 @@ become = True host_key_checking = False deprecation_warnings = False callback_whitelist = profile_tasks,timer,profile_roles -; display_skipped_hosts = False +display_skipped_hosts = False stdout_callback = yaml [callback_log_plays] diff --git a/docs/00-how-to.md b/docs/00-how-to.md index b87f6e608a18192318009b52aa4fb5dc9485614a..904da7e2525a63e58ba12edad5d6efcfdaac09fb 100644 --- a/docs/00-how-to.md +++ b/docs/00-how-to.md @@ -9,18 +9,18 @@ master 组仅可以配置 1 台机器。follower 可以多台。cascade 可选可为空。 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.11 -[opengauss_follower] +[opengauss_standby] 192.168.56.12 [opengauss_cascade] 192.168.56.13 [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` diff --git a/docs/01-ansible-in-docker.md b/docs/01-ansible-in-docker.md index 1983441e5e5956c579c27104f9311b4b796361d8..84af9fe9fb64b2c54fde79c74869752fbbb58f61 100644 --- a/docs/01-ansible-in-docker.md +++ b/docs/01-ansible-in-docker.md @@ -15,6 +15,6 @@ docker exec -it ansible-for-opengauss byobu # 声明 -我个人比较喜爱使用 fish 作为默认的 shell,以及使用 byobu 作为单窗口多终端的工作环境。如果您不喜欢,可以自行修改 Dockerfile 的内容。 +我个人比较喜爱使用 [fish](https://fishshell.com/) 作为默认的 shell,以及使用 [byobu](https://www.byobu.org/documentation) 作为单窗口多终端的工作环境。如果您不喜欢,可以自行修改 Dockerfile 的内容。 -接下来你可以阅读[详细配置](02-pre-set.md),仅需按分组编排服务器节点的角色,即可开始全自动化部署。 \ No newline at end of file +接下来你可以阅读[详细配置](02-pre-set.md),仅需按分组编排服务器节点的角色,即可开始全自动化部署。 diff --git a/docs/02-pre-set.md b/docs/02-pre-set.md index 43c4380b83941e2bd0e630f38944356efa6eda15..7ac1ddf8d81d25174c407f36da4935804e42e4d2 100644 --- a/docs/02-pre-set.md +++ b/docs/02-pre-set.md @@ -6,11 +6,11 @@ ``` ; 主服务器组,仅设置 1 个目标机。 -[opengauss_master] +[opengauss_primary] 192.168.56.11 ; 从服务器组,可设置若干个或留空。 -[opengauss_follower] +[opengauss_standby] 192.168.56.12 ; 级联服务器组,可设置若干个或留空。 @@ -19,8 +19,8 @@ ; 以上 3 个分组的合并组,勿动。 [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ; 机器的 SSH 信息,请根据你的实际情况修改。 @@ -33,47 +33,51 @@ ansible_ssh_pass=vagrant ansible_ssh_port=22 ``` -## 修改默认运行值。 +## 根据部署需求,修改变量文件。 -本项目的默认配置参数,存放在 `/workdir/roles/opengauss/defaults/main.yml`,你可以参考这个文件的内容,根据实际需要做一些定制。 +首先拷贝默认变量配置文件 -***但不建议直接修改它,以考虑对不同的服务器仓库进行维护。*** +``` +cp /workdir/roles/opengauss/defaults/main.yml /workdir/inventories/opengauss/group_vars/all.yml +``` -建议的自定义方法,是将 `/workdir/roles/opengauss/defaults/main.yml` 拷贝到 `/workdir/inventories/opengauss/group_vars/opengauss.yml`,再进行编辑。 +然后使用你习惯的编辑器修改 `/workdir/inventories/opengauss/group_vars/all.yml` 里的内容。 -部分变量是可以替换或扩展的。例如 +内容可以按官方文档的要求、建议,或你的实际需求,进行增删改,保留数据结构即可。 + +譬如目标机器是有额外数据盘的,挂载在 `/opengauss_data` 路径下,那么你可以修改 ``` -# Sysctl 的配置,可自行扩展。 -opengauss_sysctl: - net.ipv4.tcp_retries1: 5 - net.ipv4.tcp_syn_retries: 5 +# 安装目录 +opengauss_home: /opt/openGauss ``` -你可以改成 +为 ``` -# Sysctl 的配置,可自行扩展。 -opengauss_sysctl: - net.ipv4.tcp_retries1: 5 - net.ipv4.tcp_syn_retries: 3 - net.ipv4.tcp_synack_retries: 5 +# 安装目录 +opengauss_home: /openGauss ``` -通过 `roles/pre-tasks/tasks/vars_combine.yml` 的处理后,我们可以得到一组合并后的变量 +我们的脚本会通过 `/workdir/roles/pre-tasks/tasks/vars_combine.yml` 的处理后,替换默认变量并放置在 `combined_vars` 数组内。 ``` combined_vars: - opengauss_sysctl: - net.ipv4.tcp_retries1: 5 - net.ipv4.tcp_syn_retries: 3 - net.ipv4.tcp_synack_retries: 5 + opengauss_home: /openGauss ``` 整个部署任务,都会大量使用 combined_vars 里的变量。 # 使用自定义的 cluster_config.xml -如果你需要手动定制集群,这里也是支持的,只需要把写好的 `cluster_config.xml` 改名为 `cluster_config.xml.j2`,存放到 `/workdir/inventories/opengauss/templates/cluster_config.xml.j2`,部署时会优先使用你的自定义配置。 +如果你需要手动定制集群,这里也是支持的。 + +前提: + + 1. 确保 `/workdir/inventories/opengauss/hosts.ini` 的角色编排,和你的 `cluster_config.xml` 内容一致。 + + 2. 已有节点的编号顺序,在 `cluster_config.xml` 需要严格保持一致。 + +然后把写好的 `cluster_config.xml` 改名为 `cluster_config.xml.j2`,存放到 `/workdir/inventories/opengauss/templates/cluster_config.xml.j2`,部署时会优先使用你的自定义配置。 -接下来就可以[开始部署](03-deploy.md) \ No newline at end of file +接下来就可以 [开始部署](03-deploy.md)。 \ No newline at end of file diff --git a/docs/03-deploy.md b/docs/03-deploy.md index 73aac7be2edd71098b3d3eb49eccb8642d54f70d..fc1509d24536ada677d539fe11f295f90afc8776 100644 --- a/docs/03-deploy.md +++ b/docs/03-deploy.md @@ -31,6 +31,12 @@ 这里的 `pansible` 是我预置的命令别名,对应的是 `ansible-playbook`。 + 同时,你可以登入到主节点,切换到 root 去查看部署过程中的日志,了解进度。 + + ``` + tail -n100 -f /var/log/omm/**/*.log + ``` + 1. 部署过程中自动生成的公私钥,以及账号密码,存放在 `/workdir/inventories/opengauss/credentials` ``` @@ -57,16 +63,16 @@ 对应的 hosts.ini 分组编排内容 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.11 -[opengauss_follower] +[opengauss_standby] [opengauss_cascade] [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` @@ -77,17 +83,17 @@ opengauss_cascade 对应的 hosts.ini 分组编排内容 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.12 -[opengauss_follower] +[opengauss_standby] 192.168.56.13 [opengauss_cascade] [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` @@ -98,18 +104,18 @@ opengauss_cascade 对应的 hosts.ini 分组编排内容 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.14 -[opengauss_follower] +[opengauss_standby] 192.168.56.15 [opengauss_cascade] 192.168.56.16 [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` @@ -120,18 +126,18 @@ opengauss_cascade 对应的 hosts.ini 分组编排内容 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.17 -[opengauss_follower] +[opengauss_standby] 192.168.56.18 192.168.56.19 [opengauss_cascade] [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` diff --git a/docs/04-expansion.md b/docs/04-expansion.md index 5a9807ff070b2d7443e3b916080a0d52f4d99732..06ef8c4bf138932a45bfcf0329f07fceeafbc39a 100644 --- a/docs/04-expansion.md +++ b/docs/04-expansion.md @@ -12,19 +12,19 @@ ### 请注意: -### 如果是对 1 主 1 备进行扩容,会增加一些部署 Cluster Manager 的流程,请确保扩容前没有数据库读写操作。 +### 如果是对 1 主 1 备进行扩容,会增加一些部署 Cluster Manager 的流程,耗时较长。请确保扩容前没有数据库读写操作。 -### 目前已支持从任意节点数量开始往上扩容。 +### 扩容级联(Cascade)节点的前提是需要有备机(Standby)节点,即 `opengauss_standby` 组必须有分配机器。 假设原编排为 1 主 1 备。 ``` ; 主服务器组,仅设置 1 个目标机。 -[opengauss_master] +[opengauss_primary] 192.168.56.11 ; 从服务器组,可设置若干个或留空。 -[opengauss_follower] +[opengauss_standby] 192.168.56.12 ``` @@ -32,11 +32,11 @@ ``` ; 主服务器组,仅设置 1 个目标机。 -[opengauss_master] +[opengauss_primary] 192.168.56.11 ; 从服务器组,可设置若干个或留空。 -[opengauss_follower] +[opengauss_standby] 192.168.56.12 192.168.56.14 192.168.56.16 @@ -52,4 +52,10 @@ 再次执行 `pansible 01-deploy.yml`。 +同时,你可以登入到主节点,切换到 root 去查看部署过程中的日志,了解进度。 + +``` +tail -n100 -f /var/log/omm/**/*.log +``` + ![扩容结果](imgs/23-10-13_1155_909.png) \ No newline at end of file diff --git a/docs/imgs/intro_000.png b/docs/imgs/intro_000.png new file mode 100644 index 0000000000000000000000000000000000000000..0f26fbab403872cf5fcd6860f73899b2bd230386 Binary files /dev/null and b/docs/imgs/intro_000.png differ diff --git a/docs/imgs/intro_002.png b/docs/imgs/intro_002.png new file mode 100644 index 0000000000000000000000000000000000000000..728c22bfbc2d328c10cdd54cd0b0a508a7309819 Binary files /dev/null and b/docs/imgs/intro_002.png differ diff --git a/docs/imgs/intro_003.png b/docs/imgs/intro_003.png new file mode 100644 index 0000000000000000000000000000000000000000..af450a6b08578216feff379125b53fd7f0fd1cc2 Binary files /dev/null and b/docs/imgs/intro_003.png differ diff --git a/docs/imgs/intro_report.png b/docs/imgs/intro_report.png new file mode 100644 index 0000000000000000000000000000000000000000..3033893e0127f381bb19c14ad0ac46e4e85723ed Binary files /dev/null and b/docs/imgs/intro_report.png differ diff --git a/docs/intro.md b/docs/intro.md new file mode 100644 index 0000000000000000000000000000000000000000..8d072b247c334ffbb757aa7764ffb2b2aed38ac0 --- /dev/null +++ b/docs/intro.md @@ -0,0 +1,234 @@ +大家好,今天我们为大家推荐一套基于 Ansible 开发的,自动化部署及扩容 openGauss 的脚本工具:Ansible for openGauss(以下简称 AFO)。 + +通过它,我们只需简单修改一些配置文件,即可快速部署多种架构模式的 openGauss,以及对已有架构进行自动化扩容。下面我们就请这套工具的贡献者,上海联空网络技术有限公司(以下简称“联空网络”)的李海滨,给大家讲解它的设计理念和优点。 + +# 开发背景 + +Hi,大家好,我是来自联空网络品质安全中心的运维工程师,李海滨。 + +我们联空网络是一家专注于互联网医疗领域的公司,国内多家百强医院都是我们公司的客户。面对当前国产信创的需求日益增长,我们的医院客户开始关注我们联空的软硬件产品,是否能与国产新创产品适配。为此我们积极响应,投入专业团队,对相关新创软硬件做可行性研判。 + +在深入了解国产数据库的过程中,我们接触到了其中一款产品:海量数据库。在向海量数据的工程师们请教后,得知它的上游是开源数据库 openGauss。openGauss是一款开源的关系型数据库管理系统,具有高性能、高可用性以及卓越的扩展能力。于是我构想,我们可以为研发团队提供 openGauss 环境,让他们基于 openGauss 做代码适配。那么我们的软件不就可以同样适配海量数据库咯。 + +为了方便反复部署测试,我拿出了擅长的 Ansible,为 openGauss 写一套自动化部署工具,以简化其安装、配置和管理过程。 + +# 解决部署痛点 + +如果你有过按照官方文档去部署一套 openGauss,你会发现不论是单点还是集群,其实还需要做不少的前期工作。例如要根据 CPU 和操作系统,下载对应版本的安装包。又需要根据不同的 Linux 操作系统,做一些额外配置。手工部署在这里不仅低效,而且容易有错漏。如果是多节点的部署,手工部署的弱势会被进一步放大。 + +我开发这套 Ansible 脚本的目标,就是尽可能地覆盖部署前、部署中和部署后的场景,并且把手工部署过程中遇到的一些坑,也通过自动化来解决掉。 + +例如在 openEuler 20.03 系统里部署 openGauss 5.0,你会遇到 readline-devel 这个依赖包的版本是 8,而 openGauss 5.0 需要的是 libreadline.so.7,从而出现报错。我查找到解决方法后,加入到部署流程中,自动帮大家把这个坑给填了。 + +总结下来,目前我们这个工具能实现以下功能: + +1. 提供一个专属的 ansible-docker 目录,只要控制机可运行 docker,即可运行一个 Ansible 容器,适配 AFO 的执行。 +1. 自动适配 CPU 架构和操作系统(已支持 CentOS 和 openEuler 20.03),自动下载对应的 openGauss 安装包,自动对操作系统做适配修改。 +1. 实现从单节点到多节点的多种架构模式的一次性部署。 +1. 使用本工具部署的单点或集群,还能通过添加服务器,再次运行实现平滑扩容。实测可直接从单主扩容到 1 主 2 备 2 级联。 +1. 允许用户自定义一些变量,例如指定部署目录,指定 openGauss 版本(5.0 或以上),指定 sysctl 的参数配置。 +1. 免除手工部署中的问答环节,自动生成相关密码,自动填写。最后生成部署报告。 + +# 效率是如何提升的? + +#### 以配置网卡 MTU 为例,3 台服务器的执行时间如下: + +``` +Saturday 18 November 2023 20:54:50 +0800 (0:00:06.777) 0:01:03.288 ***** + +TASK [opengauss : Config MTU in /etc/sysconfig/network-scripts/ifcfg-enp0s8] ***************************************************************************************************************** +changed: [192.168.56.11] +changed: [192.168.56.14] +changed: [192.168.56.15] +Saturday 18 November 2023 20:54:50 +0800 (0:00:00.517) 0:01:03.807 ***** +``` + +上一个任务的结束时间是 1 分 3 秒 288,Ansible 在不到 0.6 秒的时间里完成了 3 台服务器的网卡 MTU 修改。因为,它是并行操作的。 + +#### 以生成 cluster_config.xml 为例 + +3 台服务器的节点,手工写 cluster_config.xml,需要花费多少时间,各位可以自己计时看看。 + +而我们利用 Ansible 的模板功能,可以在 1 秒内生成该文件,编排非常清晰,一目了然。 + +从这么一个服务器分组编排文件 `inventories/opengauss/hosts.ini` + +``` +[opengauss_primary] +192.168.56.11 + +; 备机,可设置若干个或留空。 +[opengauss_standby] +192.168.56.15 + +; 级联机,可设置若干个或留空。前提是 opengauss_standby 组不为空。 +[opengauss_cascade] +192.168.56.14 + +; 以上 3 个分组的合并组,勿动。 +[opengauss:children] +opengauss_primary +opengauss_standby +opengauss_cascade + +; 备节点分组,总数不可大于 8。 +[opengauss_replicas:children] +opengauss_standby +opengauss_cascade + +; 机器的 SSH 信息,请根据你的实际情况修改。 +[opengauss:vars] +; ssh 用户名,如果不是 root 用户,请确保它有 sudo 权限。 +ansible_ssh_user=vagrant +; ssh 密码 +ansible_ssh_pass=vagrant +; ssh 端口 +ansible_ssh_port=22 + +``` + +生成以下配置内容,仅需 1 秒钟。 + +``` + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +我们甚至考虑到了在机房里使用小尺寸显示器查看 cluster_config.xml 的场景,对 xml 做了换行处理,避免出现单行过长的问题。 + +# 自动化思路 + +AFO 主要的工作流如下: + +1. 对部署环境的信息进行采集。 +1. 将用户自定义的变量和脚本默认变量合并,优先使用自定义变量。 +1. 检查是否已部署 openGauss,如无,进入初次部署流程。 +1. 如果已部署 openGauss,则检查用户是否有添加新节点,进入节点扩容流程。 +1. 节点扩容流程里又分 2 步,先扩容备机节点,再扩容级联节点。因为级联节点只连接备机。 +1. 如果集群节点数量为 3 或以上,且未部署 CM,则进入 CM 部署流程。 +1. 最后,检查确认集群已正常运行,生成部署报告。 + +# 结果展示 + +以 5 台 VirtualBox 虚拟机为例,统一 8U+16GB 配置,都在 1 块物理 SATA 盘上读写。 + +另外,虚拟机都已经提前安装好相关依赖包,openGauss 安装包也已经下载到本地。排除网络下载速度的不确定因素。 + + +#### 单节点,4 分 40 秒(下图右下角)完成部署。 + +![1p](imgs/intro_000.png) + +#### 1 主 1 备 1 级联,11 分 07 秒(下图右下角)完成部署。 + +![1p1s1c](imgs/intro_002.png) + +#### 从 1 主,扩容为 1 主 1 备 1 级联,23 分(下图右下角)完成部署。 + +扩容模式需要额外的流程,因此耗时比直接部署 3 节点的要多。 + +![1p1s1c](imgs/intro_003.png) + +最后生成部署报告 + +![report](imgs/intro_report.png) + +# 项目 git 库地址 + +这套工具已经收录在 openGauss 社区的官方代码库,欢迎大家前往下载试用,并向我们多多提出宝贵意见。 + +### https://gitee.com/opengauss/ansible-for-opengauss + +# 后话 + +上海联空网络科技有限公司,致力于为医疗行业提供全面、高效的互联网解决方案。作为众多知名医院的软件供应商,我们凭借丰富的行业经验和卓越的技术实力,赢得了广泛的客户认可和口碑。 + +我们的团队由一批经验丰富、技术精湛的专家组成,他们在互联网医疗领域具有深厚的积累和专业的知识。通过不断创新和完善,我们为医院客户提供了一系列优质、高效的软件服务,帮助医疗机构实现数字化转型和信息化升级,提高医疗质量和效率。 + +未来,我们也将继续投入更多资源,为 openGauss 等开源技术贡献力量,同时为医疗行业提供更多创新、高效的解决方案,推动互联网医疗的进一步发展。感谢您的关注和支持,期待与您共同探讨互联网医疗的未来发展。 \ No newline at end of file diff --git a/inventories/opengauss/cluster_config.xml b/inventories/opengauss/cluster_config.xml deleted file mode 100644 index 7964169725ae199409803f5c3bc70cc4c4f292a6..0000000000000000000000000000000000000000 --- a/inventories/opengauss/cluster_config.xml +++ /dev/null @@ -1,88 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/inventories/opengauss/group_vars/opengauss.yml b/inventories/opengauss/group_vars/opengauss.yml deleted file mode 100644 index 036acd3ca43e75762212e2d1dabefd9f0307e22e..0000000000000000000000000000000000000000 --- a/inventories/opengauss/group_vars/opengauss.yml +++ /dev/null @@ -1,5 +0,0 @@ -# ansible_python_interpreter: "python3" - -opengauss_version: 5.0.0 - -# global_pkg_mirror: https://mirrors.aliyun.com diff --git a/inventories/opengauss/hosts.ini b/inventories/opengauss/hosts.ini index 3a59a1e3aa49abe0ffa0ea81faa9740fd9bcf580..ad3e43775f1a6e3f03f0f8608541254265af0e87 100644 --- a/inventories/opengauss/hosts.ini +++ b/inventories/opengauss/hosts.ini @@ -1,24 +1,26 @@ ; 主机,仅设置 1 个目标机。 -[opengauss_master] +[opengauss_primary] 192.168.56.11 ; 备机,可设置若干个或留空。 -[opengauss_follower] -192.168.56.12 +[opengauss_standby] +192.168.56.14 +;192.168.56.15 -; 级联机,可设置若干个或留空。 +; 级联机,可设置若干个或留空。前提是 opengauss_standby 组不为空。 [opengauss_cascade] -;192.168.56.13 +;192.168.56.11 +192.168.56.12 ; 以上 3 个分组的合并组,勿动。 [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ; 备节点分组,总数不可大于 8。 [opengauss_replicas:children] -opengauss_follower +opengauss_standby opengauss_cascade ; 机器的 SSH 信息,请根据你的实际情况修改。 diff --git a/roles/opengauss/README.md b/roles/opengauss/README.md index 17ab66584f5a070540fafa90e4adf817e838d26b..406c9a681160bed24019cee29cf7a062a1ffd3ed 100644 --- a/roles/opengauss/README.md +++ b/roles/opengauss/README.md @@ -48,18 +48,18 @@ master 组仅可以配置 1 台机器。follower 可以多台。cascade 可选可为空。 ``` -[opengauss_master] +[opengauss_primary] 192.168.56.11 -[opengauss_follower] +[opengauss_standby] 192.168.56.12 [opengauss_cascade] 192.168.56.13 [opengauss:children] -opengauss_master -opengauss_follower +opengauss_primary +opengauss_standby opengauss_cascade ``` diff --git a/roles/opengauss/defaults/main.yml b/roles/opengauss/defaults/main.yml index 8eb539e7a14b398ebed2a72f2718ba8c83d20255..afd3da09474d6a150f8636a9fd75511d0cf4c19b 100644 --- a/roles/opengauss/defaults/main.yml +++ b/roles/opengauss/defaults/main.yml @@ -14,7 +14,7 @@ og_disable_history: false # 具体请看 https://docs.opengauss.org/zh/docs/{{ opengauss_version }}/docs/InstallationGuide/%E5%88%9B%E5%BB%BAXML%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6.html opengauss_paths: gaussdbAppPath: "{{ opengauss_home }}/install/app" - gaussdbLogPath: "/var/log/" + gaussdbLogPath: "/var/log" tmpMppdbPath: "{{ opengauss_home }}/tmp" corePath: "{{ opengauss_home }}/corefile" gaussdbToolPath: "{{ opengauss_home }}/install/om" @@ -38,7 +38,7 @@ opengauss_env: lookup( 'password', inventory_dir + '/credentials/opengauss_ca_file_pass', - chars=['ascii_letters', 'digits', 'punctuation'], + chars=['ascii_letters', 'digits'], length=8 ) }} @@ -94,8 +94,9 @@ opengauss_download: # 默认初始值 og_hostname: "og{{ inventory_hostname | ipaddr('int') }}" og_all_nodes: "{{ groups['opengauss'] | sort }}" -og_master: "{{ groups['opengauss_master'] | first }}" +og_master: "{{ groups['opengauss_primary'] | first }}" og_replicas: "{{ groups['opengauss_replicas'] | sort }}" og_upload_path: "/opt/software/openGauss" og_cm_enabled: "{{ (groups['opengauss'] | count) > 2 }}" og_expansion: false +og_preinstall: false diff --git a/roles/opengauss/tasks/deploy/cluster_check.yml b/roles/opengauss/tasks/deploy/cluster_check.yml index f8d7909500c5427042513d35bd7fe2a23a91dd41..e3b0172093972b3e658a8a69a476bb2a4a4dbb2d 100644 --- a/roles/opengauss/tasks/deploy/cluster_check.yml +++ b/roles/opengauss/tasks/deploy/cluster_check.yml @@ -14,30 +14,17 @@ run_once: true rescue: - - name: Import pre tasks + - name: Import OS config tasks ansible.builtin.import_tasks: - file: pre_tasks.yml + file: os/main.yml become_user: root - - name: Import runtime tasks + - name: Import cluster deploy tasks ansible.builtin.import_tasks: - file: deploy/runtime_groups.yml - run_once: true - - - name: Import deploy tasks - ansible.builtin.import_tasks: - file: deploy/main.yml + file: deploy/cluster_deploy.yml delegate_to: "{{ og_master }}" + become_user: "{{ og_user }}" run_once: true - vars: - cm_nodes: "{{ (og_all_nodes | count > 2) | ternary(og_all_nodes, '') | sort }}" - dn_nodes: "{{ groups['opengauss_dn'] | sort }}" - ep_nodes: "{{ (groups['opengauss_ep'] is defined) | ternary(groups['opengauss_ep'], '') | sort }}" - og_cm_enabled: "{{ og_all_nodes | count > 2 }}" - -# - name: Repeat tasks -# ansible.builtin.include_tasks: -# file: deploy/cluster_check.yml - name: Mission aborted when: @@ -45,7 +32,7 @@ - "gs_detail.stdout | regex_search(keywords, multiline=True, ignorecase=True)" vars: keywords: >- - "(repair|down)" + "(repair|down|disconnect|build)" run_once: true block: - name: Abort tasks when cluster is unstable diff --git a/roles/opengauss/tasks/deploy/cluster_config_generate.yml b/roles/opengauss/tasks/deploy/cluster_config_generate.yml new file mode 100644 index 0000000000000000000000000000000000000000..a61db079cd32429107edd7f4afb087da7dd2b6bc --- /dev/null +++ b/roles/opengauss/tasks/deploy/cluster_config_generate.yml @@ -0,0 +1,42 @@ +- name: "Create local file {{ inventory_dir + '/cluster_config.xml' }}" + ansible.builtin.template: + src: "{{ item }}" + dest: "{{ inventory_dir }}/cluster_config.xml" + owner: "{{ og_user }}" + group: "{{ og_group }}" + mode: "0644" + lstrip_blocks: true + with_first_found: + - "{{ inventory_dir }}/templates/cluster_config.xml.j2" + - "cluster_config.xml.j2" + delegate_to: localhost + become: false + +- name: "Update {{ og_master + ':' + og_upload_path + '/cluster_config.xml' }}" + ansible.builtin.copy: + src: "{{ inventory_dir }}/cluster_config.xml" + dest: "{{ og_upload_path }}/" + owner: "{{ og_user }}" + group: "{{ og_group }}" + mode: "0644" + backup: true + +- name: Tips + ansible.builtin.debug: + msg: | + Starting pre-install. + Please wait... + when: "og_preinstall" + +# 预部署脚本 +- name: "Run {{ og_upload_path + '/script/gs_preinstall' }}" + ansible.builtin.command: >- + {{ og_upload_path }}/script/gs_preinstall \ + -U {{ og_user }} \ + -G {{ og_group }} \ + -X {{ og_upload_path }}/cluster_config.xml \ + --non-interactive + changed_when: false + become_user: root + when: "og_preinstall" + delegate_to: "{{ og_master }}" diff --git a/roles/opengauss/tasks/deploy/cluster_deploy.yml b/roles/opengauss/tasks/deploy/cluster_deploy.yml new file mode 100644 index 0000000000000000000000000000000000000000..0f3f7d69411fb3d040e2717613d8e1bb830bcffd --- /dev/null +++ b/roles/opengauss/tasks/deploy/cluster_deploy.yml @@ -0,0 +1,46 @@ +- name: Import runtime tasks + ansible.builtin.import_tasks: + file: deploy/runtime_groups.yml + +- name: Set first time deploy facts + ansible.builtin.set_fact: + cm_nodes: "{{ (og_all_nodes | count > 2) | ternary(og_all_nodes, '') | sort }}" + dn_nodes: "{{ groups['opengauss_dn'] | sort }}" + ep_nodes: "{{ (groups['opengauss_ep'] is defined) | ternary(groups['opengauss_ep'], '') | sort }}" + og_cm_enabled: "{{ (og_all_nodes | count) > 2 }}" + og_preinstall: true + +- name: Switch to root user + become_user: root + block: + - name: Import upload tasks + ansible.builtin.import_tasks: + file: deploy/upload.yml + + - name: Import pre tasks + ansible.builtin.import_tasks: + file: deploy/cluster_pre_tasks.yml + + - name: Import config generate tasks + ansible.builtin.import_tasks: + file: deploy/cluster_config_generate.yml + +- name: Tips + ansible.builtin.debug: + msg: | + Starting deploy openGauss on + + {{ og_all_nodes }} + + Please wait... + +- name: "Run {{ og_upload_path + '/script/gs_install' }}" + ansible.builtin.command: >- + gs_install \ + -X {{ og_upload_path }}/cluster_config.xml \ + --gsinit-parameter="--pwpasswd={{ og_db_pass }}" \ + --gsinit-parameter="--locale={{ og_locale }}" \ + --time-out {{ (og_all_nodes | count) * 600 }} + changed_when: false + become_flags: "-i" + when: "not og_expansion" diff --git a/roles/opengauss/tasks/deploy/cluster_expand.yml b/roles/opengauss/tasks/deploy/cluster_expand.yml index a41f0ae916302ee5f0047bcd11acef26aa3048ac..7e344537a26cc4f3df8495faf9a1095e1da5e1be 100644 --- a/roles/opengauss/tasks/deploy/cluster_expand.yml +++ b/roles/opengauss/tasks/deploy/cluster_expand.yml @@ -1,35 +1,95 @@ -- name: Import pre tasks +- name: Set expansion facts + ansible.builtin.set_fact: + cm_nodes: "{{ (groups['opengauss_cm'] is defined) | ternary(groups['opengauss_cm'], '') | sort }}" + dn_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" + ep_nodes: "{{ groups['opengauss_ep'] | sort }}" + og_cm_enabled: "{{ (groups['opengauss_cm'] is defined) | ternary(true, false) }}" + og_preinstall: true + run_once: true + +# 从扩容列表里拆分出备机和级联机 +- name: "Set 'standby_nodes' and 'cascade_nodes'" + ansible.builtin.set_fact: + standby_nodes: >- + {%- for node in ep_nodes if node in groups['opengauss_standby'] -%} + {{ node }}{{ (loop.nextitem is defined) | ternary(',', '') }} + {%- endfor -%} + cascade_nodes: >- + {%- for node in ep_nodes if node in groups['opengauss_cascade'] -%} + {{ node }}{{ (loop.nextitem is defined) | ternary(',', '') }} + {%- endfor -%} + run_once: true + +- name: Import OS config tasks ansible.builtin.include_tasks: - file: pre_tasks.yml + file: os/main.yml when: "inventory_hostname in ep_nodes" -- name: Import deploy tasks - ansible.builtin.import_tasks: - file: deploy/main.yml - delegate_to: "{{ og_master }}" +- name: Start expansion tasks run_once: true + delegate_to: "{{ og_master }}" + block: + - name: Import pre tasks + ansible.builtin.import_tasks: + file: deploy/cluster_pre_tasks.yml + + - name: Import config generate tasks + ansible.builtin.import_tasks: + file: deploy/cluster_config_generate.yml + + # 备机扩容 + - name: Tips + ansible.builtin.debug: + msg: | + Adding standby nodes -# # 3 节点或以上,且未部署 CM 的情况。 -# - name: Tasks for CM cluster -# when: -# - "(cm_nodes | count) < 2" -# - "og_cm_enabled" -# run_once: true -# block: -# - name: Import cluster manager tasks -# ansible.builtin.import_tasks: -# file: cluster_manager.yml -# delegate_to: "{{ og_master }}" -# become_user: "{{ og_user }}" - -# - name: Refresh cluster status -# ansible.builtin.command: -# gs_om -t status --detail -# changed_when: false -# register: gs_detail1 -# become_user: "{{ og_user }}" -# delegate_to: "{{ og_master }}" - -# - name: Import runtime groups tasks -# ansible.builtin.import_tasks: -# file: deploy/runtime_groups.yml + {{ standby_nodes }} + + Please wait... + + - name: Expanding standby nodes + when: + - "standby_nodes is truthy" + ansible.builtin.shell: >- + . /home/{{ og_user }}/.bashrc && \ + {{ og_upload_path }}/script/gs_expansion \ + -U {{ og_user }} \ + -G {{ og_group }} \ + -X {{ og_upload_path }}/cluster_config.xml \ + --time-out {{ (ep_nodes | count) * 600 }} \ + -h {{ standby_nodes }} + changed_when: false + + - name: Import 'wait_for_started' tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml + + # 级联机扩容,前提是集群里有 Normal 状态的备机节点。 + - name: Tips + ansible.builtin.debug: + msg: | + Add cascade nodes into cluster. + Please wait... + + - name: Expanding cascade nodes + when: + - "gs_status.stdout_lines | regex_search('Standby.*Normal')" + - "cascade_nodes is truthy" + ansible.builtin.shell: >- + . /home/{{ og_user }}/.bashrc && \ + {{ og_upload_path }}/script/gs_expansion \ + -U {{ og_user }} \ + -G {{ og_group }} \ + -X {{ og_upload_path }}/cluster_config.xml \ + --time-out {{ (ep_nodes | count) * 600 }} \ + -h {{ cascade_nodes }} + changed_when: false + + always: + - name: Import wait for started tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml + +- name: Import cluster check tasks + ansible.builtin.import_tasks: + file: deploy/cluster_check.yml diff --git a/roles/opengauss/tasks/deploy/cluster_manager.yml b/roles/opengauss/tasks/deploy/cluster_manager.yml index b70664035ee7b880c3ac54987ad53ba6c49f7738..8b1e9472d59e97375176f943b0fdb8765494dcbf 100644 --- a/roles/opengauss/tasks/deploy/cluster_manager.yml +++ b/roles/opengauss/tasks/deploy/cluster_manager.yml @@ -1,29 +1,27 @@ -- name: Deploy CM into existing data cluster +- name: Set CM facts + ansible.builtin.set_fact: + cm_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" + dn_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" + ep_nodes: "" + og_cm_enabled: true + og_expansion: false + og_preinstall: false + +- name: Root tasks + become_user: root block: - name: Import upload tasks ansible.builtin.import_tasks: file: upload.yml - become_user: root - # run_once: true - name: Get cm package name ansible.builtin.set_fact: cm_pkg: "{{ file }}" - loop: "{{ og_upload.files | default([]) }}" + loop: "{{ og_upload.files }}" loop_control: loop_var: file when: - "'cm.tar.gz' in file" - # run_once: true - - - name: "Change file permission of {{ cm_pkg }}" - ansible.builtin.file: - path: "{{ og_upload_path }}/{{ cm_pkg }}" - owner: "{{ og_user }}" - group: "{{ og_group }}" - mode: "0640" - become_user: root - # run_once: true - name: Create some paths ansible.builtin.file: @@ -33,7 +31,6 @@ group: "{{ og_group }}" mode: "0750" delegate_to: "{{ item.1 }}" - become_user: root with_nested: - ['{{ og_log_path }}/omm/cm/cm_server', '{{ og_log_path }}/omm/cm/cm_agent'] - "{{ og_all_nodes }}" @@ -44,155 +41,94 @@ ansible.builtin.file: path: "/opt/openGauss/install/om/{{ cm_pkg }}" state: absent - become_user: root loop: "{{ og_all_nodes }}" delegate_to: "{{ item }}" - # 这里需要建一个假的定时任务,包含 'om_monitor'。后续 cm_install 时会检查,否则失败。 - - name: Fake an om_monitor cron job before install - ansible.builtin.cron: - name: openGauss om monitor - special_time: yearly - job: "{{ og_home }}/install/app/bin/om_monitor" - loop: "{{ dn_nodes }}" - loop_control: - loop_var: node - label: "{{ hostvars[node]['node_ip'] }}" - delegate_to: "{{ hostvars[node]['node_ip'] }}" + - name: Import pre tasks + ansible.builtin.import_tasks: + file: deploy/cluster_pre_tasks.yml - - name: "Create cluster_config.xml in local path '{{ inventory_dir }}'" - ansible.builtin.template: - src: "{{ item }}" - dest: "{{ inventory_dir }}/cluster_config.xml" - owner: "{{ og_user }}" - group: "{{ og_group }}" - mode: "0644" - lstrip_blocks: true - with_first_found: - - "{{ inventory_dir }}/templates/cluster_config.xml.j2" - - "cluster_config.xml.j2" - delegate_to: localhost - become: false - # run_once: true + - name: Import config generate tasks + ansible.builtin.import_tasks: + file: deploy/cluster_config_generate.yml - - name: "Upload cluster_config.xml to {{ og_upload_path }}" - ansible.builtin.copy: - src: "{{ inventory_dir }}/cluster_config.xml" - dest: "{{ og_upload_path }}/" + - name: "Change file permission of {{ cm_pkg }}" + ansible.builtin.file: + path: "{{ og_upload_path }}/{{ cm_pkg }}" owner: "{{ og_user }}" group: "{{ og_group }}" - mode: "0644" - backup: true + mode: "0640" - - name: Step 1 | Switchover once to avoid 'Term of primary is invalid or not maximal' error +# 这里需要建一个假的定时任务,包含 'om_monitor'。后续 cm_install 时会检查替换,否则失败。 +- name: Fake an om_monitor cron job before install + ansible.builtin.cron: + name: openGauss om monitor + special_time: yearly + job: "{{ og_home }}/install/app/bin/om_monitor" + loop: "{{ dn_nodes }}" + loop_control: + loop_var: node + label: "{{ hostvars[node]['node_ip'] }}" + delegate_to: "{{ hostvars[node]['node_ip'] }}" + +# 集群增加 CM 工具 +- name: Deploy CM tool + when: + - "groups['opengauss_cm'] is not defined" + - "og_cm_enabled" + block: + - name: Switchover on a standby node ansible.builtin.command: gs_ctl switchover -D {{ og_data_path }}/dn changed_when: false - loop: "{{ dn_nodes }}" + loop: "{{ dn_nodes | select('search', 'Standby') }}" loop_control: loop_var: node + extended: true + when: "ansible_loop.index == 1" delegate_to: "{{ hostvars[node]['node_ip'] }}" - when: "'Standby' in node" - register: switchover_status - until: "switchover_status is succeeded" - retries: 9 - delay: 5 - - # - name: Select a standby node and the primary node - # ansible.builtin.set_fact: - # standby_nodes: >- - # {% for node in dn_nodes if 'Standby' in node %} - # {{ hostvars[node]['node_ip'] }} - # {% endfor %} - # primary_nodes: >- - # {% for node in dn_nodes if 'Primary' in node %} - # {{ hostvars[node]['node_ip'] }} - # {% endfor %} - - name: "Step 2 | Switchover back to {{ og_master }}" + - name: Switchover back to primary node ansible.builtin.command: gs_ctl switchover -D {{ og_data_path }}/dn changed_when: false - loop: "{{ dn_nodes }}" + loop: "{{ dn_nodes | select('search', 'Primary') }}" loop_control: loop_var: node delegate_to: "{{ hostvars[node]['node_ip'] }}" - when: "'Primary' in node" - register: switchover_status - until: "switchover_status is succeeded" - retries: 9 - delay: 5 - # - name: Stop all nodes - # ansible.builtin.command: - # gs_om -t stop - # changed_when: false + - name: Start any stopped nodes + ansible.builtin.command: + gs_om -t start + changed_when: false + + - name: Import 'wait_for_started' tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml - - name: Install CM + - name: Tips + ansible.builtin.debug: + msg: | + Adding CM tool into cluster. + Please wait... + + - name: Deploy CM tool ansible.builtin.expect: command: "./cm_install -X {{ og_upload_path }}/cluster_config.xml --cmpkg {{ og_upload_path }}/{{ cm_pkg }}" responses: (?i)password: "{{ og_ca_pass }}" chdir: "{{ og_home }}/install/app/tool/cm_tool" + timeout: "{{ (dn_nodes | count) * 300 }}" changed_when: false register: cm_install until: - "cm_install.stdout is defined" - "'CM exists' in cm_install.stdout" retries: 3 - timeout: 300 - ignore_errors: true - become_user: "{{ og_user }}" - - rescue: - - name: Install pexpect - ansible.builtin.package: - name: "{{ python_name }}-pexpect" - become_user: root - - - name: Repeat tasks - ansible.builtin.include_tasks: - file: cluster_manager.yml - - always: - # - name: Debug - # ansible.builtin.debug: - # var: cm_install - - # - name: Start the primary data node - # ansible.builtin.command: - # "gs_om -t start -h {{ hostvars[node]['node_name'] }}" - # loop: "{{ dn_nodes }}" - # loop_control: - # loop_var: node - # when: "'Primary' in node" - # changed_when: false - - # - name: Wait for primary node started - # ansible.builtin.wait_for: - # host: "{{ hostvars[node]['node_ip'] }}" - # port: "{{ og_cluster_config.db_port }}" - # timeout: "600" - # loop: "{{ dn_nodes }}" - # loop_control: - # loop_var: node - # when: "'Primary' in node" - # changed_when: false - - # - name: Start the rest of data nodes - # ansible.builtin.command: - # "gs_om -t start -h {{ hostvars[node]['node_name'] }}" - # loop: "{{ dn_nodes }}" - # loop_control: - # loop_var: node - # when: "'Primary' not in node" - # changed_when: false + #ignore_errors: true + failed_when: "cm_install.rc == 999" - - name: Wait for the cluster started + - name: Refresh cluster config ansible.builtin.command: - "gs_om -t status" - register: gs_status - until: "'Normal' in gs_status.stdout" - retries: 30 - delay: 10 + gs_om -t refreshconf changed_when: false diff --git a/roles/opengauss/tasks/deploy/cluster_pre_tasks.yml b/roles/opengauss/tasks/deploy/cluster_pre_tasks.yml new file mode 100644 index 0000000000000000000000000000000000000000..cd5190b606210b5703c0f9a6f21954061c47047d --- /dev/null +++ b/roles/opengauss/tasks/deploy/cluster_pre_tasks.yml @@ -0,0 +1,36 @@ +- name: Update /etc/hosts + ansible.builtin.blockinfile: + path: /etc/hosts + marker: "# {mark} OPENGAUSS NODES" + block: | + {% for node in og_all_nodes %} + {{ node }} og{{ node | ipaddr('int') }} + {% endfor %} + delegate_to: "{{ node }}" + loop: "{{ og_all_nodes }}" + loop_control: + loop_var: node + +- name: Scan hosts key + ansible.builtin.command: >- + ssh-keyscan -p {{ host_port }} {{ node }},og{{ node | ipaddr('int') }} + changed_when: false + loop: "{{ og_all_nodes }}" + loop_control: + loop_var: node + vars: + host_port: "{{ ansible_ssh_port | default('22') }}" + register: known_host_keys + +- name: Config known hosts + ansible.builtin.include_tasks: + file: deploy/known_hosts.yml + vars: + host_keys: "{{ known_host_keys.results | map(attribute='stdout_lines') | flatten }}" + +- name: Config authorized keys + ansible.builtin.include_tasks: + file: deploy/add_auth.yml + loop: "{{ og_all_nodes }}" + loop_control: + loop_var: node diff --git a/roles/opengauss/tasks/deploy/install.yml b/roles/opengauss/tasks/deploy/install.yml index 5bcce4eb57ac68e9d8f78902db10983015ef939b..bc474307ab279814cce95e47900af9c4c562345f 100644 --- a/roles/opengauss/tasks/deploy/install.yml +++ b/roles/opengauss/tasks/deploy/install.yml @@ -1,4 +1,4 @@ -- name: "Create cluster_config.xml in local path '{{ inventory_dir }}'" +- name: "Create local file {{ inventory_dir + 'cluster_config.xml' }}" ansible.builtin.template: src: "{{ item }}" dest: "{{ inventory_dir }}/cluster_config.xml" @@ -12,7 +12,7 @@ delegate_to: localhost become: false -- name: "Upload cluster_config.xml to {{ og_upload_path }}" +- name: "Upload cluster_config.xml to {{ og_upload_path + '/cluster_config.xml' }}" ansible.builtin.copy: src: "{{ inventory_dir }}/cluster_config.xml" dest: "{{ og_upload_path }}/" @@ -21,49 +21,138 @@ mode: "0644" backup: true -- name: Start expansion +# 预部署脚本 +- name: "Run {{ og_upload_path + '/script/gs_preinstall' }}" + ansible.builtin.command: >- + {{ og_upload_path }}/script/gs_preinstall \ + -U {{ og_user }} \ + -G {{ og_group }} \ + -X {{ og_upload_path }}/cluster_config.xml \ + --non-interactive + changed_when: false become_user: root - when: "og_expansion" + when: "og_expansion or ep_nodes is truthy" + +# 从扩容列表里拆分出备机和级联机 +- name: "Set 'standby_nodes' and 'cascade_nodes'" + ansible.builtin.set_fact: + standby_nodes: >- + {%- for node in ep_nodes if node in groups['opengauss_standby'] -%} + {{ node }}{{ (loop.nextitem is defined) | ternary(',', '') }} + {%- endfor -%} + cascade_nodes: >- + {%- for node in ep_nodes if node in groups['opengauss_cascade'] -%} + {{ node }}{{ (loop.nextitem is defined) | ternary(',', '') }} + {%- endfor -%} + +# 备机扩容 +- name: Start standby expansion + become_user: root + when: + - "og_expansion" + - "standby_nodes is truthy" block: - - name: Starting pre install - ansible.builtin.command: >- - {{ og_upload_path }}/script/gs_preinstall \ - -U {{ og_user }} \ - -G {{ og_group }} \ - -X {{ og_upload_path }}/cluster_config.xml \ - --non-interactive + - name: Expanding standby nodes + ansible.builtin.shell: >- + . /home/{{ og_user }}/.bashrc && \ + {{ og_upload_path }}/script/gs_expansion \ + -U {{ og_user }} \ + -G {{ og_group }} \ + -X {{ og_upload_path }}/cluster_config.xml \ + --time-out {{ (ep_nodes | count) * 600 }} \ + -h {{ standby_nodes }} changed_when: false - - name: Cluster expanding +# 级联机扩容,前提是集群里有 Normal 状态的备机节点。 +- name: Start cascade expansion + become_user: root + when: + - "og_expansion" + - "cascade_nodes is truthy" + block: + - name: Import 'wait_for_started' tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml + + - name: Expanding cascade nodes ansible.builtin.shell: >- . /home/{{ og_user }}/.bashrc && \ {{ og_upload_path }}/script/gs_expansion \ -U {{ og_user }} \ -G {{ og_group }} \ -X {{ og_upload_path }}/cluster_config.xml \ - -h {{ groups['opengauss_ep'] | join(',') }} \ - --time-out {{ (og_all_nodes | count) * 600 }} + --time-out {{ (ep_nodes | count) * 600 }} \ + -h {{ cascade_nodes }} changed_when: false -- name: Starting deploy - when: "not og_expansion" + - name: Import 'wait_for_started' tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml + +# 集群增加 CM 工具 +- name: Deploy CM tool + when: + - "groups['opengauss_cm'] is not defined" + - "og_cm_enabled" block: - - name: Starting pre install - ansible.builtin.command: >- - {{ og_upload_path }}/script/gs_preinstall \ - -U {{ og_user }} \ - -G {{ og_group }} \ - -X {{ og_upload_path }}/cluster_config.xml \ - --non-interactive + - name: Switchover to fix 'Term of primary is invalid or not maximal' error + ansible.builtin.command: + gs_ctl switchover -D {{ og_data_path }}/dn + changed_when: false + when: + - "'Primary' not in node" + - "ansible_loop.index < 3" + loop: "{{ dn_nodes | reject('search', 'Primary') | sort(reverse=True) }}" + loop_control: + loop_var: node + pause: 15 + extended: true + delegate_to: "{{ hostvars[node]['node_ip'] }}" + + - name: Import 'wait_for_started' tasks + ansible.builtin.import_tasks: + file: wait_for_started.yml + + - name: Deploy CM tool + ansible.builtin.expect: + command: "./cm_install -X {{ og_upload_path }}/cluster_config.xml --cmpkg {{ og_upload_path }}/{{ cm_pkg }}" + responses: + (?i)password: "{{ og_ca_pass }}" + chdir: "{{ og_home }}/install/app/tool/cm_tool" + timeout: "{{ (dn_nodes | count) * 300 }}" changed_when: false + register: cm_install + until: + - "cm_install.stdout is defined" + - "'CM exists' in cm_install.stdout" + retries: 2 + ignore_errors: true - - name: Deploy openGauss - ansible.builtin.command: >- - gs_install \ - -X {{ og_upload_path }}/cluster_config.xml \ - --gsinit-parameter="--pwpasswd={{ og_db_pass }}" \ - --gsinit-parameter="--locale={{ og_locale }}" \ - --time-out {{ (og_all_nodes | count) * 600 }} + - name: Refresh cluster config + ansible.builtin.command: + gs_om -t refreshconf changed_when: false - become_user: "{{ og_user }}" - become_flags: "-i" + +# 初次部署 +# # 预部署脚本 +# - name: "Run {{ og_upload_path + '/script/gs_preinstall' }}" +# ansible.builtin.command: >- +# {{ og_upload_path }}/script/gs_preinstall \ +# -U {{ og_user }} \ +# -G {{ og_group }} \ +# -X {{ og_upload_path }}/cluster_config.xml \ +# --non-interactive +# changed_when: false +# become_user: root + +- name: "Run {{ og_upload_path + '/script/gs_install' }}" + ansible.builtin.command: >- + gs_install \ + -X {{ og_upload_path }}/cluster_config.xml \ + --gsinit-parameter="--pwpasswd={{ og_db_pass }}" \ + --gsinit-parameter="--locale={{ og_locale }}" \ + --time-out {{ (og_all_nodes | count) * 600 }} + changed_when: false + become_user: "{{ og_user }}" + become_flags: "-i" + when: "not og_expansion" diff --git a/roles/opengauss/tasks/deploy/known_hosts.yml b/roles/opengauss/tasks/deploy/known_hosts.yml index 21e04b2888c2f1e4483465414289089a7bb141c5..77bd9a23ad2702c469679112bd782b5e4d4877ed 100644 --- a/roles/opengauss/tasks/deploy/known_hosts.yml +++ b/roles/opengauss/tasks/deploy/known_hosts.yml @@ -14,7 +14,7 @@ loop_var: node delegate_to: "{{ node }}" -- name: "Update ~/.ssh/known_hosts for user '{{ og_user }}'" +- name: "Update '{{ '/home/' + og_user + '/.ssh/known_hosts' }}'" ansible.builtin.blockinfile: path: "/home/{{ og_user }}/.ssh/known_hosts" owner: "{{ og_user }}" diff --git a/roles/opengauss/tasks/deploy/main.yml b/roles/opengauss/tasks/deploy/main.yml.bak similarity index 97% rename from roles/opengauss/tasks/deploy/main.yml rename to roles/opengauss/tasks/deploy/main.yml.bak index 8346313506e51db70e75551dacec6974d5377c58..beec9f00cbf7d846288f151c0c982283b56c10a1 100644 --- a/roles/opengauss/tasks/deploy/main.yml +++ b/roles/opengauss/tasks/deploy/main.yml.bak @@ -48,8 +48,10 @@ - name: Import upload tasks ansible.builtin.import_tasks: file: deploy/upload.yml + run_once: true always: - name: Import installation tasks ansible.builtin.import_tasks: file: deploy/install.yml + run_once: true diff --git a/roles/opengauss/tasks/deploy/runtime_groups.yml b/roles/opengauss/tasks/deploy/runtime_groups.yml index 514a3914cbe9074313bd4b732b191bb171a19ad5..1ecd06470707443a3d462884e995a0c26e716756 100644 --- a/roles/opengauss/tasks/deploy/runtime_groups.yml +++ b/roles/opengauss/tasks/deploy/runtime_groups.yml @@ -31,7 +31,7 @@ when: "node not in gs_detail.stdout" # 按编号顺序保存现有 CM 服务器列表。 - - name: Create current cmserver list + - name: Save current cmserver list ansible.builtin.add_host: hostname: "{{ node_info[3] }}_{{ node_info[1] }}" groups: @@ -42,7 +42,7 @@ loop_var: line vars: node_info: "{{ line | regex_replace(' {1,}', '|') | split('|') }}" - when: "'cmserver' in line" + when: "'cm_server' in line" - name: Print out current cluster manager servers ansible.builtin.debug: @@ -51,7 +51,7 @@ # 按编号顺序保存现有 DN 服务器列表。 # 这里需要判断输出了多少列,因为有 CM 的集群里,会隐藏数据库端口 Port 的那一列。 - - name: Create current data nodes list + - name: Save current data nodes list ansible.builtin.add_host: hostname: >- {%- if og_cluster_config.db_port in line -%} @@ -88,38 +88,3 @@ - name: Current data nodes ansible.builtin.debug: msg: "{{ groups['opengauss_dn'] }}" - when: "groups['opengauss_dn'] is defined" - -# # 如果未部署 CM 且为单节点 -# - name: Create groups for CM deploy | Single node mode -# when: -# - "groups['opengauss_cm'] is not defined" -# - "(groups['opengauss_dn'] | count) == 1" -# - "og_cm_enabled" -# block: -# - name: Create init cm nodes list -# ansible.builtin.add_host: -# hostname: "{{ node }}" -# groups: -# - opengauss_cm -# node_ip: "{{ node }}" -# node_name: "og{{ node | ipaddr('int') }}" -# loop: "{{ groups['opengauss_ep'] | first }}" -# loop_control: -# loop_var: node - -# - name: Create init dn nodes list -# ansible.builtin.add_host: -# hostname: "{{ node }}" -# groups: -# - opengauss_cm -# node_ip: "{{ node }}" -# node_name: "og{{ node | ipaddr('int') }}" -# loop: "{{ groups['opengauss_dn'] | first }}" -# loop_control: -# loop_var: node - -# - name: Current cm nodes -# ansible.builtin.debug: -# msg: "{{ groups['opengauss_cm'] }}" -# when: "groups['opengauss_cm'] is defined" diff --git a/roles/opengauss/tasks/deploy/wait_for_started.yml b/roles/opengauss/tasks/deploy/wait_for_started.yml new file mode 100644 index 0000000000000000000000000000000000000000..295514e58975ca958545b6663a32c6db940bc28d --- /dev/null +++ b/roles/opengauss/tasks/deploy/wait_for_started.yml @@ -0,0 +1,23 @@ +- name: Wait for the cluster started + become_user: "{{ og_user }}" + block: + - name: Checking status + ansible.builtin.command: + "gs_om -t status --detail" + register: gs_status + until: + - "gs_status.stdout_lines | regex_search('cluster_state.*Normal')" + - "gs_status.stdout_lines | regex_search('Standby.*Normal')" + retries: 30 + delay: 10 + changed_when: false + + rescue: + - name: Print out cluster status + ansible.builtin.debug: + msg: | + {{ gs_status.stdout_lines }} + + - name: Play aborted + ansible.builtin.meta: + end_play diff --git a/roles/opengauss/tasks/main.yml b/roles/opengauss/tasks/main.yml index 2861b36302ca3269cf3fc45321dac1f83cc2f1f5..d7bb584b3ee36e8f64be9c97bee3f9e09e9b3ef3 100644 --- a/roles/opengauss/tasks/main.yml +++ b/roles/opengauss/tasks/main.yml @@ -19,60 +19,18 @@ - name: Import cluster expand tasks ansible.builtin.import_tasks: file: deploy/cluster_expand.yml - vars: - cm_nodes: "{{ (groups['opengauss_cm'] is defined) | ternary(groups['opengauss_cm'], '') | sort }}" - dn_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" - ep_nodes: "{{ (groups['opengauss_ep'] is defined) | ternary(groups['opengauss_ep'], '') | sort }}" - # 当现有架构不足 3 节点时,先扩容数据节点(CM),再部署管理节点(CM) - og_expansion: true - og_cm_enabled: >- - {{ - ( - (groups['opengauss_cm'] is not defined) - and - (dn_nodes | count ) < 3 - ) - | - ternary( - false, - true - ) - }} when: "groups['opengauss_ep'] is defined" + become_user: root - - name: Import cluster manager deploy + - name: Import cluster manager deploy tasks + ansible.builtin.import_tasks: + file: deploy/cluster_manager.yml + when: + - "groups['opengauss_cm'] is not defined" + - "(groups['opengauss_dn'] | count) > 2" become_user: "{{ og_user }}" delegate_to: "{{ og_master }}" run_once: true - when: - - "groups['opengauss_cm'] is not defined" - - "og_cm_enabled" - block: - - name: Import cluster checking tasks - ansible.builtin.import_tasks: - file: deploy/cluster_check.yml - - - name: Import cluster manager deploy tasks - ansible.builtin.import_tasks: - file: deploy/cluster_manager.yml - vars: - cm_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" - dn_nodes: "{{ (groups['opengauss_dn'] is defined) | ternary(groups['opengauss_dn'], '') | sort }}" - ep_nodes: "" - og_expansion: false - og_cm_enabled: >- - {{ - ( - (groups['opengauss_cm'] is not defined) - and - (dn_nodes | count ) < 3 - ) - | - ternary( - false, - true - ) - }} always: - name: Import post tasks diff --git a/roles/opengauss/tasks/os/common_set.yml b/roles/opengauss/tasks/os/common_set.yml index b0403302c8c9eb0b1962879a11cd80bcb945b39f..dce9567ea6d89ce60e1833ae4fbdbe6bdf8922ba 100644 --- a/roles/opengauss/tasks/os/common_set.yml +++ b/roles/opengauss/tasks/os/common_set.yml @@ -29,9 +29,6 @@ mode: "0644" notify: Restart systemd-logind.service -- name: Flush handlers - ansible.builtin.meta: flush_handlers - - name: Disable history command logs ansible.builtin.lineinfile: path: /etc/profile @@ -59,7 +56,6 @@ - nano - htop - "{{ python_name }}-pexpect" - # update_cache: true use: "{{ custom_pkg_mgr | default(ansible_pkg_mgr) }}" register: pkg_inst until: pkg_inst is succeeded diff --git a/roles/opengauss/tasks/pre_tasks.yml b/roles/opengauss/tasks/os/main.yml similarity index 77% rename from roles/opengauss/tasks/pre_tasks.yml rename to roles/opengauss/tasks/os/main.yml index 1d0a3fc80a285dd22e140f9cc08fd9fa4197b90a..117bf60718a3864c2112efb67a323216564448c4 100644 --- a/roles/opengauss/tasks/pre_tasks.yml +++ b/roles/opengauss/tasks/os/main.yml @@ -2,8 +2,8 @@ ansible.builtin.include_tasks: file: "{{ item }}" with_first_found: - - "os/{{ ansible_distribution | replace(' ', '_') }}.yml" - - "os/not_supported.yml" + - "{{ ansible_distribution | replace(' ', '_') }}.yml" + - "not_supported.yml" - name: Import Common tasks for all distribution ansible.builtin.import_tasks: @@ -13,12 +13,12 @@ ansible.builtin.include_tasks: file: "{{ item }}" with_first_found: - - "os/{{ ansible_os_family }}.yml" - - "os/not_supported.yml" + - "{{ ansible_os_family }}.yml" + - "not_supported.yml" - name: Import user config tasks ansible.builtin.include_tasks: - file: os/user.yml + file: user.yml with_items: "{{ groups['opengauss_ep'] | default(og_all_nodes) }}" loop_control: loop_var: node @@ -26,7 +26,7 @@ - name: Import ssh config tasks ansible.builtin.include_tasks: - file: os/ssh.yml + file: ssh.yml with_items: - user: root group: root @@ -36,3 +36,6 @@ home: "/home/{{ og_user }}" loop_control: loop_var: og_ssh + +- name: Flush handlers + ansible.builtin.meta: flush_handlers diff --git a/roles/opengauss/tasks/os/openEuler.yml b/roles/opengauss/tasks/os/openEuler.yml index 4637580ff25351b926bb024095d38cf1fad3c43c..4714993470ba72792ed706a7dd40907a6529643e 100644 --- a/roles/opengauss/tasks/os/openEuler.yml +++ b/roles/opengauss/tasks/os/openEuler.yml @@ -8,6 +8,8 @@ until: pkg_inst is succeeded retries: 3 +# openGauss 5.0.0 依赖 readline 7 +# openEuler 20.03 只有 readline 8 - name: Create soft link ansible.builtin.file: src: /lib64/libreadline.so.8 diff --git a/roles/opengauss/tasks/os/ssh.yml b/roles/opengauss/tasks/os/ssh.yml index b2e56b33d3f649f470df600d4209727ea40d5c74..d41bf2b596fbd488dbc27873a353c8f2cc60c557 100644 --- a/roles/opengauss/tasks/os/ssh.yml +++ b/roles/opengauss/tasks/os/ssh.yml @@ -1,6 +1,6 @@ - name: Config ssh on host block: - - name: "Create '.ssh' under {{ og_ssh.home }}" + - name: "Create path {{ og_ssh.home + '/.ssh' }}" ansible.builtin.file: path: "{{ og_ssh.home }}/.ssh" state: directory diff --git a/roles/opengauss/tasks/os/user.yml b/roles/opengauss/tasks/os/user.yml index 12e5f6a972b250c11772c22ae0d259893970991c..5cd189a99b140d4d7e1025c4b526eb6001ddab86 100644 --- a/roles/opengauss/tasks/os/user.yml +++ b/roles/opengauss/tasks/os/user.yml @@ -22,12 +22,12 @@ recurse: true - name: "Config command alias for user '{{ og_user }}'" - ansible.builtin.lineinfile: + ansible.builtin.blockinfile: path: "/home/{{ og_user }}/.bashrc" create: true - line: "{{ item }}" + block: | + alias gs_detail='gs_om -t status --detail' + marker: "# {mark} OPENGAUSS COMMAND ALIAS" owner: "{{ og_user }}" group: "{{ og_group }}" mode: "0644" - with_items: - - "alias gs_detail='gs_om -t status --detail'" diff --git a/roles/opengauss/templates/cluster_config.xml.j2 b/roles/opengauss/templates/cluster_config.xml.j2 index a7e8607003af870534a9b732a67479a1f14a0dbe..f6e19987e6eef2ee0a6033ba28710a50c3bd74d3 100644 --- a/roles/opengauss/templates/cluster_config.xml.j2 +++ b/roles/opengauss/templates/cluster_config.xml.j2 @@ -28,8 +28,6 @@ {{ lookup('ansible.builtin.template', 'cluster_master.xml.j2') }} - {{ lookup('ansible.builtin.template', 'cluster_replicas.xml.j2') }} - diff --git a/roles/opengauss/templates/cluster_master.xml.j2 b/roles/opengauss/templates/cluster_master.xml.j2 index c6651d0f47cb533b51cb60ab2ff2eedee6d0e4c8..d233eebd46044b41b92811e5cda433ed82826be4 100644 --- a/roles/opengauss/templates/cluster_master.xml.j2 +++ b/roles/opengauss/templates/cluster_master.xml.j2 @@ -22,12 +22,12 @@ "/> - {%- if og_cm_enabled -%} +{% if og_cm_enabled %} - + - {%- endif -%} +{% endif %} \ No newline at end of file diff --git a/roles/opengauss/templates/cluster_replicas.xml.j2 b/roles/opengauss/templates/cluster_replicas.xml.j2 index cbffafa7615a04bf24e05e7b197416bc428938b1..eb3503f8b394bda7b970bd1f0212441d3f39ac73 100644 --- a/roles/opengauss/templates/cluster_replicas.xml.j2 +++ b/roles/opengauss/templates/cluster_replicas.xml.j2 @@ -1,5 +1,4 @@ {% for node in dn_nodes if hostvars[node]['node_ip'] != og_master %} - @@ -10,7 +9,7 @@ {% if og_cm_enabled %} - + {% endif %} {% if hostvars[node]['node_ip'] in groups['opengauss_cascade'] %} @@ -18,26 +17,24 @@ {% endif %} -{% endfor %} +{% endfor %} {% for node in ep_nodes %} - - {% if og_cm_enabled %} - + {% endif %} - {% if node in groups['opengauss_cascade'] %} {% endif %} + {% endfor %} \ No newline at end of file diff --git a/roles/opengauss/tests/inventory b/roles/opengauss/tests/inventory deleted file mode 100644 index 878877b0776c44f55fc4e458f70840f31da5bb01..0000000000000000000000000000000000000000 --- a/roles/opengauss/tests/inventory +++ /dev/null @@ -1,2 +0,0 @@ -localhost - diff --git a/roles/opengauss/tests/test.yml b/roles/opengauss/tests/test.yml deleted file mode 100644 index 2943711ab483c60987f90d8e70612d9f5d8435d2..0000000000000000000000000000000000000000 --- a/roles/opengauss/tests/test.yml +++ /dev/null @@ -1,5 +0,0 @@ ---- -- hosts: localhost - remote_user: root - roles: - - openGauss diff --git a/roles/opengauss/vars/main.yml b/roles/opengauss/vars/main.yml deleted file mode 100644 index d0bd0ac82c2c45d34e8c10b29d37fdb7df954ac7..0000000000000000000000000000000000000000 --- a/roles/opengauss/vars/main.yml +++ /dev/null @@ -1,2 +0,0 @@ ---- -# vars file for openGauss diff --git a/tests/oneshot.yml b/tests/oneshot.yml new file mode 100644 index 0000000000000000000000000000000000000000..066b4aab30c515b25739dd75074eabddd0a63583 --- /dev/null +++ b/tests/oneshot.yml @@ -0,0 +1,10 @@ +- name: Tests for openGauss deploy + hosts: localhost + become: false + vars_files: + - vars/cases.yml + roles: + - role: oneshot + loop: "{{ cases }}" + loop_control: + loop_var: case diff --git a/tests/roles/oneshot/tasks/main.yml b/tests/roles/oneshot/tasks/main.yml new file mode 100644 index 0000000000000000000000000000000000000000..e1f4929668d17ecb3ec27ee0a03be67ad54caded --- /dev/null +++ b/tests/roles/oneshot/tasks/main.yml @@ -0,0 +1,3 @@ +- name: Check virtual machines status + ansible.builtin.import_role: + name: vagrant diff --git a/tests/roles/vagrant/tasks/main.yml b/tests/roles/vagrant/tasks/main.yml new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/tests/templates/Vagrantfile.j2 b/tests/templates/Vagrantfile.j2 new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/tests/templates/hosts.ini.j2 b/tests/templates/hosts.ini.j2 new file mode 100644 index 0000000000000000000000000000000000000000..162d1d6aa7088d0da91698d4d50233544255338c --- /dev/null +++ b/tests/templates/hosts.ini.j2 @@ -0,0 +1,28 @@ +; 主机,仅设置 1 个目标机。 +[opengauss_primary] + +; 备机,可设置若干个或留空。 +[opengauss_standby] + +; 级联机,可设置若干个或留空。 +[opengauss_cascade] + +; 以上 3 个分组的合并组,勿动。 +[opengauss:children] +opengauss_primary +opengauss_standby +opengauss_cascade + +; 备节点分组,总数不可大于 8。 +[opengauss_replicas:children] +opengauss_standby +opengauss_cascade + +; 机器的 SSH 信息,请根据你的实际情况修改。 +[opengauss:vars] +; ssh 用户名,如果不是 root 用户,请确保它有 sudo 权限。 +ansible_ssh_user=vagrant +; ssh 密码 +ansible_ssh_pass=vagrant +; ssh 端口 +ansible_ssh_port=22 diff --git a/tests/vars/cases.yml b/tests/vars/cases.yml new file mode 100644 index 0000000000000000000000000000000000000000..42d728f423d704e71e07bf27926978f8a5757d44 --- /dev/null +++ b/tests/vars/cases.yml @@ -0,0 +1,22 @@ +# primary 不需要设置,必须只有 1 个。 +cases: + - name: "单节点" + sn: "1p" + standby: 0 + cascade: 0 + - name: "1 主 1 备" + sn: "1p1s" + standby: 1 + cascade: 0 + - name: "1 主 1 备 1 级联" + sn: "1p1s1c" + standby: 1 + cascade: 1 + - name: "1 主 2 备 1 级联" + sn: "1p2s1c" + standby: 2 + cascade: 1 + - name: "1 主 2 备 2 级联" + sn: "1p2s2c" + standby: 2 + cascade: 2 diff --git a/vagrant/Vagrantfile b/vagrant/Vagrantfile index baa571677a748b32b0deaea6800785351b945eb0..f1905799f6c1a278065fbf923e1146b9e79a46d7 100644 --- a/vagrant/Vagrantfile +++ b/vagrant/Vagrantfile @@ -12,6 +12,9 @@ Vagrant.configure("2") do |config| # 这个设置针对无法被 vagrant 识别的国产系统,例如 openEuler。强制指定按哪个通用系统来配置。 config.vm.guest = "centos" + # How many vm do you want? + N = 5 + config.vm.provider "virtualbox" do |vb| vb.memory = 1024 * 16 vb.cpus = 8 @@ -23,6 +26,10 @@ Vagrant.configure("2") do |config| config.vm.provision "shell" do |s| s.inline = <<-SHELL sed -i "s|PasswordAuthentication no|PasswordAuthentication yes|g" /etc/ssh/sshd_config + dnf makecache + dnf upgrade -y + dnf install -y bzip2 expect net-tools ntp tar gzip readline-devel \ + patch ncurses-devel libaio-devel glibc-devel flex bison nano htop libnsl /bin/systemctl restart sshd.service SHELL end @@ -30,9 +37,6 @@ Vagrant.configure("2") do |config| #Disabling the default /vagrant share config.vm.synced_folder ".", "/vagrant", disabled: true - # How many vm do you want? - N = 3 - (1..N).each do |i| config.vm.define "opengauss#{i}" do |node| node.vm.box = "openeuler2003_x64" diff --git a/vagrant/vm_create.sh b/vagrant/vm_create.sh new file mode 100755 index 0000000000000000000000000000000000000000..59bd0d8bc4e07d036fd73ded951fcf5b1a7a50a4 --- /dev/null +++ b/vagrant/vm_create.sh @@ -0,0 +1,7 @@ +#!/bin/bash +set -e + +vagrant destroy -f +vagrant up --provision +vagrant reload +vagrant snapshot save init