前言:html
生产上新入网的服务器都须要安装prometheus的监控客户端软件,主要步骤有:新建监控用户、拷贝客户端软件、拉起客户端进程、开机自启动。本文记录了使用ansible的role方式批量快速的安装该客户端软件。node
本文使用到的主要模块:user、stat、copy、shell、script、lineinfile等。python
环境说明:git
主机名 | 操做系统版本 | ip | ansible version | 备注 |
---|---|---|---|---|
ansible | Centos 7.6.1810 | 172.27.34.51 | 2.9.9 | ansible管理服务器 |
ansible-awx | Centos 7.6.1810 | 172.27.34.50 | / | 被管服务器 |
[root@ansible ~]# cd /etc/ansible/roles [root@ansible roles]# ansible-galaxy init prometheus - Role prometheus was created successfully [root@ansible roles]# tree prometheus prometheus ├── defaults │ └── main.yml ├── files ├── handlers │ └── main.yml ├── meta │ └── main.yml ├── README.md ├── tasks │ └── main.yml ├── templates ├── tests │ ├── inventory │ └── test.yml └── vars └── main.yml 8 directories, 8 files
使用ansible-galaxy命令初始化role的目录github
[root@ansible ~]# yum -y install python3-pip
[root@ansible ~]# cd /tmp [root@ansible tmp]# pip3 download passlib==1.7.2 -d /tmp/pkg [root@ansible tmp]# more requirements.txt passlib==1.7.2 [root@ansible tmp]# pip3 install --no-index --find-links=./pkg -r requirements.txt WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead. Collecting passlib==1.7.2 (from -r requirements.txt (line 1)) Installing collected packages: passlib Successfully installed passlib-1.7.2
生产密码会使用到Python的passlib模块web
[root@ansible ~]# python3 -c "from passlib.hash import sha512_crypt; import getpass; print(sha512_crypt.using(rounds=5000).hash(getpass.getpass()))" Password: $6$irgqm/Fea6/O07B7$LJpYtZoKqUkF.pN4D71LX2Cac3TNrF2.1GKGLfaSWxvKupknNLbWNcYym3LuojT3BqUeUCgsrmD/M6FqTx4lK/
输入明文密码会生成密码密文,复制该密码,后面建立用户时会用到。shell
[root@ansible ansible]# pwd /etc/ansible [root@ansible ansible]# more prometheus.yaml --- - hosts: "{{ hostlist }}" gather_facts: no roles: - role: prometheus
[root@ansible ~]# cd /etc/ansible/roles [root@ansible roles]# more prometheus/tasks/main.yml --- # tasks file for prometheus # author: loong576 - name: user search shell: id {{ user_name }} register: user_search ignore_errors: true - name: user add user: name: "{{ user_name }}" shell: "{{ user_bash }}" password: "{{ user_password }}" when: user_search.failed == true - name: file search stat: path: "{{ file_dest }}/{{ file_src }}" register: file_search - name: copy files copy: src: "{{ file_src }}" dest: "{{ file_dest }}" mode: 0755 when: file_search.stat.exists == false - name: process search shell: "ps -ef|grep node_exporter |grep -v grep" register: process ignore_errors: true - name: install node_exporter environment: dest: "{{ file_dest }}" src: "{{ file_src }}" port: "{{ node_port }}" script: startup.sh register: start tags: start when: process.failed == true - name: exec when startup lineinfile: dest: /etc/rc.local line: nohup {{ file_dest }}/{{ file_src }} --web.listen-address=:{{ node_port }} >/dev/null &
执行逻辑为:判断被执行主机上有无监控用户,若无则新增;判断被执行主机有无客户端文件,若无则拷贝;判断被执行主机有无客户端进程,若无则拉起;最后设置客户端进程开机自启动。bash
[root@ansible roles]# more prometheus/defaults/main.yml --- # defaults file for prometheus user_name: sysmonitor user_bash: /bin/bash user_password: $6$bB7R8JF3U7L7s/3E$fKOQwpoZ7RESfMmX6uqts1gw4yeXniRNctI2JRBRS2/120EgrHCWS3DboiRhO5sN0CjoVxvtAKgeDVQRaPlc0/ file_src: node_exporter file_dest: /home/sysmonitor node_port: 9100
定义监控用户的用户名、shell、密码,客户端执行文件的文件名、文件路径和端口。服务器
[root@ansible roles]# ll prometheus/files/ 总用量 16512 -rw-r--r-- 1 root root 16900416 7月 30 16:04 node_exporter -rwxr--r-- 1 root root 102 7月 31 11:32 startup.sh [root@ansible roles]# more prometheus/files/startup.sh #/bin/bash echo $dest echo $src echo $port nohup $dest/$src --web.listen-address=:$port >/dev/null &
file文件有两个,node_exporter为客户端执行文件,startup.sh为客户端进程拉起脚本。ide
[root@ansible ansible]# pwd /etc/ansible [root@ansible ansible]# ansible-playbook prometheus.yaml -e hostlist=test50
‘ -e hostlist=test50’指定被执行的主机为test50,即172.27.34.50
登录被管主机test50,发现监控用户和监控进程都在且加入到了开机自启动文件中,符合预期。
本文全部脚本和配置文件已上传github:ansible-production-practice-2
更多请点击:ansible系列文章