发布时间:2014-09-05 17:07:14作者:知识屋
目标:实现在监控3306端口服务时,出现1次critical软状态时或者在上一次执行后没有成功后出现的第一次硬状态critical情况下,远程执行mysql重启服务,并且每次执行远程重启服务前把报告事件记录到DB中
牵涉技术:
(1)Nagios事件处理原理
(2)Ssh无密码登录执行命令
(3)Perl操作mysql
如果大家对以上三条都掌握了,相信看懂这篇文章也就不成话下了。
##进入正题##
前期准备工作
I.制作ssh无密码登录
实现目标:nagios用户无密码登录server
大家对root用户无密码登录都做过。但是今天,我要做的是普通用户nagios用户无密码登录(在此感谢我同事的技术支持).
角色
Host_ip
备注
Client
192.168.x.x
Nagios监控端作为Client,目的是为了远程执行脚本
Server
192.168.x.y
存启动服务脚本,如:mysql脚本
Client端(192.168.x.x)制作
---------------------------------------------------------------------------------------------------
(1) 创建nagios用户略过(Server端也需要)
(2) su –nagios环境下执行
ssh-keygen -t rsa
一路回车便可,无需密码。
(3)将公钥copy到server端nagios家目录下
[nagios@nagios ~]$ scp .ssh/id_rsa.pub nagios@192.168.x.y:/home/nagios/
The authenticity of host '192.168.x.y (192.168.x.y)' can't be established.
RSA key fingerprint is 66:9a:b5:86:3d:81:22:9b:f8:67:9e:af:aa:4c:4a:97.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.x.y' (RSA) to the list of known hosts.
nagios@192.168.x.y's password:
id_rsa.pub 100% 411 0.4KB/s 00:00
---------------------------------------------------------------------------------------------------
Server端(192.168.x.x)制作
--------------------------------------------------------------------------------------------------
(1) 进入server端,登入nagios帐号
(2) 创建mkdir /home/nagios/.ssh
(3) 将公钥匙写入authorized_keys文件:
cat /home/nagios/id_rsa.pub >>.ssh/authorized_keys
(4) 改权限(以root身份或者通过visudo授权给nagios):
chmod 700 /home/nagios/.ssh
chmod 600 /home/nagios/.ssh/authorized_keys
检查
SERVER端权限检查
[root@centos-server nagios]# ls -la /home/nagios|grep .ssh
drwx------- 2 nagios nagios 4096 Aug 3 09:04 .ssh
[root@centos-server nagios]# ls -la /home/nagios/.ssh/
total 12
drw------- 2 nagios nagios 4096 Aug 3 09:04 .
drwx------ 4 nagios nagios 4096 Aug 3 09:03 ..
-rw------- 1 nagios nagios 411 Aug 3 09:04 authorized_keys
请确保红色标识的内容(保证.ssh目录的权限为700, authorized_keys的权限为600)
nagios用户持有者
CLIENT端登录测试
[nagios@nagios ~]$ ssh nagios@192.168.x.y
Last login: Wed Aug 3 09:15:59 2011 from 192.168.x.x
[nagios@centos-server ~]$
看到没?从192.168.x.x登录到192.168.x.y无需密码了。
如果没有这样的效果,大家看下是不是前面的权限问题。我曾今也是因为权限折腾了我同事半天。哈哈。
II.无密码登录远程执行命令
实现目标:nagios用户远程启动server端mysql服务
-----------------------------------------------------------------------------------------------
Server端(192.168.x.x)制作
------------------------------------------------------------------------------------------------
(1) 配置mysql启动控制脚本
输入以下SQL语句,创建一个具有root权限的用户(admin)和密码(controlmysql):
GRANT ALL PRIVILEGES ON *.* TO 'admin'@'localhost' IDENTIFIED BY ' controlmysql ';
GRANT ALL PRIVILEGES ON *.* TO 'admin'@'127.0.0.1' IDENTIFIED BY ' controlmysql ';
作用:用与启动/关闭控制mysql服务
Mysql控制(启动/停止等)脚本
#!/bin/sh
mysql_port=3306
mysql_username="admin"
mysql_password=" controlmysql "
mysql_scripts_path="/data0/mysql/3306"
mysqld_path="/usr/local/webserver/mysql"
start_mysql()
{
printf "Starting MySQL.../n"
/bin/sh ${mysqld_path}/bin/mysqld_safe --defaults-file=/data0/mysql/${mysql_port}/my.cnf 2>&1 > /dev/null &
}
stop_mysql()
{
printf "Stoping MySQL.../n"
${mysqld_path}/bin/mysqladmin -u ${mysql_username} -p${mysql_password} -S /tmp/mysql.sock shutdown
}
restart_mysql()
{
printf "Restarting MySQL.../n"
stop_mysql
sleep 5
start_mysql
}
kill_mysql()
{
kill -9 $(ps -ef | grep 'bin/mysqld_safe' | grep -v 'grep'| awk '{printf $2}')
kill -9 $(ps -ef | grep 'libexec/mysqld' | grep -v 'grep' |awk '{printf $2}')
}
if [ "$1" = "start" ]; then
start_mysql
elif [ "$1" = "stop" ]; then
stop_mysql
elif [ "$1" = "restart" ]; then
restart_mysql
elif [ "$1" = "kill" ]; then
kill_mysql
else
printf "Usage: ${mysql_scripts_path}/mysql {start|stop|restart|kill}/n"
fi
(2) 配置sudo,允许nagios用户执行脚本
**如果没有sudo,yum –y install sudo**
#visudo
添加
nagios ALL=(root) NOPASSWD:/data0/mysql/3306/mysql start
检查
SERVER端脚本测试检查
[root@centos-server ~]# netstat -an|grep 3306
[root@centos-server ~]#
说明mysql没有起来
[root@centos-server ~]# /data0/mysql/3306/mysql start
Starting MySQL...
[root@centos-server ~]# netstat -an|grep 3306
tcp 0 0 :::3306 :::* LISTEN
[root@centos-server ~]#
脚本OK,正常
Client端测试(以nagios用户登录)
[nagios@nagios ~]$ ssh nagios@192.168.x.y "sudo /data0/mysql/3306/mysql start"
sudo: sorry, you must have a tty to run sudo
解决:
Server端修改visudo,将下面一行注释
Defaults requiretty
再试
[nagios@nagios ~]$ ssh nagios@192.168.x.y "sudo /data0/mysql/3306/mysql start"
Starting MySQL...
正常启动
检查SERVER端 端口3306是否存在
恭喜,基本功已经做完。我们可以去玩监控端nagios配置了
III.Nagios监控端配置
(1)nagios基本配置文件如下:
mfs_hosts.cfg
define host{
use mfs-server
host_name mfs-192.168.x.y
alias mfs-192.168.x.y
address 192.168.x.y
}
mfs_hostgroups.cfg
define hostgroup{
hostgroup_name mfs-servers
alias Mfs Linux Servers
members mfs-192.168.x.y
}
mfs_services.cfg
define service {
name mfs-services
service_description checkport
check_command check_tcp!3306
check_period 24x7
max_check_attempts 2
normal_check_interval 3
retry_check_interval 1
notification_interval 5
notification_period 24x7
notification_options w,u,c,r
register 0
}
define service{
use mfs-services
host_name mfs-192.168.x.y
event_handler_enabled 1
event_handler restart_mysql
}
define service{
use mfs-service
host_name mfs-192.168.x.y
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
commands.cfg
define command{
command_name restart_mysql
command_line /usr/local/nagios/libexec/restart_mysql $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDR
ESS$
}
(2)改写/usr/local/nagios/libexec/restart_mysql
restart_mysql
HostAddress=$4
debug=1
if [ $debug -eq 1 ];then
echo "MysqlServer:${HostAddress}" >>/tmp/ReMysql.log
fi
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
case "$3" in
1)
if [ $debug -eq 1 ];then
echo "Restarting Mysql service (1rd soft critical state)..." >>/tmp/ReMysql.log
fi
/usr/bin/ssh nagios@${HostAddress} "sudo /data0/mysql/3306/mysql start"
;;
esac
;;
HARD)
if [ $debug -eq 1 ];then
echo "Restarting Mysql service..." >>/tmp/ReMysql.log
fi
/usr/bin/ssh nagios@${HostAddress} "sudo /data0/mysql/3306/mysql start"
;;
esac
;;
esac
exit 0
注明:测试最好将debug设置为1
申明:本脚本暂时的作用是远程重启mysql,后续还要添加写入数据库的脚本。
检查
Nagios配置文件检查
/usr/local/nagios/bin/nagios –v /usr/local/nagios/etc/nagios.cfg
无错误,重启nagios
Service nagios restart
被监控端开启mysql等相关服务,保证监控一切正常!如图:
尝试正常关闭mysql服务
实现目标:当出现第一次软状态的critical情况下,去尝试重启mysql.
以下4条信息足以证明我们想达到的效果已经实现!
(1)检查监控端nagios图
(2)检查监控端脚本日志
[root@nagios tmp]# tail -f ReMysql.log
MysqlServer:192.168.x.y
Restarting Mysql service (1rd soft critical state)...
(3)被监控端检查端口是否存在
[root@centos-server ~]# netstat -an|grep 3306
tcp 0 0 :::3306 :::* LISTEN
(4)再检查监控端nagios图
注明:到这里我们已经实现了第一个想法,就是远程重启服务。下面,我们要实现将事件记录到mysql中。
=======================================================
IV将通知信息写入Mysql
实现目标:将nagios报错信息写入到mysql DB中
角色
Host_ip
备注
Client
192.168.x.x
Nagios监控端作为Client,执行将报错信息写入数据库脚本
DB Server
192.168.x.z
存储报错信息的DB
DB Server端操作:
-----------------------------------------------------------------------------------------------
(1)创建库
create database nagios;
(2)授权
输入以下SQL语句,创建一个具有插入/修改/删除/浏览权限的用户(nagioslog)和密码(nagioslog)(允许nagios监控端远程登录):
GRANT ALL PRIVILEGES ON nagios.* TO 'nagioslog'@'192.168.x.x' IDENTIFIED BY '12345678';
作用:用与插入/修改/删除/浏览数据
(3)以nagioslog用户登录创建log表
create table log(host_ip varchar(50),services_desc varchar(200),plugin_out varchar(500)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Client端操作
-----------------------------------------------------------------------------------------
(1)安装perl操作mysql环境
perl -MCPAN -e "install DBI"
perl -MCPAN -e "install DBD::mysql"
(2)操作mysql脚本
Perl远程操作mysql脚本
#!/bin/perl
#Last Modifed by Hahazhu 2011/08/03
use DBI;
##########INIT DEFINED###########
my $remote_mysql="192.168.x.z";
my $remote_db="nagios";
my $remote_mysql_user="nagioslog";
my $remote_mysql_pwd="12345678";
my $debug=1;
##########Recevice Values#########
my $host_ip=$ARGV[0];
my $service_desc=$ARGV[1];
my $plugin_out=$ARGV[2];
my $dbh = DBI->connect("DBI:mysql:database=$remote_db;host=$remote_mysql", "$remote_mysql_user", "$remote_mysql_pwd", {'RaiseError'
=> 1});
my $rows = $dbh->do("INSERT INTO log (host_ip, services_desc, plugin_out) VALUES ('$host_ip', '$service_desc', '$plugin_out')");
if ($debug){
print "$rows row(s) affected /n";
}
if($debug){
my $sth = $dbh->prepare("SELECT host_ip, services_desc , plugin_out FROM log");
$sth->execute();
while (@data=$sth->fetchrow_array()){
print "$data[0] $data[1] $data[2]/n";
}
}
$dbh->disconnect();
申明:测试前请将$debug设置为1.
检查
Nagios端以nagios用户执行插入数据脚本
[nagios@nagios libexec]$ perl insert_log_to_mysql.pl 1.1.1.1 check_3306 "connection refused"
1 row(s) affected
1.1.1.1 check_3306 connection refused
DB Server端检查
mysql> select * from log;
+---------+---------------+--------------------+
| host_ip | services_desc | plugin_out |
+---------+---------------+--------------------+
| 1.1.1.1 | check_3306 | connection refused |
+---------+---------------+--------------------+
1 row in set (0.00 sec)
OK,脚本测试无问题。后面的工作就是将其加入到nagios配置里了。
V.Nagios服务配置调整
Commands.cfg
define command{
command_name restart_mysql
command_line /usr/local/nagios/libexec/restart_mysql $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$ $SERVICEDESC$ "$SERVICEOUTPUT$"
}
启动mysql脚本要调整restart_mysql
#/bin/sh
HostAddress=$4
Services_desc=$5
Plugin_out=$6
debug=1
if [ $debug -eq 1 ];then
echo "MysqlServer:${HostAddress}" >>/tmp/ReMysql.log
fi
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
case "$3" in
1)
if [ $debug -eq 1 ];then
echo "Restarting Mysql service (1rd soft critical state)..." >>/tmp/ReMysql.log
fi
/usr/bin/perl /usr/local/nagios/libexec/insert_log_to_mysql.pl ${HostAddress} ${Services_desc} ${Plugin_out}
/usr/bin/ssh nagios@${HostAddress} "sudo /data0/mysql/3306/mysql start"
;;
esac
;;
HARD)
if [ $debug -eq 1 ];then
echo "Restarting Mysql service..." >>/tmp/ReMysql.log
fi
/usr/bin/perl /usr/local/nagios/libexec/insert_log_to_mysql.pl ${HostAddress} ${Services_desc} “${Plugin_out}”
/usr/bin/ssh nagios@${HostAddress} "sudo /data0/mysql/3306/mysql start"
;;
esac
;;
esac
exit 0
申明:调试前最好把debug设置为1
检查
到了本文最后一部分了,有点激动…
看看,我们验证能不能达到我们下面的目标.
实现目标:
重启mysql服务,必把相关日志记录到另一台mysql DB中。
试验:stop mysql服务
Nagios端检查图:
Nagios端日志:
[root@nagios ~]# tail -f /tmp/ReMysql.log
MysqlServer:192.168.x.y
Restarting Mysql service (1rd soft critical state)...
此时检查mysql服务端
[root@centos-server ~]# netstat -an|grep 3306
tcp 0 0 :::3306 :::* LISTEN
再检查记录日志情况:
mysql> select * from nagios.log;
+--------------+---------------+--------------------+
| host_ip | services_desc | plugin_out |
+--------------+---------------+--------------------+
| 192.168.x.y | checkport | Connection refused |
+--------------+---------------+--------------------+
1 row in set (0.00 sec)
OK,目标已经实现。不仅实现了远程开机。而且将错误记录下来了。
到此,本文结束。我相信大家会有更多的想法去扩展…
下一篇,我将会带大家学习下nagios 分布式监控!
本文出自 “坏男孩” 博客
linux一键安装web环境全攻略 在linux系统中怎么一键安装web环境方法
Linux网络基本网络配置方法介绍 如何配置Linux系统的网络方法
Linux下DNS服务器搭建详解 Linux下搭建DNS服务器和配置文件
对Linux进行详细的性能监控的方法 Linux 系统性能监控命令详解
linux系统root密码忘了怎么办 linux忘记root密码后找回密码的方法
Linux基本命令有哪些 Linux系统常用操作命令有哪些
Linux必学的网络操作命令 linux网络操作相关命令汇总
linux系统从入侵到提权的详细过程 linux入侵提权服务器方法技巧
linux系统怎么用命令切换用户登录 Linux切换用户的命令是什么
在linux中添加普通新用户登录 如何在Linux中添加一个新的用户
2012-07-10
CentOS 6.3安装(详细图解教程)
Linux怎么查看网卡驱动?Linux下查看网卡的驱动程序
centos修改主机名命令
Ubuntu或UbuntuKyKin14.04Unity桌面风格与Gnome桌面风格的切换
FEDORA 17中设置TIGERVNC远程访问
StartOS 5.0相关介绍,新型的Linux系统!
解决vSphere Client登录linux版vCenter失败
LINUX最新提权 Exploits Linux Kernel <= 2.6.37
nginx在网站中的7层转发功能