使用Zabbix LLD实现进程数监控

目的

Contents

  • 针对特定进程数量做监控报警

思路

  1. 通过Zabbix LLD自动发现:每台机器都跑了什么服务、每个服务应该跑多少进程
  2. Zabbix Agent 30s将当前机器跑了哪些服务、每个服务进程数上报Zabbix Server
  3. 开发给定配置文件proccessInfo.txt: IP 服务名称 进程数量,此配置作为监控依据
  4. proccessInfo.txt配置文件需在每次变更配置时,自动生成最新

配置流程

  1. LLD自动发现脚本
  2. 数据采集脚本
  3. Agent添加Key
  4. Zabbix Server添加模板组
  5. 创建自动发现规则(监控项、报警触发器)
  6. 添加当前进程数监控项(通过Zabbix Trapper方式,由Agent端)
  7. 定义报警内容

具体步骤

LLD自动发现脚本

LLD自动发现,将进程名称及进程总数上报ZabbixServer:
/usr/bin/pythonservices.pyservices_list
 
{
    "data": [
        {
            "{#SERVICENAME}": "192.168.1.2-p_q1_server", 
            "{#TRIGGER_VALUE}": 3
        }, 
        {
            "{#SERVICENAME}": "192.168.1.2-p_world_d2_server", 
            "{#TRIGGER_VALUE}": 1
        }, 
        {
            "{#SERVICENAME}": "192.168.1.2-p_gate_server", 
            "{#TRIGGER_VALUE}": 2
        }, 
        {
            "{#SERVICENAME}": "192.168.1.2-p_world_d1_server", 
            "{#TRIGGER_VALUE}": 1
        }
    ]
}
 
数据采集上报: /usr/bin/pythonservices.py {HOST.HOST}
 
# -*- coding: utf-8 -*-
 
importjson
importcommands
importsubprocess
importre
importsys
 
class services_monitor:
 
        def__init__(self):
 
            self.zabbix_server_ip = '192.168.1.1'
            self.info_path = '/home/proccessInfo.txt'
            self.data_path = '/tmp/.process_number_monitor.log'
 
 
        defip(self):
            ipstr = '([0-9]{1,3}.){3}[0-9]{1,3}'
            ipconfig_process = subprocess.Popen("ifconfig", stdout=subprocess.PIPE)
            output = ipconfig_process.stdout.read()
            ip_pattern = re.compile('(inet addr:%s)' % ipstr)
            pattern = re.compile(ipstr)
            iplist = []
            for ipaddrin re.finditer(ip_pattern, str(output)):
                ip = pattern.search(ipaddr.group())
                if ip.group() != "127.0.0.1":
                    iplist.append(ip.group())
            ip = '|'.join(iplist)
            return ip
 
        defcheck_proc(self,proc_name):
 
            cmd = 'ps -ef |grep  %s|grep -v grep|wc -l' % proc_name
            proccess_info = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE)
 
            # list=proccess_info.stdout.read().strip().split('n')
            procss_num = proccess_info.communicate()[0]
            return procss_num
 
 
        defget_info(self,ip):
 
            service = []
            status, result = commands.getstatusoutput("grep -E '%s' %s" % (str(ip),self.info_path))
            result = result.split('n')
            for i in result:
                i = list(i.split(' '))
                service.append({"{#SERVICENAME}": i[0].strip() + "-" + i[1].strip(), "{#TRIGGER_VALUE}":int(i[2].strip())})
            data = json.dumps({'data': service}, sort_keys=True, indent=4)
            return data
 
        defcollect_data(self,data):
            data = json.loads(data)["data"]
            commands.getstatusoutput('cat /dev/null >%s' % self.data_path)
            f = open(self.data_path,'a')
            for i  in data:
                name = i['{#SERVICENAME}'].split('-')
                ip = name[0]
                proc_name =  name[1]
                f.write('%stproc_num[%s]t%s' %(ip,i['{#SERVICENAME}'],self.check_proc(proc_name)))
            f.close()
 
        defsend_data(self,data_path):
            status,output = commands.getstatusoutput('/bin/bash -c "zabbix_sender -z  %s  -i  %s &>/dev/null"' % (self.zabbix_server_ip,self.data_path))
            print status,output
 
if __name__ == '__main__':
 
    services = services_monitor()
    ip = services.ip()
    data = services.get_info(ip)
    try:
        argv = sys.argv[1]
        if argv == "services_list":
            print data
        else:
            services.collect_data(data)
            services.send_data(services.data_path)
    exceptIndexError:
        print data
 
 

Agent添加Key

vim /usr/local/etc/zabbix_agentd.conf
UserParameter=dzpt.service.process.discovery,/usr/bin/python /home/opt/scripts/services.pyservices_list
UserParameter=dzpt.service.process.exec[*],/usr/bin/python /home/opt/scripts/services.py  $1
 

创建自动发现规则(监控项Trapper方式、报警触发器)

添加当前进程数监控项

定义报警内容

Action中定义(此处略)

将定义好的模板链接到主机或者其他模板即可

最后

使用Zabbix LLD之后,可以设定多久更新一次监控项及监控阀值;当配置文件变更时,无需人为调整阀值和监控项

稿源:Geekwolf's Blog (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 后端存储 » 使用Zabbix LLD实现进程数监控

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录