首页 » 我的课程 » 正文

Nginx日志中有用的分析脚本

以下脚本来源于网络整理,整理一下而已

  假设你的nginx的日志文件是main.log

1、获取请求总数   
    less main.log | wc -l
          

2、平均每秒的请求数

 less main.log | awk ‘{sec=substr($4,2,20);reqs++;reqsBySec[sec]++;} END{print reqs/length(reqsBySec)}’

3、峰值每秒请求数

less main.log | awk ‘{sec=substr($4,2,20);requests[sec]++;} END{for(s in requests){printf(“%s %s\n”, requests[s],s)}}’ | sort -nr | head -n 3

4、流量速率分析

less main.log | awk ‘url=$7; requests[url]++;bytes[url]+=$10}
END{for(url in requests){printf(“%sMB %sKB/req %s %s\n”, bytes[url] /
1024 / 1024, bytes[url] /requests[url] / 1024, requests[url], url)}}’ | sort -nr | head -n 15

5、根据响应时间大体了解某个URL占用的CPU时间

less main.log | awk ‘{print $7}’ |sed -re ‘s/(.*)\?.*/\1/g’ -e
‘s/(.*)\..*/\1/g’ -e ‘s:/[0-9]+:/*:g’ | awk ‘{requests[$1]++;time[$1]
+=$2} END{for(url in requests){printf(“%smin %ss/req %s %s\n”, time
[url] / 60, time[url] /requests[url], requests[url], url)}}’ | sort -nr | head -n 50

6、打印出不同爬虫的请求频次($http_user_agent),或者查看某个特定的页面,最近有没有被爬虫爬过

less main.log | egrep ‘spider|bot’ | awk ‘{name=$17;if(index
($15,”spider”)>0){name=$15};spiders[name]++} END{for(name in spiders)
{printf(“%s %s\n”,spiders[name], name)}}’ | sort -nr

7、访问次数最多及最耗时的页面(慢查询)

function usage()
{
   echo “$0 filelog  options”;
   exit 1;
}

function slowlog()
{
#set -x;
field=$2;
files=$1;
end=2;
msg=””;

[[ $2 == ‘1’ ]] && field=1&&end=2&&msg=”总访问次数统计”;
[[ $2 == ‘2’ ]] && field=3&&end=4&&msg=”平均访问时间统计”;

echo -e “\r\n\r\n”;
echo -n “$msg”;
seq -s ‘#’ 30 | sed -e ‘s/[0-9]*//g’;

awk ‘{split($7,bbb,”?”);arr[bbb[1]]=arr[bbb[1]]+$NF; arr2[bbb[1]]=arr2[bbb[1]]+1; } END{for ( i in arr ) { print i”:”arr2[i]”:”arr[i]”:”arr[i]/arr2[i]}}’ $1 | sort  -t: +$field -$end -rn |grep “pages” |head -30 | sed ‘s/:/\t/g’
}

[[ $# < 2 ]] && usage;

slowlog $1 $2;

本文共 2 个回复

发表评论