How does the python2 script extract the nginx log to access the ip of a specific location on the same day?

1. Currently, it is possible to extract the ip visited on the same day, but bug is the first to find records that do not contain ip and cannot be extracted to access a specific location.

< hr >

2. Code `

< H1 >! / usr/bin/env python < / H1 > < H1 > coding:utf-8 < / H1 >

import urllib
import json
import time
import re

ld = time.strftime ("% d/%b/%Y", time.localtime ())
url =" http://ip.taobao.com/service/."

def ip_find (ip):

data = urllib.urlopen(url + ip).read()
datadict = json.loads(data)
for oneinfo in datadict:
    if "code" == oneinfo:
        if datadict[oneinfo] == 0:
            return datadict["data"]["country"] + datadict["data"]["region"] + datadict["data"]["city"] + "\t" + datadict["data"]["isp"]

def sort_value (s):

d = sorted(s.iteritems(),key=lambda t:t[1],reverse=True)
return d

if name ="_ _ main__":

with open("access.log") as f:
    content = f.read()
    patt = re.compile(ld + r"(.*)", re.S)
    result = re.search(patt, content).group(1)
file = r"test.txt"
with open(file, "w+") as f:
    f.write(result)

with open("test.txt") as f: -sharp /opt/nginx/logs/
    d = {}
    for line in f:
        field = line.split()
        print field
        if field[0] not in d:
            d.setdefault(field[0],[])
        d[field[0]].append(field[0])

    s = {}
    for k in d:
        s[k] = len(d[k])
    s = sort_value(s)[0:10]
    print "IP\t\t\t\t\tIP"
    print "----------------------------------------------------------------------------"
    for ip,con in s:
        print str(ip) + "\t\t" + str(con) + "\t\t" + ip_find(ip)`

3. Log file
106.38.121.196-[08/Jun/2018:18:15:43 + 0800] "POST / supervision/api/user/login.do HTTP/1.1" 200503 "http://supervision.bangcle.com/"" Mozilla/5.0 (Windows NT 6.1; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36 "
180.168.174.128-[11/Jun/2018:09:12:04 + 0800]" POST / supervision/api/user/login.do HTTP/1.1 "200491" http://supervision.bangcle.com/"Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.79 Safari/537.36 "
106.38.121.195-[11/Jun/2018:18:11:04 + 0800]" GET / static/js/vendor.180eb0f8247b996979d3.js HTTP/1.1 "3040" http://supervision.bangcle.com/"Mozilla/5.0 (Windows NT 6.1; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36 "

Mar.18,2021

ip: 106.38.121.196, views: 1
ip: 180.168.174.128, views: 2
total: 2
max_ip_view: {'106.38.121.195: 2}


you can use result = re.split (ld,content) [0]. Split (' ") [0] + ld + re.search (patt, content) .group (1)

Menu