Details of the use of python rules.

00:00:00    2982199073774412    [360]    8 3    download.it.com.cn/softweb/software/firewall/antivirus/20067/17938.html
00:00:00    07594220010824798    []    1 1    news.21cn.com/social/daqian/2008/05/29/4777194_1.shtml
00:00:00    5228056822071097    [75810]    14 5    www.greatoo.com/greatoo_cn/list.asp?link_id=276&title=%BE%DE%C2%D6%D0%C2%CE%C5
00:00:00    6140463203615646    []    62 36    www.jd-cd.com/jd_opus/xx/200607/706.html
00:00:00    8561366108033201    []    3 2    www.big38.net/
00:00:00    23908140386148713    []    1 2    www.chinabaike.com/article/81/82/110/2007/2007020724490.html
00:00:00    1797943298449139    []    8 5    www.6wei.net/dianshiju/????\xa1\xe9|????do=index
00:00:00    00717725924582846    []    1 2    www.shanziba.com/
00:00:00    41416219018952116    []    2 6    bbs.gouzai.cn/thread-698736.html
00:00:00    9975666857142764    []    2 2    ks.cn.yahoo.com/question/1307120203719.html
00:00:00    21603374619077448    [111aa]    1 6    www.fotolog.com.cn/tags/aa111
00:00:00    7423866288265172    []    3 13    ks.cn.yahoo.com/question/1406051201894.html
00:00:00    0616877776407358    [tudou.com+]    2 9    topic.bindou.com/1487/
00:00:00    3933365481995287    []    6 3    ks.cn.yahoo.com/question/1407051001276.html
00:00:00    8242389147671512    [PP]    2 3    shwamlys.blog.sohu.com/76558184.html

for this piece of data, I read it by row.
but only the data of the 2nd, 3rd and 4th columns
between each column is the TAB key and in the middle of the fourth row is the space bar
how to write the regular expression?
ask the boss for advice!

Feb.27,2021

import re

with open('1.txt', 'r') as r:
    result = [re.split('\s+', i)[1:4] for i in r.readlines()]

print(result)
Menu