|
3391617
初级用户
 
积分 116
发帖 56
注册 2007-3-7
状态 离线
|
『楼 主』:
[求助]如何从txt文件中提取我要的IP?[谢谢各位]
使用 LLM 解释/回答一下
</center><il><li><a title="AU" onMouseOver="s('AU')" onMouseOut="d()" class="D">144.140.22.190:80</a></li><li><a title="KR" onMouseOver="s('KR')" onMouseOut="d()" class="D">125.248.244.131:8080</a></li><li><a title="IN" onMouseOver="s('IN')" onMouseOut="d()" class="D">202.53.13.10:8080</a></li><li><a title="PH" onMouseOver="s('PH')" onMouseOut="d()" class="B">125.212.37.150:8080</a></li><li><a title="CN" onMouseOver="s('CN')" onMouseOut="d()" class="B">220.181.31.44:3128</a></li><li><a title="CO" onMouseOver="s('CO')" onMouseOut="d()" class="D">200-91-243-90-host.ifx.net.co:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.229:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.228:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.226:80</a></li><li><a title="JP" onMouseOver="s('JP')" onMouseOut="d()" class="D">neptun.ium.ne.jp:8094</a></li><li><a title="TR" onMouseOver="s('TR')" onMouseOut="d()" class="D">195.175.37.71:8080</a></li><li><a title="NL" onMouseOver="s('NL')" onMouseOut="d()" class="D">213.227.149.165:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.227:80</a></li></il><center>
以上内容(都是同1行里的,不是现在看到的多行)为文档1.txt,请教我如何可以从1.txt中提取出里面的
144.140.22.190:80
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
...
...
...
xxx.xxx.xxx.xxx:xx
到新建的另一个2.txt中呢?
最好能把class后的类型属性也提取出来,例如这样
144.140.22.190:80----D
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
220.181.31.44:3128----B
...
...
xxx.xxx.xxx.xxx:xx
先谢谢了!
Last edited by 3391617 on 2007-3-9 at 10:18 AM ]
To extract the content from document1.txt and save it to a new document2.txt with the IP:port and the class attribute, you can use a programming language like Python. Here's a simple Python example using regular expressions:
```python
import re
# Read the content from document1.txt
with open('document1.txt', 'r', encoding='utf-8') as file:
content = file.read()
# Use regular expression to find the relevant parts
pattern = r'<a title=".*?" onMouseOver="s\('.*?'\)" onMouseOut="d\()" class="(\w+)">(\d+\.\d+\.\d+\.\d+:\d+)</a>'
matches = re.findall(pattern, content)
# Write the results to document2.txt
with open('document2.txt', 'w', encoding='utf-8') as file:
for match in matches:
file.write(f"{match}----{match}\n")
```
This code uses regular expressions to find the IP:port pairs and their corresponding class attributes, then writes them to document2.txt. Make sure the Python script is in the same directory as document1.txt.
|
|
2007-3-9 02:41 |
|
|
ccwan
金牌会员
     
积分 2725
发帖 1160
注册 2006-9-23 来自 河北廊坊
状态 离线
|
『第 2 楼』:
使用 LLM 解释/回答一下
这种内容的处理,用sed比较好,还是请熟悉sed命令的同学来做吧。
It is better to use sed for handling this kind of content, or ask classmates who are familiar with sed commands to do it.
|

三人行,必有吾师焉。 学然后知不足,教然后知困,然后能自强也。 |
|
2007-3-9 02:50 |
|
|
vkill
金牌会员
     
积分 4103
发帖 1744
注册 2006-1-20 来自 甘肃.临泽
状态 离线
|
『第 3 楼』:
使用 LLM 解释/回答一下
sed -r "s/>[^<>]*<\/a>/&\n/g" 1.txt|sed -r "s/.*class=\x22([A-Z]{1})\x22>([^<>]*)<\/a>$/\2--\1/;/--([A-Z]{1})$/!d"|more>2.txt
只是量体裁衣,没有考虑太多的匹配
sed -r "s/>*<\/a>/&\n/g" 1.txt|sed -r "s/.*class=\x22({1})\x22>(*)<\/a>$/\2--\1/;/--({1})$/!d"|more>2.txt
Just a matter of tailoring, not considering too much matching
|
|
2007-3-9 03:22 |
|
|
slore
铂金会员
      
积分 5212
发帖 2478
注册 2007-2-8
状态 离线
|
『第 4 楼』:
不知道帖子什么地方去了^
使用 LLM 解释/回答一下
『楼 主』: 如何从txt文件中提取我想要的内容?
</center><il><li><a title="AU" onMouseOver="s('AU')" onMouseOut="d()" class="D">144.140.22.190:80</a></li><li><a title="KR" onMouseOver="s('KR')" onMouseOut="d()" class="D">125.248.244.131:8080</a></li><li><a title="IN" onMouseOver="s('IN')" onMouseOut="d()" class="D">202.53.13.10:8080</a></li><li><a title="PH" onMouseOver="s('PH')" onMouseOut="d()" class="B">125.212.37.150:8080</a></li><li><a title="CN" onMouseOver="s('CN')" onMouseOut="d()" class="B">220.181.31.44:3128</a></li><li><a title="CO" onMouseOver="s('CO')" onMouseOut="d()" class="D">200-91-243-90-host.ifx.net.co:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.229:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.228:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.226:80</a></li><li><a title="JP" onMouseOver="s('JP')" onMouseOut="d()" class="D">neptun.ium.ne.jp:8094</a></li><li><a title="TR" onMouseOver="s('TR')" onMouseOut="d()" class="D">195.175.37.71:8080</a></li><li><a title="NL" onMouseOver="s('NL')" onMouseOut="d()" class="D">213.227.149.165:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.227:80</a></li></il><center>
以上内容(都是同1行里的,不是现在看到的多行)为文档1.txt,请教我如何可以从1.txt中提取出里面的
144.140.22.190:80
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
...
...
...
xxx.xxx.xxx.xxx:xx
到新建的另一个2.txt中呢?
最好能把class后的类型属性也提取出来,例如这样
144.140.22.190:80----D
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
220.181.31.44:3128----B
...
...
xxx.xxx.xxx.xxx:xx
先谢谢了!
刚看的啊,3月8日的!不知道跑什么地方去了^
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objText = objFSO.OpenTextFile("D:\桌面\1.txt", 1)
inputstr = objText.ReadAll
objText.Close
outstr=replace(inputstr,"</a>",vbcrlf)
dim tep
tep=split(outstr,vbcrlf)
outstr=empty
dim i
for i=0 to ubound(tep)-2
outstr=outstr & right(tep(i),len(tep(i))-instrRev(tep(i),">")+3) & vbcrlf
next
outstr=replace(outstr,""">","--")
Set objText = objFSO.OpenTextFile("D:\桌面\2.txt", 2,True)
objText.Write outstr
objText.Close
结果:
D--144.140.22.190:80
D--125.248.244.131:8080
D--202.53.13.10:8080
B--125.212.37.150:8080
B--220.181.31.44:3128
D--200-91-243-90-host.ifx.net.co:3128
D--216.133.248.229:80
D--216.133.248.228:80
D--216.133.248.226:80
D--neptun.ium.ne.jp:8094
D--195.175.37.71:8080
D--213.227.149.165:3128
D--216.133.248.227:80
『Poster』: How to extract the content I want from a txt file?
</center><il><li><a title="AU" onMouseOver="s('AU')" onMouseOut="d()" class="D">144.140.22.190:80</a></li><li><a title="KR" onMouseOver="s('KR')" onMouseOut="d()" class="D">125.248.244.131:8080</a></li><li><a title="IN" onMouseOver="s('IN')" onMouseOut="d()" class="D">202.53.13.10:8080</a></li><li><a title="PH" onMouseOver="s('PH')" onMouseOut="d()" class="B">125.212.37.150:8080</a></li><li><a title="CN" onMouseOver="s('CN')" onMouseOut="d()" class="B">220.181.31.44:3128</a></li><li><a title="CO" onMouseOver="s('CO')" onMouseOut="d()" class="D">200-91-243-90-host.ifx.net.co:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.229:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.228:80</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.226:80</a></li><li><a title="JP" onMouseOver="s('JP')" onMouseOut="d()" class="D">neptun.ium.ne.jp:8094</a></li><li><a title="TR" onMouseOver="s('TR')" onMouseOut="d()" class="D">195.175.37.71:8080</a></li><li><a title="NL" onMouseOver="s('NL')" onMouseOut="d()" class="D">213.227.149.165:3128</a></li><li><a title="US" onMouseOver="s('US')" onMouseOut="d()" class="D">216.133.248.227:80</a></li></il><center>
The above content (all in the same line, not the multiple lines you see now) is document 1.txt. Please teach me how I can extract from 1.txt
144.140.22.190:80
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
...
...
...
xxx.xxx.xxx.xxx:xx
to a newly created another 2.txt?
It's best to also extract the type attribute after class, for example like this
144.140.22.190:80----D
xxx.xxx.xxx.xxx:xx
xxx.xxx.xxx.xxx:xx
220.181.31.44:3128----B
...
...
xxx.xxx.xxx.xxx:xx
Thanks in advance!
Just saw it, March 8th! I don't know where it went^
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objText = objFSO.OpenTextFile("D:\Desktop\1.txt", 1)
inputstr = objText.ReadAll
objText.Close
outstr=replace(inputstr,"</a>",vbcrlf)
dim tep
tep=split(outstr,vbcrlf)
outstr=empty
dim i
for i=0 to ubound(tep)-2
outstr=outstr & right(tep(i),len(tep(i))-instrRev(tep(i),">")+3) & vbcrlf
next
outstr=replace(outstr,""">","--")
Set objText = objFSO.OpenTextFile("D:\Desktop\2.txt", 2,True)
objText.Write outstr
objText.Close
Result:
D--144.140.22.190:80
D--125.248.244.131:8080
D--202.53.13.10:8080
B--125.212.37.150:8080
B--220.181.31.44:3128
D--200-91-243-90-host.ifx.net.co:3128
D--216.133.248.229:80
D--216.133.248.228:80
D--216.133.248.226:80
D--neptun.ium.ne.jp:8094
D--195.175.37.71:8080
D--213.227.149.165:3128
D--216.133.248.227:80
|
|
2007-3-9 03:23 |
|
|
ccwan
金牌会员
     
积分 2725
发帖 1160
注册 2006-9-23 来自 河北廊坊
状态 离线
|
『第 5 楼』:
使用 LLM 解释/回答一下
在回收站里。不知怎么回事。
In the recycle bin. I don't know how it happened.
|

三人行,必有吾师焉。 学然后知不足,教然后知困,然后能自强也。 |
|
2007-3-9 03:41 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 6 楼』:
使用 LLM 解释/回答一下
因为楼主的原标题十分模糊,从标题无法迅速知晓帖子大概内容,所以被我移到回收站去了。经过楼主的修改,标题已经符合规范,因此移回,并合并了相关帖子。
Because the original title of the thread starter was very vague, and it was impossible to quickly know the general content of the post from the title, so it was moved to the recycle bin. After the thread starter modified the title, it has met the specifications, so it is moved back and related threads are merged.
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-3-9 03:51 |
|
|
3391617
初级用户
 
积分 116
发帖 56
注册 2007-3-7
状态 离线
|
『第 7 楼』:
使用 LLM 解释/回答一下
4楼是VBS?3楼sed似乎还不成~~~~bat就不成?
下午忙业务去了,没得闲上论坛来!
在这谢谢你们几位的帮助了!
Floor 4 is VBS? Floor 3's sed doesn't seem to work yet~~~~bat doesn't work? Went busy with business in the afternoon, didn't have time to come to the forum! Thanks to you several for your help here!
|
|
2007-3-9 06:11 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 8 楼』:
使用 LLM 解释/回答一下
以下代码可以提取IP格式的记录,但是,不能提取到类型属性:
@echo off
setlocal enabledelayedexpansion
cls
for /f "delims=" %%i in (1.txt) do (
set "str=%%i"
set "str=!str:"=!"
call :pickup "!str!"
)
pause
exit
:pickup
for /f "tokens=1* delims=<>" %%i in (%1) do (
echo %%i|findstr "^\.">nul&&echo %%i
set "str=%%j"
if defined str call :pickup "!str!"
)
goto :eof
The following code can extract IP - formatted records, but it cannot extract the type attribute:
@echo off
setlocal enabledelayedexpansion
cls
for /f "delims=" %%i in (1.txt) do (
set "str=%%i"
set "str=!str:"=!"
call :pickup "!str!"
)
pause
exit
:pickup
for /f "tokens=1* delims=<>" %%i in (%1) do (
echo %%i|findstr "^\.">nul&&echo %%i
set "str=%%j"
if defined str call :pickup "!str!"
)
goto :eof
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-3-9 06:59 |
|
|
3391617
初级用户
 
积分 116
发帖 56
注册 2007-3-7
状态 离线
|
『第 9 楼』:
使用 LLM 解释/回答一下
对XXX.XXX.XXX.XXX:XX格式的的确好用
收下了
谢谢namejm!
It's really useful for the format of XXX.XXX.XXX.XXX:XX.
Received it.
Thank you namejm!
|
|
2007-3-9 08:09 |
|
|
youxi01
高级用户
   
积分 846
发帖 247
注册 2006-10-27 来自 湖南==》广东
状态 离线
|
『第 10 楼』:
使用 LLM 解释/回答一下
借用namejm的代码,可以提取IP格式的记录和类型属性:
@echo off
setlocal enabledelayedexpansion
for /f "delims=" %%i in (test.txt) do (
set "str=%%i"
set "str=!str:"=!"
call :pickup "!str!"
)
pause>nul
:pickup
for /f "tokens=1* delims=<" %%i in (%1) do (
echo "%%i"|findstr "class">nul && (
for /f "tokens=1,2 delims=>" %%a in ("%%i") do (
set class=%%a
set class=!class:~-1!
set IP=%%b
echo !class!--!IP!
)
)
set "str=%%j"
if defined str call :pickup "!str!"
)
goto :eof
结果:
D--144.140.22.190:80
D--125.248.244.131:8080
D--202.53.13.10:8080
B--125.212.37.150:8080
B--220.181.31.44:3128
D--200-91-243-90-host.ifx.net.co:3128
D--216.133.248.229:80
D--216.133.248.228:80
D--216.133.248.226:80
D--neptun.ium.ne.jp:8094
D--195.175.37.71:8080
D--213.227.149.165:3128
D--216.133.248.227:80
Borrowing the code from namejm, you can extract records in IP format and type attributes:
@echo off
setlocal enabledelayedexpansion
for /f "delims=" %%i in (test.txt) do (
set "str=%%i"
set "str=!str:"=!"
call :pickup "!str!"
)
pause>nul
:pickup
for /f "tokens=1* delims=<" %%i in (%1) do (
echo "%%i"|findstr "class">nul && (
for /f "tokens=1,2 delims=>" %%a in ("%%i") do (
set class=%%a
set class=!class:~-1!
set IP=%%b
echo !class!--!IP!
)
)
set "str=%%j"
if defined str call :pickup "!str!"
)
goto :eof
Result:
D--144.140.22.190:80
D--125.248.244.131:8080
D--202.53.13.10:8080
B--125.212.37.150:8080
B--220.181.31.44:3128
D--200-91-243-90-host.ifx.net.co:3128
D--216.133.248.229:80
D--216.133.248.228:80
D--216.133.248.226:80
D--neptun.ium.ne.jp:8094
D--195.175.37.71:8080
D--213.227.149.165:3128
D--216.133.248.227:80
|
|
2007-3-9 12:00 |
|
|
clonecd
初级用户
 
积分 94
发帖 46
注册 2006-5-14
状态 离线
|
『第 11 楼』:
使用 LLM 解释/回答一下
这个问题前就有人问过啦,不过那个是在多行里的。
Last edited by clonecd on 2007-3-9 at 04:32 PM ]
This question was asked before by someone else, but that was in multiple lines.
Last edited by clonecd on 2007-3-9 at 04:32 PM ]
|
|
2007-3-9 15:14 |
|
|
clonecd
初级用户
 
积分 94
发帖 46
注册 2006-5-14
状态 离线
|
『第 12 楼』:
使用 LLM 解释/回答一下
@sed "s/class/\n&/g;s/<\/a>/&\n/g" 1.txt|sed "/:/!d;s/.*\x22\(.*\)\x22>\(\+\)<.*/\2----\1/">2.txt
以上是不包含200-91-243-90-host.ifx.net.co:3128----D
如果需要这一行,请用以下代码
sed "s/class/\n&/g;s/<\/a>/&\n/g" 1.txt|sed "/:/!d;s/.*\x22\(.*\)\x22>\(\+\)<.*/\2----\1/">2.txt
Last edited by clonecd on 2007-3-9 at 04:40 PM ]
@sed "s/class/\n&/g;s/<\/a>/&\n/g" 1.txt|sed "/:/!d;s/.*\x22\(.*\)\x22>\(\+\)<.*/\2----\1/">2.txt
The above does not include 200-91-243-90-host.ifx.net.co:3128----D
If you need this line, use the following code
sed "s/class/\n&/g;s/<\/a>/&\n/g" 1.txt|sed "/:/!d;s/.*\x22\(.*\)\x22>\(\+\)<.*/\2----\1/">2.txt
Last edited by clonecd on 2007-3-9 at 04:40 PM ]
|
|
2007-3-9 16:30 |
|
|
3391617
初级用户
 
积分 116
发帖 56
注册 2007-3-7
状态 离线
|
『第 13 楼』:
使用 LLM 解释/回答一下
上面的好象不好用哦~
有没有能从任意文档中提取出全部类似以下格式:
XXX.XXX.XXX.XXX:XX
XXX.XXX.XXX.XX:XXXX
XX.XXX.XX.XXX:XXX
XX.XXX.XXX.XX:XXXX
or格式:
XXX.XX.XXX.XXX--XX
XXX.XXX.XXX.XXX--XXXX
XX.XXX.XX.XXX--XXX
XX.XXX.XXX.XX--XX
的所有IP地址的通用方法呢?
在同一行/不在同一行?
Last edited by 3391617 on 2007-3-9 at 10:23 AM ]
The one above doesn't seem to work well. Is there a general method that can extract all IP addresses in the following formats from any document:
XXX.XXX.XXX.XXX:XX
XXX.XXX.XXX.XX:XXXX
XX.XXX.XX.XXX:XXX
XX.XXX.XXX.XX:XXXX
Or the format:
XXX.XX.XXX.XXX--XX
XXX.XXX.XXX.XXX--XXXX
XX.XXX.XX.XXX--XXX
XX.XXX.XXX.XX--XX
? In the same line/not in the same line?
Last edited by 3391617 on 2007-3-9 at 10:23 AM ]
|
|
2007-3-9 23:18 |
|
|
xycoordinate
中级用户
  
积分 493
发帖 228
注册 2007-2-16 来自 安徽
状态 离线
|
『第 14 楼』:
使用 LLM 解释/回答一下
if defined str call :pickup "!str!"
哪位能帮我解释一下?
我在if /?
看到:
如果命令扩展名被启用,IF 会如下改变:
IF string1 compare-op string2 command
IF CMDEXTVERSION number command
IF DEFINED variable command
如果已定义环境变量,DEFINED 条件的作用跟 EXISTS 的一样,
除了它取得一个环境变量,返回的结果是 true。
如果没有已定义环境变量,那就不执行后面的command???
Last edited by xycoordinate on 2007-3-9 at 10:51 AM ]
if defined str call :pickup "!str!"
Can anyone help explain this?
I saw in if /?:
If command extensions are enabled, IF will change as follows:
IF string1 compare-op string2 command
IF CMDEXTVERSION number command
IF DEFINED variable command
If an environment variable is defined, the effect of the DEFINED condition is the same as EXISTS, except that it takes an environment variable and returns true.
If the environment variable is not defined, then the subsequent command is not executed???
Last edited by xycoordinate on 2007-3-9 at 10:51 AM ]
|
|
2007-3-9 23:43 |
|
|
clonecd
初级用户
 
积分 94
发帖 46
注册 2006-5-14
状态 离线
|
『第 15 楼』:
使用 LLM 解释/回答一下
Originally posted by 3391617 at 2007-3-9 23:18:
上面的好象不好用哦~
有没有能从任意文档中提取出全部类似以下格式:
XXX.XXX.XXX.XXX:XX
XXX.XXX.XXX.XX:XXXX
XX.XXX.XX.XXX:XXX
XX.XXX.XXX.XX:XXXX
or格式: ...
12楼的代码是针对你1楼给出的条件来写的,要考虑你13楼附加的条件用sed也是完全可以做到的,你自己学习sed再研究吧。
Originally posted by 3391617 at 2007-3-9 23:18:
The above one doesn't seem to work well~
Is there any way to extract all content in the following formats from any document:
XXX.XXX.XXX.XXX:XX
XXX.XXX.XXX.XX:XXXX
XX.XXX.XX.XXX:XXX
XX.XXX.XXX.XX:XXXX
or format: ...
The code on floor 12 is written for the conditions you provided on floor 1. It's completely possible to use sed to handle the additional conditions you provided on floor 13. You can study sed by yourself and then research it.
|
|
2007-3-9 23:56 |
|