标题: [求助]如何在文本文件中搜索并提取数据?
[打印本页]
作者: longobaba
时间: 2007-7-1 19:17
标题: [求助]如何在文本文件中搜索并提取数据?
首先谢谢大家!
数据格式如下,有三万多行,现在的问题是:
如何从庞大的数据中找出Total不等于0的酒店,然后把ARRIVALS这一行的酒店名称提取出来。老婆每个月都要处理这样的数据,工作得很辛苦,请大家帮帮忙,写个批处理,鄙人在这里鞠躬了。
ARRIVALS: 10005 - Five Rams City Hotel Guangzhou
Arr Date Name #Nt #Rm Room Description Rate Plan Rate Resv Num
---------- ---------------- --- --- -------------------- --------- ------- --------
Total: 0
Guangzhou Hotel Eelan Reservations
Great Hotels Organization China Customer Support
Tel+86 27 67845050
ARRIVALS: 10006 - Guangzhou Hotel Eelan
Arr Date Name #Nt #Rm Room Description Rate Plan Rate Resv Num
---------- ---------------- --- --- -------------------- --------- ------- --------
2007-06-03 HAMILTON, MATTHE 1 1 PREMIUM TWRM PRPRTWRM 40.00 180775384
2007-06-22 ORUNSOLU, KOLAWO 7 1 PREMIUM TWRM PRPRTWRM 40.00 180546312
Total: 2
Eastrn Inn Beijing Reservations
Great Hotels Organization China Customer Support
Tel+86 27 67845050
ARRIVALS: 10007 - Eastern Inn Beijing
Arr Date Name #Nt #Rm Room Description Rate Plan Rate Resv Num
---------- ---------------- --- --- -------------------- --------- ------- --------
Total: 0
Goodsun International Business Apartment Reservations
Great Hotels Organization China Customer Support
Tel+86 27 67845050
作者: lxmxn
时间: 2007-7-1 19:51
假设你的数据库文件为hotel.txt,在命令行执行下面的代码
gawk "{if($0~/[ \t]*ARRIVALS:/)result=$0;if($0~/^Total:.*/)if($2!=0)print result}" hotel.txt
其中 Gawk 是个外部工具,Gawk 下载地址:
http://www.cn-dos.net/forum/viewthread.php?tid=31098&page=1#pid205571
作者: lxmxn
时间: 2007-7-1 19:52
没有考虑用批处理来处理是因为文件有3万多行,太大了,用批处理来处理太慢了。
作者: digger
时间: 2007-7-2 10:08
用纯批处理测试了一下3W多行的数据,发现速度还过得去:
@echo off
:: 如果要去掉酒店编号及短横线,请把 echo !str!>>list.txt 换成 echo !str:*-!>>list.txt
set bg=%time%
cd.>list.txt
setlocal enabledelayedexpansion
for /f "tokens=1*" %%i in (test.txt) do (
if /i "%%i"=="ARRIVALS:" set str=%%j
if /i "%%i"=="Total:" if %%j neq 0 echo !str!>>list.txt
)
echo %bg% %time%
pause