|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『楼 主』:
我的AWK不能打印字符串
使用 LLM 解释/回答一下
gawk "BEGIN{print "The start"}"
上面代码不能正确打印出 The start串
请斑竹帮忙分析原因,和平台有关吗(WIN2000)
—————————————— willsort 版务记录 ——————————————
分割主题:{18102}如何用批处理按标记分割文件 的最后两帖合并至此
操作原因:二主题存在上下文的直接联系
版主提示:主题分割合并后,帖子会依照发帖时间重新排列,因而会影响阅读顺序
—————————————— willsort 版务记录 ——————————————
Last edited by willsort on 2005-12-19 at 22:33 ]
gawk "BEGIN{print \"The start\"}"
The above code cannot correctly print the string The start
Please help the moderator analyze the reason, is it related to the platform (WIN2000)?
—————————————— willsort Moderation Record ——————————————
Split topic: {18102}How to split a file by markers using batch processing The last two posts are merged here
Operation reason: The two topics have direct contextual connections
Moderator prompt: After the topic is split and merged, the posts will be rearranged according to the posting time, which will affect the reading order
—————————————— willsort Moderation Record ——————————————
Last edited by willsort on 2005-12-19 at 22:33 ]
|
|
2005-12-18 14:11 |
|
|
tigerpower
中级用户
   大师兄
积分 377
发帖 99
注册 2005-8-26
状态 离线
|
『第 2 楼』:
使用 LLM 解释/回答一下
gawk "BEGIN {print \"The start\"}"
gawk "BEGIN {print \"The start\"}"
|
|
2005-12-18 15:46 |
|
|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『第 3 楼』:
使用 LLM 解释/回答一下
非常感谢谢 tigerpower ,终于知道了,资料上难道说的是在UNIX平台,才不需要\"
Thank you very much, tigerpower. Finally, I understand. Does the material say that on the UNIX platform, there is no need for "
|
|
2005-12-18 18:06 |
|
|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『第 4 楼』:
使用 LLM 解释/回答一下
学习了一段时间,幸好有点C基础,要不茫然了,对前辈的AWK代码研究了下,
其实中间变量不会降低效率,因为它是内存操作
象 print $0 >> \"temp\"
属于硬盘操作,每行都有次动作,效率会底点,但斑竹的代码更简洁,而且调用了system()函数,不过REN的操作是非常快的,几乎可以忽略时间
另外,有点不明白的地方就是 ,WIN下用 双引号 代替了 单引号 ,而带反斜杠的双引号代替了原来的双引号,请各位大大告知这是基于什么原因
After studying for a period of time, fortunately, I have some C language foundation, otherwise I would be at a loss. I studied the predecessor's AWK code.
In fact, intermediate variables do not reduce efficiency because it is a memory operation.
For example, print $0 >> "temp"
Belongs to disk operation, and there is this action for each line, and the efficiency will be lower. But the bamboo shooter's code is more concise, and the system() function is called. However, the REN operation is very fast, almost negligible in terms of time.
In addition, there is something I don't understand. In Windows, double quotes are used instead of single quotes, and double quotes with backslashes are used instead of the original double quotes. Please let the great gods tell me the reason for this.
|
|
2005-12-19 08:49 |
|
|
willsort
元老会员
         Batchinger
积分 4432
发帖 1512
注册 2002-10-18
状态 离线
|
『第 5 楼』:
使用 LLM 解释/回答一下
Re yhsean:
这与两种平台不同的命令行特性有关。
在 Windows 的命令行环境中,如果某个参数含有空格或其它有特殊含义的字符,只有用双引号括起,才能作为字符串的一部分而不被转义。单引号不具有此作用。
而 gawk 等命令行工具,只会从特定的一个参数中获取脚本内容,而如果这个脚本中含有空格,就必须用双引号引起,否则空格后的部分会被当作下一个参数,而被 gawk 等忽略。
而一旦使用了双引号,则双引号也成了特殊的转义符号,而 gawk 等的脚本中也可能出现双引号,这时就可能出现双引号匹配的二义性问题:是相邻的双引号进行匹配,还是两端的双引号进行匹配?Windows 为了更好的支持参数选择了后者,而 gawk 为了支持脚本,则选择了前者。
这样,类似 gawk "BEGIN{print "The start"}" 的命令行显然就有问题。The Start 被置于双引号之外,不再是一个字符串了。为了解决这个问题, gawk 中使用\对 The Start 前的双引号进行再次转义,此时它不再会匹配BEGIN之前的双引号,而会直接被读入脚本中,与后面的双引号进行匹配。
应该就是这样了,各位还有什么见解,不妨提出来,一起讨论一下。
Re yhsean:
This is related to the different command-line characteristics of the two platforms.
In the Windows command-line environment, if a parameter contains spaces or other characters with special meanings, it must be enclosed in double quotes to be treated as part of a string without being escaped. Single quotes do not have this function.
For command-line tools like gawk, they only obtain the script content from a specific parameter. If there are spaces in this script, it must be enclosed in double quotes; otherwise, the part after the space will be regarded as the next parameter and ignored by gawk, etc.
Once double quotes are used, the double quotes also become special escape symbols. There may also be double quotes in the script of gawk, etc., and then there may be ambiguity in double quote matching: whether adjacent double quotes are matched or the double quotes at both ends are matched? Windows chooses the latter to better support parameter selection, while gawk chooses the former to support the script.
In this way, a command line like gawk "BEGIN{print "The start"}" is obviously problematic. "The Start" is placed outside the double quotes and is no longer a string. To solve this problem, in gawk, use \ to escape the double quote before "The Start" again. At this time, it will no longer match the double quote before BEGIN and will be directly read into the script to match the subsequent double quotes.
That should be it. Everyone, if you have any insights, feel free to put them forward and discuss together.
|

※ Batchinger 致 Bat Fans:请访问 批处理编程的异类 ,欢迎交流与共享批处理编程心得! |
|
2005-12-19 18:05 |
|
|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『第 6 楼』:
使用 LLM 解释/回答一下
理解斑竹的意思
但是为何舍弃了单引号。因为将双引号置于单引号中,这种二义性的解释就不存在了
因为可以将单引号内的内容看作 AWK 的命令内容 ,而与WINDOWS无关,除非在单
引号外,也就是说在单引号内无论发生什么,WIN无权利干涉
即 见到AWK ,AWK就应该有一种机制叫WIN理解其单引号做法。
Last edited by yhsean on 2005-12-19 at 19:38 ]
Understand the meaning of the moderator.
But why did we abandon the single quote? Because if we put double quotes inside single quotes, there will be no ambiguous interpretation.
Because the content inside the single quotes can be regarded as the command content of AWK, and has nothing to do with WINDOWS, unless outside the single quotes, that is, no matter what happens inside the single quotes, WIN has no right to interfere.
That is, When seeing AWK, AWK should have a mechanism for WIN to understand its single quote practice.
Last edited by yhsean on 2005-12-19 at 19:38 ]
|
|
2005-12-19 19:23 |
|
|
tigerpower
中级用户
   大师兄
积分 377
发帖 99
注册 2005-8-26
状态 离线
|
『第 7 楼』:
使用 LLM 解释/回答一下
Originally posted by yhsean at 2005-12-19 08:49:
...有点不明白的地方就是 ,WIN下用 双引号 代替了 单引号 ,而带反斜杠的双引号代替了原来的双引号...
是因为环境不一样<img src="images/smilies/face-wink.png" align="absmiddle" border="0">
在Windows xp中是 cmd, 下面前两语句是错误的, 最后一句是对的:
cd 'C:\Program Files'
'calc'
"calc"
在Unix中是shell, 上面的mawk脚本, 在Unix中是
mawk -F"*" '/\*/ {printf t>$2 ".txt";t="";next};{t=t $0 "\n"}' aa.txt
或
mawk -F"*" '
/\*/ {printf t>$2 ".txt"
t=""
next}
{t=t $0 "\n"}
' aa.txt
某些 dos版本可以使用单引号, 例如可以这样用:
echo.|awk '{print "Hello world!"}'
但在其他win32的版本中似乎不行.
Originally posted by yhsean at 2005-12-19 08:49:
...There is a little part that I don't understand, that is, in Windows, double quotes replace single quotes, and double quotes with backslashes replace the original double quotes...
It's because the environment is different;)
In Windows XP, it's cmd. The first two statements below are wrong, and the last statement is correct:
cd 'C:\Program Files'
'calc'
"calc"
In Unix, it's shell. The above mawk script, in Unix, is
mawk -F"*" '/\*/ {printf t>$2 ".txt";t="";next};{t=t $0 "\n"}' aa.txt
or
mawk -F"*" '
/\*/ {printf t>$2 ".txt"
t=""
next}
{t=t $0 "\n"}
' aa.txt
Some DOS versions can use single quotes, for example, you can use it like this:
echo.|awk '{print "Hello world!"}'
But in other win32 versions, it seems not possible.
|
|
2005-12-19 20:51 |
|
|
无奈何
荣誉版主
      
积分 1338
发帖 356
注册 2005-7-15
状态 离线
|
『第 8 楼』:
使用 LLM 解释/回答一下
总算可以有点时间上上网了。
to yhsean
因为 CMD 不识别单引号为其间不做解释的特殊字符,只认可双引号并对其间字符做部分解释, CMD 下没有提供像 UNIX 下单引号作用相同的特殊字符,这是没有办法的事,只能如此迁就。
再者是 CMD 将命令解释后提交 AWK ,而 AWK 不能也没办法还原 CMD 提交前的原始命令的。
Finally, I can have some time to surf the Internet.
to yhsean
Because CMD does not recognize the single quote as a special character that is not interpreted in between, it only recognizes double quotes and partially interprets the characters in between. There is no special character in CMD that has the same function as the single quote in UNIX, so it is unavoidable and we can only accommodate like this.
Moreover, CMD interprets the command and then submits it to AWK, and AWK cannot and has no way to restore the original command before CMD submits it.
|

☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul
|
|
2005-12-19 21:04 |
|
|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『第 9 楼』:
使用 LLM 解释/回答一下
终于领悟到AWK的魅力,易学 强大适用
介于上面问题,特搜寻了一个WIN32版SHELL程序,贴出给大家共享
另外,在写BAT文件的时候,为了一个文件完成(勿需要单独建立AWK脚本),如此一来
为了写出可读性好的程序,是否有办法,一个ECHO或别的办法产生出AWK脚本
MY.BAT内容如下
set file=input.txt
>my.awk echo {
>>my.awk echo BEGIN { }
>>my.awk echo {
...
>>my.awk echo }
>>my.awk echo END { }
>>my.awk echo }
mawk -f my.awk %file%
del my.awk
测试中,注意到如果在文件中引用AWK脚本 ,其中的双引号不需要 转义(my.awk 对于 " 不需写成 \")
Last edited by yhsean on 2005-12-22 at 09:34 ]
Finally realized the charm of AWK, easy to learn and powerful and applicable.
Due to the above problem, I specially searched for a WIN32 version SHELL program and posted it for everyone to share.
In addition, when writing BAT files, in order to complete in one file (no need to separately establish AWK scripts), in this way, in order to write programs with good readability, is there a way to generate an AWK script with one ECHO or other methods?
The content of MY.BAT is as follows:
set file=input.txt
>my.awk echo {
>>my.awk echo BEGIN { }
>>my.awk echo {
...
>>my.awk echo }
>>my.awk echo END { }
>>my.awk echo }
mawk -f my.awk %file%
del my.awk
During the test, it was noticed that if the AWK script is referenced in the file, the double quotes in it do not need to be escaped (for " in my.awk, it does not need to be written as \")
Last edited by yhsean on 2005-12-22 at 09:34 ]
附件
1: sh.rar (2005-12-21 12:10, 186.54 KiB, 下载附件所需积分 1 点
,下载次数: 14)
|
|
2005-12-21 12:06 |
|
|
yhsean
初级用户
 
积分 90
发帖 26
注册 2005-12-5
状态 离线
|
『第 10 楼』:
使用 LLM 解释/回答一下
Originally posted by willsort at 2005-12-19 18:05:
Re yhsean:
...是相邻的双引号进行匹配,还是两端的双引号进行匹配?Windows 为了更好的支持参数选择了后者,而 gawk 为了支持脚本,则选择了前者。
斑竹提到的引号配对问题,CMD在有些情况下配对甚是模糊,
假设 %~f0 返回值 d:\aa bb\cc\my.bat
如 echo. """%~f0""" (看见别人在ECHO后加个. 我也加了个,还不知道原因,郁闷中...)
将输出 "d:\aa bb\cc\my.bat"
而非 ""d:\aa bb\cc\my.bat""
也非 d:\aa bb\cc\my.bat (而echo. ”%~f0" 将输出该串 )
Last edited by yhsean on 2005-12-22 at 13:14 ]
Originally posted by willsort at 2005-12-19 18:05:
Re yhsean:
...Is it matching adjacent double quotes or the quotes at both ends? Windows chooses the latter to better support parameter selection, while gawk chooses the former to support scripts.
The problem of quote pairing mentioned by the moderator, CMD is rather ambiguous in some cases,
Suppose the return value of %~f0 is d:\aa bb\cc\my.bat
For example, echo. """%~f0""" (I added a . after ECHO seeing others do it, but I don't know the reason yet, so depressed...)
in order to test in the batch file
It will output "d:\aa bb\cc\my.bat"
Not ""d:\aa bb\cc\my.bat""
Nor d:\aa bb\cc\my.bat (while echo. ”%~f0" will output this string )
Last edited by yhsean on 2005-12-22 at 13:14 ]
|
|
2005-12-22 12:38 |
|
|