In the CMD, the output processing of text content containing special characters has always been quite a headache: If you want to be compatible with special characters, you usually enclose the content in quotes and then output it. But in this way, quotes will be added at the beginning and end of all output lines. If you are very concerned about the quotes after output, this solution cannot be implemented. However, except for this solution, it seems there are no other solutions that can perfectly solve this difficult problem. Note: For the perfect solution, please refer to the code of bjsh on floor 23
Recently, I have had a little free time and thought about this problem again for a while. After several modifications, there is Code 1, which is sent out for everyone to test:
Code 1:
The modification from the code on floor 21 by bjsh is as follows. I personally think it is a relatively perfect solution:
Code 2:
The most perfect code is as follows (from the code of bjsh on floor 23, I only made a small modification):
Code 3:
The content of the test file test.txt (please note: one of the blank lines is composed of spaces):
Analysis of Code 1 and Code 2 is as follows:
① According to the general idea, when referencing variables in the for statement, the setlocal enabledelayedexpansion statement is used to enable variable delay function. However, this function has a fatal flaw: when the string to be processed contains exclamation marks, the exclamation mark pairs and all the strings between them will be replaced with empty. So, Code 1 and Code 2 abandon the setlocal solution and use the solution of calling a sub-process;
② If the text to be processed contains an odd number of quotes, the echo.%str%>>output.txt statement will be wrong. So directly replace the quotes with special invisible characters and then output (that is, the black box displayed in the code); Thanks to the test of lxmxn and the analysis of bjsh;
③ In the :output sub-process, the sentence set "str=%str:^=^^%" must be placed before all replacement sentences, otherwise, ^ will be repeatedly replaced, resulting in inaccurate results; Thanks to the test of lxmxn;
④ The ordinary for statement will ignore the line content starting with a semicolon and also ignore blank lines. So, the findstr .* test.txt statement is used to display all lines (including blank lines); delims=: will discard all colons at the beginning of the line. So, the statement call set "str=%%str:*:=%%" is used to avoid this situation;
⑤ In the statement (call echo.%%str%%)>>output.txt, the dot after echo cannot be omitted. Otherwise, when the line content is a space, the output content will display the current state of echo; Using the call statement is to be compatible with lines with quotes; Using parentheses is to correctly handle the single numbers 1-9 separated by spaces at the end of the line.
⑥ The reason why the :output tag segment does not use the statement for %%i in (^^ ^> ^< ^| ^&) do call set "str=%%str:%%i=^%%i%%" is because this replacement statement cannot replace correctly. It seems that the call delay mechanism in the for statement is indeed a bit confusing.
Regarding Code 3, due to limited level, I can only make a superficial and vague analysis: In this code, the variable delay function is used to completely obtain special characters and terminate the variable delay at an appropriate time to avoid the problem that the string is recognized as a variable reference due to excessive variable delay. In fact, this is still the CMD preprocessing mechanism at work. It is worth noting that the position of the setlocal statement cannot be swapped with the set statement, otherwise, the exclamation mark will still be recognized as a variable reference symbol and thus be discarded.
[ Last edited by namejm on 2007-8-14 at 09:27 PM ]
Recently, I have had a little free time and thought about this problem again for a while. After several modifications, there is Code 1, which is sent out for everyone to test:
Code 1:
@echo off
:: Idea: Escape all special symbols and then output
:: Restriction: The file to be processed cannot be enclosed in quotes;
cd.>output.txt
for /f "delims=" %%i in ('findstr /n .* test.txt') do (
set "str=%%i"
call set "str=%%str:*:=%%"
if defined str (call :output) else echo.>>output.txt
)
start output.txt
exit
:output
set "str=%str:^=^^%"
set "str=%str:>=^>%"
set "str=%str:<=^<%"
set "str=%str:|=^|%"
set "str=%str:&=^&%"
set "str=%str:"=^"%"
call echo.%%str%%>>output.txt
goto :eof
The modification from the code on floor 21 by bjsh is as follows. I personally think it is a relatively perfect solution:
Code 2:
@echo off
cd.>output.txt
for /f "delims=" %%i in ('findstr /n .* test.txt') do (
set "str=%%i"
call set "str=%%str:*:=%%"
if defined str (call :output) else echo.>>output.txt
)
start output.txt
exit
:output
set "str=%str:^=^^%"
set "str=%str:"=%"
set "str=%str:>=^>%"
set "str=%str:<=^<%"
set "str=%str:&=^&%"
set "str=%str:|=^|%"
set "str=%str:="%"
(call echo.%%str%%)>>output.txt
goto :eof
The most perfect code is as follows (from the code of bjsh on floor 23, I only made a small modification):
Code 3:
@echo off
cd.>output.txt
for /f "delims=" %%i in ('findstr /n .* test.txt') do (
set "var=%%i"
setlocal enabledelayedexpansion
set var=!var:*:=!
(echo.!var!)>>output.txt
endlocal
)
start output.txt
The content of the test file test.txt (please note: one of the blank lines is composed of spaces):
"aou"eo
;euou%^>
::::aeui
:::E2uo alejou 3<o2io|
^aue||%ou
!aue!
aoue eou 2
!str!auoeu!ueo &&
euo 8
ueyi^^^^aueuo2
~ ! @ # $ % ^ & * ( () " ok " No " <>nul
set ok=^
Analysis of Code 1 and Code 2 is as follows:
① According to the general idea, when referencing variables in the for statement, the setlocal enabledelayedexpansion statement is used to enable variable delay function. However, this function has a fatal flaw: when the string to be processed contains exclamation marks, the exclamation mark pairs and all the strings between them will be replaced with empty. So, Code 1 and Code 2 abandon the setlocal solution and use the solution of calling a sub-process;
② If the text to be processed contains an odd number of quotes, the echo.%str%>>output.txt statement will be wrong. So directly replace the quotes with special invisible characters and then output (that is, the black box displayed in the code); Thanks to the test of lxmxn and the analysis of bjsh;
③ In the :output sub-process, the sentence set "str=%str:^=^^%" must be placed before all replacement sentences, otherwise, ^ will be repeatedly replaced, resulting in inaccurate results; Thanks to the test of lxmxn;
④ The ordinary for statement will ignore the line content starting with a semicolon and also ignore blank lines. So, the findstr .* test.txt statement is used to display all lines (including blank lines); delims=: will discard all colons at the beginning of the line. So, the statement call set "str=%%str:*:=%%" is used to avoid this situation;
⑤ In the statement (call echo.%%str%%)>>output.txt, the dot after echo cannot be omitted. Otherwise, when the line content is a space, the output content will display the current state of echo; Using the call statement is to be compatible with lines with quotes; Using parentheses is to correctly handle the single numbers 1-9 separated by spaces at the end of the line.
⑥ The reason why the :output tag segment does not use the statement for %%i in (^^ ^> ^< ^| ^&) do call set "str=%%str:%%i=^%%i%%" is because this replacement statement cannot replace correctly. It seems that the call delay mechanism in the for statement is indeed a bit confusing.
Regarding Code 3, due to limited level, I can only make a superficial and vague analysis: In this code, the variable delay function is used to completely obtain special characters and terminate the variable delay at an appropriate time to avoid the problem that the string is recognized as a variable reference due to excessive variable delay. In fact, this is still the CMD preprocessing mechanism at work. It is worth noting that the position of the setlocal statement cannot be swapped with the set statement, otherwise, the exclamation mark will still be recognized as a variable reference symbol and thus be discarded.
[ Last edited by namejm on 2007-8-14 at 09:27 PM ]
Recent Ratings for This Post
( 4 in total)
Click for details
尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。
考虑问题复杂化,解决问题简洁化。

DigestI