GBK & UTF8 Encoding Conversion Script (CMD+GAWK)
Because in my actual use I needed UTF8-to-GBK encoding conversion, I wrote one with GAWK. In fact, I had already been using it in an earlier post. This time I整理ed it a bit, and it now supports bidirectional encoding conversion. I made a complete GBK-to-UTF8 conversion mapping table myself. During the process I found that there are quite a few differences between the system's conversion results and iconv's conversion results, and the mapping table uses the former.
This script supports encoding conversion through pipes and files, with the result output to the screen. There are not many parameters supported yet, but it does have parameter integrity checking and can handle unordered calls with multiple parameters. It uses a new script release method: when the source file is modified, the script will be updated automatically. There are also error messages and a dependency-file integrity checking mechanism. I hope these little tricks will be helpful to everyone in writing batch files.
GAWK download link: http://www.klabaster.com/progs/gawk32.zip
The script and conversion table are in the attachment.
Posted by 无奈何 2006-11-30 01:02
- :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
- :: gbk2utf8.cmd -V0.1 -- GBK & UTF8 encoding conversion
- :: 无奈何@cn-dos.net - 2006-11-28 - CMD & GAWK
- :: Usage: gbk2utf8 /I file...
- :: Supported files: - gawk.exe gbk2utf8.dat
- :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
- @echo off
- setlocal
- set self="%~f0"
- set AwkScript="%temp%\%~n0%~z0.awk"
- set path=%path%;%~dp0;%cd%
- set nofile=
- set error=
- set input=
- ::Dependency file integrity check
- for %%i in (gawk.exe gbk2utf8.dat) do (
- @if "%%~$PATH:i" == "" (
- echo.The required dependency file "%%i" is missing.
- set nofile=1
- ) else ( set %%~ni="%%~$PATH:i" )
- )
- if defined nofile goto :EOF
- ::Update script after file changes
- if not exist %AwkScript% (
- del /q "%temp%\%~n0*.awk" 2>nul
- gawk "/^#<-1/,/^#>-1/{if(!/^#/)print}" %self% >%AwkScript%
- )
- :ParseLoop
- if "%~1" == "" goto Start
- if "%~1" == "?" goto SwitchH
- if "%~1" == "/?" goto SwitchH
- rem Process parameters and jump to the corresponding label.
- for %%s in (U u I i h H) do if "%~1"=="/%%s" goto Switch%%s
- if "%F_input%" == "1" (
- if not exist "%~1" set error=Warning: file "%~1" does not exist. & goto error
- set input=%input% "%~1"
- shift
- goto ParseLoop
- )
- if "%F_input%" == "-1" shift & goto ParseLoop
- set error=Error: incorrect parameter format - "%1" !
- goto error
- :SwitchI
- set F_input=1
- if "%~2" == "-" set F_input=-1
- shift
- goto ParseLoop
- :SwitchU
- set F=-1
- shift
- goto ParseLoop
- :error
- echo.%error%
- echo.
- :SwitchH
- echo.gbk2utf8 V0.1 -- GBK ^& UTF8 encoding conversion
- echo.
- echo.Usage: 1、%~n0
- echo. 2、%~n0 /I file...
- echo. 3、%~n0 /I -
- echo.
- echo.Options: /? displays this brief help, equivalent to /H .
- echo. /U converts UTF8 to GBK; the default is GBK to UTF8.
- echo. /I specifies the file to convert; “-” gets it from standard output.
- echo. This parameter may be omitted; by default it will be obtained from standard output.
- echo. When specifying a file to convert, the /I parameter cannot be omitted.
- goto :EOF
- :Start
- if "%input%" == "" set F_input=-1
- if "%F_input%" == "-1" (
- gawk -v F=%F% -f %AwkScript%
- ) else (
- gawk -v F=%F% -f %AwkScript% %input%
- )
- goto :EOF
- :AwkScript
- #<-1
- function gbk2utf8(string,flag, reg, gbkreg, utf8reg, char, result){
- gbkreg="|"
- utf8reg="||\xe0||\xf0|"
- reg=gbkreg
- if (flag==-1)
- reg=utf8reg
- RLENGTH = 1
- while(RLENGTH != -1){
- match(string,reg)
- char=substr(string,RSTART,RLENGTH)
- if (RLENGTH>1)
- char=charset
- result=result char
- string=substr(string,RSTART+RLENGTH)
- }
- return result
- }
- BEGIN {
- FS=","
- if (!F) F=1
- if (F==1) {
- while((getline<"gbk2utf8.dat") > 0)
- charset=$2
- }
- else{
- while((getline<"gbk2utf8.dat") > 0)
- charset=$1
- }
- close("gbk2utf8.dat")
- }
- {
- x=gbk2utf8($0,F)
- print x
- }
- #>-1
- goto :EOF
[ Last edited by 无奈何 on 2006-11-30 at 02:04 PM ]
Attachments
☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul

DigestI
