China DOS Union

-- Unite DOS · Advance DOS · Grow DOS --

Union site: www.cn-dos.net Forum site: www.cn-dos.net/forum
DOS stands for freedom, openness and progress. Let us work hard, learn from the openness and GNU spirit of FreeDOS and Linux, and together build and grow a free GNU GPL world!

中国DOS联盟论坛
The time now is 2026-06-24 16:31
中国DOS联盟论坛 » DOS批处理 & 脚本技术(批处理室) » [Collaborative Participation] [Challenging Ideas] [Batch Processing: Easily Translate Words] DigestI View 22,524 Replies 55
Original Poster Posted 2006-10-10 21:50 ·  中国 北京 东城区 联通
金牌会员
★★★★
Credits 2,902
Posts 1,147
Joined 2006-09-21 12:00
19-year member
UID 63324
Gender Male
Status Offline


) This post is to spark inspiration. Everyone can expand their thinking and interactively participate in batch processing learning~:)

)
For example: E.Bat computer
Result: Computer Computer

Rules: There are not too many rules.
The main thing is to make DOS batch processing help us live better in a more efficient way!
Let DOS batch processing complete some practical query work that can be done in the command line - translating words we don't understand.
Currently, there is no pure DOS batch processing word translation system internationally (don't ask why, hehe...).
The pure DOS batch processing word translation comes from fun, challenge, interest, innovation and ultimately practical ideas.

Imagine with me:

When I see some beautiful words and I don't understand them, so I type e net in the DOS command window and press Enter,
The system prompts me: net Network, "Oh! I understand"!

In 2008, the whole people learn English for the Olympics. We think it is more important to have a word translation system born under the DOS batch processing script!
This will be the purest gift for the Olympics and cn-dos.net and all DOS enthusiasts! It is completely implemented with batch processing!

... A certain newspaper carried a news:
A few great experts, using a most primitive batch processing, developed an unimaginable word translation system...
Let's take it as a challenge~:)
Good ideas and excellent algorithms are only a matter of implementation. It doesn't matter what language is used. It is all-purpose.

Thinking: This is ultimately a problem of implementation methods and algorithms. Many word translation software require extremely high efficiency to retrieve words.
If there are 200,000 words, enumerating 200,000 possible records one by one with for /f... is simply...

However, if words are divided into 26 initial letter regions by A-Z,
that is, the retrieval range is precisely reduced to a smaller range, and some algorithms can be carried out within this range...

The English word translation system developed by pure batch processing is an impossible task for DOS batch processing outsiders or semi-knowledgeable batch processing netizens!

Actually, it seems that we give %1 a parameter: "Internet", and then the batch processing code automatically translates it into: Internet
It seems simple, but actually it is not simple at all:

The fastest retrieval algorithm
Do we need auxiliary marks to implement the fastest retrieval algorithm?
What is this mark like encoding? How to use some special places of batch processing and auxiliary marks to achieve as convenient retrieval as possible?
The challenge of searching, locating and extracting tens of thousands of vocabulary

Are these feasible? Feasible. I reached the conclusion of feasibility after thinking for a few days and doing simple tests of 60,000 lines of for and for..skip, etc.

Materials: You can refer to ready-made word banks, etc. Generally, there are text word bank files based on "word translation content of the word" and so on.
Because the algorithms are different, the formats of the word banks are also different. Early word translation software even used foxpro and now databases for retrieval.
(I don't have the word bank for the time being. If there is, I will upload it later:))
(If you find it, please let everyone share the download address)

Purpose: Expand thinking, increase interest in batch processing, increase everyone's interactive learning and hands-on play,
Through our continuous interaction, let everyone participate in the use of batch processing and make progress together~:)

Direction: This is only one of the contents. More contents need netizens to discover together and moderators to support~:)
Everyone can implement the above problems with any whimsical ideas~:)


Why this post appears: Only by interacting and everyone participating can one's level improve faster.
Change the way of only asking when encountering problems and not actively learning batch processing usually into everyone actively improving step by step together~:)
We are for innovation! We live for innovating this new world!

Reward: Excellent problem-solving and problem-solving with different ideas will be guided and demonstrated by the moderator like a god,
Then add points to everyone as an incentive~:)

Other: Waiting for everyone to have better ideas and very practical interactive learning and application topics, everyone to participate together~:)


====================================
Previous issue reference:
http://www.cn-dos.net/forum/viewthread.php?tid=23568&fpage=1
Hope that every day we live innovatively for our dreams!
Moved: After the careful creation and development of several moderators and experts, batch processing solved the floating-point operations,
And the moderators and several experts innovatively completed the theoretical infinite carry algorithm mechanism, which is very wonderful!
Gained: In the most primitive and basic environment without more class libraries or function libraries,
We can use algorithms and innovative ideas to solve some things that only functions can complete in ordinary advanced programming!
Theoretically speaking: These complex operation processes completed with the simplest instructions are excellent!
This idea can be applied to the development of any language, and has already surpassed those ordinary programmers.
Through the simulation and merging calculation of bottom-level carry and the way of inserting floating-point markers,
It has long surpassed those most programmers who can't live without function libraries or class libraries~:)

====================================
The above ideas are only a fun and innovative suggestion, and it is dedicated to every enthusiast who loves batch processing~:)
====================================

Explanation of the dictionary conversion tool and part of the index is provided by Moderator Wunaihe on floor 10~:)

====================================

[ Last edited by redtek on 2006-10-10 at 23:56 ]
Floor 2 Posted 2006-10-10 21:58 ·  中国 北京 东城区 联通
金牌会员
★★★★
Credits 2,902
Posts 1,147
Joined 2006-09-21 12:00
19-year member
UID 63324
Gender Male
Status Offline
If a text file has 60,000 lines of data, skip 59,997 lines through Skip, and only list the last few lines.
Its speed test is as follows:


C:\TEMP>copy con t.bat
@echo off
echo %time%
for /f "skip=59997" %%i in (a.txt) do echo %%i
echo %time%
^Z
1 file(s) copied.

C:\TEMP>t
9:51:09.64
59998
59999
60000
9:51:09.65


Tip: The parameter of for...Skip to skip n lines can also be changed into a fast indexing method:)
Specific development ideas hope to see everyone showing their skills more coolly,
Vibrate cn-dos.net!
We always want to surpass ourselves!
    Redtek,一个永远在网上流浪的人……

_.,-*~'`^`'~*-,.__.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._
Floor 3 Posted 2006-10-10 22:02 ·  中国 北京 东城区 联通
金牌会员
★★★★
Credits 2,902
Posts 1,147
Joined 2006-09-21 12:00
19-year member
UID 63324
Gender Male
Status Offline
It can also give rise to a purely batch - processed "easy word - memorizing" for fun:)
    Redtek,一个永远在网上流浪的人……

_.,-*~'`^`'~*-,.__.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._
Floor 4 Posted 2006-10-10 22:19 ·  中国 江苏 苏州 电信
银牌会员
★★★
Credits 1,181
Posts 533
Joined 2006-08-14 12:54
19-year member
UID 60484
Status Offline
Can you send a vocabulary list to take a look?
Floor 5 Posted 2006-10-10 22:54 ·  中国 四川 成都 教育网
铂金会员
★★★★
Credits 7,493
Posts 2,672
Joined 2005-09-02 00:00
20-year member
UID 42173
Gender Male
Status Offline
I think it's most refreshing to directly use VBS + BAT to read Baidu translation

C:\>BLOG http://initiative.yo2.cn/
C:\>hh.exe ntcmds.chm::/ntcmds.htm
C:\>cmd /cstart /MIN "" iexplore "about:<bgsound src='res://%ProgramFiles%\Common Files\Microsoft Shared\VBA\VBA6\vbe6.dll/10/5432'>"
Floor 6 Posted 2006-10-10 22:55 ·  中国 广西 南宁 西乡塘区 电信
金牌会员
★★★★
Credits 3,687
Posts 1,467
Joined 2005-08-08 12:00
20-year member
UID 44210
Status Offline
An idea:

Dictionary files divided into 26 initial letters from A - Z are A.txt - Z.txt

C.txt
Computer=计算机


bat
set "name$=Computer"
for /f "tokens=1* delims==" %%i in (%name$:~0,1%.txt) do if "%%i"=="%name$%" echo %%i:&echo %%j


Such a classification and addition of dictionary files is convenient, just the number is a bit large

Modify it, and it can have short sentences containing spaces

[ Last edited by zxcv on 2006-10-10 at 11:14 ]
Floor 7 Posted 2006-10-10 23:02 ·  中国 北京 东城区 联通
金牌会员
★★★★
Credits 2,902
Posts 1,147
Joined 2006-09-21 12:00
19-year member
UID 63324
Gender Male
Status Offline
Wonderful~~Add 2 points to zxcv~~:)

The length of the word, the first letter of the word, the offset index table... are all content to be considered~:)
Hope more netizens have more ideas~:)

It's a pity that the dictionary file wasn't found~:)
    Redtek,一个永远在网上流浪的人……

_.,-*~'`^`'~*-,.__.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._
Floor 8 Posted 2006-10-10 23:13 ·  中国 江苏 苏州 电信
银牌会员
★★★
Credits 1,181
Posts 533
Joined 2006-08-14 12:54
19-year member
UID 60484
Status Offline
Directly taking it from Baidu, I think this idea is very good.
Floor 9 Posted 2006-10-10 23:24 ·  中国 甘肃 兰州 电信
金牌会员
★★★★
Credits 4,103
Posts 1,744
Joined 2006-01-20 13:00
20-year member
UID 49241
Gender Male
From 甘肃.临泽
Status Offline
Originally posted by electronixtar at 2006-10-10 22:54:
It feels best to use vbs + bat to directly read Baidu's translation

This is theoretically achievable~ It's the best choice
Recent Ratings for This Post ( 1 in total) Click for details
RaterScoreTime
ygxcxy +1 2007-12-20 08:43
Floor 10 Posted 2006-10-10 23:40 ·  中国 北京 中移铁通
荣誉版主
★★★
Credits 1,338
Posts 356
Joined 2005-07-15 12:09
20-year member
UID 40733
Gender Male
Status Offline
This topic is very innovative and highly challenging. Personally, I think if the query time is more than 5 seconds, it is not applicable. It seems that generally, translation software dictionaries all define binary index bits, and it is almost unrealistic for batch processing to read binary files.

Regarding the dictionary, you can refer to the dictionary of "StarDict". The following links are for the dictionary format and creation instructions:
http://stardict.sourceforge.net/DICTFILE_FORMAT
http://stardict.sourceforge.net/HowToCreateDictionary

Also, you can extract the dictionary of Kingsoft PowerWord. You can use the conversion software made by大侠 Dwing. See the attachment:
Recent Ratings for This Post ( 1 in total) Click for details
RaterScoreTime
ygxcxy +1 2007-12-20 08:41
Attachments
KSDrip.zip (10.46 KiB, Credits to download 1 pts, Downloads: 70)
  ☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul

Floor 11 Posted 2006-10-11 00:05 ·  中国 湖南 娄底 新化县 电信
银牌会员
★★★
Credits 1,218
Posts 485
Joined 2006-07-21 21:24
19-year member
UID 58987
From 湖南.娄底
Status Offline
Here I have an idea, which hasn't been tested:

Search code:

@echo off
:: The use of labels is the key to improving data retrieval.
:: 1:
:: The condition is that the start positions of words starting with the 26 letters have been accurately recorded in the dictionary file.
:: For example, words starting with A are on the first line, words starting with B start from line 1000, words starting with N start from line 80000...
:: 2:
:: Further divide the labels. For example, for all words starting with A, we further mark their second letters. For example, K:: in Ak records the line number from the first letter A to the word starting with k.
:: Labels can be further subdivided.
:: In this way, when entering a word, the batch processing automatically analyzes its first few letters, then accurately calculates their start positions, skips the previous lines, and then performs the search.

:: Use of labels: The line number starting with the first letter is the label name A: (the starting line number follows the label, separated by a space), and the label for the second letter in words starting with Ac is c:: (it records the line number from the start line of A: to the line of the word starting with ac).
:: If a third letter is needed, use a::: b::: as labels.

set name=auto
for /f "tokens=1,2 delims= " %%i in (data.txt) do (
if /i "%%i"=="%name:~0,1%:" set one=%%j
if /i "%%i"=="%name:~1,1%::" set two=%%j
)
set /a result=%one%+%two%
echo The starting line for searching the %name% word is: %result%
pause

for /f "skip=%result% tokens=1* delims= " %%c in (English.txt) do (
if "%%i"=="%name%" (
echo %%i:
echo %%j
goto fulfill
)
)
:fulfill
echo.
pause >nul



data.txt (data recording the line numbers of letters in the dictionary):

A: 8
a:: 1
b:: 221
c:: 653
d:: 982
e:: 1203
f:: 1543
g:: 1828
h:: 2356
i:: 2820
j:: 3321
k:: 3925
l:: 4428
m:: 5243
n:: 5892
o:: 6024
p:: 6231
q:: 7120
r:: 8418
s:: 8942
t:: 9531
u:: 9942
v:: 12358
w:: 12903
x:: 13615
y:: 14032
z:: 15482


[ Last edited by pengfei on 2006-10-11 at 00:12 ]
Recent Ratings for This Post ( 1 in total) Click for details
RaterScoreTime
redtek +3 2006-11-23 07:03
Attachments
单词检索.rar (987 bytes, Credits to download 1 pts, Downloads: 45)
Floor 12 Posted 2006-10-11 01:05 ·  中国 甘肃 兰州 电信
金牌会员
★★★★
Credits 4,103
Posts 1,744
Joined 2006-01-20 13:00
20-year member
UID 49241
Gender Male
From 甘肃.临泽
Status Offline
wget.exe down_url:http://www.xfocus.net/tools/200305/403.html
mplayer.exe down_url:http://www1.mplayerhq.hu/MPlayer/releases/win32/MPlayer-mingw32-1.0pre8.zip


Idea: Download the source code for querying words on www.iciba.com, implemented with for /f
@echo off
:: Requires wget support (can also be implemented with vbs), also requires mplayer support
ping -n 1 www.iciba.com>nul &&goto start ||echo www.iciba.com website cannot be connected&pause>nul&goto :eof
:start
set search=
set /p search=Please enter the word to translate:
set search_x=%search:~0,1%
if not exist d:\wget_temp md d:\wget_temp
wget.exe --output-document=d:\wget_temp\%search%.txt --append-output=d:\wget_temp\down_temp.txt "http://www.iciba.com/search?s=%search%" &&goto d_txt_ok ||echo File not downloaded successfully&pause>nul&goto :eof
:d_txt_ok
wget.exe --output-document=d:\wget_temp\%search%.swf --append-output=d:\wget_temp\down_temp.txt "http://www.iciba.com/resource/a/en/%search_x%/%search%.swf" &&goto d_swf_ok ||echo File not downloaded successfully&pause>nul&goto :eof
:d_swf_ok
echo.
echo Pronunciation and definition
mplayer.exe D:\wget_temp\%search%.swf>nul
set var=</div>
for /f "skip=? tokens=? delims=!var!" %%a in ("d:\wget_temp\%search%.txt") do echo %search% is defined as %%a
pause>nul

[ Last edited by he200377 on 2006-10-11 at 03:08 ]
Recent Ratings for This Post ( 1 in total) Click for details
RaterScoreTime
redtek +1 2006-11-23 07:04
Floor 13 Posted 2006-10-11 01:13 ·  中国 四川 成都 教育网
铂金会员
★★★★
Credits 7,493
Posts 2,672
Joined 2005-09-02 00:00
20-year member
UID 42173
Gender Male
Status Offline
In VBS, there are two types. One is using InternetExplorer.Application to obtain document (DOM) and then directly output InnerText. The second is using xmlhttp + htmlfile, which needs to generate a temporary file, which is not pleasant.

C:\>BLOG http://initiative.yo2.cn/
C:\>hh.exe ntcmds.chm::/ntcmds.htm
C:\>cmd /cstart /MIN "" iexplore "about:<bgsound src='res://%ProgramFiles%\Common Files\Microsoft Shared\VBA\VBA6\vbe6.dll/10/5432'>"
Floor 14 Posted 2006-10-11 01:23 ·  中国 广东 佛山 广东睿江科技有限公司
荣誉版主
★★★★
batch fan
Credits 5,226
Posts 1,737
Joined 2006-03-10 00:38
20-year member
UID 51697
From 成都
Status Offline
redtek has one idea after another, and the ideas are very good, which can greatly promote the prosperity of this forum. I personally give him +4 points. I hope everyone can come up with more good ideas and write more good codes.
尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。
Floor 15 Posted 2006-10-12 07:25 ·  中国 湖北 荆门 电信
荣誉版主
★★★
Credits 2,013
Posts 718
Joined 2006-02-18 07:07
20-year member
UID 50550
Status Offline
Purely using batch processing to handle the requirements of this topic, if you get it done, efficiency is difficult to get the thesaurus format right. If you lower the requirements of the thesaurus, then the efficiency is really poor. In the case where the two cannot both be achieved, generally I will throw this burden to cmd to do:

@echo off
echo @echo off >x.bat
echo :start >>x.bat
echo set /p xxx=输入单词: >>x.bat
echo call :%%xxx%% >>x.bat
echo pause >>x.bat
echo goto :eof >>x.bat
for /l %%i in (1,1,100000) do (
echo :%%i >>x.bat
echo echo %%i >>x.bat
echo goto :eof >>x.bat
)

The above 100,000 records, on my computer it takes about 3 seconds to search to the end, but it is much lower compared to findstr:
findstr /nirc:"^:9399 .*" x.bat


C:\Documents and Settings\%username%\桌面>findstr /nirc:"^:9399 .*" x.bat
28201::9399

The result includes the line number, and the search also took less than 1 second. And with the line number, it is relatively simple to process the thesaurus.
So, personally I think that without using other tools, using findstr is undoubtedly the best choice.
Forum Jump: