Re namejm & others:
Regarding Brother yuanyong630's encryption scheme, your思路 is correct, but there are slight deviations.
When the Notepad program saves a newly created document without a specified encoding type, it uses the default ANSI type (for the Chinese version, this corresponds to GB encoding).
When opening an existing document, it analyzes the document's encoding type. First, it checks for a BOM (Byte Order Mark, a 2-3 byte sequence) at the beginning of the document. If present, it determines the encoding type based on the content: FF FE (Unicode), FE FF (Unicode big endian), EF BB BF (UTF-8).
Since many non-ANSI encoded documents are "plain text" without any BOM, such documents cannot be simply judged as ANSI encoded. Instead, a series of statistical algorithms are used to guess the document encoding based on the content. Notepad uses the IsTextUnicode function to determine if the encoding is Unicode/Unicode big endian and IsTextUTF8 to check for UTF8 encoding.
However, as these are statistical algorithms, misjudgments are inevitable, especially when the document content is too short. Due to the small sample size, the probability of such misjudgments increases significantly. For example, the well-known joke about Microsoft having a grudge against China Unicom arises because Notepad misidentifies an ANSI encoded document containing only the two characters "联通" as UTF8 using the IsTextUTF8 function. Similar misjudgments occur with the IsTextUnicode function; for instance, a document with the structure "this app can break" (4335 structure) is misidentified as Unicode encoding.
It should be noted that such misjudgments are likely only when the text is short and its byte characteristics are not disturbed. If the text is slightly modified (even by adding a single carriage return), misjudgment becomes difficult.
The uniqueness of Brother yuanyong630's scheme lies in its byte string, which not only has Unicode characteristics but is also long, reaching 1288 bytes. This means its Unicode characteristics are strong, allowing it to resist interference from some short, non-Unicode characteristic strings, as determined by statistical laws. However, when the interfering string is somewhat longer, the Unicode characteristics will be significantly disrupted until the IsTextUnicode function identifies it as non-Unicode. Therefore, some friends who cannot successfully test it should consider the length and content of the additional batch processing code. Everyone can test the code in .
Other editors (such as Word/Wordpad/EditPlus/UltraEdit) use newer encoding determination algorithms, which have improved Unicode judgment, though UTF8 judgment remains unsatisfactory. Theoretically, a completely accurate algorithm does not exist, so we can only avoid using non-ANSI documents without BOM or manually specify the encoding type when opening documents.
Additionally, if Notepad is used to save files with misjudged encoding types, recovery becomes difficult. Saving with the misjudged encoding adds a BOM mark, making the original document unobservable in other editors. Saving with ANSI encoding converts the original document as if it were Unicode, leaving almost no possibility of restoration.
Unicode Introduction
http://my.opera.com/neutronstar/blog/index.dml/tag/编码
Why Does Microsoft Have a Grudge Against China Unicom
http://blog.vckbase.com/localvar/archive/2005/07/12/9510.aspx
Notepad bug? Encoding issue?
http://weblogs.asp.net/cumpsd/archive/2004/02/27/81098.aspx
Bush Hid The Facts
http://www.shoutwire.com/comments/16341/Bush_Hid_The_Facts
cry.cmd
for /l %%a in (1,1,10) do ren *.jpg %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a %%a
@echo off
echo bbs.cn-dos.net
echo.