|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『楼 主』:
批处理室经典帖子索引初步整理结果[20070211更新]
使用 LLM 解释/回答一下
整理批处理室的经典帖子这个想法很早之前就有人提出来了,只是那时候还没太多的时间,导致此事的搁浅。1月初,在wengier站长的持续更新和qzwqzw等人的辛勤测试下,论坛的DOS界面开发工作进展得很顺畅,极大地方便了帖子标题记录的提取工作,从而为建立批处理室经典帖子索引创造了便利的条件。
从1月下旬起,本人开始着手整理批处理室经典帖子索引,但是中途诸事缠身,整理工作中断了一些时日,前两天才接着做下去,今天终于赶出了初稿,放出来让大家来讨论一下,看看该如何做进一步的完善。
说一下我的收录标准:
① 对某一问题做深入分析、讨论的帖子;
② 关注程度很高、实用性/通用性很强的帖子;
③ 技巧性较高或思路/见解独到的帖子;
④ 提供不同思路/不同脚本工具编写代码的帖子
因为只是个初稿,提供两个版本的整理结果:考虑到整理的方便,”分类整理初步(详细).txt“只保留了帖子的创建时间、ID、回帖量、浏览量、标题等关键内容,没有加上论坛的链接,但是可以通过 http://www.cn-dos.net/forum/viewthread.php?tid=ID号 的方式访问;考虑到查看帖子内容的方便,“分类整理初步(带链接版).txt”只提供了链接和标题。既然是初稿,那么,肯定会存在很多不足,比如分类不尽合理,还有可增/删/调整的地方;比如同类帖子可能存在内容上的重复,还可以去粗取精……请大家多提建议,把这个帖子索引做成批处理室的精品。
目前,初步的想法是:
① 把索引分类更加合理化;
② 数量尽量精简,同类帖子只选取具有代表性的发出来;
③ 发表出来的索引帖做成这样: DOS联盟论坛解答室精华帖索引(2005.08.10),但是,会以分类的形式张贴,并通过添加颜色、改变字体字号等手段加以区别。
感谢 ccwan 在分类方面的建议;感谢 jmz573515 在版本内容方面的建议。
The idea of organizing classic posts in the Batch Processing Room was proposed a long time ago, but at that time, there was not enough time, resulting in the suspension of this matter. In early January, under the continuous updates of站长wengier and the hard tests by qzwqzw and others, the development work of the DOS interface in the forum went very smoothly, which greatly facilitated the extraction of post title records, thus creating convenient conditions for establishing the index of classic posts in the Batch Processing Room.
From the end of January, I started to organize the index of classic posts in the Batch Processing Room. However, I was occupied with various things in the middle, and the organizing work was interrupted for some time. I just continued to do it the day before yesterday, and finally finished the first draft today. I put it out for everyone to discuss and see how to further improve it.
Let me talk about my inclusion criteria:
① Posts that make in-depth analyses and discussions on a certain issue;
② Posts with high concern, strong practicality/generality;
③ Posts with relatively high skills or unique ideas/insights;
④ Posts that provide different ideas/different script tool writing codes.
Since it is only a first draft, two versions of the organizing results are provided: Considering the convenience of organizing, "Classified Organization Preliminary (Detailed).txt" only retains key contents such as the creation time, ID, number of replies, page views, and titles of the posts, without adding the forum links, but it can be accessed by the way of http://www.cn-dos.net/forum/viewthread.php?tid=ID number; Considering the convenience of viewing the post contents, "Classified Organization Preliminary (With Links Version).txt" only provides the links and titles. Since it is a first draft, there must be many deficiencies, for example, the classification is not reasonable enough, and there are places that can be added/deleted/adjusted; for example, there may be content repetitions in similar posts, and we can also discard the irrelevant and keep the essential... Please give more suggestions to make this post index a fine product in the Batch Processing Room.
At present, the preliminary ideas are:
① Rationalize the index classification more;
② Try to be as concise as possible, and only select representative similar posts to post;
③ The posted index posts will be made like this: Essence Post Index of the Q&A Room in the DOS Union Forum (2005.08.10), however, it will be posted in the form of classification, and distinguished by means of adding colors, changing font sizes, etc.
Thanks to ccwan for the suggestions on classification; thanks to jmz573515 for the suggestions on the version contents.
附件
1: 批处理室经典帖子初步整理结果.rar (2007-2-11 14:29, 31.44 KiB, 下载附件所需积分 1 点
,下载次数: 151)
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-2-7 01:51 |
|
|
redtek
金牌会员
     
积分 2902
发帖 1147
注册 2006-9-21
状态 离线
|
『第 2 楼』:
使用 LLM 解释/回答一下
超级顶!!!真是太好了~:)))
Super top!!! It's really great~: ) ) )
|

Redtek,一个永远在网上流浪的人……
_.,-*~'`^`'~*-,.__.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._ |
|
2007-2-7 02:07 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 3 楼』:
使用 LLM 解释/回答一下
Originally posted by redtek at 2007-2-6 13:07:
超级顶!!!真是太好了~:)))
期待你的建议。
Originally posted by redtek at 2007-2-6 13:07:
Super top!!! It's really great~:)
Looking forward to your suggestions.
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-2-7 02:12 |
|
|
redtek
金牌会员
     
积分 2902
发帖 1147
注册 2006-9-21
状态 离线
|
『第 4 楼』:
使用 LLM 解释/回答一下
namejm兄总是这么谦虚~:)
盼着DOS联盟论坛dos批处理室精华帖索引帖闪亮的挂在论坛的《重要主题》区上~:)
Brother namejm is always so modest~:)
Looking forward to the brilliant index post of the elite threads in the DOS Batch Processing Room of the DOS Union Forum hanging brightly in the "Important Topics" section of the forum~:)
|

Redtek,一个永远在网上流浪的人……
_.,-*~'`^`'~*-,.__.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._,_.,-*~'`^`'~*-,._ |
|
2007-2-7 02:25 |
|
|
qingfushuan
高级用户
   
积分 502
发帖 327
注册 2006-12-30
状态 离线
|
『第 5 楼』:
顶哦
使用 LLM 解释/回答一下
顶你个肺
顶你个心子把把哦
Damn you!
Damn you to the core!
|
|
2007-2-7 04:37 |
|
|
vkill
金牌会员
     
积分 4103
发帖 1744
注册 2006-1-20 来自 甘肃.临泽
状态 离线
|
|
2007-2-7 05:21 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 7 楼』:
使用 LLM 解释/回答一下
为了能让索引尽早发布,方便大家的学习,希望各位不要袖手旁观。
In order to enable the index to be released as soon as possible and facilitate everyone's learning, I hope everyone will not stand by idly.
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-2-8 01:38 |
|
|
anqing
高级用户
   
积分 859
发帖 413
注册 2006-8-14
状态 离线
|
|
2007-2-8 01:53 |
|
|
无奈何
荣誉版主
      
积分 1338
发帖 356
注册 2005-7-15
状态 离线
|
『第 9 楼』:
使用 LLM 解释/回答一下
非常不错!
建议发布索引的时候附带上发帖时间,如 tid=13226 那帖的样子,namejm 兄 现在整理保留的是最后回复时间,好像这个时间意义不大。
Very good!
It is suggested that when publishing the index, the posting time should be attached, like the post with tid=13226. Brother namejm is currently organizing and keeping the last reply time, which seems to be of little significance.
|

☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul
|
|
2007-2-8 02:35 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 10 楼』:
使用 LLM 解释/回答一下
Originally posted by 无奈何 at 2007-2-7 13:35:
建议发布索引的时候附带上发帖时间,如 tid=13226 那帖的样子,namejm 兄 现在整理保留的是最后回复时间,好像这个时间意义不大。
如顶楼所说的,发布索引的格式参考的是解答室那个置顶索引帖的格式,会带上发帖时间的,并且我目前放出来的这个初步整理结果里保留的也是发帖时间而不是最后回复时间,无奈何 兄可能看走眼了。
Last edited by namejm on 2007-8-13 at 01:56 PM ]
Originally posted by Wu Nai He at 2007-2-7 13:35:
It is suggested that when releasing the index, the posting time should be attached, like the post with tid=13226. Brother namejm is currently sorting and keeping the last reply time, which seems to be of little significance.
As mentioned in the top floor, the format for releasing the index refers to the format of the sticky index thread in the Q&A room, which will include the posting time. And in the preliminary sorting result I released currently, what is kept is also the posting time, not the last reply time. Brother Wu Nai He might have misread.
Last edited by namejm on 2007-8-13 at 01:56 PM ]
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-2-8 02:53 |
|
|
无奈何
荣誉版主
      
积分 1338
发帖 356
注册 2005-7-15
状态 离线
|
『第 11 楼』:
使用 LLM 解释/回答一下
namejm 兄 不好意思,我只是粗粗的看了一下。
分类确实是个头疼的问题,分类过粗,会显得很杂;分类过细,帖子置于哪个分类不好判断。相比还是细些分类好点,现在缺乏系统合理的分类表,希望大家能一起讨论一下。
namejm Brother, I'm sorry, I just took a quick look.
Classification is indeed a headache. If the classification is too broad, it will seem messy; if it's too detailed, it's hard to judge which category a post should go into. I think it's better to have more detailed classification. Now there's a lack of a systematic and reasonable classification table, and I hope everyone can discuss it together.
|

☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul
|
|
2007-2-8 03:35 |
|
|
qzwqzw
银牌会员
     天的白色影子
积分 2343
发帖 636
注册 2004-3-6
状态 离线
|
『第 12 楼』:
使用 LLM 解释/回答一下
其实可以不必太讲究分类法
因为很多内容都是无法归类的
可能既属于A,又属于B,同时与C又有关系
关键是要让人可以快速的找到他所需要的内容
所以建立索引的意义大于建立分类
索引中最有效也最常用的无疑是关键字索引
而关键字的建立和选取就简单得多
可以从内容中提取高频字词
也可以从内容中概括关键字
-------------------------------------------------------
另外有一个初步的设想——
因为关键字的优劣取决于定位的准确性与用户的关注程度
因此可以考虑随着用户的选择而动态的优化索引
不断增设新的关键字及其索引
汰除旧的关键字及其索引
这个过程应该是可以自动实现的
关键是采用何种方式收集和反馈用户的选择行为
Actually, there's no need to be too particular about the classification method. Because many contents are impossible to classify. It may belong to both A and B, and be related to C at the same time. The key is to let people quickly find the content they need. So the significance of establishing an index is greater than establishing a classification. The most effective and commonly used in the index is undoubtedly the keyword index. And the establishment and selection of keywords are much simpler. You can extract high-frequency words from the content, or summarize keywords from the content.
-------------------------------------------------------
In addition, there is a preliminary idea -
Because the quality of keywords depends on the accuracy of positioning and the attention degree of users, so it can be considered to dynamically optimize the index with the user's choices. Continuously add new keywords and their indexes, and eliminate old keywords and their indexes. This process should be able to be automatically realized. The key is to adopt what kind of method to collect and feedback users' choice behaviors.
|
|
2007-2-8 05:27 |
|
|
无奈何
荣誉版主
      
积分 1338
发帖 356
注册 2005-7-15
状态 离线
|
『第 13 楼』:
使用 LLM 解释/回答一下
Originally posted by qzwqzw at 2007-2-8 05:27:
因此可以考虑随着用户的选择而动态的优化索引
兄的提议很好,动态索引由于我不懂 javascript 等,实现会很困难。也请懂 javascript 的朋友指点一下是否可行。
我考虑的方案是 制作出 chm 文件,在本机架设微型的 apache + php 接受用户提交的关键字、分类信息等,然后统一收集每个用户的提交信息,最后制作带分类及索引的 chm 文件。这样怎么也做不到真正的动态,不能随用户的整理随时更新。
再进一步的话用脚本生成 chm 编译所需的文件,让用户自己编译出 chm 文件。
Originally posted by qzwqzw at 2007-2-8 05:27:
Therefore, we can consider dynamically optimizing the index according to user choices
Your proposal is very good. Implementing dynamic indexing will be very difficult for me because I don't know JavaScript and so on. Please also ask friends who know JavaScript to give some guidance on whether it is feasible.
The plan I considered is to make a CHM file, set up a micro Apache + PHP on this machine to receive the user-submitted keywords, category information, etc., and then uniformly collect each user's submission information, and finally make a CHM file with categories and indexes. In any case, this cannot achieve real dynamism and cannot be updated at any time with the user's organization.
If we go further, we can use scripts to generate the files required for CHM compilation, and let users compile the CHM file by themselves.
|

☆开始\运行 (WIN+R)☆
%ComSpec% /cset,=何奈无── 。何奈可无是原,事奈无做人奈无&for,/l,%i,in,(22,-1,0)do,@call,set/p= %,:~%i,1%<nul&ping/n 1 127.1>nul
|
|
2007-2-8 07:31 |
|
|
namejm
荣誉版主
       batch fan
积分 5226
发帖 1737
注册 2006-3-10 来自 成都
状态 离线
|
『第 14 楼』:
使用 LLM 解释/回答一下
Originally posted by qzwqzw at 2007-2-7 16:27:
其实可以不必太讲究分类法
因为很多内容都是无法归类的
可能既属于A,又属于B,同时与C又有关系
关键是要让人可以快速的找到他所需要的内容
所以建立索引的意义大于建立分类
有些内容确实是难以精确归类,但是我觉得分类的工作还是应该去做,因为按照目前初步整理的结果,经典帖子标题有400余条,如果不分门别类地放好而是全部放在一起的话,找自己感兴趣的内容会十分吃力的——为了降低整理强度和难度,目前我只想把帖子标题列出来,用户点击就跳转到这个帖子,做的是标题列表索引而非全文关键字索引。
当然,如果能有办法做出关键字索引的话,那将是论坛里的一大幸事,只可惜以我的水平,还做不了这个高难度的项目,只有指望其他人了<img src="images/smilies/face-surprise.png" align="absmiddle" border="0">
Originally posted by qzwqzw at 2007-2-7 16:27:
Actually, there is no need to be too particular about the classification method
Because many contents cannot be classified
It may belong to both A and B, and be related to C at the same time
The key is to let people quickly find the content they need
So the significance of establishing an index is greater than establishing a classification
Some contents are indeed difficult to be accurately classified, but I think the work of classification should still be done, because according to the preliminary sorting results, there are more than 400 classic post titles. If they are not placed in different categories but all put together, it will be very difficult to find the content you are interested in - in order to reduce the sorting intensity and difficulty, I only want to list the post titles for now, and users can jump to this post by clicking. What I am doing is a title list index, not a full-text keyword index.
Of course, if there is a way to make a keyword index, it will be a great blessing in the forum. Unfortunately, with my level, I can't do this high-difficulty project, and I can only count on others :o
|

尺有所短,寸有所长,学好CMD没商量。
考虑问题复杂化,解决问题简洁化。 |
|
2007-2-8 13:54 |
|
|
jmz573515
银牌会员
    
积分 1212
发帖 464
注册 2006-12-13
状态 离线
|
『第 15 楼』:
使用 LLM 解释/回答一下
不知道这样能不能给各位带来方便
I don't know if this can bring convenience to everyone
附件
1: 搜索DOS论坛.rar (2007-2-10 00:57, 76.35 KiB, 下载附件所需积分 1 点
,下载次数: 68)
|
|
2007-2-10 00:57 |
|
|