資源簡(jiǎn)介
向量空間模型(VSM)的JAVA實(shí)現(xiàn),從文檔表示到相似度計(jì)算,使用兩種相似度計(jì)算方式:cos和tf-idf算法,對(duì)錯(cuò)誤進(jìn)行修改

代碼片段和文件信息
import?java.util.*;
public?class?Doc?{
int?length;??//該文章term個(gè)數(shù)
Vector?termVec;?//該文章term集合
public?Doc()?{?
length?=?0;
termVec?=?new?Vector();
}
public?String?toString()
{
String?s=“\nThe?length?is?:“+this.length;
for(String?t:termVec)
{
s+=t+“\n“;
}
return?s;
}
}
?屬性????????????大小?????日期????時(shí)間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2014-06-19?11:17??src\
?????文件????????3680??2014-06-19?09:35??src\ChineseStopWords.txt
?????文件?????????344??2014-06-19?09:37??src\Doc.java
?????文件?????????164??2014-06-19?09:38??src\DocSimilarity.java
?????文件?????8221775??2014-06-19?09:50??src\edited1988.txt
?????文件???????11950??2014-06-19?13:53??src\edited2014.txt
?????文件???????11648??2014-06-19?13:51??src\Similarity.java
?????文件?????????536??2014-06-19?09:38??src\Term.java
評(píng)論
共有 條評(píng)論