如何解决Lucene我可以用method替换迭代器吗?
我有个主意:
现在我有 1 个了。
第 2 部分已完成,但它使用迭代器,这意味着我们将在到达我需要的模板之前遍历所有术语,我怎样才能立即获得我的术语并定位文本?
我的代码:
public void methodFromStack() throws Exception {
Directory directory = new RAMDirectory();
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new StandardAnalyzer());
IndexWriter writer = new IndexWriter(directory,indexWriterConfig);
Document doc = new Document();
// Field.Store.NO,Field.Index.ANALYZED,Field.TermVector.YES
FieldType type = new FieldType();
type.setStoreTermVectors(true);
type.setStoreTermVectorPositions(true);
type.setStoreTermVectorOffsets(true);
type.setStored(true);
type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
Field fieldStore = new Field("tags","Kite good world.",type);
doc.add(fieldStore);
writer.addDocument(doc);
writer.close();
DirectoryReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
//Поиск по словосочетанию с учетом отступа
QueryParser queryParser = new QueryParser("tags",new StandardAnalyzer());
Query query = queryParser.parse("\"Kite World\"~1");
TopDocs results = searcher.search(query,1);
for ( scoreDoc scoreDoc : results.scoreDocs) {
Fields termVs = reader.getTermVectors(scoreDoc.doc);
Terms f = termVs.terms("tags");
TermsEnum te = f.iterator();
PostingsEnum docsAndPosEnum = null;
BytesRef bytesRef;
//Here iterator,output all terms,but i need get one my result term and possition
while ((bytesRef = te.next()) != null) {
docsAndPosEnum = te.postings(docsAndPosEnum,PostingsEnum.ALL);
// for each term (iterator next) in this field (field)
// iterate over the docs (should only be one)
int nextDoc = docsAndPosEnum.nextDoc();
assert nextDoc != DocIdSetIterator.NO_MORE_DOCS;
final int fr = docsAndPosEnum.freq();
final int p = docsAndPosEnum.nextPosition();
final int o = docsAndPosEnum.startOffset();
System.out.println("Word: " + bytesRef.utf8ToString());
System.out.println("Position: "+ p + ",startOffset: " + o + " length: "
+bytesRef.length + " Freg: " + fr);
if(fr > 1){
for(int iter = 1; iter <= fr-1; iter++) {
System.out.println("Possition: "+ docsAndPosEnum.nextPosition());
}
}
}
}
}
(我知道在旧版本的 Lucene 库中有类 TermFreqVector 和类 TermPositionVector?,但是随着从 3 到 4 过渡到新版本,发生了变化。在这些变化之后,我发现的是采用迭代器。
使用:Windows+NetBeans+maven+Lucene 7.4.0)
解决方法
解决问题的方法:使用方法seekExact,你可以使用该代码进行测试:
TermsEnum te = f.iterator();
PostingsEnum docsAndPosEnum = null;
if (te.seekExact(ref)) {
docsAndPosEnum = te.postings(docsAndPosEnum,PostingsEnum.ALL);
int nextDoc = docsAndPosEnum.nextDoc();
assert nextDoc != DocIdSetIterator.NO_MORE_DOCS;
final int freg = docsAndPosEnum.freq();
final int pos = docsAndPosEnum.nextPosition();
final int o = docsAndPosEnum.startOffset();
System.out.println("Word: " + ref.utf8ToString());
System.out.println("Position: " + pos + ",startOffset: " + o + " length: " + ref.length + " Freg: " + freg);
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。