创建了LowercaseKeywordAnalyzer和不返回结果的TermQuery
本文关键字:结果 TermQuery 返回 LowercaseKeywordAnalyzer 创建 | 更新日期: 2023-09-27 18:04:28
我正在使用PerFieldAnalyzer设置ExactTitle使用新的LowercaseKeywordAnalyzer:
private Analyzer GetDefaultAnalyzer()
{
var perFieldAnalyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(Version.LUCENE_30));
perFieldAnalyzer.AddAnalyzer(ReportFields.ExactTitle, new LowercaseKeywordAnalyzer());
...
我正在建立索引,添加字段:
var exactTitleField = new Field(ReportFields.ExactTitle, report.PortalReportTitle, Field.Store.NO,
Field.Index.NOT_ANALYZED);
exactTitleField.Boost = 10.0f;
reportDoc.Add(exactTitleField);
当我使用2字示例"test abc"搜索它时,当我对它进行TermQuery搜索时,它无法找到它:
var term = new Term(exactField, "test abc");
var exactQuery = new TermQuery(term);
query.Add(exactQuery,Occur.SHOULD);
var hits = searcher.Search(query, null, HitsLimit, Sort.RELEVANCE);
如果我搜索"Test Abc",它可以工作。我怎么做才能使这个大小写不敏感的关键字/术语搜索工作?
分析器:
public class LowercaseKeywordAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
TokenStream tokenStream = new KeywordTokenizer(reader);
tokenStream = new LowerCaseFilter(tokenStream);
return tokenStream;
}
}
当您在创建字段时指定Field.Index.NOT_ANALYZED
时,分析器将不会运行,并且文本将不会转换为小写
切换到Field.Index.ANALYZED
编辑
也许你的索引代码有问题,也不要忘记,如果你在索引时小写,你也将在搜索时小写输入以获得匹配。
由于您手动构建查询,因此您需要自己处理它。理想情况下,您希望在搜索时构建Term
之前在字符串上运行相同的Analyzer
。
我提供了一个快速的小示例,它可以做你似乎想做的事情,也许它会帮助你找出问题所在。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.Search;
using Lucene.Net.Store;
namespace ConsoleApplication2
{
class Program
{
static void Main(string[] args)
{
RAMDirectory dir = new RAMDirectory();
var perFieldAnalyzer = new PerFieldAnalyzerWrapper(new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30));
perFieldAnalyzer.AddAnalyzer("ExactTitle", new LowercaseKeywordAnalyzer());
IndexWriter indexWriter = new IndexWriter(dir, perFieldAnalyzer, IndexWriter.MaxFieldLength.UNLIMITED);
Document reportDoc = new Document();
Field exactTitleField = new Field("ExactTitle",
"Test Abc",
Field.Store.NO,
Field.Index.ANALYZED);
reportDoc.Add(exactTitleField);
indexWriter.AddDocument(reportDoc);
indexWriter.Commit();
IndexSearcher searcher = new IndexSearcher(indexWriter.GetReader());
var term = new Term("ExactTitle", "test abc"); //note: for this to work this way you need to always lower case the search too
var exactQuery = new TermQuery(term);
var hits = searcher.Search(exactQuery, null, 25, Sort.RELEVANCE);
Console.WriteLine(hits.TotalHits); // prints "1"
Console.ReadLine();
indexWriter.Close();
}
public class LowercaseKeywordAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
TokenStream tokenStream = new KeywordTokenizer(reader);
tokenStream = new LowerCaseFilter(tokenStream);
return tokenStream;
}
}
}
}