You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating a query with a category filter like searcher.CreateQuery("Content") I'm getting no results back even tough my items were indexed with the category Content. I found out that this is because of the big C. If I query with searcher.CreateQuery("content") I will get the expected results back. But shouldn't the first query return the correct results and not only the secound one?
If you here change the category of the indexed items to cOntent you will see that the test will pass even without chaning the category in the CreateQuery statement:
publicvoidNativeQuery_Single_Word(){varanalyzer=newStandardAnalyzer(LuceneInfo.CurrentVersion);using(varluceneDir=newRandomIdRAMDirectory())using(varindexer=GetTestIndex(luceneDir,analyzer,newFieldDefinitionCollection(newFieldDefinition("parentID",FieldDefinitionTypes.Integer)))){indexer.IndexItems(new[]{ValueSet.FromObject(1.ToString(),"cOntent",new{nodeName="location 1",bodyText="Zanzibar is in Africa"}),ValueSet.FromObject(2.ToString(),"cOntent",new{nodeName="location 2",bodyText="In Canada there is a town called Sydney in Nova Scotia"}),ValueSet.FromObject(3.ToString(),"cOntent",new{nodeName="location 3",bodyText="Sydney is the capital of NSW in Australia"})});varsearcher=indexer.Searcher;varquery=searcher.CreateQuery("content").NativeQuery("sydney");Console.WriteLine(query);varresults=query.Execute();Assert.AreEqual(2,results.TotalItemCount);}}
But as soon as you change the category parameter in ``CreateQueryto match the actual category (toCreateQuery("cOntent")`) the test will fail.
Expected result
I expected that the category would be case sensitive or case intensive and not forced lowercase.
Workaround
To workaround the issue I tried to add the field manually with the Field() statement. This seems to make the category identifier case insensitive meaning that both content and cOntent return the expected results.
Example
publicvoidNativeQuery_Single_Word(){varanalyzer=newStandardAnalyzer(LuceneInfo.CurrentVersion);using(varluceneDir=newRandomIdRAMDirectory())using(varindexer=GetTestIndex(luceneDir,analyzer,newFieldDefinitionCollection(newFieldDefinition("parentID",FieldDefinitionTypes.Integer)))){indexer.IndexItems(new[]{ValueSet.FromObject(1.ToString(),"cOntent",new{nodeName="location 1",bodyText="Zanzibar is in Africa"}),ValueSet.FromObject(2.ToString(),"cOntent",new{nodeName="location 2",bodyText="In Canada there is a town called Sydney in Nova Scotia"}),ValueSet.FromObject(3.ToString(),"cOntent",new{nodeName="location 3",bodyText="Sydney is the capital of NSW in Australia"})});varsearcher=indexer.Searcher;varquery=searcher.CreateQuery().Field(ExamineFieldNames.CategoryFieldName,"cOntent").And().NativeQuery("sydney");Console.WriteLine(query);varresults=query.Execute();Assert.AreEqual(2,results.TotalItemCount);}}
The text was updated successfully, but these errors were encountered:
Hi, this all has to do with Lucene analysis. The StandardAnalyzer uses a LowerCase filter which means that anything that goes into the index for the category field (so long as you haven't specified a custom analyzer for that field) will be lowercased when it is analyzed. Analyzers work the opposite way as well, they not only change text on the way into the index, they also change text in your query when it is parsed.
So, for the example that this always works var query = searcher.CreateQuery("content").NativeQuery("sydney"); is because even if you are indexing "cOntent", it will be analyzed as "content" so this query matches.
The reason this CreateQuery("cOntent") will fail is because the query parser is probably not being used for that query under the hood whereas the underlying mechanism for .Field(ExamineFieldNames.CategoryFieldName, "cOntent") is using the query parser - so it will end up like "content".
Essentially, you've found a bug though. The mechanism for searching on category should also probably use the query parser.
nikcio
added a commit
to nikcio/Examine
that referenced
this issue
Oct 24, 2022
When creating a query with a category filter like
searcher.CreateQuery("Content")
I'm getting no results back even tough my items were indexed with the categoryContent
. I found out that this is because of the bigC
. If I query withsearcher.CreateQuery("content")
I will get the expected results back. But shouldn't the first query return the correct results and not only the secound one?Example:
Let's take a test from the source code:
Examine/src/Examine.Test/Examine.Lucene/Search/FluentApiTests.cs
Lines 65 to 93 in 07d0889
If you here change the category of the indexed items to
cOntent
you will see that the test will pass even without chaning the category in theCreateQuery
statement:But as soon as you change the category parameter in ``CreateQuery
to match the actual category (to
CreateQuery("cOntent")`) the test will fail.Expected result
I expected that the category would be case sensitive or case intensive and not forced lowercase.
Workaround
To workaround the issue I tried to add the field manually with the
Field()
statement. This seems to make the category identifier case insensitive meaning that bothcontent
andcOntent
return the expected results.Example
The text was updated successfully, but these errors were encountered: