Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomy Index for Facets #323

Merged
merged 81 commits into from
Jul 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
f176361
Support for deep paging using search after
nzdev Dec 12, 2022
df4da35
Doc
nzdev Dec 12, 2022
70edcc1
Documentation
nzdev Dec 12, 2022
fd802e6
Add note.
nzdev Dec 12, 2022
3a6860b
Fix collectors. Tidy.
nzdev Dec 12, 2022
254c827
SearchAfter support for facet.
nzdev Dec 12, 2022
600b6eb
fix merge facets and searchafter.
nzdev Dec 12, 2022
ed63120
Wip test facet searchafter
nzdev Dec 12, 2022
1b3e8bf
Support non sorted query.
nzdev Dec 12, 2022
7e383d6
merge searchafter facets.
nzdev Dec 12, 2022
fb2932f
merge
nzdev Dec 12, 2022
56ddb0f
reorder
nzdev Dec 12, 2022
4bb5ae6
Collect facets
nzdev Dec 12, 2022
67a08bf
Merge Facet updates.
nzdev Dec 14, 2022
7c4bce4
Merge branch 'feat/facets' of https://github.com/nikcio/Examine into …
nzdev Dec 20, 2022
f91fda2
remove comment
nzdev Dec 22, 2022
56ab379
remove unused code
nzdev Dec 22, 2022
126e265
Add xdoc
nzdev Dec 22, 2022
8ac1543
Add ExecuteWithLucene
nzdev Dec 22, 2022
1bb53e4
merge
nzdev Dec 22, 2022
bb931a6
Merge branch 'feature/v3/searchafter' into v3/feature/facets-searchafter
nzdev Dec 22, 2022
018aaf9
Support any IQueryExecutor that supports ILuceneSearchResults.
nzdev Dec 22, 2022
9183b9e
wip
nzdev Dec 22, 2022
cf7d3ea
revert
nzdev Dec 22, 2022
ba25ed6
wip
nzdev Dec 22, 2022
ae69d82
wip
nzdev Dec 22, 2022
700e0db
wip
nzdev Dec 22, 2022
54eb977
wip
nzdev Dec 22, 2022
d54b6ad
test taxonomy
nzdev Dec 22, 2022
411ebe6
doc
nzdev Dec 22, 2022
f23ce57
Taxonomy Searcher methods.
nzdev Dec 22, 2022
6bf68bc
FacetLabel
nzdev Dec 22, 2022
c98a124
fix dispose order.
nzdev Dec 22, 2022
0d003bc
Add Taxonomyfacet type
nzdev Dec 22, 2022
b7b4b79
wip taxonomy facet support
nzdev Dec 22, 2022
674ba0f
Taxonomy field
nzdev Dec 22, 2022
ca3ca3a
fix tests
nzdev Dec 22, 2022
293cd68
facet field
nzdev Dec 22, 2022
13b6391
clearer test
nzdev Dec 22, 2022
f698a78
Merge
nzdev Dec 29, 2022
b1f8610
Merge branch 'feat/facets' of https://github.com/nikcio/Examine into …
nzdev Dec 29, 2022
54c9d56
Merge branch 'v3/feature/facets-searchafter' into v3/feature/facet-ta…
nzdev Dec 29, 2022
7bd63e9
Restore tests
nzdev Dec 29, 2022
1436604
Taxonomy Index field name support.
nzdev Dec 29, 2022
cb40117
Fix missing facets config.
nzdev Dec 29, 2022
f2a0075
Seperate directory for Taxonomy Index
nzdev Dec 29, 2022
ce373d8
Merge branch 'v3/feature/facet-taxonomy' into v3/feature/facet-ui-exa…
nzdev Dec 29, 2022
6f11b0e
store taxonomy in subdirectoy {index}/taxonomy
nzdev Dec 29, 2022
d5334a8
Merge branch 'v3/feature/facet-taxonomy' into v3/feature/facet-ui-exa…
nzdev Dec 29, 2022
1af4367
fix override
nzdev Dec 29, 2022
c2c9c7f
Merge branch 'v3/feature/facet-taxonomy' into v3/feature/facet-ui-exa…
nzdev Dec 29, 2022
31a312b
Retrieve facets in demo
nzdev Dec 29, 2022
b1f3912
support tax / non tax index
nzdev Dec 29, 2022
ed93d4a
Filter
nzdev Dec 29, 2022
8df9683
native facets
nzdev Dec 29, 2022
c215676
Show facet count
nzdev Dec 29, 2022
92745a0
Support for indexing heirarchical facets.
nzdev Dec 30, 2022
fab94c8
Comments
nzdev Dec 30, 2022
e043313
Support Taxonomy Index Replication
nzdev Dec 30, 2022
5bcbd49
Support random sampling
nzdev Dec 30, 2022
e825cbf
object[] { value, string[] facetpath} support for setting heirarchica…
nzdev Dec 30, 2022
9b2b25c
Support for facet path in full text query.
nzdev Dec 30, 2022
52be487
Merge TaxonomyIndex into LuceneIndex
nzdev Dec 30, 2022
e0d5e6b
Tidy
nzdev Dec 30, 2022
46c8b77
merge from impl
nzdev Mar 20, 2023
5cf860c
Merge #310
nzdev Mar 20, 2023
97d7d7d
Tidy interface
nzdev Mar 20, 2023
ea3d050
merge facet-extensions
nzdev Mar 20, 2023
1cdd791
merge fix
nzdev Mar 20, 2023
66f7251
path
nzdev Mar 20, 2023
db9a89d
fix merge
nzdev Mar 20, 2023
a525c25
Multivalued tags facets
nzdev Mar 23, 2023
9159fdc
Merge Examine V4
nzdev Jul 26, 2023
9e78672
Merge v4 changes
nzdev Jul 26, 2023
40e1370
Facets docfx
nzdev Jul 26, 2023
b8bb351
Simplify Taxonomy Facet config. Add configuration docfx docs for taxo…
nzdev Jul 26, 2023
a94d12f
Fix spelling
nzdev Jul 26, 2023
59d589e
Value type docs
nzdev Jul 26, 2023
4c0b40f
Improve facet search docs
nzdev Jul 26, 2023
3fae26d
docs
nzdev Jul 26, 2023
ef354c3
xdocs
nzdev Jul 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/sorting.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,16 @@ With the combination of `ISearchResult.Skip` and `maxResults`, we can tell Lucen
* Skip over a certain number of results without allocating them and tell Lucene
* only allocate a certain number of results after skipping

### Deep Paging
When using Lucene.NET as the Examine provider it is possible to more efficiently perform deep paging.
Steps:
1. Build and execute your query as normal.
2. Cast the ISearchResults from IQueryExecutor.Execute to ILuceneSearchResults
3. Store ILuceneSearchResults.SearchAfter (SearchAfterOptions) for the next page.
4. Create the same query as the previous request.
5. When calling IQueryExecutor.Execute. Pass in new LuceneQueryOptions(skip,take, SearchAfterOptions); Skip will be ignored, the next take documents will be retrieved after the SearchAfterOptions document.
6. Repeat Steps 2-5 for each page.

### Example

```cs
Expand Down
126 changes: 108 additions & 18 deletions docs/v2/articles/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,24 +98,47 @@ Value types are responsible for:

These are the default field value types provided with Examine. Each value type can be resolved from the static class [`Examine.FieldDefinitionTypes`](xref:Examine.FieldDefinitionTypes) (i.e. [`Examine.FieldDefinitionTypes.FullText`](xref:Examine.FieldDefinitionTypes#Examine_FieldDefinitionTypes_FullText)).

| Value Type | Description | Sortable |
|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|
| FullText | __Default__.<br />The field will be indexed with the index's <br />default Analyzer without any sortability. <br />Generally this is fine for normal text searching. | ❌ |
| FullTextSortable | Will be indexed with FullText but also <br />enable sorting on this field for search results. <br />_FullText sortability adds additional overhead <br />since it requires an additional index field._ | ✅ |
| Integer | Stored as a numerical structure. | ✅ |
| Float | Stored as a numerical structure. | ✅ |
| Double | Stored as a numerical structure. | ✅ |
| Long | Stored as a numerical structure. | ✅ |
| DateTime | Stored as a DateTime, <br />represented by a numerical structure. | ✅ |
| DateYear | Just like DateTime but with <br />precision only to the year. | ✅ |
| DateMonth | Just like DateTime but with <br />precision only to the month. | ✅ |
| DateDay | Just like DateTime but with <br />precision only to the day. | ✅ |
| DateHour | Just like DateTime but with <br />precision only to the hour. | ✅ |
| DateMinute | Just like DateTime but with <br />precision only to the minute. | ✅ |
| EmailAddress | Uses custom analyzers for dealing <br />with email address searching. | ❌ |
| InvariantCultureIgnoreCase | Uses custom analyzers for dealing with text so it<br /> can be searched on regardless of the culture/casing. | ❌ |
| Raw | Will be indexed without analysis, searching will<br /> only match with an exact value. | ❌ |

| Value Type | Description | Sortable | Facetable | Retrievable | Searchable | Filterable | Analyzer |
| ------------------------------ | ------------ | -------- | --------- | ----------- | ---------- | ---------- | -------- |
| FullText | **Default**. The field will be indexed with the index's default Analyzer without any sortability. Generally this is fine for normal text searching. | ❌ | ❌ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| FullTextSortable | Will be indexed with FullText but also enable sorting on this field for search results. *FullText sortability adds additional overhead since it requires an additional index field.* | ✅ | ❌ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| Integer | Stored as a numerical structure.| ✅ | ❌ | ✅ | ❌ | ✅ | - |
| Float | Stored as a numerical structure. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| Double | Stored as a numerical structure. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| Long | Stored as a numerical structure. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateTime | Stored as a DateTime, represented by a numerical structure. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateYear | Just like DateTime but with precision only to the year. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateMonth | Just like DateTime but with precision only to the month. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateDay | Just like DateTime but with precision only to the day. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateHour | Just like DateTime but with precision only to the hour. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| DateMinute | Just like DateTime but with precision only to the minute. | ✅ | ❌ | ✅ | ❌ | ✅ | - |
| EmailAddress | Uses custom analyzers for dealing with email address searching. | ❌ | ❌ | ✅ | ✅ | ✅ | EmailAddressAnalyzer |
| InvariantCultureIgnoreCase | Uses custom analyzers for dealing with text so it can be searched on regardless of the culture/casing. | ❌ | ❌ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer |
| Raw | Will be indexed without analysis, searching will only match with an exact value. | ❌ | ❌ | ✅ | ✅ | ✅ | KeywordAnalyzer |
| FacetFullText | The field will be indexed with the index's default Analyzer without any sortability. Generally this is fine for normal text searching. | ❌ | ✅ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| FacetFullTextSortable | Will be indexed with FullText but also enable sorting on this field for search results. *FullText sortability adds additional overhead since it requires an additional index field.* | ✅ | ✅ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| FacetInteger | Stored as a numerical structure. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetFloat | Stored as a numerical structure. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDouble | Stored as a numerical structure. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetLong | Stored as a numerical structure. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateTime | Stored as a DateTime, represented by a numerical structure. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateYear | Just like DateTime but with precision only to the year. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateMonth | Just like DateTime but with precision only to the month. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateDay | Just like DateTime but with precision only to the day. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateHour | Just like DateTime but with precision only to the hour. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetDateMinute | Just like DateTime but with precision only to the minute. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyFullText | The field will be indexed with the index's default Analyzer without any sortability. Generally this is fine for normal text searching. Stored in the Taxonomy Facet sidecar index. | ❌ | ✅ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| FacetTaxonomyFullTextSortable | Will be indexed with FullText but also enable sorting on this field for search results. *FullText sortability adds additional overhead since it requires an additional index field.* Stored in the Taxonomy Facet sidecar index. | ✅ | ✅ | ✅ | ✅ | ✅ | CultureInvariantStandardAnalyzer or Index default |
| FacetTaxonomyInteger | Stored as a numerical structure. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyFloat | Stored as a numerical structure. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDouble | Stored as a numerical structure. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyLong | Stored as a numerical structure. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateTime | Stored as a DateTime, represented by a numerical structure. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateYear | Just like DateTime but with precision only to the year. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateMonth | Just like DateTime but with precision only to the month. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateDay | Just like DateTime but with precision only to the day. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateHour | Just like DateTime but with precision only to the hour. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
| FacetTaxonomyDateMinute | Just like DateTime but with precision only to the minute. Stored in the Taxonomy Facet sidecar index. | ✅ |✅ | ✅ | ❌ | ✅ | - |
### Custom field value types

A field value type is defined by [`IIndexFieldValueType`](xref:Examine.Lucene.Indexing.IIndexFieldValueType)
Expand Down Expand Up @@ -192,3 +215,70 @@ That returns an result [`ValueSetValidationResult`](xref:Examine.ValueSetValidat
* `Filtered` - The ValueSet has been filtered/modified by the validator and will be indexed

Examine only has one implementation: [`ValueSetValidatorDelegate`](xref:Examine.Lucene.Providers.ValueSetValidatorDelegate) which can be used by developers as a simple way to create a validator based on a callback, else developers can implement this interface if required. By default, no ValueSet validation is done with Examine.

## Facets configuration

When using the facets feature it's possible to add facets configuration to change the behavior of the indexing.

For example, you can allow multiple values in an indexed field with the configuration below.
```csharp
// Create a config
var facetsConfig = new FacetsConfig();

// Set field to be able to contain multiple values (This is default for a field in Examine. But you only need this if you are actually using multiple values for a single field)
facetsConfig.SetMultiValued("MultiIdField", true);

services.AddExamineLuceneIndex("MyIndex",
// Set the indexing of your fields to use the facet type
fieldDefinitions: new FieldDefinitionCollection(
new FieldDefinition("Timestamp", FieldDefinitionTypes.FacetDateTime),

new FieldDefinition("MultiIdField", FieldDefinitionTypes.FacetFullText)
),
// Pass your config
facetsConfig: facetsConfig
);
```

Without this configuration for multiple values, you'll notice that your faceted search breaks or behaves differently than expected.

### Hierarchical and Taxonomy Facets configuration

To enable support for hierarchical facets as well as supporting faster faceting the Taxonomy Facet sidecar index can be enabled.

1. Set LuceneIndexOptions.UseTaxonomyIndex = true; for the index. This enables the use of the Taxonomy sidecar index.
2. Change the Field Definitions to use the "FacetTaxonomy" Field Definition Types instead of the "Facet" types. E.g. FieldDefinitionTypes.FacetFullText => FieldDefinitionTypes.FacetTaxonomyFullText.
3. To enable hierarchical facets on a field, call FacetsConfig.SetHierarchical("facetfieldname", true);

Example:

```csharp
// Create a config
var facetsConfig = new FacetsConfig();

// Set field to be able to support hierarchical facets
facetsConfig.SetHierarchical("hierarchyFacetfield", true);

// Set field to be able to contain multiple values (This is default for a field in Examine. But you only need this if you are actually using multiple values for a single field)
facetsConfig.SetMultiValued("MultiIdField", true);

services.AddExamineLuceneIndex("MyIndex",
// Set the indexing of your fields to use the facet Taxonomy type
fieldDefinitions: new FieldDefinitionCollection(
new FieldDefinition("Timestamp", FieldDefinitionTypes.FacetTaxonomyDateTime),
new FieldDefinition("hierarchyFacetfield", FieldDefinitionTypes.FacetTaxonomyFullText),

new FieldDefinition("MultiIdField", FieldDefinitionTypes.FacetTaxonomyFullText)
),
// Pass your config
facetsConfig: facetsConfig,
// Enable the Taxonomy sidecar index
useTaxonomyIndex: true
);
```

**Note: See more examples of how facets configuration can be used under [Searching](xref:searching)**

To explore other configuration settings see the links below:
- [FacetsConfig API docs](https://lucenenet.apache.org/docs/4.8.0-beta00016/api/facet/Lucene.Net.Facet.FacetsConfig.html#methods)
- [Facets with lucene](https://norconex.com/facets-with-lucene/). See how the config is used in the code examples.
Loading