Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomy Index for Facets #323

Merged
merged 81 commits into from
Jul 27, 2023

Conversation

nzdev
Copy link
Contributor

@nzdev nzdev commented Dec 22, 2022

Builds on #310 and #311 and #321.

Adds support for the Taxonomy index for fast / Hierarchical facets to LuceneIndex.

What is the Taxonomy Index?

The Taxonomy Index is a sidecar index used for efficient faceting. The Taxonomy Index is required in order to use hierarchical facets. The Taxonomy Index has it's own directory and writer / reader for Taxonomy specific operations. The "main" Index is still used.

What's in this PR?

  • Adds support for the Taxonomy Index to the existing LuceneIndex.
  • Adds option, LuceneIndexOptions.UseTaxonomyIndex to enable/disable the use of the Taxonomy Index.
  • Adds FieldDefinitionTypes to configure the facet fields to be stored in the Taxonomy Index. E.x. FieldDefinitionTypes.FacetTaxonomyFullText
  • Adds ILuceneTaxonomySearcher for Lucene Taxonomy specific search functionality on LuceneIndex.
  • Adds CreateTaxonomyDirectory to IDirectoryFactory to change how / where the Taxonomy Directory is created.
  • Adds SyncedTaxonomyFileSystemDirectoryFactory and ExamineTaxonomyReplicator to replicate the Index and Taxonomy.
  • Adds support for random sampling facet query performance optimization, configured via LuceneFacetSamplingQueryOptions on LuceneQueryOptions
  • Adds support for hierarchical paths on IFacetQueryField
  • Adds support for indexing fields and setting hierarchical facet path.
  • Adds example Facet and Taxonomy Facet indexes to the web demo
  • Adds Faceted Search page to the web demo
  • Faceting documenation in docfx

Why?

Support faster facet counting and hierarchical facet paths.

How to use the Taxonomy Index?

How to switch to using Taxonomy Index from using a Faceted, non-taxonomy index.

  1. Set LuceneIndexOptions.UseTaxonomyIndex = true; for the index.
  2. Set LuceneIndexOptions.FacetsConfig. To enable hierarchical facets on a field, call FacetsConfig.SetHierarchical("facetfieldname", true);
  3. Change the Field Definitions to use the "FacetTaxonomy" Field Definition Types instead of the "Facet" types. E.g. FieldDefinitionTypes.FacetFullText => FieldDefinitionTypes.FacetTaxonomyFullText

@nzdev nzdev marked this pull request as ready for review December 30, 2022 20:08
@nikcio
Copy link
Contributor

nikcio commented Jul 26, 2023

@nzdev same as #321 (comment)

@nzdev nzdev changed the base branch from release/3.0 to release/4.0 July 26, 2023 05:07
@nzdev
Copy link
Contributor Author

nzdev commented Jul 26, 2023

Done

@Shazwazza
Copy link
Owner

whoa! This is some huge work!

I love the effort put in here. My only concern is how I am going to support this moving forward :P But alas, it's a community effort so if folks need help, I guess we all need to chip in.

One question though - Can you explain in more detail why the extra index is required and why the info can't be stored in the same index?

@nzdev
Copy link
Contributor Author

nzdev commented Jul 26, 2023

Happy to help.
https://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-files/userguide.html#taxonomy_index

TLDR: Some of the faceting features require the use of the taxonomy index, plus it makes faceting faster.

@Shazwazza Shazwazza merged commit 925f33c into Shazwazza:release/4.0 Jul 27, 2023
@Shazwazza
Copy link
Owner

Merging this huge amount of work! Will need to give all this a run through myself since I haven't had time yet. Regarding the #321, looks like that is auto merged with this one being merged.

So what do @nzdev and @nikcio think are the next steps? Are there remaining PRs we want to get in for a v4 release? And then I think we'll need to see what compatibility is like for Umbraco - as in, will they need to rev a major change or do we think we can ship this while supporting Umbraco v12 without changes?

@nikcio nikcio mentioned this pull request Jul 28, 2023
5 tasks
@nikcio
Copy link
Contributor

nikcio commented Jul 28, 2023

@Shazwazza See #310 (comment) 😄

@TomSteer
Copy link

TomSteer commented Nov 30, 2023

@nzdev Does this pull request also support filtering using hierarchical facets or does it just support the index aspect at the moment? I had a play around with the demo project but couldn't see how to filter using hierarchical facets.

Awesome work by the way 🎉

@nzdev
Copy link
Contributor Author

nzdev commented Nov 30, 2023

Mostly the index aspect, building on top of the work from @nikcio . You may need to use lucene.net directly. Drill down / sideways abstractions are still being worked on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants