Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of using different multisearchable fields for different tenants? #493

Open
pauljacksonrodgers opened this issue Oct 17, 2022 · 1 comment

Comments

@pauljacksonrodgers
Copy link

I have an application that supports an admin portal and a customer portal. We already use multisearch for the admin portal's search function, and would now like to expose a similar search to customers.

For security reasons, I want the multisearchable fields to be different. For instance, admins should be able to search by phone number, but customers should not.

I thought the easiest way of accomplishing this would be to simply have two different content and tsv_content fields, like content_admin and content_customer, and then specify which one to search. Is this something that pg_search supports? Or is there a better way of doing what I'm trying to do?

@copiousfreetime
Copy link

I have this exact problem at the moment too.

After a dive on the source code, it looks like the following things would need to change in able to support it:

  1. The ability to add in additional pg_search_scope items to the PgSearch::Document - this is so you can set against: [:content, :content_admin] for the new pg_search_scope
  2. Change PgSearch.multisearch_options to allow configuring additional things for the additional pg_search_scope - for instance adding in ts_vector columns and things
  3. Nice to have - The ability to rename the default Multisearch :content column in the generated migration, and in pg_search_document_attrs so we wouldn't be required to have a column namedcontent
  4. The ability to have some method similar to the searchable_text approach, but for the internal / private fields

This is my current setup that is working.

I'm not sure if this would be there recommended way of doing it, and would definitely appreciate feedback from @nertzy or anyone on this approach.

lib/extensions/pg_search/document.rb

This adds an additional internal_search search scope to the existing PgSearch::Document. pg_search is already setup to handle joining multiple ts_vector colums together so just the additional scope and column are needed, and the additional column only needs the private data, no need to index the public data a 2nd time.

require 'pg_search'
PgSearch::Document.pg_search_scope(:internal_search, lambda { |*args|
  {
    query: args.first,
    against: [:content, :internal_content],
    using: {
      tsearch: {
        prefix: true,
        negation: true,
        tsvector_column: ["content_tsvector", "internal_content_tsvector"]
      }
    }
  }
})

Migration to alter the existing pg_search_documents table

class AlterPgSearchDocumentsForPrivateContent < ActiveRecord::Migration[6.1]

  def change
    # Add the new columns
    add_column(:pg_search_documents, :internal_content, :text)
    add_column(:pg_search_documents, :internal_content_tsvector, :tsvector)

    # add the new index for the private index
    add_index :pg_search_documents, :internal_content_tsvector, using: 'gin'

    trigger_name = "pg_search_documents_tsearch_tr"

    reversible do |dir|
      dir.up do
        create_trigger(trigger_name, compatibility: 1)
          .on(:pg_search_documents)
          .before(:insert, :update)
          .for_each(:row) do
            <<~PLSQL
            new.content_tsvector :=  to_tsvector('simple', coalesce(new.content::text, ''));
            new.internal_content_tsvector :=  to_tsvector('simple', coalesce(new.internal_content::text, ''));
            PLSQL
          end

        execute <<~SQL
          CREATE INDEX pg_search_documents_on_internal_content ON pg_search_documents USING gin(coalesce(internal_content, ''::text) gin_trgm_ops);
          CREATE INDEX pg_search_documents_on_all_content ON pg_search_documents USING gin(coalesce(content || internal_content, ''::text) gin_trgm_ops);
          CREATE INDEX pg_search_documents_on_all_tsvectors ON pg_search_documents USING gin((content_tsvector || internal_content_tsvector));
        SQL
      end

      dir.down do
        create_trigger(trigger_name, compatibility: 1)
          .on(:pg_search_documents)
          .before(:insert, :update)
          .for_each(:row) do
            "new.content_tsvector :=  to_tsvector('simple', coalesce(new.content::text, ''))"
          end

        execute <<~SQL
          DROP INDEX pg_search_documents_on_internal_content;
          DROP INDEX pg_search_documents_on_all_content;
          DROP INDEX pg_search_documents_on_all_tsvectors;
        SQL
      end
    end
  end
end

that manual create index for (content_tsvector || internal_content_tsvector) is what gets postgresql to use the vector index on the merged columns data, if it wasn't there the internal search would be doing table scans. This might be a useful not to add to the tsvector index documentation if folsk are using multiple columns.

searchable concern

I use a model concern of Searchable for models to say what attributes / methods of theirs return data that is indexable.

A usage example

class ObservationReport < ApplicationRecord

#...

  include Searchable
  self.searchable_attributes =  [
      :route,
      :area,
      :description,
   ]

  self.internal_searchable_attributes = [
     :comments_admin
     :firstname
     :lastname
  ]
end

And then can search doing:

# public search
PgSearch::Document.search("searching")
ObservationReport.search("searching") # this is a helper in the concern

# internal search
PgSearch::Document.internal_search("internal")
ObservationReport.internal_search("internal")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants