Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NC 24 / fulltextsearch 24: #702

Closed
cbachmann987 opened this issue May 25, 2022 · 7 comments
Closed

NC 24 / fulltextsearch 24: #702

cbachmann987 opened this issue May 25, 2022 · 7 comments

Comments

@cbachmann987
Copy link

  • NC24 in the official docker container (Apache), using podman
  • Elasticsearch (in separate official docker container, 7.17.3)

Got the following error during indexing:

┌─ Indexing  ────
│ Action: indexDocument
│ Provider: Files                Account: (REMOVED)
│ Document: 90734
│ Info: application/vnd.openxmlformats-officedocument.presentationml.presentation
│ Title: (REMOVED)
│ Content size: 7602364
│ Chunk:     37/236
│ Progress:    773/5474
└──
┌─ Results ────
│ Result:  16392/16392
│ Index: files:90732
│ Status: ok
│ Message: {"_index":"my_index","_type":"_doc","_id":"files:90732","_version":1,"result":"created"
│ ,"_shards":{"total":2,"successful":1,"failed":0},"_seq_no":16390,"_primary_term":1}
│ 
└──
┌─ Errors ────
│ Error:   1660/1660
│ Index: files:90684
│ Exception: Elasticsearch\Common\Exceptions\BadRequest400Exception
│ Message: The element type "hr" must be terminated by the matching end-tag "</hr>".
│ 
│ 
└──
## x:first result ## c/v:prec/next result ## b:last result
## f:first error ## h/j:prec/next error ## d:delete error ## l:last error
## q:quit ## p:pause 
An unhandled exception has been thrown:
TypeError: OCA\FullTextSearch_Elasticsearch\Service\IndexMappingService::indexDocumentNew(): Return value must be of type array, string returned in /var/www/html/custom_apps/fulltextsearch_elasticsearch/lib/Service/IndexMappingService.php:88
Stack trace:
#0 /var/www/html/custom_apps/fulltextsearch_elasticsearch/lib/Service/IndexService.php(195): OCA\FullTextSearch_Elasticsearch\Service\IndexMappingService->indexDocumentNew(Object(Elasticsearch\Client), Object(OCA\Files_FullTextSearch\Model\FilesDocument))
#1 /var/www/html/custom_apps/fulltextsearch_elasticsearch/lib/Platform/ElasticSearchPlatform.php(226): OCA\FullTextSearch_Elasticsearch\Service\IndexService->indexDocument(Object(Elasticsearch\Client), Object(OCA\Files_FullTextSearch\Model\FilesDocument))
#2 /var/www/html/custom_apps/fulltextsearch/lib/Service/IndexService.php(373): OCA\FullTextSearch_Elasticsearch\Platform\ElasticSearchPlatform->indexDocument(Object(OCA\Files_FullTextSearch\Model\FilesDocument))
#3 /var/www/html/custom_apps/fulltextsearch/lib/Service/IndexService.php(324): OCA\FullTextSearch\Service\IndexService->indexDocument(Object(OCA\FullTextSearch_Elasticsearch\Platform\ElasticSearchPlatform), Object(OCA\Files_FullTextSearch\Model\FilesDocument))
#4 /var/www/html/custom_apps/fulltextsearch/lib/Service/IndexService.php(195): OCA\FullTextSearch\Service\IndexService->indexDocuments(Object(OCA\FullTextSearch_Elasticsearch\Platform\ElasticSearchPlatform), Object(OCA\Files_FullTextSearch\Provider\FilesProvider), Array, Object(OCA\FullTextSearch\Model\IndexOptions))
#5 /var/www/html/custom_apps/fulltextsearch/lib/Command/Index.php(416): OCA\FullTextSearch\Service\IndexService->indexProviderContentFromUser(Object(OCA\FullTextSearch_Elasticsearch\Platform\ElasticSearchPlatform), Object(OCA\Files_FullTextSearch\Provider\FilesProvider), 'christian', Object(OCA\FullTextSearch\Model\IndexOptions))
#6 /var/www/html/custom_apps/fulltextsearch/lib/Command/Index.php(279): OCA\FullTextSearch\Command\Index->indexProvider(Object(OCA\Files_FullTextSearch\Provider\FilesProvider), Object(OCA\FullTextSearch\Model\IndexOptions))
#7 /var/www/html/3rdparty/symfony/console/Command/Command.php(255): OCA\FullTextSearch\Command\Index->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#8 /var/www/html/core/Command/Base.php(168): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#9 /var/www/html/3rdparty/symfony/console/Application.php(1009): OC\Core\Command\Base->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#10 /var/www/html/3rdparty/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand(Object(OCA\FullTextSearch\Command\Index), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#11 /var/www/html/3rdparty/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#12 /var/www/html/lib/private/Console/Application.php(211): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#13 /var/www/html/console.php(99): OC\Console\Application->run()
#14 /var/www/html/occ(11): require_once('/var/www/html/c...')
#15 {main}

Indexing obviously stops, need to restart the Nextcloud instance to run any further commands with occ fulltextsearch

❯ podman exec -u www-data -it nextcloud-main php occ fulltextsearch:check
Full text search 24.0.0
 
- Search Platform:
Elasticsearch 24.0.0 (Selected)
{
    "elastic_host": [
        "http://127.0.0.1:9200"
    ],
    "elastic_index": "my_index",
    "fields_limit": "10000",
    "es_ver_below66": "0",
    "analyzer_tokenizer": "standard"
} 

- Content Providers:
Deck 1.7.0
[]
Files 24.0.0
{
    "files_local": "1",
    "files_external": "2",
    "files_group_folders": "0",
    "files_encrypted": "0",
    "files_federated": "0",
    "files_size": "20",
    "files_pdf": "0",
    "files_office": "1",
    "files_image": "0",
    "files_audio": "0"
}
❯ podman exec -u www-data -it nextcloud-main php occ fulltextsearch:test
 
.Testing your current setup:  
Creating mocked content provider. ok  
Testing mocked provider: get indexable documents. (2 items) ok  
Loading search platform. (Elasticsearch) ok  
Testing search platform. ok  
Locking process ok  
Removing test. ok  
Pausing 3 seconds 1 2 3 ok  
Initializing index mapping. ok  
Indexing generated documents. ok  
Pausing 3 seconds 1 2 3 ok  
Retreiving content from a big index (license). (size: 32386) ok  
Comparing document with source. ok  
Searching basic keywords:  
 - 'test' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple test' (result: 2, expected: ["simple","license"]) ok  
 - '"document is a test"' (result: 0, expected: []) ok  
 - '"document is a simple test"' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple -test' (result: 1, expected: ["license"]) ok  
 - 'document is a simple +test' (result: 1, expected: ["simple"]) ok  
 - '-document is a simple test' (result: 0, expected: []) ok  
 - 'document is a simple +test +testing' (result: 1, expected: ["simple"]) ok  
 - 'document is a simple +test -testing' (result: 0, expected: []) ok  
 - 'document is a +simple -test -testing' (result: 0, expected: []) ok  
 - '+document is a simple -test -testing' (result: 1, expected: ["license"]) ok  
 - 'document is a +simple -license +testing' (result: 1, expected: ["simple"]) ok  
Updating documents access. ok  
Pausing 3 seconds 1 2 3 ok  
Searching with group access rights:  
 - 'license' - [] -  (result: 0, expected: []) ok  
 - 'license' - ["group_1"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_1","group_2"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_3","group_2"] -  (result: 1, expected: ["license"]) ok  
 - 'license' - ["group_3"] -  (result: 0, expected: []) ok  
Searching with share rights:  
 - 'license' - notuser -  (result: 0, expected: []) ok  
 - 'license' - user2 -  (result: 1, expected: ["license"]) ok  
 - 'license' - user3 -  (result: 1, expected: ["license"]) ok  
Removing test. ok  
Unlocking process ok  

Fulltextsearch unusable for me - any help appreciated. I can do further tests if helpful.

Perhaps I could also help writing some docs

@apg1980
Copy link

apg1980 commented May 31, 2022

Fulltextsearch indexing on NC24 breaks as well if:

  • User has no Quota
  • User is disabled
  • User has no Files
    You can reproduce it.

@cbachmann987
Copy link
Author

Fulltextsearch indexing on NC24 breaks as well if:

* User has no Quota

* User is disabled

* User has no Files
  You can reproduce it.

I had no quota set for any user; so I just tried to recreate the index with quotas enabled for all users => indexing breaks earlier then before, no real difference. Have no users with no files or disabled users.

@apg1980
Copy link

apg1980 commented Jun 1, 2022

for me issue still exists after a index run it breaks with:
An unhandled exception has been thrown: Error: Call to a member function getUID() on null in /var/www/nextcloud/apps/files_fulltextsearch/lib/Service/FilesService.php:449 Stack trace: #0 /var/www/nextcloud/apps/files_fulltextsearch/lib/Service/FilesService.php(421): OCA\Files_FullTextSearch\Service\FilesService->generateFilesDocumentFromFile() #1 /var/www/nextcloud/apps/files_fulltextsearch/lib/Service/FilesService.php(318): OCA\Files_FullTextSearch\Service\FilesService->generateFilesDocumentFromParent() #2 /var/www/nextcloud/apps/files_fulltextsearch/lib/Provider/FilesProvider.php(269): OCA\Files_FullTextSearch\Service\FilesService->getFilesFromUser() #3 /var/www/nextcloud/apps/fulltextsearch/lib/Service/IndexService.php(183): OCA\Files_FullTextSearch\Provider\FilesProvider->generateIndexableDocuments() #4 /var/www/nextcloud/apps/fulltextsearch/lib/Command/Index.php(416): OCA\FullTextSearch\Service\IndexService->indexProviderContentFromUser() #5 /var/www/nextcloud/apps/fulltextsearch/lib/Command/Index.php(279): OCA\FullTextSearch\Command\Index->indexProvider() #6 /var/www/nextcloud/3rdparty/symfony/console/Command/Command.php(255): OCA\FullTextSearch\Command\Index->execute() #7 /var/www/nextcloud/core/Command/Base.php(168): Symfony\Component\Console\Command\Command->run() #8 /var/www/nextcloud/3rdparty/symfony/console/Application.php(1009): OC\Core\Command\Base->run() #9 /var/www/nextcloud/3rdparty/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand() #10 /var/www/nextcloud/3rdparty/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun() #11 /var/www/nextcloud/lib/private/Console/Application.php(211): Symfony\Component\Console\Application->run() #12 /var/www/nextcloud/console.php(99): OC\Console\Application->run() #13 /var/www/nextcloud/occ(11): require_once('...')

@apg1980
Copy link

apg1980 commented Jun 1, 2022

it breaks if groupfolders is enabled.
it is an old issue nextcloud/server#15074

@XueSheng-GIT
Copy link

@apg1980 please try to keep the different issues separated. The exception regarding getUID() is already in discussion at #697. There are already specific cases shown which result in this error.

This issue is about TypeError: OCA\FullTextSearch_Elasticsearch\Service\IndexMappingService::indexDocumentNew(): Return value must be of type array, string returned in /var/www/html/custom_apps/fulltextsearch_elasticsearch/lib/Service/IndexMappingService.php:88.

Thanks!

@ArtificialOwl
Copy link
Member

@cbachmann987 I am assuming it only happens on some file. ES is returning an error in a strange format, we will ignore those files.

Would you be able to test this PR ?
nextcloud/fulltextsearch_elasticsearch#200

@cbachmann987
Copy link
Author

Thanks for your work - I applied your patch (manually) and indexing went through all files.

Makes definitely sense to better skip some files with unknown errors than having no index at all..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants