Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cant install florence and ppt extractor #798

Open
sadath-12 opened this issue Jul 30, 2024 · 0 comments
Open

cant install florence and ppt extractor #798

sadath-12 opened this issue Jul 30, 2024 · 0 comments

Comments

@sadath-12
Copy link
Contributor

error with florence

Failed to pull image "tensorlake/florence": failed to pull and unpack image "docker.io/tensorlake/florence:latest": failed to resolve reference "docker.io/tensorlake/florence:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed

error with ppt

root@ip-172-31-23-167:/home/ubuntu/indexify/operations/k8s/helm# k logs ppt-57fdd5fd4c-m2t72   -n indexify
indexify-extractor-sdk version 0.0.87
workers  1
config path provided  None
joining coordinator:8950 and sending extracted content to api:8900
Exception in initializer:
Traceback (most recent call last):
  File "/usr/lib/python3.11/concurrent/futures/process.py", line 235, in _process_worker
    initializer(*initargs)
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/extractor_worker.py", line 56, in create_extractor_wrapper_map
    description = extractor_wrapper.describe()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/base_extractor.py", line 312, in describe
    outputs: Dict[str, List[Union[Feature, Content]]] = self.extract_batch(
                                                        ^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/base_extractor.py", line 298, in extract_batch
    out[task_id] = self._instance.extract(content, param_instance)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/indexify_extractors/ppt_extractor.py", line 25, in extract
    prs = Presentation(inputtmpfile.name)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/api.py", line 28, in Presentation
    presentation_part = Package.open(pptx).main_document_part
                        ^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 73, in open
    return cls(pkg_file)._load()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 157, in _load
    pkg_xml_rels, parts = _PackageLoader.load(self._pkg_file, self)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 186, in load
    return cls(pkg_file, package)._load()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 190, in _load
    parts, xml_rels = self._parts, self._xml_rels
                      ^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/util.py", line 215, in __get__
    value = self._fget(obj)
            ^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 219, in _parts
    content_types = self._content_types
                    ^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/util.py", line 215, in __get__
    value = self._fget(obj)
            ^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/package.py", line 203, in _content_types
    return _ContentTypeMap.from_xml(self._package_reader[CONTENT_TYPES_URI])
                                    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/serialized.py", line 35, in __getitem__
    return self._blob_reader[pack_uri]
           ^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/util.py", line 215, in __get__
    value = self._fget(obj)
            ^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/serialized.py", line 49, in _blob_reader
    return _PhysPkgReader.factory(self._pkg_file)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/pptx/opc/serialized.py", line 135, in factory
    raise PackageNotFoundError("Package not found at '%s'" % pkg_file)
pptx.exc.PackageNotFoundError: Package not found at '/tmp/tmpef6pkkj5.pptx'
Traceback (most recent call last):
  File "/root/.indexify-extractors/ve/bin/indexify-extractor", line 8, in <module>
    sys.exit(typer_app())
             ^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/typer/main.py", line 328, in __call__
    raise e
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/typer/main.py", line 311, in __call__
    return get_command(self)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/typer/core.py", line 783, in main
    return _main(
           ^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/typer/core.py", line 225, in _main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/main.py", line 113, in join_server
    indexify_extractor.join(
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/indexify_extractor.py", line 57, in join
    ] = asyncio.get_event_loop().run_until_complete(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 650, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/root/.indexify-extractors/ve/lib/python3.11/site-packages/indexify_extractor_sdk/extractor_worker.py", line 206, in describe
    return await loop.run_in_executor(executor, _describe)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
root@ip-172-31-23-167:/home/ubuntu/indexify/operations/k8s/helm# 

using helm chart with the following extractor values

extractors:
  - image: tensorlake/chunk-extractor:latest
    name: chunker
  - image: tensorlake/minilm-l6:latest
    name: minilm-l6
  - image: tensorlake/pdf-extractor:latest
    name: pdfextractor  
  - image: tensorlake/florence
    name: florence  
  - image: tensorlake/whisper-asr
    name:  whisper-asr
  - image: tensorlake/summarization
    name: summarization 
  - image: tensorlake/openai
    name:  openai  
  - image: tensorlake/marker
    name: marker  
  - image: tensorlake/audio-extractor
    name: audio-extractor  
  - image: tensorlake/mistral
    name: mistral  
  - image: tensorlake/ppt
    name: ppt  

others seems to get installed fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant