Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(TaskProcessing): Update docs to reflect latest changes #12063

Merged
merged 8 commits into from
Jul 26, 2024
8 changes: 5 additions & 3 deletions admin_manual/ai/app_context_chat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,20 @@ Installation
Initial loading of data
-----------------------

Context chat will automatically load user data into the Vector DB using background jobs. To speed this up, you can set up multiple background job worker machines and run the following occ commands in parallel on each:
Context chat will automatically load user data into the Vector DB using background jobs. To speed this up, you can set up multiple background job workers (possibly on dedicated machines) and run the following occ commands as daemons in parallel on each:

.. code-block::

occ background-job:worker OCA\ContextChat\BackgroundJobs\StorageCrawlJob
occ background-job:worker 'OCA\ContextChat\BackgroundJobs\StorageCrawlJob'

.. code-block::

occ background-job:worker OCA\ContextChat\BackgroundJobs\IndexerJob
occ background-job:worker 'OCA\ContextChat\BackgroundJobs\IndexerJob'

This will ensure that the necessary background jobs are run as often as possible: ``StorageCrawlJob`` will crawl Nextcloud storages and put files that it finds into a queue and ``IndexerJob`` will iterate over the queue and load the file content into the Vector DB.

Make sure to restart these daemons regularly. For example once a day.

Scaling
-------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@ Added APIs
- ``OCP\App\IAppManager::BACKEND_CALDAV`` was added to represent the caldav backend dependency for ``isBackendRequired()``.
- ``OCP\App\IAppManager::isBackendRequired()`` was added to check if at least one app requires a specific backend (currently only ``caldav``).
- ``OCP\Accounts\IAccountManager::PROPERTY_BIRTHDATE`` was added to allow users to configure their date of birth in their profiles.
- ``OCP\TaskProcessing``` was added to unify task processing of AI tasks and other types of tasks. See :ref:`Task Processing<task_processing>`
- ``OCP\AppFramework\Bootstrap\IRegistrationContext::registerTaskProcessingProvider()`` was added to allow registering task processing providers
- ``OCP\AppFramework\Bootstrap\IRegistrationContext::registerTaskProcessingTaskType()`` was added to allow registering task processing task types
- ``OCP\Files\IRootFolder::getAppDataDirectoryName()`` was added to allow getting the name of the app data directory

Changed APIs
^^^^^^^^^^^^
Expand Down Expand Up @@ -175,6 +179,10 @@ Deprecated APIs
- Calling ``OCP\DB\QueryBuilder\IQueryBuilder::update()`` with ``$alias`` is deprecated and will throw an exception in a future version as the underlying library is removing the functionality.
- Calling ``OCP\IDBConnection::getDatabasePlatform()`` is deprecated and will throw an exception in a future version as the underlying library is renaming and removing platforms which breaks the backwards-compatibility. Use ``getDatabaseProvider()`` instead.
- Calling ``OCP\Files\Lock\ILockManager::registerLockProvider()`` is deprecated and will be removed in the future. Use ``registerLazyLockProvider()`` instead.
- Using ``OCP\Translation`` is deprecated and will be removed in the future. Use ``OCP\TaskProcessing`` instead.
- Using ``OCP\SpeechToText`` is deprecated and will be removed in the future. Use ``OCP\TaskProcessing`` instead. Existing ``SpeechToText`` providers will continue to work with the TaskProcessing API until then.
- Using ``OCP\TextToImage`` is deprecated and will be removed in the future. Use ``OCP\TaskProcessing`` instead. Existing ``TextToImage`` providers will continue to work with the TaskProcessing API until then.
- Using ``OCP\TextProcessing`` is deprecated and will be removed in the future. Use ``OCP\TaskProcessing`` instead. Existing ``TextProcessing`` providers will continue to work with the TaskProcessing API until then.

Removed APIs
^^^^^^^^^^^^
Expand Down
5 changes: 4 additions & 1 deletion developer_manual/digging_deeper/speech-to-text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ Speech-To-Text

.. versionadded:: 27

.. deprecated:: 30
Use the TaskProcessing API instead

Nextcloud offers a **Speech-To-Text** API. The overall idea is that there is a central OCP API that apps can use to request transcriptions of audio or video files. To be technology agnostic any app can provide this Speech-To-Text functionality by registering a Speech-To-Text provider.

Consuming the Speech-To-Text API
Expand Down Expand Up @@ -182,4 +185,4 @@ The provider class is registered via the :ref:`bootstrap mechanism<Bootstrapping

public function boot(IBootContext $context): void {}

}
}
137 changes: 134 additions & 3 deletions developer_manual/digging_deeper/task_processing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,13 @@ The following built-in task types are available:
* ``input``: ``Text``
* Output shape:
* ``output``: ``Text``
* ``'core:text2text:translate'``: This task will translate text from one language to another. It is implemented by ``\OCP\TaskProcessing\TaskTypes\TextToTextTranslate``
* Input shape:
* ``input``: ``Text``
* ``origin_language``: ``Enum``
* ``target_language``: ``Enum``
* Output shape:
* ``output``: ``Text``
* ``'core:audio2text'``: This task type is for transcribing audio to text. It is implemented by ``\OCP\TaskProcessing\TaskTypes\AudioToText``
* Input shape:
* ``input``: ``Audio``
Expand Down Expand Up @@ -126,6 +133,7 @@ Input and output shape keys can have one of a pre-defined set of types, which ar
case Audio = 3;
case Video = 4;
case File = 5;
case Enum = 6;
case ListOfNumbers = 10;
case ListOfTexts = 11;
case ListOfImages = 12;
Expand Down Expand Up @@ -158,6 +166,17 @@ The task class objects have the following methods available:
* ``getAppId()`` This returns the originating application ID of the task.
* ``getCustomId()`` This returns the original scheduler-defined identifier for the task
* ``getUserId()`` This returns the originating user ID of the task.
* ``getCompletionExpectedAt()`` This is available after scheduling the task and returns the DateTime when the task is expected to be completed
* ``getLastUpdated()`` This returns the time the task was last updated as a unix timestamp
* ``getScheduledAt()`` This returns the time the task was scheduled as a unix timestamp
* ``getStartedAt()`` This returns the time the task execution started as a unix timestamp
* ``getEndedAt()`` This returns the time the task execution ended as a unix timestamp
* ``getErrorMessage()`` This returns the error message if the task execution failed
* ``getProgress()`` This returns the current task progress, between 0 and 1 while the task is running. Will be 1 when the task is completed
* ``setWebhookUri()`` This sets the URI of a webhook that will be notified when the task execution has ended
* ``setWebhookMethod()`` This sets the HTTP method that will be used for the webhook when the task execution has ended
* ``getWebhookUri()`` This returns the webhook URI that will be notified when the task execution has ended
* ``getWebhookMethod()`` This returns the HTTP method that will be used for the webhook when the task execution has ended

You could now schedule the task as follows:

Expand Down Expand Up @@ -261,7 +280,7 @@ A **Task processing provider** will usually be a class that implements the inter
) {
}

public function getId() {
public function getId(): string {
return 'myapp:summary';
}

Expand All @@ -277,17 +296,129 @@ A **Task processing provider** will usually be a class that implements the inter
// Return the output here
}

public function getExpectedRuntime() {
public function getExpectedRuntime(): int {
// usually takes 1min on average
return 60;
}

public function getInputShapeDefaults(): array {
return [];
}

public function getOptionalInputShape(): array {
return [];
}

public function getOptionalInputShapeDefaults(): array {
return [];
}

public function getOptionalOutputShape(): array {
return [];
}

public function getInputShapeEnumValues(): array {
return [];
}

public function getOptionalInputShapeEnumValues(): array {
return [];
}

public function getOutputShapeEnumValues(): array {
return [];
}

public function getOptionalOutputShapeEnumValues(): array {
return [];
}
}

The method ``getName`` returns a string to identify the registered provider in the user interface.

The method ``process`` implements the text processing step. In case execution fails for some reason, you should throw a ``\OCP\TaskProcessing\Exception\ProcessingException`` with an explanatory error message. Important to note here is that ``Image``, ``Audio``, ``Video`` and ``File`` slots in the input array will be filled with ``\OCP\Files\File`` objects for your convenience. When outputting one of these you should simply return a string, the API will turn the data into a proper file for convenience. The ``$reportProgress`` parameter is a callback that you may use at will to report the task progress as a single float value between 0 and 1. Its return value will indicate if the task is still running (``true``) or if it was cancelled (``false``) and processing should be terminated.

This class would typically be saved into a file in ``lib/TextProcessing`` of your app but you are free to put it elsewhere as long as it's loadable by Nextcloud's :ref:`dependency injection container<dependency-injection>`.
This class would typically be saved into a file in ``lib/TaskProcessing`` of your app but you are free to put it elsewhere as long as it's loadable by Nextcloud's :ref:`dependency injection container<dependency-injection>`.

Providing supplemental inputs and outputs
marcelklehr marked this conversation as resolved.
Show resolved Hide resolved
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Built-in task types often only specify the most basic input and output slots. If you would like to offer more input options
with your provider you can specify optional inputs and outputs using the ``getOptionalInputShape`` and ``getOptionalOutputShape`` methods.
You will need to return an associative array of ``\OCP\TaskProcessing\ShapeDescriptor`` objects.

.. code-block:: php

public function getOptionalInputShape(): array {
return [
'tone' => new ShapeDescriptor($this->l->t('Tone of voice'), $this->l->t('Set the tone of voice to be used for the output'), EShapeType::Text)
];
}

In the same vein you can also provide optional output shape slots in addition to the pre-defined output slots.

.. code-block:: php

public function getOptionalOutputShape(): array {
return [
'co2_emissions' => new ShapeDescriptor($this->l->t('CO2 Emissions'), $this->l->t('The CO2 emissions produced by running this task in metric tons'), EShapeType::Number)
];
}

Providing input defaults
^^^^^^^^^^^^^^^^^^^^^^^^

With the method ``getInputShapeDefaults`` you can specify default values for input slots (which are defined by the task type). For example:

.. code-block:: php

public function getInputShapeDefaults(): array {
return [
'input' => 'There was once a man with many cows who wanted to have even more cows.'
];
}

Note that you can only specify default values for 'Text' and 'Number' slots.

The same works for your optional input shapes that you defined in ``getOptionalInputShape``:

.. code-block:: php

public function getOptionalInputShapeDefaults(): array {
return [
'tone' => 'Formal'
];
}

Working with Enum shape types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Both input and output shapes as well as the optional input and output shapes allow declaring slots of type ``'Enum'``. An Enum
is a type that only allows values from a pre-defined set. In the case of the TaskProcessing API this set is not defined by the task type, but
by the provider implementing the task type using ``getInputShapeEnumValues``, ``getOutputShapeEnumValues``, ``getOptionalInputShapeEnumValues`` and ``getOptionalOutputShapeEnumValues``.

You could, for example, implement the above tone of voice slot using an Enum:

.. code-block:: php

public function getOptionalInputShape(): array {
return [
'tone' => new ShapeDescriptor($this->l->t('Tone of voice'), $this->l->t('Set the tone of voice to be used for the output'), EShapeType::Enum)
];
}

.. code-block:: php

public function getOptionalInputShapeEnumValues(): array {
return [
'tone' => [
new ShapeEnumValue($this->l->t('Simple'), 'So that a kid could understand'),
new ShapeEnumValue($this->l->t('Funny'), 'Funny'),
new ShapeEnumValue($this->l->t('Formal'), 'Formal'),
]
];
}


Providing more task types
^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
3 changes: 3 additions & 0 deletions developer_manual/digging_deeper/text2image.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ Text-To-Image

.. versionadded:: 28

.. deprecated:: 30
Use the TaskProcessing API instead

Nextcloud offers a **Text-To-Image** API. The overall idea is that there is a central OCP API that apps can use to prompt tasks to latent diffusion AI models and similar image generation tools. To be technology agnostic any app can provide this functionality by registering a Text-To-Image provider.

Consuming the Text-To-Image API
Expand Down
4 changes: 4 additions & 0 deletions developer_manual/digging_deeper/text_processing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ Text Processing

.. versionadded:: 27.1.0

.. deprecated:: 30
Use the TaskProcessing API instead


Nextcloud offers a **Text Processing** API. The overall idea is that there is a central OCP API that apps can use to prompt tasks to Large Language Models and similar text processing tools. To be technology agnostic any app can provide this functionality by registering Text Processing providers.

Consuming the Text Processing API
Expand Down
5 changes: 4 additions & 1 deletion developer_manual/digging_deeper/translation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ Machine Translation

.. versionadded:: 26

.. deprecated:: 30
Use the TaskProcessing API instead

Nextcloud offers a **Translation** API. The overall idea is that there is a central OCP API that apps can use to request machine translations of text. To be technology agnostic any app can provide this Translation functionality by registering a Translation provider.

Consuming the Translation API
Expand Down Expand Up @@ -187,4 +190,4 @@ The provider class is registered via the :ref:`bootstrap mechanism<Bootstrapping

public function boot(IBootContext $context): void {}

}
}
Loading