Steamship Tagger Plugin

This project implements a basic Steamship Tagger that you can customize and deploy for your own use.

In Steamship, Taggers add annotations to text that can be queried and composed later. Note that a file must have first been blockified in order to be tagged.

This sample project adds paragraph and sentence tags for a sample text document, but the implementation you create might:

Use entity recognition to tag named entities in text block
Use sentiment analysis to tag positive and negative sections of a transcribed conversation

Once a Tagger has generated data in Steamship, that data is ready for use by the rest of the ecosystem. For example, you could perform a query over the sentences or embed each sentence.

First Time Setup

We recommend using Python virtual environments for development. To set one up, run the following command from this directory:

python3 -m venv .venv

Activate your virtual environment by running:

source .venv/bin/activate

Your first time, install the required dependencies with:

python -m pip install -r requirements.dev.txt
python -m pip install -r requirements.txt

Developing

All the code for this plugin is located in the src/api.py file:

The TaggerPlugin class
The run method that is invoked via the File.tag call

Testing

Tests are located in the test/test_api.py file. You can run them with:

pytest

We have provided sample data in the test_data/ folder.

Deploying

Deploy your tagger to Steamship by running:

ship deploy --register-plugin

That will deploy your plugin to Steamship and register it as a plugin for use.

Using

Once deployed, your Tagger Plugin can be referenced by the handle in your steamship.json file.

from steamship import Steamship, Block, File, MimeTypes, Tag

MY_PLUGIN_HANDLE = ".. fill this out .."

client = Steamship()
tagger = client.use_plugin(plugin_handle=MY_PLUGIN_HANDLE, plugin_instance="unique-instance-id")

with open("./test_data/king_speech.txt", "r") as text:
    # here, we add an initial block, as tagging requires files have been blockified.
    content = text.read()
    file = File.create(client, content=content, mime_type=MimeTypes.TXT, blocks=[Block.CreateRequest(text=content)])
    
file.tag(tagger.handle).wait()

# now that our file has been tagged, we can access the new tags by refreshing the file.
file = file.refresh()
for block in file.blocks:
    print(block.text)
    print(block.tags)

# we can also query for tags in the file (here, finding sentences)
print("\n".join([content[t.start_idx:t.end_idx] for t in Tag.query(client, f'file_id "{file.id}" and kind "sentence"').tags]))

Sharing

Please share what you've built with hello@steamship.com!

We would love take a look, hear your suggestions, help where we can, and share what you've made with the community.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Steamship Tagger Plugin

First Time Setup

Developing

Testing

Deploying

Using

Sharing

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
test		test
.gitignore		.gitignore
README.md		README.md
requirements.dev.txt		requirements.dev.txt
requirements.txt		requirements.txt
steamship.json		steamship.json

CerebriumAI/steamship

Folders and files

Latest commit

History

Repository files navigation

Steamship Tagger Plugin

First Time Setup

Developing

Testing

Deploying

Using

Sharing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages