Build cross-modal and multi-modal applications on the cloud
Jina is a framework that empowers anyone to build cross-modal and multi-modal[*] applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer.
[*] Example cross-modal application: DALL·E Flow; example multi-modal services: CLIP-as-service, Jina Now.
Applications built with Jina enjoy the following features out-of-the-box:
🌌 Universal
- Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI's DocArray.
- Support all mainstream deep learning frameworks.
- Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS.
⚡ Performance
- Intuitive design pattern for high-performance microservices.
- Scaling at ease: set replicas, sharding in one line.
- Duplex streaming between client and server.
- Async and non-blocking data processing over dynamic flows.
☁️ Cloud-native
- Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub.
- Fast deployment to Kubernetes, Docker Compose and Jina Cloud.
- Full observability via Prometheus and Grafana.
🍱 Ecosystem
- Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.
pip install jina
More install options can be found in the docs.
Document, Executor and Flow are three fundamental concepts in Jina.
- Document is the fundamental data structure.
- Executor is a Python class with functions that use Documents as IO.
- Flow ties Executors together into a pipeline and exposes it with an API gateway.
The full glossary is explained here.
Leveraging these three concepts, let's look at a simple example below:
from jina import DocumentArray, Executor, Flow, requests
class MyExec(Executor):
@requests
async def add_text(self, docs: DocumentArray, **kwargs):
for d in docs:
d.text += 'hello, world!'
f = Flow().add(uses=MyExec).add(uses=MyExec)
with f:
r = f.post('/', DocumentArray.empty(2))
print(r.texts)
- The first line imports three concepts we just introduced;
MyExec
defines an async functionadd_text
that receivesDocumentArray
from network requests and appends"hello, world"
to.text
;f
defines a Flow streamlined two Executors in a chain;- The
with
block opens the Flow, sends an empty DocumentArray to the Flow, and prints the result.
Running it gives you:
At the last line we see its output ['hello, world!hello, world!', 'hello, world!hello, world!']
.
While one could use standard Python with the same number of lines and get the same output, Jina accelerates time to market of your application by making it more scalable and cloud-native. Jina also handles the infrastructure complexity in production and other Day-2 operations so that you can focus on the data application itself.
The example above can be refactored into a Python Executor file and a Flow YAML file:
toy.yml |
executor.py |
---|---|
jtype: Flow
with:
port: 51000
protocol: grpc
executors:
- uses: MyExec
name: foo
py_modules:
- executor.py
- uses: MyExec
name: bar
py_modules:
- executor.py |
from jina import DocumentArray, Executor, requests
class MyExec(Executor):
@requests
async def add_text(self, docs: DocumentArray, **kwargs):
for d in docs:
d.text += 'hello, world!' |
Run the following command in the terminal:
jina flow --uses toy.yml
The server is successfully started, and you can now use a client to query it.
from jina import Client, Document
c = Client(host='grpc://0.0.0.0:51000')
c.post('/', Document())
This simple refactoring allows developers to write an application in the client-server style. The separation of Flow YAML and Executor Python file does not only make the project more maintainable but also brings scalability and concurrency to the next level:
- The data flow on the server is non-blocking and async. New request is handled immediately when an Executor is free, regardless if previous request is still being processed.
- Scalability can be easily achieved by the keywords
replicas
andneeds
in YAML/Python. Load-balancing is automatically added when necessary to ensure the maximum throughput.
toy.yml |
Flowchart |
---|---|
jtype: Flow
with:
port: 51000
protocol: grpc
executors:
- uses: MyExec
name: foo
py_modules:
- executor.py
replicas: 2
- uses: MyExec
name: bar
py_modules:
- executor.py
replicas: 3
needs: gateway
- needs: [foo, bar]
name: baz |
- You now have an API gateway that supports gRPC (default), Websockets, and HTTP protocols with TLS.
- The communication between clients and the API gateway is duplex.
- The API gateway allows you to route request to a specific Executor while other parts of the Flow are still busy, via
.post(..., target_executor=...)
Without having to worry about dependencies, you can easily share your Executors with others; or use public/private Executors in your project thanks to Jina Hub.
To create an Executor:
jina hub new
To push it to Jina Hub:
jina hub push .
To use a Hub Executor in your Flow:
Docker container | Sandbox | Source | |
---|---|---|---|
YAML | uses: jinahub+docker://MyExecutor |
uses: jinahub+sandbox://MyExecutor |
uses: jinahub://MyExecutor |
Python | .add(uses='jinahub+docker://MyExecutor') |
.add(uses='jinahub+sandbox://MyExecutor') |
.add(uses='jinahub://MyExecutor') |
Behind this smooth experience is advanced management of Executors:
- Automated builds on the cloud
- Store, deploy, and deliver Executors cost-efficiently;
- Automatically resolve version conflicts and dependencies;
- Instant delivery of any Executor via Sandbox without pulling anything to local.
Using Kubernetes becomes easy:
jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s
Using Docker Compose becomes easy:
jina export docker-compose flow.yml docker-compose.yml
docker-compose up
Using Prometheus becomes easy:
from jina import Executor, requests, DocumentArray
class MyExec(Executor):
@requests
def encode(self, docs: DocumentArray, **kwargs):
with self.monitor('preprocessing_seconds', 'Time preprocessing the requests'):
docs.tensors = preprocessing(docs)
with self.monitor(
'model_inference_seconds', 'Time doing inference the requests'
):
docs.embedding = model_inference(docs.tensors)
Using Grafana becomes easy, just download this JSON and import it into Grafana:
What cloud-native technology is still challenging to you? Tell us, we will handle the complexity and make it easy for you.
- Join our Slack community and chat with other community members about ideas.
- Join our Engineering All Hands meet-up to discuss your use case and learn Jina's new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
- Subscribe to the latest video tutorials on our YouTube channel
Jina is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in open source.