Professionalized server implementation #768

blythed · 2023-08-25T07:27:51Z

blythed
Aug 25, 2023
Maintainer

I'm starting this discussion by posting a discussion which happened in slack for completeness. The aim is to gather ideas around how to proceed with plans for a server (-client) implementation using FastAPI and potentially asyncio.

@nenb wrote:

Consider the following 2 users:

User 1 - Python user

This user can already install superduperdb on their client.
This means that the user can already achieve all the operations that our API offers if they are given access to the database.
It would make more sense to me to work on a solution that provides the client with restricted access to the database (this involves only a single de/serialisation to the database) rather than requiring everything to be de/serialised twice via the server acting as a 'person-in-the-middle' (slower, especially if we are talking about bigger binary blobs, and also means we are implementing the same functionality in two different locations).
Note: I can certainly imagine a user case for a server where we are executing compute-intensive operations that the client can't handle. But this would require us to think a little bit more about what these endpoints would look like, and/or whether we should just be communicating with the dask server directly.

User 2 - Non-Python user

This user has none of the abstractions that are available in superduperdb.
The current API implementation would mean very little to them as it depends explicitly on our Python abstractions eg there is no meaning to a Component in Javascript world (someone would need to define this), and it is not clear how exactly Javascript objects would be added to the database.
Note: The server absolutely has a use-case as an interface to other languages. But I think we need to think a little bit more about what these endpoints would look like, and make sure they are not dependent on our (Python-specific) abstractions.

Rather than jumping into the technical details straight away, I would like to understand a little more about the vision of what we are trying to offer. What are we trying to enable for i) Python users with a server and for ii) non-Python users in general ie if I am a Javascript user, why should I consider superduperdb?

I don't know the best way to do this, but I am very open to suggestions. I really do think a little time reflecting on these endpoints will save us a lot of time when it comes to implementation etc.

blythed · 2023-08-25T07:32:10Z

blythed
Aug 25, 2023
Maintainer Author

I see your point about not necessarily simply exposing all functionality in the server.

The suggestion was actually my starting point -- I originally only had 2 endpoints: prediction and vector-search. I'm open to pivoting back to that.

The point here is that the user might want to do everything with the system which they can do if the system were deployed locally.
In practice we might not want to expose the database, dask cluster, etc.. to the world. So having a single point of entry might simplify things.

Putting such concerns aside, it is true that the only endpoint likely to be called from outside Python are exactly prediction and vector-search. I would also like to add support for adding models in a .yaml format. Restricting scope to only these 3 things will indeed greatly simplify matters.

If, theoretically, the model of mapping everything to pydantic classes, were to be possible. Then it is at least open to us to decide exactly where to draw the line. However, if this is not possible, then exposing only a limited set of functionality is appealing.

0 replies

nenb · 2023-08-25T07:58:58Z

nenb
Aug 25, 2023

The point here is that the user might want to do everything with the system which they can do if the system were deployed locally.

(With the current API)
The user can already do everything with the system which they can do if the system were deployed locally if they are given access to the database. At the moment we are using the server as a simple proxy for the database, but we are also sending all data via the server as well (slow, duplication of code, memory problems always a concern). If we are only concerned about authentication, we should restructure the server to focus only on this, and not have it acting as a proxy for the data as well.

But rather than focusing on the details, I think we should instead focus on use-cases first. As I mentioned in my first post, I am interested in two questions:

I am a Python user - why do I need a server? (Answers should outline use-cases rather than technical details.)
I am a non-Python user - what functionality can be exposed by our API that is so useful that I will consider using superduperdb? (Answers should outline use-cases rather than technical details.)

You touched on this a little with 'prediction' and 'vector-search'. Can you say more about these and other use-cases that you think will be absolutely crucial to superduperdb? I think a laser-focus on the domain (AI) and what it is we want to enable for our users is crucial. Hopefully the implementation details will follow somewhat naturally after we have this.

0 replies

rec · 2023-08-25T12:08:35Z

rec
Aug 25, 2023

It would make more sense to me to work on a solution that provides the client with restricted access to the database (this involves only a single de/serialisation to the database) rather than requiring everything to be de/serialised twice via the server acting as a 'person-in-the-middle' (slower, especially if we are talking about bigger binary blobs, and also means we are implementing the same functionality in two different locations).

The plan is not to actually send back every binary blob.

Large items, Artifacts, get replaced by a token into a store representing large items in memory or disk.

You could then use that token to request further operations on the results of that blob without ever downloading it.

If the final result were, say, some sort of summary or report, then there wouldn't be any significant traffic at all.

Let's jump back a level though.

In order to write a proper server, we need to make every single query and response JSONizable, through a typed "schema" too. On the other hand, if we had that perfect JSONization for all responses and queries, then writing a server would be more of a medium chore than a huge task.

So when looking at the benefits, we need to include the benefits of a completely typed specification of a computation.

The first one, which we have not yet considered, is auditability, repeatability and reuse. For a professional result, you can't just display your work, you need the provenance of the data, and you need other people to be able to replicate your work if necessary.

If all of our interactions go through an entirely JSONizable interface, we can store an "executable" journal of all the operations we perform with the results.

If files were immutable, and you (the user) were careful to preserve the exact installation of all the Python packages you had, and there was no dependency on unseeded randomness, or seeded randomness with race conditions between threads, or a dependency on the current time of course, or a bunch of other things didn't go wrong, you could even perhaps guarantee bit-for-bit reproducibility of certain artifacts.

Auditability is much easier target to hit - immutable files and the journal of request/responses hits it perfectly.

And the journal could be used in some TBD way to say, "Now do that operation on those things!"

A secondary benefit is that it would be far easier to covert our program to run on multiple processes if all our commands were JSONizable. We wouldn't even have to have a REST server, we could just use python's multiprocessing and process-safe queues. Large artifacts could very easily be stored in shared memory/memory mapped files.

This has two advantages: it allows someone to use all the cores on their machine, and also, our central program can be independent of any of the subprocesses and recover even if one of them SEGVs.

0 replies

nenb · 2023-08-25T18:23:27Z

nenb
Aug 25, 2023

Let's jump back a level though.
... In order to write a proper server, we need to make every single query and response JSONizable, through a typed "schema" too. On the other hand, if we had that perfect JSONization for all responses and queries, then writing a server would be more of a medium chore than a huge task.

I think it's too early to jump into the implementation details. As I said in my last posts, first I think we need to address two questions:

I am a Python user - why do I need a server? (Answers should outline use-cases rather than technical details.)
I am a non-Python user - what functionality can be exposed by our API that is so useful that I will consider using superduperdb? (Answers should outline use-cases rather than technical details.)

You hinted at this several times in your post:

You could then use that token to request further operations on the results of that blob without ever downloading it.

and

we can store an "executable" journal of all the operations we perform with the results

But I think we need to really spell out these issues by answering the two questions above. Once we have understood our use-cases then we can jump into the technical details.

A secondary benefit is that it would be far easier to covert our program to run on multiple processes if all our commands were JSONizable. We wouldn't even have to have a REST server, we could just use python's multiprocessing and process-safe queues. Large artifacts could very easily be stored in shared memory/memory mapped files.

This is exactly what dask and/or ray offer (and they have many years more experience and edge-case testing, including many of the isseus we have discussed around serialization).

It's so important that we clarify what our use-cases/needs for the server are before we jump into the implementation details so that we avoid re-implementing tooling.

I would really like us to address the two questions above as clearly and thoroughly as possible, and then we can use the subsequent answers to guide us for the technical details.

0 replies

blythed · 2023-08-28T08:47:46Z

blythed
Aug 28, 2023
Maintainer Author

I am Python user - why do I need a server?

If the dask cluster and artifact store etc. are behind a private network, and there are concerns about exposing:

database
dask cluster
artifact store
metadata store

then having a single hardened endpoint will provide us with more control. We could then perform the operations which we perform locally (without a server) from the remote.

I am a non-Python or Python user

It will be easy to send requests to this server to "query" the data, without having everything installed.
We will be able to provide very thin client libraries in python or other languages, esp. Javascript, to do things such as:

perform vector-search
fetch model outputs
get information about workflows, models and model-versions
manage/ update the models in the artifact stores and meta data stores
update models directly as blobs in conjunction with metadata, without ever having to invoke the python library

I am an infrastructure person

I want a clear record of traffic to the system
I want fine-grained control over users/ access control etc.
I want to host the heavy lifting components in a private VPC, only exposing 1 endpoint to the outside world

0 replies

blythed · 2023-08-30T14:21:14Z

blythed
Aug 30, 2023
Maintainer Author

Proceeds of a slack huddle between @nenb and @blythed.

A key application in e-Commerce:

The browser sends a search, e.g. "red hat with dots and stripes" to a server, which returns a list of results (product ids/ urls)
As a minimum we would like to support this functionality but with search replaced by any db.execute query or db.predict call. Examples:
- Send image bytes as search
- Ask question instead of search ("summarize the types of products for weddings please")
- etc.

Exactly how this is done, TBD.

0 replies

blythed · 2023-08-31T09:08:00Z

blythed
Aug 31, 2023
Maintainer Author

Proceeds of an internal meeting including @nenb @rec @thejumpman2323 and @thgnw.

We are interested in reducing scope. 1 possible approach to doing this, is to suggest that all endpoints, if to be implemented at all, should be pure JSON.

For endpoints such as /add, we could restrict arguments to e.g. container.model.Model to be encoded as JSON.
E.g. instead of sending sklearn.svm.SVC(C=0.1) we would send a JSON body containing {"cls": "sklearn.svm.SVC", "params": {"C": 0.1}}. This would have the added benefit, that models would be more easily specifiable from e.g. javascript.

0 replies

nenb · 2023-08-31T09:48:11Z

nenb
Aug 31, 2023

We are interested in reducing scope. 1 possible approach to doing this, is to suggest that all endpoints, if to be implemented at all, should be pure JSON.

@blythed I know I am getting really annoying here, but I really think that the first thing to do is to decide on the use-cases and the endpoints. We should first decide on what are the most important things that these endpoints need to do (eg a songs endpoint that provides users with songs that they might like, a repos endpoint that provides users with all the relevant metadata for their repo).

What are the key endpoints that we would like to provide, and what do we unlock by creating these endpoints?

Look at the OpenAI API as an example. They have an Embeddings endpoint whose use-case is to create vector representations (I know you know this, just an illustration). They have an Audio endpoint whose use-case is to transcribe audio into the input language. Etc.

What are the endpoints that we should be offering? Perhaps it's also helpful to think in terms of nouns, rather than verbs.

For endpoints such as /add, we could restrict arguments to e.g. container.model.Model to be encoded as JSON.
E.g. instead of sending sklearn.svm.SVC(C=0.1) we would send a JSON body containing {"cls": "sklearn.svm.SVC", "params": {"C": 0.1}}. This would have the added benefit, that models would be more easily specifiable from e.g. javascript.

This! More of this please! We need a few more details from a user/AI domain knowledge perspective about why we need an /add (or perhaps /model) endpoint, and also whether this is the most important endpoint that we should offer, or are there even more important endpoints that we should begin with? But this is absolutely the direction I want us to start with.

0 replies

blythed · 2023-08-31T15:59:14Z

blythed
Aug 31, 2023
Maintainer Author

Use cases:

Execute a SuperDuperDB query from any programming language, or directly from http://; e.g. e-Commerce, semantic search, RAG apps.
From a user-interface:
- View all SuperDuperDB models and vector-indexes (think team-wide/ company wide oversight, usability)
- Add a SuperDuperDB model (e.g. choose from dropdown menu listing all transformers) (use-case: easier insight into what's possible, consider tagging available models with description of model)
- Trigger a SuperDuperDB training (hobby developers, looking to fine-tune models for their use-cases)
- Remove a model (admin, usability)

Proposed endpoints

From this the endpoints I think we should have are:

/execute
/add
/show
/predict
/fit
/remove

0 replies

rec · 2023-09-01T10:14:16Z

rec
Sep 1, 2023

I feel obliged to remind:

REST is "Representational State Transfer".

URIs are "Universal Resource Identifiers".

An endpoint is a URI, which is intended to be a resource - a noun.

Now I've discharged my duties, I have no problem with this at all. :-D

Except that now I have used this system for a while, I'm actually strongly against the execute endpoint - big turnaround for people who are fans of this show! - because it adds nothing, because of its weak typing it has extra modes of failure, and a horribly complex signature which has to know about all the other endpoints to be correctly defined.

I think each endpoint should be completely independent of each other endpoint. This means for example, it's easy to deploy experimental servers with new or changed endpoints, or even to turn an endpoint on and off dynamically if we cared to.

But execute is this awful chimera of all other endpoints, glued together with type casts, and it doesn't allow you to do anything new.

It should be abolished.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Professionalized server implementation #768

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 10 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Professionalized server implementation #768

blythed Aug 25, 2023 Maintainer

Replies: 10 comments

blythed Aug 25, 2023 Maintainer Author

nenb Aug 25, 2023

rec Aug 25, 2023

nenb Aug 25, 2023

blythed Aug 28, 2023 Maintainer Author

blythed Aug 30, 2023 Maintainer Author

blythed Aug 31, 2023 Maintainer Author

nenb Aug 31, 2023

blythed Aug 31, 2023 Maintainer Author

rec Sep 1, 2023

blythed
Aug 25, 2023
Maintainer

blythed
Aug 25, 2023
Maintainer Author

nenb
Aug 25, 2023

rec
Aug 25, 2023

nenb
Aug 25, 2023

blythed
Aug 28, 2023
Maintainer Author

blythed
Aug 30, 2023
Maintainer Author

blythed
Aug 31, 2023
Maintainer Author

nenb
Aug 31, 2023

blythed
Aug 31, 2023
Maintainer Author

rec
Sep 1, 2023