-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support for Microsoft azure endpoints #67
Conversation
@64bit is the only thing blocking this change testing |
Hi @deipkumar, That's correct, testing is the blocker for changes in this PR to be published to crates.io |
I'm getting the following when I run: (base) ➜ examples git:(microsoft-azure-endpoints) ✗ ./target/debug/azure-openai-service
Error: JSONDeserialize(Error("missing field `type`", line: 1, column: 302)) |
That was because I ran with a gpt-3.5-turbo deployment, which failed on
(base) ➜ examples git:(microsoft-azure-endpoints) ✗ ./target/debug/azure-openai-service
. Is it Attenborough? Is that what he says? So, okay. Many years ago, in a universe that was shaped by a great war between the Decepticons and the Autobots, an age of strife that raged across the galaxy. In such a place, many tales and legends would be born. This is one such tale. On a distant, barren moon, Optimus Prime and his trusted lieutenant, Bumblebee, were patrolin' a Autobot science facility. Situated in the remote regions of space, (laughs) situated in the remote regions of space, the facility was tasked with researching new means of combating the Decepticons. To help tip the scales of victory in the upcoming battle for dominance. Under their care, the scientists and engineers of the facility would tinker and work long hours in preparation for when the moment was right to reveal their secrets. But one moonlit night, their hard work was radically interrupted. (clears throat) By a massive crashed spaceship that fell suddenly from the sky. (crashing) Oh no, Optimus, cried Bumblebee. We have to help those survivors. The two of them raced to the crash site of the craft amongst the debris and shattered circuits were a group of unfamiliar robots, never before seen by the Autobots. "Are you Decepticons?" Optimus growled. The robots stirred and the first to speak was obviously a leader in his own right, something that Optimus Prime could easily recognize. "No, we are not Decepticons. "We are the Primals, guardians of the Thirteenth Cybertronian Colony Fleet. "We have escaped the destruction caused by our eternal enemy, the Unicron. "We were returning to our home when one of our ships malfunctioned "and the unfortunate events that you have seen before you happened." (laughs) Optimus considered his response thoughtfully. He knew well the chance that they were Decepticons, but something in their eyes made the Prime believe that they were genuine. "All right, very well. "You may stay here and recover." "We'll help you repair your ship and get you on your way "in the nearest stellar body." And so within days,stream failed: Stream ended
after the crash, the Primals rejoined their fleet and the Autobots could only believe what they had been told and continued their work until the next night raid on the Decepticon stronghold. And that my good friends is a brief but amazing bedtime story about space robots (.
We're gonna do this for you, Champ. Mr. Attenborough's over there. We're gonna get in here for you. Let's rock.
And the Transformers slept soundly that night. His own bot cried out to one of the moons, and Bumblebee shifted a little in his sleep. Optimus Prime, despite everything, slept with his eyes a little open, always vigilant, since the threat was always present, no matter how peaceful the night seemed to be. But even that weighed light in the face of the quiet beauty of the moon's glow softly lighting the Cybertronian landscape from above. And that's the end of that.
That was excellent.
I love it.
Love it.
I love Sir David Attenborough's voice. It's like the same voice you can hear for years and years and never get tired of it.
So amazing.
Good job, yo. That was great.
So as a newcomer to the Transformers franchise, can you tell us a little bit about how you came to be involved in Transformers: War for Cybertron trilogy?
Yeah, absolutely. First and foremost, I'm really lucky that I have an extensive background in voiceover and doing animation VO work and things like that. And I think it's one of the things that opened the door for you guys took a chance on me when you were casting for the role of Alita within the trilogy. And I think, to be honest, when I found out it was the Transformers universe you guys were working on, that was even more exciting for me to be involved. And so, you know, it gave me the opportunity to do a lot of research and to really deep dive into the different series, which was kind of daunting because there's so much out there, but also really exciting. And I think one of the things I really loved was researching where exactly in the timeline we were within the trilogy because it gives you a lot of information as to what the characters are thinking, how things are moving around. And then obviously, you know, pairing that with, you know, working with the scripts and the director and everyone else involved, it was a really cool experience to see it all come together.
That's amazing. So since you're new to the franchise, can you tell us a little bit about how much preparation and research went into your role as Alita?
stream failed: Stream ended
... and (base) ➜ examples git:(microsoft-azure-endpoints) ✗ ./target/debug/azure-openai-service
Response:
0: Role: assistant Content: A large language model is an artificial intelligence model trained on a large dataset of text to predict the likelihood of a particular word or phrase given its context. This is known as language modeling. To achieve this, the model learns the relationships between the words and phrases in the training data and uses this knowledge to generate coherent text.
The basic architecture of a large language model typically involves a neural network with many layers of nodes. The input to the network is a sequence of words, and the output is a probability distribution over the next word in the sequence.
During training, the network is presented with large amounts of text and learns to predict the probability of the next word in the sequence based on the previous words. This involves adjusting the weights of the network to minimize the difference between the predicted probability distribution and the actual distribution of the next word in the training data.
Once the model has been trained, it can be used to generate text by taking a sequence of words as input and using the network to predict the next word in the sequence. This process can be repeated to generate longer sequences of text.
Large language models have been used for a variety of natural language processing tasks including language translation, question answering, and text summarization, among others. When I run with an embeddings deployment I get: Error: ApiError(ApiError { message: "Too many inputs. The max number of inputs is 1. We hope to increase the number of inputs per request soon. Please contact us through an Azure support request at: https://go.microsoft.com/fwlink/?linkid=2213926 for further questions.", type: "invalid_request_error", param: None, code: None }) which is due to the fact that Azure only supports embedding one text chunk at a time. In general, the code is a bit confusing with Azure with |
Thank you for your help on this and posting the output! Here are my observations:
|
Question 1: Azure does respond with data: {"id":"cmpl-7I0LcSGtHaT5RwObOEA7zBxKoNPU1","object":"text_completion","created":1684525596,"model":"text-davinci-003","choices":[{"text":" new","index":0,"finish_reason":"length","logprobs":null}],"usage":null}
data: [DONE] Using your Rust SDK (I put data: "{\"id\":\"cmpl-7I0NowuFlhjJVD2V10KajaPMhPCkz\",\"object\":\"text_completion\",\"created\":1684525732,\"model\":\"text-davinci-003\",\"choices\":[{\"text\":\" heroes\",\"index\":0,\"finish_reason\":\"length\",\"logprobs\":null}],\"usage\":null}"
stream failed: Stream ended you should be able to look for Question 2: Deployment ID refers to a deployment of a specific model. Each Azure Open AI resource/service is fixed ( import openai
openai.api_type = "azure"
openai.api_base = "https://blahblah.openai.azure.com/"
openai.api_version = "2023-03-15-preview"
openai.api_key = "*********"
prompt = "You are a mathematician. "
response = openai.Completion.create(
engine="gpt3",
prompt=prompt,
temperature=0,
max_tokens=20,
stream=True
)
response = openai.ChatCompletion.create(
engine="gpt35",
messages=[
{"role": "system", "content": prompt },
{"role": "user", "content": "Write a poem." },
],
temperature=0,
max_tokens=20,
stream=True
) |
Question 1: It seems a bug in this library then, stream should not end with Question 2: Engine seems to be deprecated? atleast in OpenAI. Even in their updated Python example https://github.com/openai/openai-python#microsoft-azure-endpoints I see deployment and model explicitly provided. But semantically I see what you mean, one deployment is just one model. In which case library user may want to call API with different models/deployment. As of this PR, that means user creates one client instance for each deployment. Or this library needs to expose method signature to accept deployment id like Python library. So this PR is not ready to be shipped without some non trivial work, I was hoping it should work out of the box. |
Do you think we can merge support for Azure embeddings and non streaming completions now and defer streaming support? |
Hypothetically say we do ship this, the major issues with that is the library immediately suffers from reliability issues where some of the APIs don't work on all of exposed APIs from library. Second major issue is this introduces breaking change already, and when Azure support is implemented 100% there could be further breaking changes. Both are not desired effect for the users of library. Its better to pick reliability and quality over shipping partial working library. That said, the feature implemented in this PR still is useful if needed, in which case you can pin the dependency to Git commit in this branch:
|
Hey @64bit are you planning on holding on until there's full parity between both platforms? If there's anything missing I can help with, happy to jump in. Otherwise I'll used the pinned dependency. (thank you for this great library) |
Hi @Czechh, Thank you - I'm happy to know its useful. After news today on function calls, it seems keeping parity may not be practical - Azure API seems to lag OpenAI API. So I'm wondering if I should ship this and make it clear in the documnetation that OpenAI API is going to be 100% working and Azure may not be 100% working just because Rust type differences and relies fully on community contributions who have access to Azure OpenAI service. That said, in this PR if you have cycles to contribute: if completions stream example can be made to work without getting Lastly, knowing that #72 is being worked up on (see #73 ) - I'd hold to ship this to avoid interruptting their work on supporting function calls. Do you think this is a reasonable plan to ship it? |
I think it's good to go. Azure endpoints aren't freely available, and I don't think they will ever be. So the users will need to be more savvy inherently. This is a change for the 1-5% of users. Right now, there isn't any other option besides writing up your own client/forking this. The biggest issue I see is the fact that it had to be a breaking change. I understand why not keeping |
Thank you for your thoughtful feedback. Its better to have an option than to have none. Also agree that Based on percentage of users I'm convinced to update this PR and ship this. |
Released this in v0.11.0. Thank you all for your contributions! Also recorded the bug identified in this PR in #74 |
Tested this, and works great! Thank you. I'll add to https://github.com/getmetal/motorhead now. :) |
New:
OpenAIConfig
andAzureConfig
to be used inClient::with_config( config )
Client::new()
defaults to client with defaultOpenAIConfig
Breaking change:
Client
instead platform specific configurations are done throughOpenAIConfig
orAzureConfig
Examples:
azure-openai-service
closes #32