-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added text-generation-webui infrence support #221
base: main
Are you sure you want to change the base?
Conversation
@microsoft-github-policy-service agree |
Thanks! Two questions:
|
I can add unit testing ASAP and token healing does work but not acceleration however I can implement that and testing by tommorow |
Update unit tests and improved functionality were added |
@slundberg is this going to be merged? |
I'd like to get feedback regarding any concerns with the pull request but haven't heard much, maybe should open new one? |
Thank you for submitting this. I was thinking of writing it myself but wanted to check if someone already sent a PR before I did. I have 1 question and 1 feedback about this PR. TGWUI's GET /api/v1/model response is different than expected?Steps to recreate:
Results in this error
Not sure if it was me or if i'm using some different version of the WebUI's API but I wasn't able to use this PR. (I checked that i was using the most current pull from the text-generation-webui main branch). Tracking this down I'm finding this in this PR: source code def getModelInfo(self):
response = requests.get(self.base_url+'/api/v1/model')
# Expect response to be { "results": <any> }
resp=response.json()["results"] # Error: KeyError: 'results'
return resp
# ....
# Called in `__init__`
self.model_info= self.getModelInfo()
# Expect response to be { "results": { "model_name": "name/model" }}
self.model_name = self.model_info["model_name"] }} # This will also fail. See below. Compare to the API's route: source code # See: https://github.com/oobabooga/text-generation-webui/blob/333075e726e8216ed35c6dd17a479c44fb4d8691/extensions/api/blocking_api.py#L28
def do_GET(self):
if self.path == '/api/v1/model':
self.send_response(200)
self.end_headers()
response = json.dumps({
'result': shared.model_name # <= Returning { "result": "name/model" }
})
# ... Am I somehow using a different web API than what others are using? Feedback: Consider adding some asserts and exception handling around HTTP requests.You can't always trust an HTTP request will happen. Lost connection, timeouts, or in this case, a malformed response from the API. Consider adding some asserts or try/catches in case things go wrong. This would help with getting helpful error messages when debugging when the API itself fails. Example: def getModelInfo(self):
data = self._make_request(self.base_url + '/api/v1/model')
assert 'result' in data, f"Expected /api/v1/model to respond with a result. Got: {data}"
return data['result']
def _make_request(self, uri: str):
"""Make a GET request to TGWUI.
Args:
uri: URI to fetch.
Returns:
Data as response.json()
"""
try:
response = requests.get(uri)
return response.json()
except requests.exceptions.RequestException as e:
logging.critical(f'TWGUI API request failed: {e}')
raise |
Thanks for the feedback, I should have put guards on API routes and I should pull the latest changes for TGWUI. |
Totally up to you. It's your PR :) Personally? I'd keep it. A little extra time to button things up and you have a solid PR. Just a passing thought on PRs and nothing to do with this PR in particular: |
Sounds good, will add further units tests, documention, error handling, etc. I might want to open a new pr just bc I've changed so much code between commits expect to see an update very soon. |
Will this be merged or is there existing functionality that now exists within Guidance to support text-generation-web-ui or using external LLMs via API requests? |
I plan on pushing these changes to litellm |
Adds support to run "Local LLM" using text-generation-webui which has built-in Multi-GPU support, consistently supports new models shortly after release, provide mechanisms to use larger models such as memory and disk offloading and could enable to gudiance to run on systems with less resources in an easy to setup approach.
Note this is still currently WIP however is under active development, this is a seperate class and will not break any other functionalities of guidance
corresponds with this pr
oobabooga/text-generation-webui#2637