added text-generation-webui infrence support #221

danikhan632 · 2023-06-12T02:16:18Z

Adds support to run "Local LLM" using text-generation-webui which has built-in Multi-GPU support, consistently supports new models shortly after release, provide mechanisms to use larger models such as memory and disk offloading and could enable to gudiance to run on systems with less resources in an easy to setup approach.

Note this is still currently WIP however is under active development, this is a seperate class and will not break any other functionalities of guidance

corresponds with this pr
oobabooga/text-generation-webui#2637

danikhan632 · 2023-06-12T02:18:54Z

@microsoft-github-policy-service agree

danikhan632 · 2023-06-16T00:25:11Z

oobabooga/text-generation-webui#2709

updated

slundberg · 2023-06-20T23:53:41Z

Thanks! Two questions:

Could you add a basic unit test file (it is fine if it only runs locally when a webui server is available).
Do you have an insight into whether acceleration and token healing are possible with the webui API? (I have not looked deeply yet)

danikhan632 · 2023-06-21T00:39:59Z

I can add unit testing ASAP and token healing does work but not acceleration however I can implement that and testing by tommorow

danikhan632 · 2023-06-22T23:44:14Z

Update unit tests and improved functionality were added

jediknight813 · 2023-06-28T20:17:12Z

@slundberg is this going to be merged?

danikhan632 · 2023-06-29T00:26:04Z

@slundberg is this going to be merged?

I'd like to get feedback regarding any concerns with the pull request but haven't heard much, maybe should open new one?

chrisle · 2023-07-04T22:21:21Z

Thank you for submitting this. I was thinking of writing it myself but wanted to check if someone already sent a PR before I did.

I have 1 question and 1 feedback about this PR.

TGWUI's GET /api/v1/model response is different than expected?

Steps to recreate:

Launch text-generation-webui like this: python server.py --api --model=name/model --loader=ExLlama
Use this PR like this: guidance.llm = guidance.llms.TGWUI('http://localhost:5000', chat_mode=False)

Results in this error

  File "/Users/chrisle/code/kbot/.venv/lib/python3.9/site-packages/guidance/llms/_tgwui.py", line 34, in getModelInfo
    resp=response.json()["results"].
KeyError: 'results'

Not sure if it was me or if i'm using some different version of the WebUI's API but I wasn't able to use this PR. (I checked that i was using the most current pull from the text-generation-webui main branch).

Tracking this down I'm finding this in this PR: source code

def getModelInfo(self):        
    response = requests.get(self.base_url+'/api/v1/model')
    # Expect response to be { "results": <any> }
    resp=response.json()["results"] # Error: KeyError: 'results'
    return resp

# ....

# Called in `__init__`
self.model_info= self.getModelInfo()
# Expect response to be { "results": { "model_name": "name/model" }}
self.model_name = self.model_info["model_name"] }} # This will also fail. See below.

Compare to the API's route: source code

# See: https://github.com/oobabooga/text-generation-webui/blob/333075e726e8216ed35c6dd17a479c44fb4d8691/extensions/api/blocking_api.py#L28

def do_GET(self):
    if self.path == '/api/v1/model':
        self.send_response(200)
        self.end_headers()
        response = json.dumps({
            'result': shared.model_name # <= Returning { "result": "name/model" }
        })
    # ...

Am I somehow using a different web API than what others are using?

Feedback: Consider adding some asserts and exception handling around HTTP requests.

You can't always trust an HTTP request will happen. Lost connection, timeouts, or in this case, a malformed response from the API.

Consider adding some asserts or try/catches in case things go wrong. This would help with getting helpful error messages when debugging when the API itself fails.

Example:

def getModelInfo(self):
    data = self._make_request(self.base_url + '/api/v1/model')
    assert 'result' in data, f"Expected /api/v1/model to respond with a result. Got: {data}"
    return data['result']
    
def _make_request(self, uri: str):
    """Make a GET request to TGWUI.
    
    Args:
        uri: URI to fetch.
        
    Returns:
        Data as response.json()
    """
    try:
        response = requests.get(uri)
        return response.json()
    except requests.exceptions.RequestException as e:
        logging.critical(f'TWGUI API request failed: {e}')
        raise

danikhan632 · 2023-07-04T23:33:06Z

Thanks for the feedback, I should have put guards on API routes and I should pull the latest changes for TGWUI.
I am a little confused with getting inital feedback from the maintainers but then nothing else after I updated.
I'm honestly thinking about closing this PR, revising code to be updated, then making a new PR. Does this sound like a good plan?

chrisle · 2023-07-05T09:58:04Z

Totally up to you. It's your PR :)

Personally? I'd keep it. A little extra time to button things up and you have a solid PR.

Just a passing thought on PRs and nothing to do with this PR in particular:

A quick google and looks like the owner is a Senior Researcher at MS. Probably means they're stuck in lots of boring "impact assessment committee meetings" and unfortunately don't have enough time to do full-on code reviews like they want to. People can rest easy when they glance at the code and see the contributor has handled exceptions and edge cases, the code is well documented, style looks pretty, and, bonus points for test coverage. It looks rock solid and ready-to-ship, and the owner can rest easy.

danikhan632 · 2023-07-05T14:39:48Z

Sounds good, will add further units tests, documention, error handling, etc. I might want to open a new pr just bc I've changed so much code between commits expect to see an update very soon.

jordan-barrett-jm · 2023-11-17T04:07:39Z

Will this be merged or is there existing functionality that now exists within Guidance to support text-generation-web-ui or using external LLMs via API requests?

danikhan632 · 2023-11-17T18:04:58Z

Will this be merged or is there existing functionality that now exists within Guidance to support text-generation-web-ui or using external LLMs via API requests?

I plan on pushing these changes to litellm

added guidance class

246112a

danikhan632 mentioned this pull request Jun 12, 2023

Guidance API extensions added oobabooga/text-generation-webui#2637

Closed

Wladastic approved these changes Jun 17, 2023

View reviewed changes

QuangBK mentioned this pull request Jun 19, 2023

Connection to Text Generation WebUI QuangBK/localLLM_guidance#5

Closed

danikhan632 and others added 2 commits June 22, 2023 18:54

Merge branch 'microsoft:main' into main

dce0908

added gudiance unit tests and updated class

18c5b13

danikhan632 mentioned this pull request Jun 26, 2023

Thank you! danikhan632/guidance_api#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added text-generation-webui infrence support #221

added text-generation-webui infrence support #221

danikhan632 commented Jun 12, 2023

danikhan632 commented Jun 12, 2023

danikhan632 commented Jun 16, 2023

slundberg commented Jun 20, 2023

danikhan632 commented Jun 21, 2023

danikhan632 commented Jun 22, 2023

jediknight813 commented Jun 28, 2023

danikhan632 commented Jun 29, 2023

chrisle commented Jul 4, 2023

danikhan632 commented Jul 4, 2023

chrisle commented Jul 5, 2023

danikhan632 commented Jul 5, 2023

jordan-barrett-jm commented Nov 17, 2023

danikhan632 commented Nov 17, 2023

added text-generation-webui infrence support #221

Are you sure you want to change the base?

added text-generation-webui infrence support #221

Conversation

danikhan632 commented Jun 12, 2023

danikhan632 commented Jun 12, 2023

danikhan632 commented Jun 16, 2023

slundberg commented Jun 20, 2023

danikhan632 commented Jun 21, 2023

danikhan632 commented Jun 22, 2023

jediknight813 commented Jun 28, 2023

danikhan632 commented Jun 29, 2023

chrisle commented Jul 4, 2023

TGWUI's GET /api/v1/model response is different than expected?

Steps to recreate:

Feedback: Consider adding some asserts and exception handling around HTTP requests.

danikhan632 commented Jul 4, 2023

chrisle commented Jul 5, 2023

danikhan632 commented Jul 5, 2023

jordan-barrett-jm commented Nov 17, 2023

danikhan632 commented Nov 17, 2023