Use PIL Image internally for the Multimodal Agent #1124

BeibinLi · 2024-01-03T00:45:52Z

Why are these changes needed?

As many people have observed, saving a base64 string in _oai_messages would make debugging difficult because the string is so long that it would spam the terminal or output log (e.g., #1087). Therefore, we save a 'PIL image' inside _oai_messages and convert it to base64 before calling the OpenAI client.

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

codecov-commenter · 2024-01-03T00:49:43Z

Codecov Report

Attention: 13 lines in your changes are missing coverage. Please review.

Comparison is base (9708058) 39.46% compared to head (d0e49b9) 50.99%.

Files	Patch %	Lines
.../agentchat/contrib/multimodal_conversable_agent.py	59.09%	8 Missing and 1 partial ⚠️
autogen/agentchat/contrib/img_utils.py	93.18%	1 Missing and 2 partials ⚠️
autogen/agentchat/contrib/llava_agent.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1124       +/-   ##
===========================================
+ Coverage   39.46%   50.99%   +11.52%     
===========================================
  Files          57       57               
  Lines        6020     6073       +53     
  Branches     1346     1478      +132     
===========================================
+ Hits         2376     3097      +721     
+ Misses       3449     2726      -723     
- Partials      195      250       +55

Flag	Coverage Δ
unittests	`50.91% <80.59%> (+11.44%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

debug, reformat

* Change defualt model for `lmm` * Try to use PIL image for LMM's _oai_messages * Update test cases and llava * Remove redundant files * Update the imports for lmm tests * Test case fix * Docstring update * LMM notebook lint * Typo correction for img_utils and its test * Update test_llava.py debug, reformat --------- Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Shaokun Zhang <shaokunzhang529@gmail.com> Co-authored-by: Shaokun Zhang <shaokun.zhang@psu.edu>

BeibinLi added 5 commits December 19, 2023 11:09

Change defualt model for lmm

6e685a3

Merge branch 'main' of https://github.com/microsoft/autogen into lmm

0442baa

erge branch 'main' of https://github.com/microsoft/autogen into lmm

3cff368

Try to use PIL image for LMM's _oai_messages

70e982b

Update test cases and llava

791c4e0

BeibinLi had a problem deploying to openai1 January 3, 2024 00:45 — with GitHub Actions Failure

BeibinLi had a problem deploying to openai1 January 3, 2024 00:46 — with GitHub Actions Failure

Remove redundant files

eddcb33

BeibinLi had a problem deploying to openai1 January 3, 2024 00:51 — with GitHub Actions Failure

Update the imports for lmm tests

ff7da0b

BeibinLi had a problem deploying to openai1 January 3, 2024 00:53 — with GitHub Actions Failure

skzhang1 had a problem deploying to openai1 February 18, 2024 00:03 — with GitHub Actions Failure

Update test_llava.py

d0e49b9

debug, reformat

skzhang1 had a problem deploying to openai1 February 18, 2024 14:52 — with GitHub Actions Failure

sonichi enabled auto-merge February 18, 2024 15:07

sonichi requested a review from ekzhu February 18, 2024 15:08

sonichi approved these changes Feb 18, 2024

View reviewed changes

sonichi added this pull request to the merge queue Feb 18, 2024

Merged via the queue into microsoft:main with commit 9de374a Feb 18, 2024
46 of 57 checks passed

sonichi deleted the lmm branch February 18, 2024 16:06

whiskyboy pushed a commit to whiskyboy/autogen that referenced this pull request Apr 17, 2024

fix mathproxyagent bug (microsoft#1124)

ad25b60

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use PIL Image internally for the Multimodal Agent #1124

Use PIL Image internally for the Multimodal Agent #1124

BeibinLi commented Jan 3, 2024 •

edited

Loading

codecov-commenter commented Jan 3, 2024 •

edited

Loading

Use PIL Image internally for the Multimodal Agent #1124

Use PIL Image internally for the Multimodal Agent #1124

Conversation

BeibinLi commented Jan 3, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

codecov-commenter commented Jan 3, 2024 • edited Loading

Codecov Report

BeibinLi commented Jan 3, 2024 •

edited

Loading

codecov-commenter commented Jan 3, 2024 •

edited

Loading