This is a Fork from https://github.com/lachhabw/Image-Captioning-Extension-for-LM-Studio
This fork was created as an under-development and testing version, for the last releases check the original Repo
- Improved GUI
- Server Status: LMStudio server status Connected, Not connected
- Images count: count of the images in the selected folder
- If the user didn't choose the captioning folder the captions will be saved to the same images folder
- Play sound when finished captioning
- Change the server's check interval time
- Download the captioning.zip file from Releases
- Unzip the file.
- Navigate to the captioning folder and run the EXE file.
- Open a terminal and navigate to your desired location.
- Clone the repository:
git clone https://github.com/MMoneer/Image-Captioning-Extension-for-LM-Studio.git
- Run the installation script for your platform:
- Windows:
install_win.bat
- Linux and Mac:
install_Linux_Mac.sh
- Windows:
- Clone the repository:
git clone https://github.com/MMoneer/Image-Captioning-Extension-for-LM-Studio.git
- Create a new Python environment and activate it:
python -m venv myenv
myenv\Scripts\activate
(on Windows) orsource myenv/bin/activate
(on Linux and Mac)
- Navigate to the
Image-Captioning-Extension-for-LM-Studio
folder usingcd
. - Install the required packages:
pip install -r requirements.txt
- Run the script using:
- Windows:
run_win.bat
- Linux and Mac:
run_Linux_Mac.sh
- Windows:
Description Below from the Original Repo:
This repository contains an unofficial extension for LM Studio designed to automate the process of generating text captions for images. It allows users to utilize LM Studio's image text models like llava to caption images automatically.
The extension operates by reading a folder of images provided by the user. It sends requests to the LM Studio server for caption generation for each image in the folder. Upon receiving the captions from the server, it saves the generated captions into text files in a specified destination folder. The application incorporates a progress bar that becomes visible while the captioning process is ongoing. Additionally, it includes an embedded terminal that displays the status of processed files, indicating whether they were successful or encountered failures.
- Create a Python virtual environment:
python -m venv local
- Install dependencies:
.\local\Scripts\pip install openai==1.12.0
- Run the tool:
.\local\Scripts\python main.py
- Download the pre-built executable from the releases section.
- Double-click the executable to run the tool.
- Before using the extension, ensure that LM Studio is operating in server mode and that the image text model is loaded with the appropriate prompt template.
- Be sure to adjust the
config.ini
file to match your server settings. The default configuration is as follows:[OpenAI] base_url = http://localhost:1234/v1 api_key = not-needed
- It's important to enable "Apply Prompt Formatting" in LM Studio because this formatting is not applied on the client side. Failure to do so may result in unexpected behavior and poor quality results.
- Supported image formats include PNG, JPEG, and JPG. Formats such as AVIF and WebP are not supported.