This repository contains the projects I completed as part of the "Building Generative AI-Powered Applications with Python" course by IBM on Coursera. The course consists of 7 modules, each focusing on building generative AI applications with Python. Below is a brief overview of each module, including the key objectives, technologies, and models used.
- Module 1: Image Captioning
- Module 2: Simple Chatbot
- Module 3: Voice-Enabled AI Assistant
- Module 4: Audio Capturing and Summarization
- Module 5: PDF Querying Chatbot
- Module 6: Voice Translator Assistant
- Module 7: AI Career Coach
In this module, I learned the basics of generative AI models and used the Hugging Face platform to explore models and datasets. The main project involved building an automated image captioning tool using the BLIP model and Gradio for the user interface.
Key Learning Objectives:
- Basics of generative AI models
- Using Hugging Face to explore models
- Implementing an image captioning tool with Python and the BLIP model
- Creating a user-friendly interface using Gradio
Technologies and Models Used:
- Platform: Hugging Face
- Model: BLIP (Bootstrapping Language-Image Pretraining)
- Libraries: Gradio, Transformers, Python
This module focused on creating a simple chatbot using open-source large language models (LLMs). I integrated the chatbot into a web interface and explored how to select the right LLM for chatbot applications. The project used Facebook's Blenderbot model and Hugging Face's Transformers library.
Key Learning Objectives:
- Understanding chatbot components
- Selecting an appropriate LLM for chatbots
- Working with Transformer models
- Creating a chatbot with Python
Technologies and Models Used:
- Model: Blenderbot (Facebook)
- Libraries: Hugging Face Transformers, Python
- Web Interface: Flask
In this module, I built a voice-enabled AI assistant that integrates IBM Watson’s speech-to-text and text-to-speech functionalities. The assistant is powered by OpenAI’s GPT-3 for high intelligence and deployed to a public server. Technologies used include Python, Flask, HTML, CSS, and JavaScript.
Key Learning Objectives:
- Building chatbots with voice input/output capabilities
- Using Watson’s speech services for voice communication
- Setting up a development environment and deploying to a public server
Technologies and Models Used:
- Speech Services: IBM Watson (Speech-to-Text and Text-to-Speech)
- Model: GPT-3 (OpenAI)
- Libraries: Flask, HTML, CSS, JavaScript
- Deployment: Public server
This module introduced LLMs for text summarization. I built an application to capture audio using OpenAI's Whisper, summarize the text using Llama 2, and deployed the app using IBM Cloud Code Engine.
Key Learning Objectives:
- Using LLMs for text generation and summarization
- Implementing speech-to-text with Whisper
- Deploying applications to a cloud platform
Technologies and Models Used:
- Model: Llama 2 (Meta)
- Speech-to-Text: OpenAI Whisper
- Libraries: IBM Cloud Code Engine, Flask
- Deployment: IBM Cloud
In this module, I created a chatbot that allows users to upload PDFs and ask questions based on the content. The chatbot uses Llama 2 with Retrieval-Augmented Generation (RAG) and popular frameworks like LangChain to interpret user inputs and generate intelligent responses.
Key Learning Objectives:
- Using LLMs and RAG for information extraction from large texts
- Developing web applications using Python and Flask
- Implementing LangChain for chatbot intelligence
Technologies and Models Used:
- Model: Llama 2 (Meta)
- Technique: Retrieval-Augmented Generation (RAG)
- Libraries: LangChain, Flask
This project involved building a voice translator assistant using generative AI models such as flan-ul2 and IBM Watson’s Speech Libraries. The assistant translates speech into a specified language and provides voice output.
Key Learning Objectives:
- Implementing multilingual translation with AI models
- Using speech-to-text and text-to-speech functionalities
- Creating a web-based voice assistant with Python, Flask, HTML, CSS, and JavaScript
Technologies and Models Used:
- Model: Flan-UL2
- Speech Services: IBM Watson (Speech Libraries for Embed)
- Libraries: Flask, HTML, CSS, JavaScript
In this final module, I developed an AI Career Coach consisting of three applications: a resume enhancement tool, a personalized cover letter generator, and a career advisor. These applications leverage the Llama-2-70b-chat model integrated into IBM watsonx.ai and feature a Gradio-based web interface.
Key Learning Objectives:
- Building AI-powered career applications
- Using open-source LLMs with platforms like IBM watsonx
- Creating web interfaces using Gradio
Technologies and Models Used:
- Model: Llama-2-70b-chat (Meta)
- Platform: IBM watsonx.ai
- Libraries: Gradio, Flask, Python
To explore the projects:
-
Clone the repository:
git clone https://github.com/your-username/your-repo.git
-
Install dependencies for each module by following the instructions in the respective project folders.
-
Feel free to fork this repository, explore the code, and build upon the projects!