This document covers optional features that can be enabled in the deployed Azure resources.
You should typically enable these features before running azd up
. Once you've set them, return to the deployment steps.
- Using GPT-4
- Using text-embedding-3 models
- Enabling GPT-4 Turbo with Vision
- Enabling speech input/output
- Enabling Integrated Vectorization
- Enabling authentication
- Enabling login and document level access control
- Enabling user document upload
- Enabling CORS for an alternate frontend
- Adding an OpenAI load balancer
- Deploying with private endpoints
- Using local parsers
(Instructions for GPT-4, GPT-4o, and GPT-4o mini models are also included here.)
We generally find that most developers are able to get high-quality answers using GPT-3.5. However, if you want to try GPT-4, GPT-4o, or GPT-4o mini, you can do so by following these steps:
Execute the following commands inside your terminal:
-
To set the name of the deployment, run this command with a unique name in your Azure OpenAI account. You can use any deployment name, as long as it's unique in your Azure OpenAI account.
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT <your-deployment-name>
For example:
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT chat4
-
To set the GPT model name to a gpt-4, gpt-4o, or gpt-4o mini version from the available models, run this command with the appropriate GPT model name.
For GPT-4:
azd env set AZURE_OPENAI_CHATGPT_MODEL gpt-4
For GPT-4o:
azd env set AZURE_OPENAI_CHATGPT_MODEL gpt-4o
For GPT-4o mini:
azd env set AZURE_OPENAI_CHATGPT_MODEL gpt-4o-mini
-
To set the Azure OpenAI deployment capacity, run this command with the desired capacity.
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_CAPACITY 10
-
To set the Azure OpenAI deployment version from the available versions, run this command with the appropriate version.
For GPT-4:
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_VERSION turbo-2024-04-09
For GPT-4o:
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_VERSION 2024-05-13
For GPT-4o mini:
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_VERSION 2024-07-18
-
To update the deployment with the new parameters, run this command.
azd up
Note
To revert back to GPT 3.5, run the following commands:
azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT chat
to set the name of your old GPT 3.5 deployment.azd env set AZURE_OPENAI_CHATGPT_MODEL gpt-35-turbo
to set the name of your old GPT 3.5 model.azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_CAPACITY 30
to set the capacity of your old GPT 3.5 deployment.azd env set AZURE_OPENAI_CHATGPT_DEPLOYMENT_VERSION 0613
to set the version number of your old GPT 3.5.azd up
to update the provisioned resources.
Note that this does not delete your GPT-4 deployment; it just makes your application create a new or reuse an old GPT 3.5 deployment. If you want to delete it, you can go to your Azure OpenAI studio and do so.
By default, the deployed Azure web app uses the text-embedding-ada-002
embedding model. If you want to use one of the text-embedding-3 models, you can do so by following these steps:
-
Run one of the following commands to set the desired model:
azd env set AZURE_OPENAI_EMB_MODEL_NAME text-embedding-3-small
azd env set AZURE_OPENAI_EMB_MODEL_NAME text-embedding-3-large
-
Specify the desired dimensions of the model: (from 256-3072, model dependent)
azd env set AZURE_OPENAI_EMB_DIMENSIONS 256
-
Set the model version to "1" (the only version as of March 2024):
azd env set AZURE_OPENAI_EMB_DEPLOYMENT_VERSION 1
-
When prompted during
azd up
, make sure to select a region for the OpenAI resource group location that supports the text-embedding-3 models. There are limited regions available.
If you have already deployed:
- You'll need to change the deployment name by running
azd env set AZURE_OPENAI_EMB_DEPLOYMENT <new-deployment-name>
- You'll need to create a new index, and re-index all of the data using the new model. You can either delete the current index in the Azure Portal, or create an index with a different name by running
azd env set AZURE_SEARCH_INDEX new-index-name
. When you next runazd up
, the new index will be created and the data will be re-indexed. - If your OpenAI resource is not in one of the supported regions, you should delete
openAiResourceGroupLocation
from.azure/YOUR-ENV-NAME/config.json
. When runningazd up
, you will be prompted to select a new region.
![NOTE] The text-embedding-3 models are not currently supported by the integrated vectorization feature.
This section covers the integration of GPT-4 Vision with Azure AI Search. Learn how to enhance your search capabilities with the power of image and text indexing, enabling advanced search functionalities over diverse document types. For a detailed guide on setup and usage, visit our Enabling GPT-4 Turbo with Vision page.
📺 Watch a short video of speech input/output
You can optionally enable speech input/output by setting the azd environment variables.
The speech input feature uses the browser's built-in Speech Recognition API. It may not work in all browser/OS combinations. To enable speech input, run:
azd env set USE_SPEECH_INPUT_BROWSER true
The speech output feature uses Azure Speech Service for speech-to-text. Additional costs will be incurred for using the Azure Speech Service. See pricing. To enable speech output, run:
azd env set USE_SPEECH_OUTPUT_AZURE true
To set the voice for the speech output, run:
azd env set AZURE_SPEECH_SERVICE_VOICE en-US-AndrewMultilingualNeural
Alternatively you can use the browser's built-in Speech Synthesis API. It may not work in all browser/OS combinations. To enable speech output, run:
azd env set USE_SPEECH_OUTPUT_BROWSER true
Azure AI search recently introduced an integrated vectorization feature in preview mode. This feature is a cloud-based approach to data ingestion, which takes care of document format cracking, data extraction, chunking, vectorization, and indexing, all with Azure technologies.
To enable integrated vectorization with this sample:
- If you've previously deployed, delete the existing search index.
- Run
azd env set USE_FEATURE_INT_VECTORIZATION true
- Run
azd up
to update system and user roles - You can view the resources such as the indexer and skillset in Azure Portal and monitor the status of the vectorization process.
This feature is not currently compatible with GPT4-vision or the newer text-embedding-3 models.
By default, the deployed Azure web app will have no authentication or access restrictions enabled, meaning anyone with routable network access to the web app can chat with your indexed data. If you'd like to automatically setup authentication and user login as part of the azd up
process, see this guide.
Alternatively, you can manually require authentication to your Azure Active Directory by following the Add app authentication tutorial and set it up against the deployed web app.
To then limit access to a specific set of users or groups, you can follow the steps from Restrict your Microsoft Entra app to a set of users by changing "Assignment Required?" option under the Enterprise Application, and then assigning users/groups access. Users not granted explicit access will receive the error message -AADSTS50105: Your administrator has configured the application <app_name> to block users unless they are specifically granted ('assigned') access to the application.-
By default, the deployed Azure web app allows users to chat with all your indexed data. You can enable an optional login system using Azure Active Directory to restrict access to indexed data based on the logged in user. Enable the optional login and document level access control system by following this guide.
You can enable an optional user document upload system to allow users to upload their own documents and chat with them. This feature requires you to first enable login and document level access control. Then you can enable the optional user document upload system by setting an azd environment variable:
azd env set USE_USER_UPLOAD true
Then you'll need to run azd up
to provision an Azure Data Lake Storage Gen2 account for storing the user-uploaded documents.
When the user uploads a document, it will be stored in a directory in that account with the same name as the user's Entra object id,
and will have ACLs associated with that directory. When the ingester runs, it will also set the oids
of the indexed chunks to the user's Entra object id.
If you are enabling this feature on an existing index, you should also update your index to have the new storageUrl
field:
./scripts/manageacl.ps1 -v --acl-action enable_acls
And then update existing search documents with the storage URL of the main Blob container:
./scripts/manageacl.ps1 -v --acl-action update_storage_urls --url <https://YOUR-MAIN-STORAGE-ACCOUNT.blob.core.windows.net/content/>
Going forward, all uploaded documents will have their storageUrl
set in the search index.
This is necessary to disambiguate user-uploaded documents from admin-uploaded documents.
By default, the deployed Azure web app will only allow requests from the same origin. To enable CORS for a frontend hosted on a different origin, run:
- Run
azd env set ALLOWED_ORIGIN https://<your-domain.com>
- Run
azd up
For the frontend code, change BACKEND_URI
in api.ts
to point at the deployed backend URL, so that all fetch requests will be sent to the deployed backend.
For an alternate frontend that's written in Web Components and deployed to Static Web Apps, check out azure-search-openai-javascript and its guide on using a different backend. Both these repositories adhere to the same HTTP protocol for AI chat apps.
As discussed in more details in our productionizing guide, you may want to consider implementing a load balancer between OpenAI instances if you are consistently going over the TPM limit. Fortunately, this repository is designed for easy integration with other repositories that create load balancers for OpenAI instances. For seamless integration instructions with this sample, please check:
- Scale Azure OpenAI for Python with Azure API Management
- Scale Azure OpenAI for Python chat using RAG with Azure Container Apps
It is possible to deploy this app with public access disabled, using Azure private endpoints and private DNS Zones. For more details, read the private deployment guide. That requires a multi-stage provisioning, so you will need to do more than just azd up
after setting the environment variables.
If you want to decrease the charges by using local parsers instead of Azure Document Intelligence, you can set environment variables before running the data ingestion script. Note that local parsers will generally be not as sophisticated.
- Run
azd env set USE_LOCAL_PDF_PARSER true
to use the local PDF parser. - Run
azd env set USE_LOCAL_HTML_PARSER true
to use the local HTML parser.
The local parsers will be used the next time you run the data ingestion script. To use these parsers for the user document upload system, you'll need to run azd provision
to update the web app to use the local parsers.