Skip to content

Commit 00669d7

Browse files
feat: Update scripts and documentation to support new parameters for Azure services and Content Understanding API
1 parent 81995a5 commit 00669d7

12 files changed

+114
-111
lines changed

documents/CustomizeData.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@ If you would like to update the solution to leverage your own data please follow
2626
bash ./infra/scripts/process_custom_data.sh \
2727
<Resource-Group-Name> <Azure-Subscription-ID> \
2828
<Storage-Account-Name> <Storage-Container-Name> \
29-
<SQL-Server-Name> <SQL-Database-Name> <SQL-Managed-Identity-Client-ID> <SQL-Managed-Identity-Display-Name> \
29+
<SQL-Server-Name> <SQL-Database-Name> <Backend-User-MID-Client-ID> <Backend-User-MID-Display-Name> \
3030
<AI-Search-Name> <Search-Endpoint> \
3131
<AI-Foundry-Resource-ID> <CU-Foundry-Resource-ID> \
32-
<OpenAI-Endpoint> <Embedding-Model> <Deployment-Model> <OpenAI-Preview-API-Version> \
33-
<CU-Endpoint> <AI-Agent-Endpoint>
32+
<OpenAI-Endpoint> <Embedding-Model> <Deployment-Model> \
33+
<CU-Endpoint> <AI-Agent-Endpoint> <CU-API-Version>
3434
```
3535
3636
## How to Login to VM Using Azure Bastion

documents/DeploymentGuide.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ If you're not using one of the above options for opening the project, then you'l
138138
1. Make sure the following tools are installed:
139139
- [PowerShell](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.5) <small>(v7.0+)</small> - available for Windows, macOS, and Linux.
140140
- [Azure Developer CLI (azd)](https://aka.ms/install-azd) <small>(v1.18.0+)</small> - version
141-
- [Python 3.9+](https://www.python.org/downloads/)
141+
- [Python 3.9 to 3.11](https://www.python.org/downloads/)
142142
- [Docker Desktop](https://www.docker.com/products/docker-desktop/)
143143
- [Git](https://git-scm.com/downloads)
144144

@@ -250,11 +250,11 @@ Once you've opened the project in [Codespaces](#github-codespaces), [Dev Contain
250250
bash ./infra/scripts/process_sample_data.sh \
251251
<Resource-Group-Name> <Azure-Subscription-ID> \
252252
<Storage-Account-Name> <Storage-Container-Name> \
253-
<SQL-Server-Name> <SQL-Database-Name> <SQL-Managed-Identity-Client-ID> <SQL-Managed-Identity-Display-Name> \
253+
<SQL-Server-Name> <SQL-Database-Name> <Backend-User-MID-Client-ID> <Backend-User-MID-Display-Name> \
254254
<AI-Search-Name> <Search-Endpoint> \
255255
<AI-Foundry-Resource-ID> <CU-Foundry-Resource-ID> \
256-
<OpenAI-Endpoint> <Embedding-Model> <Deployment-Model> <OpenAI-Preview-API-Version> \
257-
<CU-Endpoint> <AI-Agent-Endpoint>
256+
<OpenAI-Endpoint> <Embedding-Model> <Deployment-Model> \
257+
<CU-Endpoint> <AI-Agent-Endpoint> <CU-API-Version>
258258
```
259259

260260
6. Once the deployment has completed successfully, open the [Azure Portal](https://portal.azure.com/), go to the deployed resource group, find the App Service, and get the app URL from `Default domain`.

documents/LocalDebuggingSetup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Install these tools before you start:
1111
- [Azure Tools](https://marketplace.visualstudio.com/items?itemName=ms-vscode.vscode-node-azure-pack)
1212
- [Bicep](https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-bicep)
1313
- [Python](https://marketplace.visualstudio.com/items?itemName=ms-python.python)
14-
- [Python 3.11](https://www.python.org/downloads/). **Important:** Check "Add Python to PATH" during installation.
14+
- [Python 3.9 to 3.11](https://www.python.org/downloads/). **Important:** Check "Add Python to PATH" during installation.
1515
- [PowerShell 7.0+](https://github.com/PowerShell/PowerShell#get-powershell).
1616
- [Node.js (LTS)](https://nodejs.org/en).
1717
- [Git](https://git-scm.com/downloads).

infra/main.bicep

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,9 @@ param azureOpenAIApiVersion string = '2025-01-01-preview'
7777
@description('Optional. Version of AI Agent API.')
7878
param azureAiAgentApiVersion string = '2025-05-01'
7979

80+
@description('Optional. Version of Content Understanding API.')
81+
param azureContentUnderstandingApiVersion string = '2024-12-01-preview'
82+
8083
// You can increase this, but capacity is limited per model/region, so you will get errors if you go over
8184
// https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits
8285
@minValue(10)
@@ -1428,6 +1431,9 @@ output AZURE_OPENAI_API_VERSION string = azureOpenAIApiVersion
14281431
@description('Contains Azure OpenAI Preview API version.')
14291432
output AZURE_OPENAI_PREVIEW_API_VERSION string = azureAiAgentApiVersion
14301433

1434+
@description('Contains Content Understanding API version.')
1435+
output AZURE_CONTENT_UNDERSTANDING_API_VERSION string = azureContentUnderstandingApiVersion
1436+
14311437
@description('Contains Azure OpenAI resource name.')
14321438
output AZURE_OPENAI_RESOURCE string = aiFoundryAiServices.outputs.name
14331439

@@ -1440,11 +1446,11 @@ output SQLDB_DATABASE string = 'sqldb-${solutionSuffix}'
14401446
@description('Contains SQL server name.')
14411447
output SQLDB_SERVER string = '${sqlDBModule.outputs.name }${environment().suffixes.sqlServerHostname}'
14421448

1443-
@description('Display name of the SQL managed identity.')
1444-
output SQLDB_USER_MID_NAME string = backendUserAssignedIdentity.outputs.name
1449+
@description('Display name of the backend API user-assigned managed identity (also used for SQL database access).')
1450+
output BACKEND_USER_MID_NAME string = backendUserAssignedIdentity.outputs.name
14451451

1446-
@description('Contains SQL database user managed identity client ID.')
1447-
output SQLDB_USER_MID string = backendUserAssignedIdentity.outputs.clientId
1452+
@description('Client ID of the backend API user-assigned managed identity (also used for SQL database access).')
1453+
output BACKEND_USER_MID string = backendUserAssignedIdentity.outputs.clientId
14481454

14491455
@description('Contains AI project client usage setting.')
14501456
output USE_AI_PROJECT_CLIENT string = 'False'

infra/scripts/copy_kb_files.sh

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22

33
# Variables
4-
storageAccount="$1"
4+
storageAccountName="$1"
55
containerName="$2"
66
resourceGroupName="$3"
77

@@ -12,14 +12,14 @@ zipFileName2="infra/data/audio_data.zip"
1212
extractedFolder2="audio_data"
1313

1414
echo "Script Started"
15-
echo "Storage Account: $storageAccount"
15+
echo "Storage Account: $storageAccountName"
1616
echo "Container Name: $containerName"
1717
echo "Resource Group: $resourceGroupName"
1818

1919
# Validate required parameters
20-
if [ -z "$storageAccount" ] || [ -z "$containerName" ] || [ -z "$resourceGroupName" ]; then
20+
if [ -z "$storageAccountName" ] || [ -z "$containerName" ] || [ -z "$resourceGroupName" ]; then
2121
echo "Error: Missing required parameters."
22-
echo "Usage: $0 <storageAccount> <containerName> <resourceGroupName>"
22+
echo "Usage: $0 <storageAccountName> <containerName> <resourceGroupName>"
2323
exit 1
2424
fi
2525

@@ -49,7 +49,7 @@ fi
4949
# Check and assign Storage Blob Data Contributor role to current user
5050
echo "Checking Storage Blob Data Contributor role assignment..."
5151
signed_user_id=$(az ad signed-in-user show --query id --output tsv)
52-
storage_resource_id=$(az storage account show --name "$storageAccount" --resource-group "$resourceGroupName" --query id --output tsv)
52+
storage_resource_id=$(az storage account show --name "$storageAccountName" --resource-group "$resourceGroupName" --query id --output tsv)
5353

5454
role_assignment=$(MSYS_NO_PATHCONV=1 az role assignment list --assignee $signed_user_id --role "Storage Blob Data Contributor" --scope $storage_resource_id --query "[].roleDefinitionId" -o tsv)
5555
if [ -z "$role_assignment" ]; then
@@ -73,7 +73,7 @@ fi
7373
echo "Uploading call transcripts to storage account..."
7474
if [ -d "$extractedFolder1" ]; then
7575
az storage blob upload-batch \
76-
--account-name "$storageAccount" \
76+
--account-name "$storageAccountName" \
7777
--destination "$containerName/$extractedFolder1" \
7878
--source "$extractedFolder1" \
7979
--auth-mode login \
@@ -92,7 +92,7 @@ fi
9292
echo "Uploading audio data to storage account..."
9393
if [ -d "$extractedFolder2" ]; then
9494
az storage blob upload-batch \
95-
--account-name "$storageAccount" \
95+
--account-name "$storageAccountName" \
9696
--destination "$containerName/$extractedFolder2" \
9797
--source "$extractedFolder2" \
9898
--auth-mode login \
@@ -111,13 +111,13 @@ fi
111111
# Create custom data directories for user uploads
112112
echo "Creating custom data directories..."
113113
az storage fs directory create \
114-
--account-name "$storageAccount" \
114+
--account-name "$storageAccountName" \
115115
--file-system "$containerName" \
116116
--name custom_audiodata \
117117
--auth-mode login 2>/dev/null || echo "custom_audiodata directory may already exist"
118118

119119
az storage fs directory create \
120-
--account-name "$storageAccount" \
120+
--account-name "$storageAccountName" \
121121
--file-system "$containerName" \
122122
--name custom_transcripts \
123123
--auth-mode login 2>/dev/null || echo "custom_transcripts directory may already exist"

infra/scripts/index_scripts/02_create_cu_template_audio.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,24 +9,25 @@
99
# Get parameters from command line
1010
p = argparse.ArgumentParser()
1111
p.add_argument("--cu_endpoint", required=True)
12+
p.add_argument("--cu_api_version", required=True)
1213
args = p.parse_args()
1314

1415
CU_ENDPOINT = args.cu_endpoint
16+
CU_API_VERSION = args.cu_api_version
1517

16-
AZURE_AI_API_VERSION = "2024-12-01-preview"
1718
ANALYZER_ID = "ckm-audio"
1819

1920
ANALYZER_TEMPLATE_FILE = 'infra/data/ckm-analyzer_config_audio.json'
2021

21-
# Add parent directory to path for module access
22+
# Add parent directory to path for imports
2223
sys.path.append(str(Path.cwd().parent))
2324

2425
credential = AzureCliCredential()
2526
# Initialize Content Understanding Client
2627
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
2728
client = AzureContentUnderstandingClient(
2829
endpoint=CU_ENDPOINT,
29-
api_version=AZURE_AI_API_VERSION,
30+
api_version=CU_API_VERSION,
3031
token_provider=token_provider
3132
)
3233

infra/scripts/index_scripts/02_create_cu_template_text.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,12 @@
77
# Get parameters from command line
88
p = argparse.ArgumentParser()
99
p.add_argument("--cu_endpoint", required=True)
10+
p.add_argument("--cu_api_version", required=True)
1011
args = p.parse_args()
1112

1213
CU_ENDPOINT = args.cu_endpoint
14+
CU_API_VERSION = args.cu_api_version
1315

14-
AZURE_AI_API_VERSION = "2024-12-01-preview"
1516
ANALYZER_ID = "ckm-json"
1617

1718
ANALYZER_TEMPLATE_FILE = 'infra/data/ckm-analyzer_config_text.json'
@@ -21,7 +22,7 @@
2122
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
2223
client = AzureContentUnderstandingClient(
2324
endpoint=CU_ENDPOINT,
24-
api_version=AZURE_AI_API_VERSION,
25+
api_version=CU_API_VERSION,
2526
token_provider=token_provider
2627
)
2728

infra/scripts/index_scripts/03_cu_process_data_text.py

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,24 +22,24 @@
2222
p = argparse.ArgumentParser()
2323
p.add_argument("--search_endpoint", required=True)
2424
p.add_argument("--ai_project_endpoint", required=True)
25-
p.add_argument("--openai_api_version", required=True)
2625
p.add_argument("--deployment_model", required=True)
2726
p.add_argument("--embedding_model", required=True)
28-
p.add_argument("--storage_account", required=True)
27+
p.add_argument("--storage_account_name", required=True)
2928
p.add_argument("--sql_server", required=True)
3029
p.add_argument("--sql_database", required=True)
3130
p.add_argument("--cu_endpoint", required=True)
31+
p.add_argument("--cu_api_version", required=True)
3232
args = p.parse_args()
3333

3434
SEARCH_ENDPOINT = args.search_endpoint
3535
AI_PROJECT_ENDPOINT = args.ai_project_endpoint
36-
OPENAI_API_VERSION = args.openai_api_version
3736
DEPLOYMENT_MODEL = args.deployment_model
3837
EMBEDDING_MODEL = args.embedding_model
39-
STORAGE_ACCOUNT = args.storage_account
38+
STORAGE_ACCOUNT_NAME = args.storage_account_name
4039
SQL_SERVER = args.sql_server
4140
SQL_DATABASE = args.sql_database
4241
CU_ENDPOINT = args.cu_endpoint
42+
CU_API_VERSION = args.cu_api_version
4343

4444
FILE_SYSTEM_CLIENT_NAME = "data"
4545
DIRECTORY = 'call_transcripts'
@@ -52,7 +52,7 @@
5252
print("Parameters received.")
5353

5454
# Azure DataLake setup
55-
account_url = f"https://{STORAGE_ACCOUNT}.dfs.core.windows.net"
55+
account_url = f"https://{STORAGE_ACCOUNT_NAME}.dfs.core.windows.net"
5656
credential = AzureCliCredential()
5757
service_client = DataLakeServiceClient(account_url, credential=credential, api_version='2023-01-03')
5858
file_system_client = service_client.get_file_system_client(FILE_SYSTEM_CLIENT_NAME)
@@ -76,10 +76,6 @@
7676
cursor = conn.cursor()
7777
print("SQL Server connection established.")
7878

79-
# CU API setup
80-
AZURE_AI_API_VERSION = "2024-12-01-preview"
81-
print("Setup complete.")
82-
8379
# SQL data type mapping for pandas to SQL conversion
8480
sql_data_types = {
8581
'int64': 'INT',
@@ -170,7 +166,7 @@ def generate_sql_insert_script(df, table_name, columns, sql_file_name):
170166
cu_token_provider = get_bearer_token_provider(cu_credential, "https://cognitiveservices.azure.com/.default")
171167
cu_client = AzureContentUnderstandingClient(
172168
endpoint=CU_ENDPOINT,
173-
api_version=AZURE_AI_API_VERSION,
169+
api_version=CU_API_VERSION,
174170
token_provider=cu_token_provider
175171
)
176172
ANALYZER_ID = "ckm-json"

infra/scripts/index_scripts/04_cu_process_custom_data.py

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -43,27 +43,27 @@
4343
parser.add_argument('--search_endpoint', required=True, help='Azure AI Search endpoint')
4444
parser.add_argument('--openai_endpoint', required=True, help='Azure OpenAI endpoint')
4545
parser.add_argument('--ai_project_endpoint', required=True, help='Azure AI Project endpoint')
46-
parser.add_argument('--openai_api_version', required=True, help='Azure OpenAI API version')
4746
parser.add_argument('--deployment_model', required=True, help='Azure OpenAI deployment model name')
4847
parser.add_argument('--embedding_model', required=True, help='Azure OpenAI embedding model name')
49-
parser.add_argument('--storage_account', required=True, help='Azure Storage Account name')
48+
parser.add_argument('--storage_account_name', required=True, help='Azure Storage Account name')
5049
parser.add_argument('--sql_server', required=True, help='Azure SQL Server name')
5150
parser.add_argument('--sql_database', required=True, help='Azure SQL Database name')
5251
parser.add_argument('--cu_endpoint', required=True, help='Azure Content Understanding endpoint')
52+
parser.add_argument('--cu_api_version', required=True, help='Azure Content Understanding API version')
5353

5454
args = parser.parse_args()
5555

5656
# Assign arguments to variables
5757
SEARCH_ENDPOINT = args.search_endpoint
5858
OPENAI_ENDPOINT = args.openai_endpoint
5959
AI_PROJECT_ENDPOINT = args.ai_project_endpoint
60-
AZURE_AI_API_VERSION = args.openai_api_version
6160
DEPLOYMENT_MODEL = args.deployment_model
6261
EMBEDDING_MODEL = args.embedding_model
63-
STORAGE_ACCOUNT_NAME = args.storage_account
62+
STORAGE_ACCOUNT_NAME = args.storage_account_name
6463
SQL_SERVER = args.sql_server
6564
SQL_DATABASE = args.sql_database
6665
CU_ENDPOINT = args.cu_endpoint
66+
CU_API_VERSION = args.cu_api_version
6767

6868
print("Command-line arguments parsed.")
6969

@@ -185,12 +185,11 @@ def create_search_index():
185185
print("SQL Server connection established.")
186186

187187
# Content Understanding client
188-
AZURE_AI_API_VERSION = "2024-12-01-preview"
189188
cu_credential = AzureCliCredential()
190189
cu_token_provider = get_bearer_token_provider(cu_credential, "https://cognitiveservices.azure.com/.default")
191190
cu_client = AzureContentUnderstandingClient(
192191
endpoint=CU_ENDPOINT,
193-
api_version=AZURE_AI_API_VERSION,
192+
api_version=CU_API_VERSION,
194193
token_provider=cu_token_provider
195194
)
196195
print("Content Understanding client initialized.")

0 commit comments

Comments
 (0)