无法使用 Python SDK 进行 Azure AI 搜索将数据添加到复杂字段
原标题:Cannot add data to a ComplexField using Python SDK for Azure AI Search
I want to upload a payload with a nested dictionary to Azure AI Search index. I am using a ComplexField in the index for a nested dictionary in my payload. The nested dictionary is not being recognized by the index and I get a null-error. Here is my code:
ComplexField,
CorsOptions,
SearchIndex,
ScoringProfile,
SearchFieldDataType,
SimpleField,
SearchableField,
)
request_example = {
"id": "1",
"text": "bla",
"payload": {
"processId": "00665",
"color": "green"
}
}
# Create a search index
index_client = SearchIndexClient(endpoint=service_endpoint, credential=credential)
fields = [
SimpleField(name="id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True),
SearchableField(name="text", type=SearchFieldDataType.String),
ComplexField(
name="payload",
fields=[
SimpleField(name="processId", type=SearchFieldDataType.String),
SimpleField(name="color", type=SearchFieldDataType.String),
],
collection=True,
)
]
# Configure the vector search configuration
vector_search = VectorSearch(
algorithms=[
HnswAlgorithmConfiguration(
name="myHnsw",
kind=VectorSearchAlgorithmKind.HNSW,
parameters=HnswParameters(
m=4,
ef_construction=400,
ef_search=500,
metric=VectorSearchAlgorithmMetric.COSINE,
)
),
],
profiles=[
VectorSearchProfile(
name="myHnswProfile",
algorithm_configuration_name="myHnsw",
),
],
)
semantic_config = SemanticConfiguration(
name="my-semantic-config",
prioritized_fields=SemanticPrioritizedFields(
# title_field=SemanticField(field_name="title"),
# keywords_fields=[SemanticField(field_name="color")],
content_fields=[SemanticField(field_name="text")]
)
)
# Create the semantic settings with the configuration
semantic_search = SemanticSearch(configurations=[semantic_config])
cors_options = CorsOptions(allowed_origins=["*"], max_age_in_seconds=60)
from typing import List
scoring_profiles: List[ScoringProfile] = []
# Create the search index with the semantic settings
index = SearchIndex(name="complex_field_test", fields=fields,
vector_search=vector_search, semantic_search=semantic_search,scoring_profiles=scoring_profiles, cors_options=cors_options)
result = index_client.create_or_update_index(index)
print(f {result.name} created )
# Upload some documents to the index
search_client = SearchClient(endpoint=service_endpoint, index_name="complex_field_test", credential=credential)
result = search_client.upload_documents([request_example])
print(f"Uploaded {len(documents)} documents")
And the error I get:
The request is invalid. Details: A null value was found for the property named payload , which has the expected type Collection(search.complex.payload)[Nullable=False] . The expected type Collection(search.complex.payload)[Nullable=False] does not allow null values.
So, as I understand the nested dictionary "payload" is not being recognized and appears like empty, although the sub-fields are registered in the index and have no null values. Can you help me with this problem? How to add data to a ComplexField?
问题回答
try with square brackets:
"payload": [
{
"processId": "00665",
"color": "green"
}
]
Although this might be late, it could still be helpful for others.
To save a payload like {"processId": "00665", "color": "green"} in a search index:
Step 1: Determine whether you need payload as a single dictionary or as a list of dictionaries. Set collection=True in the ComplexField definition if you want a list of dictionaries.
Step 2:
Assuming collection=False (meaning you want payload as a single dictionary), you can save and upload documents as follows:
payload = {"processId": "00665", "color": "green"}
request_example = {
"id": "1",
"text": "bla",
"payload": flatdict(payload) # you can check on this link how to flatdict https://www.freecodecamp.org/news/how-to-flatten-a-dictionary-in-python-in-4-different-ways/
}
# Create a search index
index_client = SearchIndexClient(endpoint=service_endpoint, credential=credential)
fields = [
SimpleField(name="id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True),
SearchableField(name="text", type=SearchFieldDataType.String),
ComplexField(
name="payload",
fields=[
SimpleField(name="processId", type=SearchFieldDataType.String),
SimpleField(name="color", type=SearchFieldDataType.String),
],
collection=False, # important . Set to True if you want a list of collection of payload
)
]
search_client = SearchClient(endpoint=service_endpoint, index_name="complex_field_test", credential=credential)
result = search_client.upload_documents([request_example])
print(f"Uploaded {len(result)} documents")