English 中文(简体)
无法停止在 AWS Bedrock 知识库中同步作业
原标题:Unable to Stop Running Sync Job in AWS Bedrock Knowledge Base
I have an issue with AWS Bedrock Knowledge Base, Web crawler as a data source, I have accidently put 2 URLs, of Wikipedia (e.g, "https://en.wikipedia.org/wiki/article1 and second URL: "https://en.wikipedia.org/wiki/article2") hosts and scope set to HOSTS_ONLY I am assuming the crawler is trying to crawl the entire Wikipedia, but since it is not Kendra or Lambda but data source set in bedrock I cannot stop the ingestion job, the status is set to STARTING and I deleted the Vector Index(Open search) successfully to trigger a failure. What else can I do in this situation the job is still running for like an hour an a half. Any help would be grateful, thank you. I tried to delete Opensearch vector index, tried to search for any force stop calls, there are non, I opened a ticket to AWS but no answer. I just want to know how to stop the sync job, or if someone explain how it works or at least what the STARTING status means, if I will be paying, and what is the estimate.
问题回答
Who ever is going to stumble with the situation, I have fixed the issue so I am posting how I did it if someone in the future will need it: In Bedrock if you choose Web Crawler as a data source in Knowledge base is not working correctly, since once you press the sync button it cannot be stopped... What I did is to delete the Vector index (in my case Open Search Serverless or OSS for short) to try and trigger a failure, after a couple of hours it did fail but I had a different issue the data source could not be deleted. It had the following error: "Unable to delete data from vector store for data source with ID XXXXXXXXXX. Check your vector store configurations and permissions and retry your request. If the issue persists, consider updating the dataDeletionPolicy of the data source to RETAIN and retry your request." So if you change the deletion policy via the UI it won t work (it will show you that it changed successfully but it won t be changed) The solution for that is to delete via the CLI Following these docs: To get the current data source information https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/get-data-source.html To update the deletion policy https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/update-data-source.html To delete the data source https://docs.aws.amazon.com/cli/latest/reference/bedrock-agent/delete-data-source.html You will need to get the data source info since it contains the data source configuration information that is required to pass to the update data source command and since the vector ingestion cannot be changed we will need to pass that as well, so we won t get an error that we are trying to change it The get command commend to run is: aws bedrock-agent get-data-source --data-source-id --knowledge-base-id Replace DATASOURCE_ID and KB_ID with your parameters you will see 2 important parameters that you will need to use "data source configuration" and "vector ingestion configuration" Copy the containing jsons under each of the objects and copy them into a local file each object copy into a different file (e.g., data source configuration into tmp.json and vector ingestion configuration into tmp2.json make sure they are formatted correctly and you do not have json syntax errors) than upload these 2 into the cli using Actions -> Upload File in the CLI window on the top right After that we will run the update data source command: aws bedrock-agent update-data-source --data-source-id --knowledge-base-id --data-source-configuration file://tmp.json --vector-ingestion-configuration file://tmp2.json --name --data-deletion-policy RETAIN the response will be the the data source with the new configuration to make sure it did change run the get-data-source command again and look for "dataDeletionPolicy : RETAIN" instead of DELETE than you can run the delete-data-source command as follows: aws bedrock-agent delete-data-source --data-source-id --knowledge-base-id And you are good to go and delete the knowledge base as well if you need to. Hope that helped AI-ing boys




相关问题
Mount windows shared drive to MWAA in bootscript

In MWAA startup script sudo yum install samba-client cifs-utils -y sudo mount.cifs //dev/test/drop /mnt/dev/test-o username=testuser,password= pwd ,domain=XX Executing above commonds giving error - ...

How to get Amazon Seller Central orders programmatically?

We have been manually been keying Amazon orders into our system and would like to automate it. However, I can t seem to figure out how to go about it. Their documentation is barely there. There is: ...

Using a CDN like Amazon S3 to control access to media

I want to use Amazon S3/CloudFront to store flash files. These files must be private as they will be accessed by members. This will be done by storing each file with a link to Amazon using a mysql ...

unable to connect to database on AWS

actually I have my website build with Joomla hosted on hostmonster but all Joomla website need a database support to run this database is on AWS configuration files need to be updated for that I ...

Using EC2 Load Balancing with Existing Wordpress Blog

I currently have a virtual dedicated server through Media Temple that I use to run several high traffic Wordpress blogs. Both tend to receive sudden StumbleUpon traffic surges that (I m assuming) ...

SSL slowness in EC2

We ve deployed our rails app to EC2. In our setup, we have two proxies on small instances behind round-robin DNS. These run nginx load balancers for a dynamically growing and shrinking farm of web ...

热门标签