我对美国妇女论坛来说是新鲜事,即将通过美国妇女论坛扩大网站。 我的卷宗中有一些数据(约200万个)。 我每个月都得到新的数据。 新浏览器更新了以前各行和新增新行。 单项卷宗中各行没有独一无二的识别标志,因为任何一栏都可以在新的更新版目录中加以调整。 我想将数据储存在一个妇科病房表中,但我确实不知道如何更新数据。
我所审议的两个备选办法:
- deleting the previous dynamo db table and creating a new one with the new csv file (not sure about the logistics for this one)
- don t use dynamo db and just read from the csv stored in s3 (not sure about the performance implications of reading from a csv each time I need to access the data)
- compare the new csv with old data in dynamo db, remove entries in dynamo db that do not occur in new csv, add all entries in new csv that do not occur in dynamo db (seems like way too many comparisons given the size of my dataset)
- just add new rows from the csv to dynamo db and ignore changes to previous rows (last resort)
我愿意接受任何建议。
我尚未尝试任何已考虑的备选办法。