Question

I got a application that has to store a lot of sparse data.
All documents are separated into Projects.
Each Project has its own database, with its own collections and documents, but all on the same server.
Now I want to make it easier to Query and Reference across Projects.

So I m considering moving all data into 1 database and let each document have a "project" field that I can query against.
The database schema would go from something like:

Project1 (Database)
    Task (Collection)
        {name: my_task, status: Completed, ...}

Project2 (Database)
    Task (Collection)
        {name: other_task, status: Started, ...}

类似:

SingleDatabase
    Task (Collection)
        {name: my_task, status: Completed, project: Project1, ...}
        {name: other_task, status: Started, project: Project2, ...}

My guess is that it would have some performance tradeoffs to memory, disk usage, and write performance.
The problem is that I have no idea about how much of an impact it would have, if its worth doing at all.

The question is:
Is it possible to calculate what impact this decision could have on the server?
Something like: given X collections, X document, X indexes... The server would on average have: X/s slower write, require X more memory.. and so on.

Answer 1

这是高度理论性的问题,而这个词在表现方面是一种坏的附庸。” 即便有一贯的、既定的理论,由于您必须说明打字(即业务具有历史、没有时间可变性、需要非常详细的使用模式等)、许多非线效应(大多数算法旨在达到某些标志(n)或nlog(n>>>)的行为)和业绩功能上的不连续(如果您的RAM不能再保留指数、打上跳板等),以及硬件的特殊性(SSD)。

发现其行为最快和最可靠的方式是执行<>。这种执行可以是轻率的、 ha的和没有的。但你可以在几个小时内得到很好的信号。

一些理论投入:

从本质上讲,使用多个数据库就是一种 b: 你们有一些能够迅速确定哪些是争.的法典。在这些桶子里,指数小幅,因此速度不大。另一方面,搜索时间只应增加指数规模不断提高的日志。特别是在大型藏书方面,这意味着实际上没有任何区别。

由于Mongo,将更有效地利用破碎的空间(除非你严重打碎了你的数据库环境)。亚洲开发银行将为每个数据库分配一份<代码>.ns的16MB尺寸和至少64MB数据文档,即使你只储存了几份文件。因此,如果小数据库数量庞大,那么尽管有其他领域,在移徙之后,你的软盘足迹应当更好。

Changes to the RAM footprint should be negligible, but memory is such an intricate topic that I would not bet a dime.

友情链接