English 中文(简体)
Multitenant DB: Why put a TenantID column in every table?
原标题:

Every tutorial I ve seen about Multitenant database models tells you to put the TenantID in every single table:

zoos
-------
id
zoo_name
tenant_id

animals
-------
id
zoo_id
animal_name
tenant_id

However, this seems redundant to me. Why not add the tenant_id column to just the zoos table and exploit the foreign key relationship between zoos and animals?

Do you add tenant_id to every table just to keep the joins from getting too crazy? Is it a safeguard against bugs? A performance consideration?

最佳回答

If I had tenantID at the top of the hierarchy (i.e. at the zoo level) you have several issues to consider.

  1. The top of the hierarchy can never change, for example if you need to add a node on the tree above the zoo level (say regions -> zoos -> animals) then it will force a re-org every time.
  2. For certain queries, you will be forced to start at the top of the hierarchy, i.e. give me a list of all animals available will force you to start at the top of the tree
  3. Why not use schemas ? Each tenant is isolated within their own schema. This will also separate the data-sets nicely.
问题回答

If one of your key design considerations is security--specifically, one client can no way no how no when access another client s data--then, depending on how you implement this security, sticking that qualifying column in every table may be necessary. One such tactic described here requires building a view on every table; assuming each table contains a tenantId column, then if properly configured each view could contain a "WHERE tenantId = SUSER_SID()" clause (and of course you configure the database so that clients can only access the views).

Another factor (as in my current job) is loading warehouse data (ETL). Tables are partitioned on tenantId (we use table partitioning, but partitioned views would also work), and data can be easily loaded or unloaded for a client without seriously impacting any other client.

But as ever, there s a lot of "it depends" involved. If there is no clear and present need, and a very low likelihood of future need, then normalize that column out. Just realize that it s more a devise of physical implementation than of conceptual or logical database design.

Its there for convenience and performance - in terms of normalisation you re absolutely right, it only needs to go in at the top. The problem then becomes that to get to some data (say zoo -> animal -> food -> supplier) you have to have horribly complex joins on what are notionally very simple queries.

So in the real world one has to compromise - question then becomes where and to what extent.

See this article Maybe Normalizing Isn t Normal - and its conclusion:

As the old adage goes, normalize until it hurts, denormalize until it works

as a place to start exploring the subject

The first thing that springs to mind is that it s slower to look up animals > zoos > tenants than simply animals > tenants. And most likely this is a lookup you will do often (for example, "get all animals for a certain tenant, regardless of zoo").

For small to mid-sized applications you can get away with a more normalized structure, but for the sake of efficiency, you should go with extraneous data (and generally speaking, multitenancy applications are not small). Just make sure it doesn t go "out of sync", which is a risk that comes with having redundant data.

To answer your last paragraph, the reason is performance, pure and simple. Joins are no bad thing; they help you keep a piece of data in one place rather than three. It s definitely not to prevent bugs. Adding a tenant_id field to more tables will increase the risk of bugs (although for an id that never changes, it wouldn t be as much of an issue).

Well, Bob may own a giraffe in zoo No1, while Joe may own a lion in the same zoo. They are not supposed to look at each others data.

The reason N1 is for security.

Security need to be a strong concept in multi-tenant application.

Suppose that you give a user the ability to modify a Animal. You create a form whith a select that show the zoo fo the current tenant. What happen if the user hack the form and pass a zoo id of another tenant?

The animal will be moved to another zoo of another tenant!!

This is real pain in a multi tenant app!





相关问题
How to model a many-to-many relationship in App Engine?

I have a question regarding how to model a many-to-many relationship in App Engine: A Blogentry can have many tags, a tag can apply to many blog entries. I see a couple of scenarios: Use a Set of ...

How to emulate tagged union in a database?

What is the best way to emulate Tagged union in databases? I m talking about something like this: create table t1 { vehicle_id INTEGER NOT NULL REFERENCES car(id) OR motor(id) -- not valid ... } ...

Users asking for denormalized database

I am in the early stages of developing a database-driven system and the largest part of the system revolves around an inheritance type of relationship. There is a parent entity with about 10 columns ...

How to best implement a 1:1 relationship in a RDBMS?

Yesterday while working on a project I came up on a peculiar 1:1 relationship which left me wondering - how to best implement this (clearly, we had done it wrong :D) The idea is that there are two ...

Automatic filling entity properties

I have some type an architect(?) question I develop an application, based on Spring and Hibernate (annotated configuration) For each table in my database I added 4 fields: createdBy and modifiedBy(...

热门标签