English 中文(简体)
How to emulate tagged union in a database?
原标题:

What is the best way to emulate Tagged union in databases? I m talking about something like this:

create table t1 {
  vehicle_id INTEGER NOT NULL REFERENCES car(id) OR motor(id) -- not valid
  ...
}

where vehicle_id would be id in car table OR motor table, and it would know which.

(assume that motor and car tables have nothing in common0

最佳回答

Some people use a design called Polymorphic Associations to do this, allowing vehicle_id to contain a value that exists either in car or motor tables. Then add a vehicle_type that names the table which the given row in t1 references.

The trouble is that you can t declare a real SQL foreign key constraint if you do this. There s no support in SQL for a foreign key that has multiple reference targets. There are other problems, too, but the lack of referential integrity is already a deal-breaker.

A better design is to borrow a concept from OO design of a common supertype of both car and motor:

CREATE TABLE Identifiable (
 id SERIAL PRIMARY KEY
);

Then make t1 reference this super-type table:

CREATE TABLE t1 (
  vehicle_id INTEGER NOT NULL,
  FOREIGN KEY (vehicle_id) REFERENCES identifiable(id)
  ...
);

And also make the sub-types reference their parent supertype. Note that the primary key of the sub-types is not auto-incrementing. The parent supertype takes care of allocating a new id value, and the children only reference that value.

CREATE TABLE car (
  id INTEGER NOT NULL,
  FOREIGN KEY (id) REFERENCES identifiable(id)
  ...
);

CREATE TABLE motor (
  id INTEGER NOT NULL,
  FOREIGN KEY (id) REFERENCES identifiable(id)
  ...
);

Now you can have true referential integrity, but also support multiple subtype tables with their own attributes.


The answer by @Quassnoi also shows a method to enforce disjoint subtypes. That is, you want to prevent both car and motor from referencing the same row in their parent supertype table. When I do this, I use a single-column primary key for Identifiable.id but also declare a UNIQUE key over Identifiable.(id, type). The foreign keys in car and motor can reference the two-column unique key instead of the primary key.

问题回答
CREATE TABLE vehicle (type INT NOT NULL, id INT NOT NULL,
             PRIMARY KEY (type, id)
)

CREATE TABLE car (type INT NOT NULL DEFAULT 1, id INT NOT NULL PRIMARY KEY,
             CHECK(type = 1),
             FOREIGN KEY (type, id) REFERENCES vehicle
)

CREATE TABLE motorcycle (type INT NOT NULL DEFAULT 2, id INT NOT NULL PRIMARY KEY,
             CHECK(type = 2),
             FOREIGN KEY (type, id) REFERENCES vehicle
)

CREATE TABLE t1 (
  ...
  vehicle_type INT NOT NULL,
  vehicle_id INT NOT NULL,
  FOREIGN KEY (vehicle_type, vehicle_id) REFERENCES vehicle
  ...
)

I think the least-boilerplate solution is to use constraint and check.

For example, consider this ADT in Haskell:

data Shape = Circle {radius::Float} | Rectangle {width::Float, height::Float}

The equivalent in MySQL/MariaDB would be (tested on 10.5.11-MariaDB):

CREATE TABLE shape (        
  type ENUM( circle ,  rectangle ) NOT NULL,
  radius FLOAT,             
  width FLOAT,              
  height FLOAT,             
  CONSTRAINT constraint_circle CHECK 
    (type <>  circle  OR radius IS NOT NULL),
  CONSTRAINT constraint_rectangle CHECK 
    (type <>  rectangle  OR (width IS NOT NULL AND height IS NOT NULL))
);                                      
                                        
INSERT INTO shape(type, radius, width, height)
  VALUES ( circle , 1, NULL, NULL); -- ok
                                                                                                                    
INSERT INTO shape(type, radius, width, height)
  VALUES ( circle , NULL, 1, NULL); -- error, constraint_circle violated

Note that the above uses type <> x OR y instead of type = x AND y. This is because the latter essentially means that all rows must have type of x, which defeats the purpose of tagged union.

Also, note that the solution above only check for required columns, but does not check for extraneous columns.

For example, you can insert a rectangle which has defined radius.

This can be easily mitigated by adding another condition for constraint_rectangle, namely radius is null.

However, I will not recommend doing so as it makes adding new type tedious.

For example, to add a new type triangle with one new column base, not only we will need to add a new constraint, but we also need to modify the existing constraints to ensure that their base is null.

I think you could model such a reference by using table inheritance in PostgreSQL.

If you really need to know where a row comes from in a Query, you could use a simple UNION ALL statment like (this possibility has nothing to do with table inheritance):

SELECT car.*,  car  table_name
UNION ALL
SELECT motor.*,  motor  table_name




相关问题
SQL SubQuery getting particular column

I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...

难以执行 REGEXP_SUBSTR

I m 查询Oracle 10g。 我有两张表格(样本数据见下文)。 i m 试图提取一些领域

SQL Query Shortcuts

What are some cool SQL shorthands that you know of? For example, something I learned today is you can specify to group by an index: SELECT col1, col2 FROM table GROUP BY 2 This will group by col2

PHP array callback functions for cleaning output

I have an array of output from a database. I am wondering what the cleanest way to filter the values is example array Array ( [0] => Array ( [title] => title 1 ...

OracleParameter and DBNull.Value

we have a table in an Oracle Database which contains a column with the type Char(3 Byte). Now we use a parameterized sql to select some rows with a DBNull.Value and it doesn t work: OracleCommand ...

Running numbers in SQL

I have a SQL-statement like this: SELECT name FROM users WHERE deleted = 0; How can i create a result set with a running number in the first row? So the result would look like this: 1 Name_1 2 ...

How to get SQL queries for each user where env is production

I’m developing an application dedicated to generate statistical reports, I would like that user after saving their stat report they save sql queries too. To do that I wrote the following module: ...

热门标签