English 中文(简体)
inserting rows from one table to another, which sql is more efficient (outer join vs sequential scan)
原标题:

I need to copy over rows from Table B to Table A. The requirement is to only insert rows that are not already in A.

My question is, which is of the the following two is more efficient:

A)

   INSERT INTO A (x, y, z)
   SELECT x, y, z
   FROM B b
   WHERE b.id NOT IN (SELECT id FROM A);

B)

   INSERT INTO A (x, y, z)
   SELECT b.x, b.y, b.z
   FROM B b LEFT OUTER JOIN A a
     ON b.id = a.id
   WHERE a.id is NULL;

I am assuming the answer depends upon the size of the tables. But I wanted to know if there is something glaringly obvious about using one approach over the other.

To reduce the vagueness, lets say Table B will have less than 50K rows, and Table A will always be equal to or greater in size to Table B by a factor of 1-5.

If anyone has any other more efficient ways to do this, do tell.

最佳回答

I think option B is better, especially if Table A is bigger than Table B by a factor > 1.

If you have indexes on a.id and b.id then joining will be faster, IMHO, than using where for each row...

问题回答

To add another option:

INSERT INTO A (x, y, z)
SELECT B.x, B.y, B.z
FROM B
WHERE NOT EXISTS(SELECT * FROM A WHERE A.id = B.id)

I usually go with the LEFT JOIN approach. But, if you want to know truly what is most efficient, run some tests on your environment. See what the execution plans for each approach are (you may find that multiple approaches actually result in the same execution plan).

It shouldn t matter - a good optimizer will treat these identically. In practice, I have seen to quirky execution plans in exactly this case, but I have been known to use both styles interchangeably, depending on mood, readability and complexity of the query.

In SQL Server, option A is not available when you need to JOIN on a tuple of more thana a single column without using some kind of concatenation workaround (which I do not recommend), which brings us to cat-skinning option C (which I also use, expecially with the joins are really squirrely), which extends to tuples directly:

INSERT INTO A (x, y, z) 
SELECT x, y, z 
FROM B b 
WHERE NOT EXISTS (SELECT * FROM A WHERE id = b.id); 

INSERT INTO A (x, y, z) 
SELECT x, y, z 
FROM B b 
WHERE NOT EXISTS (SELECT * FROM A WHERE id1 = b.id1 AND id2 = b.id2); 

Depending on the number of rows and the activity on the database, it would help a lot to drop all indexes on the table before the insert and recreate them afterwards.





相关问题
what is wrong with this mysql code

$db_user="root"; $db_host="localhost"; $db_password="root"; $db_name = "fayer"; $conn = mysqli_connect($db_host,$db_user,$db_password,$db_name) or die ("couldn t connect to server"); // perform query ...

Users asking for denormalized database

I am in the early stages of developing a database-driven system and the largest part of the system revolves around an inheritance type of relationship. There is a parent entity with about 10 columns ...

Easiest way to deal with sample data in Java web apps?

I m writing a Java web app in my free time to learn more about development. I m using the Stripes framework and eventually intend to use hibernate and MySQL For the moment, whilst creating the pages ...

join across databases with nhibernate

I am trying to join two tables that reside in two different databases. Every time, I try to join I get the following error: An association from the table xxx refers to an unmapped class. If the ...

How can I know if such value exists in database? (ADO.NET)

For example, I have a table, and there is a column named Tags . I want to know if value programming exists in this column. How can I do this in ADO.NET? I did this: OleDbCommand cmd = new ...

Convert date to string upon saving a doctrine record

I m trying to migrate one of my PHP projects to Doctrine. I ve never used it before so there are a few things I don t understand. In my current code, I have a class similar to this: class ...

热门标签