English 中文(简体)
PostgreSQL , Select from 2 tables, but only the latest element from table 2
原标题:

Hey, I have 2 tables in PostgreSql:

1 - documents: id, title
2 - updates: id, document_id, date

and some data:

documents:

| 1 | Test Title |

updates:

| 1 | 1 | 2006-01-01 |
| 2 | 1 | 2007-01-01 |
| 3 | 1 | 2008-01-01 |

So All updates are pointing to the same document, but all with different dates for the updates.

What I am trying to do is to do a select from the documents table, but also include the latest update based on the date.

How should a query like this look like? This is the one I currently have, but I am listing all updates, and not the latest one as the one I need:

SELECT * FROM documents,updates WHERE documents.id=1 AND documents.id=updates.document_id ORDER BY date

To include; The reason I need this in the query is that I want to order by the date from the updates template!

Edit: This script is heavily simplified, so I should be able to create a query that returns any number of results, but including the latest updated date. I was thinking of using a inner join or left join or something like that!?

最佳回答

You may create a derived table which contains only the most recent "updates" records per document_id, and then join "documents" against that:

SELECT d.id, d.title, u.update_id, u."date"
FROM documents d
LEFT JOIN
-- JOIN "documents" against the most recent update per document_id
(
SELECT recent.document_id, id AS update_id, recent."date"
FROM updates
INNER JOIN
(SELECT document_id, MAX("date") AS "date" FROM updates GROUP BY 1) recent
ON updates.document_id = recent.document_id
WHERE
  updates."date" = recent."date"
) u
ON d.id = u.document_id;

This will handle "un-updated" documents, like so:

pg=> select * from documents;
 id | title 
----+-------
  1 | foo
  2 | bar
  3 | baz
(3 rows)

pg=> select * from updates;
 id | document_id |    date    
----+-------------+------------
  1 |           1 | 2009-10-30
  2 |           1 | 2009-11-04
  3 |           1 | 2009-11-07
  4 |           2 | 2009-11-09
(4 rows)

pg=> SELECT d.id ...
 id | title | update_id |    date    
----+-------+-----------+------------
  1 | foo   |         3 | 2009-11-07
  2 | bar   |         4 | 2009-11-09
  3 | baz   |           | 
(3 rows)
问题回答

Use PostgreSQL extension DISTINCT ON:

SELECT  DISTINCT ON (documents.id) *
FROM    document
JOIN    updates
ON      updates.document_id = document_id
ORDER BY
        documents.id, updates.date DESC

This will take the first row from each document.id cluster in ORDER BY order.

Test script to check:

SELECT  DISTINCT ON (documents.id) *
FROM    (
        VALUES
        (1,  Test Title ),
        (2,  Test Title 2 )
        ) documents (id, title)
JOIN    (
        VALUES
        (1, 1,  2006-01-01 ::DATE),
        (2, 1,  2007-01-01 ::DATE),
        (3, 1,  2008-01-01 ::DATE),
        (4, 2,  2009-01-01 ::DATE),
        (5, 2,  2010-01-01 ::DATE)
        ) updates (id, document_id, date)
ON      updates.document_id = documents.id
ORDER BY
        documents.id, updates.date DESC
select *
from documents
left join updates
  on updates.document_id=documents.id
  and updates.date=(select max(date) from updates where document_id=documents.id)
where documents.id=?;

It has the some advantages over previous answers:

  • you can write document_id only in one place which is convenient;
  • you can omit where and you ll get a table of all documents and their latest updates;
  • you can use more broad selection criteria, for example where documents.id in (1,2,3).

You can also avoid a subselect using group by, but you ll have to list all fields of documents in group by clause:

select documents.*, max(date) as max_date
  from documents
  left join updates on documents.id=document_id
  where documents.id=1
  group by documents.id, title;

From the top of my head:

ORDER BY date DESC LIMIT 1

If you really want only id 1 your can use this query:

SELECT * FROM documents,updates 
    WHERE documents.id=1 AND updates.document_id=1 
    ORDER BY date DESC LIMIT 1

http://www.postgresql.org/docs/8.4/interactive/queries-limit.html

This should also work

SELECT * FROM documents, updates 
    WHERE documents.id=1 AND updates.document_id=1
    AND updates.date = (SELECT MAX (date) From updates) 




相关问题
摘录数据

我如何将Excel板的数据输入我的Django应用? I m将PosgreSQL数据库作为数据库。

Postgres dump of only parts of tables for a dev snapshot

On production our database is a few hundred gigabytes in size. For development and testing, we need to create snapshots of this database that are functionally equivalent, but which are only 10 or 20 ...

How to join attributes in sql select statement?

I want to join few attributes in select statement as one for example select id, (name + + surname + + age) as info from users this doesn t work, how to do it? I m using postgreSQL.

What text encoding to use?

I need to setup my PostgreSQL DB s text encoding to handle non-American English characters that you d find showing up in languages such as German, Spanish, and French. What character encoding should ...

SQL LIKE condition to check for integer?

I am using a set of SQL LIKE conditions to go through the alphabet and list all items beginning with the appropriate letter, e.g. to get all books where the title starts with the letter "A": SELECT * ...

热门标签