English 中文(简体)
我如何在邮政总局获得随机的汽车产品?
原标题:How can I get a random cartesian product in PostgreSQL?

我有两张表格:custassetstags。 生成一些测试数据。 INSERT INTO a many-to-many table with a SlectT, which have arbitrary rows from each (so that a arbitrary main key from one table is paired with a arbitrary main key from the second). 令我惊讶的是,正如我第一次认为的那样,这 t不易,因此我坚持这样做,教我。

这里是我第一次尝试。 我选择了10个<条码>中标/代码>和3个<条码>标/代码>,但两者在每种情况下都是相同的。 第一个表格是固定的,但我想抽出所分配的标签。

SELECT
    custassets_rand.id custassets_id,
    tags_rand.id tags_rand_id
FROM
    (
        SELECT id FROM custassets WHERE defunct = false ORDER BY RANDOM() LIMIT 10
    ) AS custassets_rand
,
    (
        SELECT id FROM tags WHERE defunct = false ORDER BY RANDOM() LIMIT 3
    ) AS tags_rand

结果是:

custassets_id | tags_rand_id 
---------------+--------------
          9849 |         3322  }
          9849 |         4871  } this pattern of tag PKs is repeated
          9849 |         5188  }
         12145 |         3322
         12145 |         4871
         12145 |         5188
         17837 |         3322
         17837 |         4871
         17837 |         5188
....

随后,我尝试了以下做法:在<代码>SlectT栏目表上填写第二栏目)。 然而,情况更糟,因为它选择了单一的tag和 stick。

SELECT
    custassets_rand.id custassets_id,
    (SELECT id FROM tags WHERE defunct = false ORDER BY RANDOM() LIMIT 1) tags_rand_id
FROM
    (
        SELECT id FROM custassets WHERE defunct = false ORDER BY RANDOM() LIMIT 30
    ) AS custassets_rand

结果:

 custassets_id | tags_rand_id 
---------------+--------------
         16694 |         1537
         14204 |         1537
         23823 |         1537
         34799 |         1537
         36388 |         1537
....

这在描述性语言中是容易的,我确信,可以通过储存程序或临时表格来做。 但我只能用<条码>来做。 INSERT INTO SlectT?

我确实想选择使用随机功能的初级钥匙,但不幸的是,这两个表格的主要钥匙在加固顺序上存在差距(因此,每个表格可能选择一个空档)。 否则就会被罚款!

最佳回答

请注意,您正在研究的是not a Cartesian Products,该产品将产生n*mrows;而是一种随机1:1的协会,产生

生产直截随机<>/strong>组合,即可随机抽取(rn<>>> /code>,用于较大的组:

SELECT c_id, t_id
FROM  (
   SELECT id AS c_id, row_number() OVER (ORDER BY random()) AS rn
   FROM   custassets
   ) x
JOIN   (SELECT id AS t_id, row_number() OVER () AS rn FROM tags) y USING (rn);

如果a 任意组合足够好,就会更快(特别是大型表格):

SELECT c_id, t_id
FROM   (SELECT id AS c_id, row_number() OVER () AS rn FROM custassets) x
JOIN   (SELECT id AS t_id, row_number() OVER () AS rn FROM tags) y USING (rn);

如果两个表格中的浏览数不相上下,而且你不想从较大的表格中失去浏览量,则使用:modulo营运人>>>>%>>>>>>>>>>>>>>>>>>>>>>><>>> • 多次加入小型表格:

SELECT c_id, t_id
FROM  (
   SELECT id AS c_id, row_number() OVER () AS rn
   FROM   custassets -- table with fewer rows
   ) x
JOIN  (
   SELECT id AS t_id, (row_number() OVER () % small.ct) + 1 AS rn
   FROM   tags
       , (SELECT count(*) AS ct FROM custassets) AS small
   ) y USING (rn);

rel=“nofollow noreferer”> 窗口职能 添加了SPogreSQL8.4。

问题回答

Here is a different approach to pick a single combination from 2 tables by random, assuming two tables a and b, both with primary key id. The tables needn t be of same size, and the second row is independently chosen from the first, which might not be that important for testdata.

SELECT * FROM a, b 
 WHERE a.id = (
    SELECT id 
    FROM a 
    OFFSET (
        SELECT random () * (SELECT count(*) FROM a)
    ) 
    LIMIT 1) 
 AND b.id = (
    SELECT id 
    FROM b 
    OFFSET (
        SELECT random () * (SELECT count(*) FROM b)
        ) 
    LIMIT 1);

用两个表格进行测试,一个是7 000个浏览量,一个是100k浏览量,结果立即进行。 有一个以上的结果是,你不得不多次打电话——增加低温和反应能力,并修改<条码>x.id = <条码>至<条码>x.id IN,这将产生(aA、aB、bA、bB)结果模式。

如果你只是想从对方那里获得一套随机的浏览量,就使用一种假装的频率生成器。 我想用的是:

select *
from (select a.*, row_number() over (order by NULL) as rownum -- NULL may not work, "(SELECT NULL)" works in MSSQL
      from a
     ) a cross join
     (select b.*,  row_number() over (order by NULL) as rownum
      from b
     ) b
where a.rownum <= 30 and b.rownum <= 30

这是一项 Car产品,每 assuming有900row,每.至少有30row。

However, I interpreted your question as getting random combinations. Once again, I d go for the pseudo-random approach.

select *
from (select a.*, row_number() over (order by NULL) as rownum -- NULL may not work, "(SELECT NULL)" works in MSSQL
      from a
     ) a cross join
     (select b.*,  row_number() over (order by NULL) as rownum
      from b
     ) b
where modf(a.rownum*107+b.rownum*257+17, 101) < <some vaue>

This let s you get combinations among arbitrary rows.

Just a plain carthesian product ON random() appears to work reasonably well. Simple comme bonjour...

-- Cartesian product
-- EXPLAIN ANALYZE
INSERT INTO dirgraph(point_from,point_to,costs)
SELECT p1.the_point , p2.the_point, (1000*random() ) +1
FROM allpoints p1
JOIN allpoints p2 ON random() < 0.002
        ;




相关问题
摘录数据

我如何将Excel板的数据输入我的Django应用? I m将PosgreSQL数据库作为数据库。

Postgres dump of only parts of tables for a dev snapshot

On production our database is a few hundred gigabytes in size. For development and testing, we need to create snapshots of this database that are functionally equivalent, but which are only 10 or 20 ...

How to join attributes in sql select statement?

I want to join few attributes in select statement as one for example select id, (name + + surname + + age) as info from users this doesn t work, how to do it? I m using postgreSQL.

What text encoding to use?

I need to setup my PostgreSQL DB s text encoding to handle non-American English characters that you d find showing up in languages such as German, Spanish, and French. What character encoding should ...

SQL LIKE condition to check for integer?

I am using a set of SQL LIKE conditions to go through the alphabet and list all items beginning with the appropriate letter, e.g. to get all books where the title starts with the letter "A": SELECT * ...