English 中文(简体)
在许多至many关系中寻找物品
原标题:Searching for items in a many-to-many relationship

我目前正在撰写一份申请,允许人们储存图像,然后打上这些图像。 I m 和Peewee ORM(http://charlesleifer.com/docs/peewee/),这与Django s ORM非常相似。

My data model looks like this (simplified):

class Image(BaseModel):
    key = CharField()

class Tag(BaseModel):
    tag = CharField()

class TagRelationship(BaseModel):
    relImage = ForeignKeyField(Image)
    relTag   = ForeignKeyField(Tag)

现在,我从概念上理解,如何询问所有有一套特定标签的形象:

SELECT Image.key
  FROM Image
INNER JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
INNER JOIN Tag
    ON TagRelationship.TagID = Tag.ID
 WHERE Tag.tag
       IN (  A  ,  B  )     -- list of multiple tags
GROUP BY Image.key
HAVING COUNT(*) = 2         -- where 2 == the number of tags specified, above

然而,我也希望能够进行更为复杂的搜索。 具体来说,我能够具体列出一份“所有标签”清单,即图像必须包含所有具体标签,并附上一份“任何”清单和一份“不”清单。

http://www.un.org。 我愿澄清这一点。 具体而言,上述问题是“所有标签”式的查询。 它回归了所有方面的形象。 我想能够具体说明如下一些情况:“我所有有标签(绿色、山区)的图像,任何一张标(背井、景观)但不是标(数字、绘画)的图像”。

Now, ideally, I d like this to be one SQL query, because pagination then becomes very easy with LIMIT and OFFSET. I ve actually got an implementation working whereby I just load everything into Python sets and then use the various intersection operators. What I m wondering is if there s a method of doing this all at once?

Also, for those interested, I ve emailed the author of Peewee about how to represent the above query using Peewee, and he responded with the following solution:

Image.select([ key ]).group_by( key ).join(TagRelationship).join(Tag).where(tag__in=[ tag1 ,  tag2 ]).having( count(*) = 2 )

或者,更短的版本:

Image.filter(tagrelationship_set__relTag__tag__in=[ tag1 ,  tag2 ]).group_by(Image).having( count(*) = 2 )

提前感谢您。

最佳回答
SELECT Image.key
  FROM Image
  JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
  JOIN Tag
    ON TagRelationship.TagID = Tag.ID
 GROUP BY Image.key
HAVING SUM(Tag.tag IN (mandatory tags )) = N  /*the number of mandatory tags*/
   AND SUM(Tag.tag IN (optional tags  )) > 0
   AND SUM(Tag.tag IN (prohibited tags)) = 0

UPDATE

A more universally accepted version of the above query (converts the boolean results of the IN predicates into integers using CASE expressions):

SELECT Image.key
  FROM Image
  JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
  JOIN Tag
    ON TagRelationship.TagID = Tag.ID
 GROUP BY Image.key
HAVING SUM(CASE WHEN Tag.tag IN (mandatory tags ) THEN 1 ELSE 0 END) = N  /*the number of mandatory tags*/
   AND SUM(CASE WHEN Tag.tag IN (optional tags  ) THEN 1 ELSE 0 END) > 0
   AND SUM(CASE WHEN Tag.tag IN (prohibited tags) THEN 1 ELSE 0 END) = 0

或与COUNTs而不是SUM:

SELECT Image.key
  FROM Image
  JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
  JOIN Tag
    ON TagRelationship.TagID = Tag.ID
 GROUP BY Image.key
HAVING COUNT(CASE WHEN Tag.tag IN (mandatory tags ) THEN 1 END) = N  /*the number of mandatory tags*/
   AND COUNT(CASE WHEN Tag.tag IN (optional tags  ) THEN 1 END) > 0
   AND COUNT(CASE WHEN Tag.tag IN (prohibited tags) THEN 1 END) = 0
问题回答

上半部分用的是与法定的标签相对应的措辞。 最低一半是至少1个政党的主角。 底部查询没有小组,因为我想知道图像是否出现两次。 如果是,它既具有背景,又具有景观。 页: 1 因此,绿色、山区、背景景观最为相关。 然后是绿色、山区、背景或景观。

SELECT Image.key, count(*) AS  relevance  
FROM
     (SELECT Image.key
      FROM
        --good image candidates
        (SELECT Image.key
         FROM Image
         WHERE Image.key NOT IN 
            --Bad Images
            (SELECT DISTINCT(Image.key)   --Will reduce size of set, remove duplicates
             FROM Image
             INNER JOIN TagRelationship
                ON Image.ID = TagRelationship.ImageID
             INNER JOIN Tag
                ON TagRelationship.TagID = Tag.ID
              WHERE Tag.tag
                   IN ( digital ,  drawing  )))
    INNER JOIN TagRelationship
        ON Image.ID = TagRelationship.ImageID
    INNER JOIN Tag
        ON TagRelationship.TagID = Tag.ID
    WHERE Tag.tag
           IN ( green ,  mountain )
    GROUP BY Image.key
    HAVING COUNT(*) = count( green ,  mountain )
    --we need green AND mountain

    UNION ALL

    --Get all images with one of the following 2 tags
    SELECT * 
    FROM
        (SELECT Image.key
         FROM Image
         INNER JOIN TagRelationship
             ON Image.ID = TagRelationship.ImageID
         INNER JOIN Tag
             ON TagRelationship.TagID = Tag.ID
          WHERE Tag.tag
             IN (  background  ,  landscape  ))
)
GROUP BY Image.key
ORDER BY relevance DESC

查询后,必须归还所有与(A和B)和(C OR D)有关系但不是E和F的图像。

SELECT Image.key
FROM Image
INNER JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
INNER JOIN Tag tag1
    ON TagRelationship.TagID = tag1.ID
INNER JOIN Tag tag2
    ON TagRelationship.TagID = tag2.ID
WHERE tag1.tag
    IN (  A  ,  B  )
AND tag2.tag NOT IN ( E ,  F )

GROUP BY Image.key
HAVING COUNT(*) = 2 

UNION

SELECT Image.key
FROM Image
INNER JOIN TagRelationship
    ON Image.ID = TagRelationship.ImageID
INNER JOIN Tag tag1
    ON TagRelationship.TagID = tag1.ID
INNER JOIN Tag tag2
    ON TagRelationship.TagID = tag2.ID
WHERE tag1.tag
   IN (  C  ,  D  )
AND tag2.tag NOT IN ( E ,  F )




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签