English 中文(简体)
Join operation with NOSQL
原标题:

I have gone through some articles regarding Bigtable and NOSQL. It is very interesting that they avoid JOIN operations.

As a basic example, let s take Employee and Department table and assume the data is spread across multiple tables / servers.

Just want to know, if data is spread across multiple servers, how do we do JOIN or UNION operations?

问题回答

When you have extremely large data, you probably want to avoid joins. This is because the overhead of an individual key lookup is relatively large (the service needs to figure out which node(s) to query, and query them in parallel and wait for responses). By overhead, I mean latency, not throughput limitation.

This makes joins suck really badly as you d need to do a lot of foreign key lookups, which would end up going to many,many different nodes (in many cases). So you d want to avoid this as a pattern.

If it doesn t happen very often, you could probably take the hit, but if you re going to want to do a lot of them, it may be worth "denormalising" the data.

The kind of stuff which gets stored in NoSQL stores is typically pretty "abnormal" in the first place. It is not uncommon to duplicate the same data in all sorts of different places to make lookups easier.

Additionally most nosql don t (really) support secondary indexes either, which means you have to duplicate stuff if you want to query by any other criterion.

If you re storing data such as employees and departments, you re really better off with a conventional database.

You would have to do multiple selects, and join the data manually in your application. See this SO post for more information. From that post:

Bigtable datasets can be queried from services like AppEngine using a language called GQL ("gee-kwal") which is a based on a subset of SQL. Conspicuously missing from GQL is any sort of JOIN command. Because of the distributed nature of a Bigtable database, performing a join between two tables would be terribly inefficient. Instead, the programmer has to implement such logic in his application, or design his application so as to not need it.

Natively, unfortunately, is not possible to perform a Join into a NoSQL database. This is actually one of the biggest differences between SQL and NoSQL DBs.

As @kaleb said, you would have to do multiple selections and then join the needed information "manually".

Luckily, there are ORMs frameworks such as Prisma that will allow you to "fake" the native SQL join feature.

Note: you re still performing multiple db calls under the hood, increasing the read-ops, and everything that s concerned.

" A key feature of Prisma Client is the ability to query relations between two or more models. " -> https://www.prisma.io/

example:

 const getUser = await prisma.user.findUnique({
  where: {
    id: 19,
  },
  select: {
    name: true,
    posts: {
      select: {
        title: true,
      },
    },
  },
})

In this case, the posts are stored in a different table, but Prisma is able to fetch them and join them into the User object.

Kaleb s right. You write custom code with a NoSQL solution if your data doesn t fit well into a key-value store. Map-reduce/async processing and custom view caches are common. Brian Aker gave a very funny (and satirical and biased) presentation at the Nov 2009 OpenSQLCamp http://www.youtube.com/watch?v=LhnGarRsKnA. Skip in 40 seconds to hear about joins.





相关问题
SQL SubQuery getting particular column

I noticed that there were some threads with similar questions, and I did look through them but did not really get a convincing answer. Here s my question: The subquery below returns a Table with 3 ...

难以执行 REGEXP_SUBSTR

I m 查询Oracle 10g。 我有两张表格(样本数据见下文)。 i m 试图提取一些领域

SQL Query Shortcuts

What are some cool SQL shorthands that you know of? For example, something I learned today is you can specify to group by an index: SELECT col1, col2 FROM table GROUP BY 2 This will group by col2

PHP array callback functions for cleaning output

I have an array of output from a database. I am wondering what the cleanest way to filter the values is example array Array ( [0] => Array ( [title] => title 1 ...

OracleParameter and DBNull.Value

we have a table in an Oracle Database which contains a column with the type Char(3 Byte). Now we use a parameterized sql to select some rows with a DBNull.Value and it doesn t work: OracleCommand ...

Running numbers in SQL

I have a SQL-statement like this: SELECT name FROM users WHERE deleted = 0; How can i create a result set with a running number in the first row? So the result would look like this: 1 Name_1 2 ...

How to get SQL queries for each user where env is production

I’m developing an application dedicated to generate statistical reports, I would like that user after saving their stat report they save sql queries too. To do that I wrote the following module: ...

热门标签