English 中文(简体)
How can I avoid duplicating data in a document database like RavenDB?
原标题:

Given that document databases, such as RavenDB, are non-relational, how do you avoid duplicating data that multiple documents have in common? How do you maintain that data if it s okay to duplicate it?

最佳回答

With a document database you have to duplicate your data to some degree. What that degree is will depend on your system and use cases.

For example if we have a simple blog and user aggregates we could set them up as:

  public class User 
  {
    public string Id { get; set; }
    public string Name  { get; set; }
    public string Username  { get; set; }
    public string Password  { get; set; }
  }

  public class Blog
  {
     public string Id  { get; set; }
     public string Title  { get; set; }

     public class BlogUser
     {
       public string Id  { get; set; }
       public string Name  { get; set; }
     }
  }

In this example I have nested a BlogUser class inside the Blog class with the Id and Name properties of the User Aggregate associated with the Blog. I have included these fields as they are the only fields the Blog class is interested in, it doesn t need to know the users username or password when the blog is being displayed.

These nested classes are going to dependant on your systems use cases, so you have to design them carefully, but the general idea is to try and design Aggregates which can be loaded from the database with a single read and they will contain all the data required to display or manipulate them.

This then leads to the question of what happens when the User.Name gets updated.

With most document databases you would have to load all the instances of Blog which belong to the updated User and update the Blog.BlogUser.Name field and save them all back to the database.

Raven is slightly different as it support set functions for updates, so you are able to run a single update against RavenDB which will up date the BlogUser.Name property of the users blogs without you have to load them and update them all individually.

The code for doing the update within RavenDB (the manual way) for all the blog s would be:

  public void UpdateBlogUser(User user)
  {
    var blogs = session.Query<Blog>("blogsByUserId")
                  .Where(b.BlogUser.Id == user.Id)
                  .ToList();

    foreach(var blog in blogs)
       blog.BlogUser.Name == user.Name;

    session.SaveChanges()
  }

I ve added in the SaveChanges just as an example. The RavenDB Client uses the Unit of Work pattern and so this should really happen somewhere outside of this method.

问题回答

There s no one "right" answer to your question IMHO. It truly depends on how mutable the data you re duplicating is.

Take a look at the RavenDB documentation for lots of answers about document DB design vs. relational, but specifically check out the "Associations Management" section of the Document Structure Design Considerations document. In short, document DBs use the concepts of reference by IDs when they don t want to embed shared data in a document. These IDs are not like FKs, they are entirely up to the application to ensure the integrity of and resolve.





相关问题
Best practices for configuring a Solr schema

I m currently configuring my schema.xml file and trying to figure out what s the best way to set up my documents. I use a RMDBS and thus many objects are relational. Take this site for instance; a ...

How to register for callbacks on a DB4O server?

I have a DB4O server listening on a port IObjectServer server = Db4oClientServer.OpenServer("filename.db40", 11978); How do i register for callbacks? For example how do i execute some custom ...

Are there any small scale, durable document/object databases?

I have a few .Net projects that would benefit from using a document/object database opposed to a relational one. I think that db4o would be a good choice, but we re not sure how much the cost is. I ...

热门标签