English 中文(简体)
What are the best practices for importing large datasets into MongoDB?
原标题:

We are just giving MongoDB a test run and have set up a Rails 3 app with Mongoid. What are the best practices for inserting large datasets into MongoDB? To flesh out a scenario: Say, I have a book model and want to import several million records from a CSV file.

I suppose this needs to be done in the console, so this may possibly not be a Ruby-specific question.

Edited to add: I assume it makes a huge difference whether the imported data includes associations or is supposed to go into one model only. Any comments on either scenario welcome.

最佳回答

MongoDB comes with import/export tools that parse JSON formatted data.

Assuming you have an existing database in SQL, the easiest way to migrate that data is to output your SQL data as JSON strings, then use the import tool for each collection.

This includes denormalization and nesting/embedding - so don t migrate a relational model to MongoDB, you should consider also refactoring your data model to leverage MongoDB features.

For example, a common task is to merge articles and tags to an articles collection with the tags embedded as an array. Do that in your export script, so all MongoDB sees is nice clean JSON coming in through the import :-)

You can still import all your tables as collections, but you re missing out on some of the true strengths of MongoDB by doing that.

问题回答

If you want add this dataset only one time. You can use the db/seed.rb file. you can read your CSV and generate all Document.

If you want made that a lot of times, you can made a runner or task.

With task, you need define a lib/task/file.rake and generate task with your file and again parse it and generate all documents.

You can made a runner too.

It s the same thing that a ActiveRecord stuff.





相关问题
rails collection_select vs. select

collection_select and select Rails helpers: Which one should I use? I can t see a difference in both ways. Both helpers take a collection and generates options tags inside a select tag. Is there a ...

SSL slowness in EC2

We ve deployed our rails app to EC2. In our setup, we have two proxies on small instances behind round-robin DNS. These run nginx load balancers for a dynamically growing and shrinking farm of web ...

Auth-code with A-Za-z0-9 to use in an URL parameter

As part of a web application I need an auth-code to pass as a URL parameter. I am currently using (in Rails) : Digest::SHA1.hexdigest((object_id + rand(255)).to_s) Which provides long strings like : ...

RubyCAS-Client question: Rails

I ve installed RubyCAS-Client version 2.1.0 as a plugin within a rails app. It s working, but I d like to remove the ?ticket= in the url. Is this possible?

activerecord has_many :through find with one sql call

I have a these 3 models: class User < ActiveRecord::Base has_many :permissions, :dependent => :destroy has_many :roles, :through => :permissions end class Permission < ActiveRecord::...

Ordering a hash to xml: Rails

I m building an xml document from a hash. The xml attributes need to be in order. How can this be accomplished? hash.to_xml

Text Editor for Ruby-on-Rails

guys which text editor is good for Rubyonrails? i m using Windows and i was using E-Texteditor but its not free n its expired now can anyone plese tell me any free texteditor? n which one is best an ...

How to get SQL queries for each user where env is production

I’m developing an application dedicated to generate statistical reports, I would like that user after saving their stat report they save sql queries too. To do that I wrote the following module: ...

热门标签