




















































(For more resources related to this topic, see here.)
Origin is a gem that provides the DSL for Mongoid queries. Though at first glance, a question may seem to arise as to why we need a DSL for Mongoid queries; If we are finally going to convert the query to a MongoDB-compliant hash, then why do we need a DSL?
Origin was extracted from Mongoid gem and put into a new gem, so that there is a standard for querying. It has no dependency on any other gem and is a standalone pure query builder. The idea was that this could be a generic DSL that can be used even without Mongoid!
So, now we have a very generic and standard querying pattern. For example, in Mongoid 2.x we had the criteria any_in and any_of, and no direct support for the and, or, and nor operations. In Mongoid 2.x, the only way we could fire a $or or a $and query was like this:
Author.where("$or" => {'language' => 'English', 'address.city' => 'London '})
And now in Mongoid 3, we have a cleaner approach.
Author.or(:language => 'English', 'address.city' => 'London')
Origin also provides good selectors directly in our models. So, this is now much more readable:
Book.gte(published_at: Date.parse('2012/11/11'))
As we have seen earlier, MongoDB stores data in memory-mapped files of at most 2 GB each. After the data is loaded for the first time into the memory mapped files, we now get almost memory-like speeds for access instead of disk I/O, which is much slower. These memory-mapped files are preallocated to ensure that there is no delay of the file generation while saving data.
However, to ensure that the data is not lost, it needs to be persisted to the disk. This is achieved by journaling. With journaling, every database operation is written to the oplog collection and that is flushed to disk every 100 ms. Journaling is turned on by default in the MongoDB configuration. This is not the actual data but the operation itself. This helps in better recovery (in case of any crash) and also ensures the consistency of writes. The data that is written to various collections are flushed to the disk every 60 seconds. This ensures that the data is persisted periodically and also ensures the speed of data access is almost as fast as memory. MongoDB relies on the operating system for the memory management of its memory-mapped files. This has the advantage of getting inherent OS benefits as the OS is improved. Also, there's the disadvantage of lack of control on how memory is managed by MongoDB.
However, what happens if something goes wrong (server crashes, database stops, or disk is corrupted)? To ensure durability, whenever data is saved in files, the action is logged to a file in a chronological order. This is the journal entry, which is also a memory-mapped file but is synced with the disk every 100 ms. Using the journal, the database can be easily recovered in case of any crash. So, in the worst case scenario, we could potentially lose 100 ms of information. This is a fair price to pay for the benefits of using MongoDB.
MongoDB journaling makes it a very robust and durable database. However, it also helps us decide when to use MongoDB and when not to use it. 100 ms is a long time for some services, such as financial core banking or maybe stock price updates. In such applications, MongoDB is not recommended.
For most cases that are not related to heavy multi-table transactions like most financial applications MongoDB can be suitable.
All these things are handled seamlessly, and we don't usually need to change anything. We can control this behavior via the configuration of MongoDB but usually it's not recommended. Let's now see how we save data using Mongoid.
As with ActiveModel specifications, save will update the changed attributes and return the updated object, otherwise it will return false on failure. The save! function will raise an exception on the error. In both cases, if we pass validate: false as a parameter to save, it will bypass the validations.
A lesser-known persistence option is the upsert action. An upsert action creates a new document if it does not find it and overwrites the object if it finds it. A good reason to use upsert is in the find_and_modify action.
For example, suppose we want to reserve a book in our Sodibee system, and we want to ensure that at any one point, there can be only one reservation for a book. In a traditional scenario:
So far so good! However in a concurrent model, especially for web applications, it creates problems.
Now we have a situation where two requests think that the reservation for the book was successful and that is against our expectations. This is a typical problem that plagues most web applications. The various ways in which we can solve this is discussed in the subsequent sections.
MongoDB helps us ensure write consistency. This means that when we write something to MongoDB, it now guarantees the success of the write operation. Interestingly, this is a configurable option and is set to acknowledged by default. This means that the write is guaranteed because it waits for an acknowledgement before returning success.
In earlier versions of Mongoid, safe: true was turned off by default. This meant that success of the write operation was not guaranteed. The write concern is configured in Mongoid.yml as follows:
development:
sessions:
default:
hosts:
- localhost:27017
options:
write:
w: 1
The default write concern in Mongoid is configured with w: 1, which means that the success of a write operation is guaranteed. Let's see an example:
class Author
include Mongoid::Document
field :name, type: String
index( {name: 1}, {unique: true, background: true})
end
Indexing blocks read and write operations. Hence, its recommended to configure indexing in the background in a Rails application.
We shall now start a Rails console and see how this reacts to a duplicate key index by creating two Author objects with the same name.
irb> Author.create(name: "Gautam") => #<Author _id: 5143678345db7ca255000001, name: "Gautam"> irb> Author.create(name: "Gautam") Moped::Errors::OperationFailure: The operation: #<Moped::Protocol::Command @length=83 @request_id=3 @response_to=0 @op_code=2004 @flags=[] @full_collection_name="sodibee_development.$cmd" @skip=0 @limit=-1 @selector={:getlasterror=>1, :w=>1} @fields=nil> failed with error 11000: "E11000 duplicate key error index: sodibee_ development.authors.$name_1 dup key: { : "Gautam" }"
As we can see, it has raised a duplicate key error and the document is not saved. Now, let's have some fun. Let's change the write concern to unacknowledged:
development:
sessions:
default:
hosts:
- localhost:27017
options:
write:
w: 0
The write concern is now set to unacknowledged writes. That means we do not wait for the MongoDB write to eventually succeed, but assume that it will. Now let's see what happens with the same command that had failed earlier.
irb > Author.where(name: "Gautam").count
=> 1
irb > Author.create(name: "Gautam")
=> #<Author _id: 5287cba54761755624000000, name: "Gautam">
irb > Author.where(name: "Gautam").count
=> 1
There seems to be a discrepancy here. Though Mongoid create returned successfully, the data was not saved to the database. Since we specified background: true for the name index, the document creation seemed to succeed as MongoDB had not indexed it yet, and we did not wait for acknowledging the success of the write operation. So, when MongoDB tries to index the data in the background, it realizes that the index criterion is not met (since the index is unique), and it deletes the document from the database. Now, since that was in the background, there is no way to figure this out on the console or in our Rails application. This leads to an inconsistent result.
So, how can we solve this problem? There are various ways to solve this problem:
Other options to the index command create different types of indexes as shown in the following table:
Index Type |
Example |
Description |
sparse |
index({twitter_name: 1}, { sparse: true}) |
This creates sparse indexes, that is, only the documents containing the indexed fields are indexed. Use this with care as you can get incomplete results. |
2d
2dsphere |
index({:location => "2dsphere"}) |
This creates a two-dimensional spherical index. |
MongoDB 2.4 introduced text indexes that are as close to free text search indexes as it gets. However, it does only basic text indexing—that is, it supports stop words and stemming. It also assigns a relevance score with each search.
Text indexes are still an experimental feature in MongoDB, and they are not recommended for extensive use. Use ElasticSearch, Solr (Sunspot), or ThinkingSphinx instead.
The following code snippet shows how we can specify a text index with weightage:
index({ "name" => 'text',
"last_name" => 'text'
},
{
weights: {
'name' => 10,
'last_name' => 5,
},
name: 'author_text_index'
}
)
There is no direct search support in Mongoid (as yet). So, if you want to invoke a text search, you need to hack around a little.
irb> srch = Mongoid::Contextual::TextSearch.new(Author.collection,
Author.all, 'john')
=> #<Mongoid::Contextual::TextSearch
selector: {}
class: Author
search: john
filter: {}
project: N/A
limit: N/A
language: default>
irb> srch.execute
=> {"queryDebugString"=>"john||||||", "language"=>"english",
"results"=>[{"score"=>7.5, "obj"=>{"_id"=>BSON::ObjectId('51fc058345d
b7c843f00030b'), "name"=>"Bettye Johns"}}, {"score"=>7.5, "obj"=>{"_id
"=>BSON::ObjectId('51fc058345db7c843f00046d'), "name"=>"John Pagac"}},
{"score"=>7.5, "obj"=>{"_id"=>BSON::ObjectId('51fc058345db7c843f000578'),
"name"=>"Jeanie Johns"}}, {"score"=>7.5, "obj"=>{"_id"=>BSON::ObjectId('5
1fc058445db7c843f0007e7')
...
{"score"=>7.5, "obj"=>{"_id"=>BSON::ObjectId('51fc058a45db7c843
f0025f1'), "name"=>"Alford Johns"}}], "stats"=>{"nscanned"=>25,
"nscannedObjects"=>0, "n"=>25, "nfound"=>25, "timeMicros"=>31103},
"ok"=>1.0}
By default, text search is disabled in MongoDB configuration. We need to turn it on by adding setParameter = textSearchEnabled=true in the MongoDB configuration file, typically /usr/local/mongo.conf.
This returns a result with statistical data as well the documents and their relevance score. Interestingly, it also specifies the language. There are a few more things we can do with the search result. For example, we can see the statistical information as follows:
irb> a.stats
=> {"nscanned"=>25, "nscannedObjects"=>0, "n"=>25, "nfound"=>25,
"timeMicros"=>31103}
We can also convert the data into our Mongoid model objects by using project, as shown in the following command:
> a.project(:name).to_a
=> [#<Author _id: 51fc058345db7c843f00030b, name: "Bettye Johns",
last_name: nil, password: nil>, #<Author _id: 51fc058345db7c843f00046d,
name: "John Pagac", last_name: nil, password: nil>, #<Author _id:
51fc058345db7c843f000578, name: "Jeanie Johns", last_name: nil, password:
nil> ...
Some of the important things to remember are as follows:
A lot more data can be found at http://docs.mongodb.org/manual/tutorial/search-for-text/
This article provides an excellent reference for using Mongoid. The article has examples with code samples and explanations that help in understanding the various features of Mongoid.
Further resources on this subject: