23 Haziran 2020 Salı

MongoDB - Document Store - Consistency Önemlidir

Giriş
MongoDB structural olmadığı için yeni bir alan eklemek SQL veri tabanlarına göre çok daha kolaydır. Sebebi MongoDB'nin bir Document Store olmasıdır.

RDBMS Kavramları
RDBMS'te 2 tane önemli kavram var. Bunlar Table, Row, Column. Bu kavramlarının karşılığı şöyle
Table kavramı Mongo'da Collection kavramına denk gelir
Row kavramı Mongo'da Document kavramına denk gelir
Column kavramı Mongo'da Field kavramına denk gelir
Document Store
Açıklaması şöyle. Key Value Store gibi düşünürsek, value tarafında Document yapısı var. Ben Document yapısını parse etmeyi bildikten sonra istediğim kadar alanı ekler ve silerim.
Document stores like MongoDB and Elasticsearch offer the greatest level of flexibility...and complexity. A document store roughly resembles a key-value store where the key becomes the document ID and the value is the document containing the actual stored data.
Document bir array verisi bile olabilir. Açıklaması şöyle.
The document can be almost anything, including an array of values; thus, Elasticsearch might be a good choice when the stored data is hierarchical within a single document and should not be flattened. Elasticsearch indexes data into indices, which hold types, which contain documents, which hold fields.
Document vs Wide Column Databases
Her ikisi de esnek schema yeteneği sağlıyor. Ancak aralarında bir fark var
Wide Column Database şu işler için iyidir
- Frequent Writes. Yani infrequent update ve read anlamına gelir.
- Can Handle Large Data
- Not Suitable for Complex Queries

Document Database şu işler için iyidir
- Faster read. Sebebi ise Data denormalization kullanılması
- Not Suitable for Complex Queries

Database Design Tips and Tricks
Klasik veri tabanlarında ilk önce Collection için Normalization veya Denormalization yönteminden hangisinin seçileceğine karar vermek gerekir. Belki aynı şeyi MongoDB için de yapmak gerekebilir. 
Normalization Nedir yazısına bakabilirsiniz.

Örnek
Şöyle yaparız
Relationship MongoDB
1:1         Embedded Object (implicit)
        Document Key Reference

1:N         Embedded Array of Objects
        Document key Reference
        Query with $lookup operator
N:M         Embedded Array of Objects
        Arrays of objects with references
        Difficult to query with $lookup operator
MongoDB  ve Sharding
İki çeşit sharding var
1. Hash Strategy
2. Range Strategy
Açıklaması şöyle. Yani sharding için elle ayar yapmak gerekir.
In MongoDB, you can scale by sharding the collection into multiple nodes. You can shard by hash or range. Without explicit shard, each collection remains in a single shard.
Açıklaması şöyle
MongoDB Collections are created on a single node by default (non-sharded). You can explicitly shard a collection using either hash or range-based strategy. 

Hash strategy distributes the data evenly, but only provides shard elimination (aka partition pruning) for queries on equality predicates (also, a rewritten IN predicate… but I don’t know MongoDB optimizer well enough to confirm this).

Range strategy stores data in sorted order of the shard keys. This enables the optimizer to choose the shards the possibility of holding the documents matching the sharding keys for all of the range predicates: equality, in, greater than, less than, and between.

Indexes: MongoDB indexes are local to their data shards and use the same distribution strategy as the collection.
MongoDB Kurulum
Şöyle yaparız
docker run -d --name mongodb-instance -p 27017:27017 mongo
MongoDB  ve Transaction
Açıklaması şöyle
... MongoDB supports ACID transactions in a single document, 
...
MongoDB, with its 4.0 release, added support for Multi-Document Transaction which works across replica sets. This support has also been extended to Sharded Cluster with the 4.2 release.
Replica Set Yoksa
Exception fırlatılır. Açıklaması şöyle
In MongoDB, an operation on a single document is atomic. This covers many use-cases because you can use embedded documents and arrays to capture relationships between data. Starting from version 4.x, ACID transactions have arrived in the Document store, enforcing all-or-nothing execution and maintaining data integrity.

Spring Data MongoDB provides a number of ways to work with MongoDB transactions, including Reactive support. In our case, we will go with the MongoTransactionManager, since it is the gateway to the well-known Spring transaction support, and it lets applications use the managed transaction features of Spring.

Now, if you attempt to run the application against a standalone MongoDB server and you try to execute a @Transactional service method, you will get the following exception:

com.mongodb.MongoClientException: Sessions are not supported by the MongoDB cluster to which this client is connected

This is because MongoDB multi-document transactions require the existence of at least a single replica set. Of course, a production MongoDB should be deployed in a replica set of no less than three (3) replicas. One (1) Mongo node is the primary (accepting writes from the application) and the writes are replicated to the other two (2) Mongo nodes, which serve as secondaries. If the primary fails, one of the secondaries is elected to take its place as the primary, while continuing to replicate data to the other secondary for redundancy. 
MongoDB 4.0
MongoDB 4.0 sadece replica set varsa Multi-Document Transaction destekliyor. İşlemler kabaca şöyle
1. Bir session açılır
2. session ile transaction başlatılır ve birden fazla Document (satır) üzerinde işlem yapılır. Veri tabanını sorgulayan bir başkası henüz daha bu Document'ları göremez.
3. session.commit() yapılır
4. Şimdi Veri tabanını sorgulayan bir başkası henüz daha bu Document'ları görebilir.

MEAN Stack
MongoDB JSON kullandığı için MEAN stack'te rahatça kullanılıyor. Açıklaması şöyle
The four technologies comprising the MEAN stack are Mongo Db as the database, Express as the server system, Angular for front-end, and Node Js as the JavaScript server-side event-driven I/O (in/output) environment.

The key characteristic of MEAN stack is that all four technologies are based on javascript and JSON (JavaScript Object Notation) data from those frameworks save potential JSON encoding time consumption
JSON Verisi
MongoDB JSON verisini saklamak için ideal. Açıklaması şöyle
Behind the scenes, MongoDB represents JSON documents in a binary-encoded format called BSON. BSON is also the wire format for MongoDB.
Aslında Mongo Altta BSON Kullanır :)

Çünkü BSON daha verimli, arama için gerekli alanları da ekler. Açıklaması şöyle
Many programming languages have JavaScript Object Notation (JSON) support or similar data structures. MongoDB uses JSON documents to store records. However, behind the scenes, MongoDB represents these documents in a binary-encoded format called BSON. BSON provides additional data types and ordered fields to allow for efficient support across a variety of languages. One of these additional data types is ObjectId.
_id Alanı
Her dokümanda bu alan olmak zorundadır. Primary Key için kullanılır. 
Kendimiz bir değer atayabiliriz veya MongoDB tarafından otomatik bir değer atanabilir. Eğer MongoDB değer atıyorsa alan ObjectId tipindendir. Açıklaması şöyle
Every document must have a unique _id field.
You can either generate one yourself or let MongoDB generate a value for you which has the type ObjectId. Most of
the time you’ll probably want to let MongoDB generate it for you. By default, the _id field is indexed. You can verify
this through the getIndexes command:
db.unicorns.getIndexes()
Açıklaması şöyle
In the database world, it is frequently important to have unique identifiers associated with a record. In a legacy, tabular database, these unique identifiers are often used as primary keys. In a modern database, such as MongoDB, we need a unique identifier in an `_id` field as a primary key as well. MongoDB provides an automatic unique identifier for the `_id` field in the form of an `ObjectId` data type.

...the [ObjectId] datatype is automatically generated as a unique document identifier if no other identifier is provided.
ObjectId Tipi Nedir
Açıklaması şöyle
ObjectId value increases over time but we can’t be sure that sorting a collection by _id reflects creation order of the documents. In fact, ObjectId only contains a 4 byte timestamp value and it is generated client side by the driver.
Açıklaması şöyle
MongoDB Object Ids are 12-byte (96-bit) hexadecimal integers that begin with a random value and consist of a 4-byte epoch timestamp in seconds, a 3-byte machine identification, a 2-byte process id, and a 3-byte counter.

This is a smaller UUID than the previous 128-bit version. However, the size is more than we would typically find in a single MySQL auto-increment column (a 64-bit digit value).
Geliştirme Tarihçesi
MongoDB'nin geliştirilmesine 2009 senesinde başlanmış. Geliştirme için C++ kullanılmış.

Replication Model
Açıklaması şöyle
In the MongoDB replication model, a group of database nodes host the same data set and are defined as a replica set. One of the nodes in the set will act as primary and the others will be secondary nodes. The primary node is used for all write operations, and by default all read operations as well. This means that replica sets provide strict consistency. 
Tek bir Primary node olduğu için MongoDB consistency sağlar. Açıklaması şöyle.
In MongoDB, there is only one master node. This master node only accepts the input. Apart from this, all the nodes are used as an output. Therefore, if the data has to be written in the slave nodes, it has to pass through the master node.
Ancak eğer Primary node yüzünden performans problemi varsa şu cümleye dikkat etmek gerekir
... make sure all the reads hit the secondary DBs. This has a massive effect, ...
Şeklen şöyle


Read işlemini secondary node'a aktarmanın şöyle bir götürüsü olabilir
The biggest one is that with this solution we have no guarantee when saved data will be replicated to slave instance - ... This means that in case you have two transactions to serve one request (one saving data and another one reading it) you might not be able to read what you’ve written. That in some cases might be a serious issue ... 
Asynchronous Replication
Açıklaması şöyle
Secondaries replicate the primary’s oplog and apply the operations to their data sets asynchronously. By having the secondaries’ data sets reflect the primary’s data set, the replica set can continue to function despite the failure of one or more members.
Index
MongoDB Index yazısına taşıdım

Change Stream
Açıklaması şöyle.
Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a single collection, a database, or an entire deployment, and immediately react to them. Because change streams use the aggregation framework, applications can also filter for specific changes or transform the notifications at will.
kabuk - shell
MongoDB Komutları yazısına taşıdım

Authentication
Açıklaması şöyle
Authentication-wise, MongoDB supports 4 mechanisms:

- SCRAM (default)
- x.509 certificate authentication
- LDAP proxy authentication
- Kerberos authentication

If you are using MongoDB Enterprise Server, then you can benefit from LDAP and Kerberos support.
API
MongoDB API yazısına taşıdım

Mongoose
Açıklaması şöyle
To use MongoDB directly from Javascript rather than using the Mongo shell, we could either use the official MongoDB Node.js driver or we could use an Object Document Mapper (ODM). Mongoose is the officially supported ODM for Node.js, so it is what I have used for this work.

Mongoose requires you to define a schema for your data. This is actually a departure from vanilla MongoDB, which doesn’t require data in a collection to have a common schema. 
Örnek - schema içinde array
Şöyle yaparız. Burada roles [String] şeklinde tanımlanıyor
const userSchema = mongoose.Schema({
    name: {
        type: String,
        required: true
    },
    email: {
        type: String,
        required: true,
        unique: true
    },
    password: {
        type: String,
        required: true
    },
    isAdmin: {
        type: Boolean,
        required: true,
        default: false
    },
    roles: {
        type:[String],
        required: true,
        default:["regular"]
    }
})

Hiç yorum yok:

Yorum Gönder