27 Şubat 2023 Pazartesi

Apache Kafka Connect Neo4JConnector

Örnek
Şöyle yaparız
Properties connectorProperties = new Properties();
connectorProperties.setProperty("name", "neo4j");
connectorProperties.setProperty("connector.class", "streams.kafka.connect.source.Neo4jSourceConnector");
connectorProperties.setProperty("tasks.max", "1");
connectorProperties.setProperty("topic", "some-topic");
connectorProperties.setProperty("neo4j.server.uri", "...");
connectorProperties.setProperty("neo4j.authentication.basic.username", "neo4j");
connectorProperties.setProperty("neo4j.authentication.basic.password", "password");
connectorProperties.setProperty("neo4j.streaming.poll.interval.msecs", "5000");
connectorProperties.setProperty("neo4j.streaming.property", "timestamp");
connectorProperties.setProperty("neo4j.streaming.from", "ALL");
connectorProperties.setProperty("neo4j.source.query",
"MATCH (ts:TestSource) RETURN ts.name AS name, ts.value AS value, ts.timestamp AS timestamp");

Solution Architect (Çözüm Mimarı)

Giriş
Solution Architect belirli bir alanda uzmanlığı olan . Uzmanlık alanına göre şöyle unvanlar da olabilir
Data solution architect
Security solution architect
Cloud solution architect
Integration solution architect

Solution Architect bir iş problemine (business problem) çözüm bulan kişidir. Bu süreçte bir sürü şeye dahil olur. Örneğin
- Analyzing business requirements
- Evaluating technologies
- Designing the solution
- Creating tech documentation
- Managing the development team
- Solving technical problems
vs gibi şeylere dahil olur

Ancak en önemli olarak şu işleri yapar

1. Technical Vision
Açıklaması şöyle
..., solution architecture creates the overall technical vision for a specific solution to a business problem defined in the enterprise architecture under the domain
2. Overseeing the development
Açıklaması şöyle
Simply put, a solution architect is responsible for designing and overseeing the development of a company’s technical infrastructure. This includes everything from the systems and software that run a business to the networks and servers that support them. In short, solution architects are the masterminds behind a company’s technical operations, ensuring that everything runs smoothly and efficiently.
Yani ortaya bir çözüm konulur ve ona ulaşmak için gayret gösterilir. Geri kalan şeyler de destekleyici faaliyetler olarak takımdaki diğer roller tarafından yerine getirilir.

3. Devamı
Bir örnek şöyle. Devam eden bir projeye sonradan katılan Solution Architect kendi kararları olmasa bile halen sorumlu olabiliyor.
Let’s say you are a Solution Architect who joined a new project and team, and you find that they use Sharding Pattern to improve performance for some critical queries of your application. You also get papers showing how it works: Architecture Views and Diagrams, some notes explaining what kind of data must be stored in each shard, and even more, you got some examples on Java. The team implemented everything according to the guidelines and wrote automation tests to ensure performance was good enough for the declared values. Also, the solution works very well, and end-users get what they need.

Time goes by, and you work on something else. Still, eventually, you get a call from Project Sponsors saying that one Architecture Consultancy recommended decreasing the number of database instances for cost optimizations. Your gut feeling tells you that if you do so, you will break something for the end-users, but you cannot explain why you believe that Sharding Pattern is the best choice here. You also realize there were many other ways to improve Performance: Cache, CQRS, Event Sourcing, and different patterns. Why is the solution based on Sharding? Why not something else? 

Chief Enterprise architect vs Solution architect
Şeklen şöyle
Açıklaması şöyle. Yani Solution Architect daha özelleşmiş durumda
Enterprise architect: Plan for a wide view on required components for the enterprise at strategic level with big picture for long term development

Solution architect: Plan for a particular solution / building block within the big picture (e.g. the enterprise data warehouse solution)
Solution Architect İçin Mülakat Soruları
Solution Architect uçtan uca her şey için biraz da olsa malumat sahibi olmak zorunda. Bazı mülakat soruları şöyle
1. Describe the architecture of a high-performance and scalable data pipeline 
2. Explain how you would design a real-time streaming solution to process and analyze large datasets.
3. Describe your experience with containerization and how you would design a container-based solution.
4. Explain how you would design a distributed system that is fault-tolerant and highly available.
5. Describe your experience with cloud computing and how you would design a cloud-native solution.
6. Explain how you would design a microservices architecture and the challenges you have faced in implementing it.
7. Describe your experience with service-oriented architecture and how you would design a SOA solution.
8. Explain how you would design a solution to handle high-volume, low-latency data processing.
9. Describe your experience with data warehousing and how you would design a data warehousing solution.
10. Explain how you would design a solution for data governance and compliance.
11. Describe your experience with data modeling and how you would design a data model for a specific use case.
12. Explain how you would design a solution for data integration and data quality.
13. Describe your experience with data security and how you would design a solution for data security and privacy.
14. Explain how you would design a solution for data analytics and business intelligence.
15. Describe your experience with machine learning and how you would design a machine learning solution.
16. Explain how you would design a solution for real-time analytics and reporting.
17. Describe your experience with search and how you would design a search solution.
18. Explain how you would design a solution for data archiving and data retention.
19. Describe your experience with data replication and how you would design a data replication solution.
20. Explain how you would design a solution for data backup and disaster recovery.
21. Describe your experience with data governance and how you would design a data governance solution.
22. Explain how you would design a solution for data lineage and data provenance.
23. Describe your experience with data quality and how you would design a data quality solution.
24. Explain how you would design a solution for data governance and data stewardship.
24. Describe your experience with data governance and data management and how you would design a data governance and data management solution.
Solution Engineer
Açıklaması şöyle
... a Solutions Engineer tends to be trying to make a company's proposed solution the technology option for a potential customer.

Yazılım Mimarından (Software Architect) Çözüm Mimarına Geçiş (Solution Architect)
Açıklaması şöyle. Yani bu doğal bir şey.
I transitioned from Software Architect to Solution Architect long ago. It’s a reasonably common career move. 
Ancak her geçişte olduğu gibi eski yetenekleri yeni işte uygulama isteği de oluyor ve bu bazen sıkıntı getiriyor. Açıklaması şöyle
The problem in this situation is two-fold:

1. You know perfectly well software libraries
2. You don’t know well infrastructure components

It seems logical that people in this situation try to solve problems with the solutions they are most familiar with. However, it doesn’t mean it’s the best approach. It’s a bad one in most cases.


Apache Cassandra Query Language - CQL ile DDL

ROLE
Örnek
Şöyle yaparız
CREATE ROLE iot_root_user WITH SUPERUSER = true AND 
                                LOGIN = true AND 
                                PASSWORD = 'password';

LIST roles;

DROP role iot_root_user;

CREATE KEYSPACE iot
    WITH REPLICATION = {
        'class' : 'SimpleStrategy',
        'replication_factor' : 1
        };
KEYSPACE
Örnek
Şöyle yaparız
CREATE KEYSPACE iot
    WITH REPLICATION = {
        'class' : 'SimpleStrategy',
        'replication_factor' : 1
        };


DESCRIBE KEYSPACES;

DESCRIBE KEYSPACE iot;

DROP KEYSPACE iot;

USE iot;
CREATE
Örnek
Şöyle yaparız
CREATE COLUMNFAMILY sensor_events (
                    account_name text,
                    device_id UUID,
                    event_id UUID,
                    event_date date,
                    closest_devices_ip set<inet>,
                    tempratures list<int>,
                    tags map<text,text>,
                    latitude float,
                    longitude float,
                    humidity  int,
                    event_time time,
                    PRIMARY KEY((account_name,device_id), event_id, event_date)
);

DESCRIBE COLUMNFAMILIES;

DESCRIBE COLUMNFAMILY sensor_events;

ALTER COLUMNFAMILY sensor_events ADD pressure int;
TRUNCATE
Örnek
Şöyle yaparız
TRUNCATE COLUMNFAMILY sensor_events;

Docker Compose ve Apache Cassandra

Örnek
Şöyle yaparız
version: '3.9'

services:
  cassandra_db:
    container_name: cassandra_local_db
    image: cassandra:4.1.0
    ports:
      - "9042:9042"
    volumes:
      - ./config/cassandra-config.yaml:/etc/cassandra/cassandra.yaml
      - ./data/data-for-input.csv:/tmp/data/data-for-input.csv
Açıklaması şöyle
The Cassandra database has a configuration file where all configurations are stored. This file is located in etc/cassandra directory and is called cassandra.yaml.
Bağlanmak için şöyle yaparız
$docker container exec -it cassandra_local_db bash

$nodetool  status

$cqlsh -ucassandra -pcassandra


Apache Cassandra Primary Key

Giriş
Açıklaması şöyle
Every row in Cassandra is identified by a primary key consisting of two parts:
1. partition key — defining location in the cluster. Partition key hash indicates on which node on the Cassandra cluster the partition is located
2. clustering key — defining row location inside of the partition

Queries by partition key or by partition key and clustering key are fast & efficient. 
Şeklen şöyle


Açıklaması şöyle
Type of Partitioning: Cassandra use a compromise between range based and hash based. You can have a compound primary key consisting of several columns. Only the first part is hashed to determine the partition, other columns are used as concatenated index for sorting the data in SSTables. If a fixed value for first column is specified, range scan can be done efficiently.
Yani Clustering Key sıralama için kullanılır.  Şeklen şöyle




Column-Oriented Veri Tabanı

Giriş
Açıklaması şöyle. Column-Oriented Veri Tabanı örneklerinden bazıları Apache Cassandra ve Apache HBase
There are two types of data storage databases one is a row-oriented database and another one is a column-oriented database.

Row-Oriented (ACID Transactions): They are databases that organize data by the record, keeping all of the data associated with a record next to each other in memory. Row-oriented databases are the traditional way of organizing data and still provide some key benefits for storing data quickly. They are optimized for reading and writing rows efficiently.
Common row-oriented databases:
- PostgreSQL
- MySQL

Column-Oriented (Analytics): They are databases that organize data by field, keeping all of the data associated with a field next to each other in memory. Columnar databases have grown in popularity and provide performance advantages to querying data. They are optimized for reading and computing on columns efficiently.
Common column-oriented databases:
- AWS RedShift
- Google BigQuery
- HBase
İlişkisel veri tabanı eklen şöyle

Ama Column-Oriented veri tabanı şeklen şöyle. Kısaca her bir sütünün tüm değerlerini bir satırda topluyor gibi düşünülebilir. 


Şeklen şöyle

24 Şubat 2023 Cuma

JSON Web Token (JWT) Claim Çeşitleri

Claim Nedir
Claim bilgisi, payload yani 2. kısım içinde taşınır.  Açıklaması şöyle.
Claims are statements about the entity, which is typically a user, and any additional data. There are three types of claims:

- Registered claims: a set of recommended claims defined in the RFC 7519 spec. Some examples are iss, exp, and aud.

- Public claims: user-defined claims that can be defined by the token users, but should conform to naming conventions to avoid collision (should be defined in the IANA JSON Web Token Registry or be defined as a URI that contains a collision resistant namespace) because they are in the public namespace.

- Private claims: arbitrary custom claims that are used to share information between parties that agree on them (and don’t have to worry about name collision because they’re private).
Claim için zaman damgası olabilir. Açıklaması şöyle.
have a payload that contains the claim(s) (equipped with a timestamp)
1. Registered veya Reserved Claim Tipleri
Açıklaması şöyle.
There are the claims which are registered in IANA "JSON Web Token Claims" registry. These claims are not mandatory to use or to be implement in all cases, rather they are registered to provide a starting point in for a set of useful, interoperable claims.
Örnek
Açıklaması şöyle.
iss is who issued the token. This is a registered claim.
exp is when the token expired. Also a registered claim.
sub is the subject. This is usually a user identifier. Also a registered claim.
scope is a custom, private claim that is commonly used with OAuth 2.0.
Açıklaması şöyle.
1. iss (issuer): The "iss" (issuer) claim identifies the principal that issued the JWT. The processing of this claim is generally application specific. The "iss" value is a case-sensitive string containing a String or URI value. Use of this claim is OPTIONAL.

2. sub (subject): This claim represents the subject of JWT (the user). The subject value MUST either be scoped to be locally unique in the context of the issuer or be globally unique. The processing of this claim is generally application specific. The "sub" value is a case-sensitive string containing a String or URI value. Use of this claim is OPTIONAL.

3. aud (audience): This claim represents the intended recipient of the JWT. If the party processing the claim does not identify itself with a value in the "aud" claim when this claim is present, then the JWT MUST be rejected. In the general case, the "aud" value is an array of case- sensitive strings, each containing a String or URI value. Use of this claim is OPTIONAL.

4. exp (expiration): The "exp" (expiration time) claim identifies the expiration time on or after which the JWT MUST NOT be accepted for processing. The processing of the "exp" claim requires that the current date/time MUST be before the expiration date/time listed in the "exp" claim. Usually the value is kept short preferably in seconds. Its value MUST be a number containing a NumericDate value. Use of this claim is OPTIONAL.

5. nbf (not before): The "nbf" (not before) claim identifies the time before which the JWT MUST NOT be accepted for processing. The processing of the "nbf" claim requires that the current date/time MUST be after or equal to the not-before date/time listed in the "nbf" claim. Implementers MAY provide for some small leeway, usually no more than a few minutes, to account for clock skew. Its value MUST be a number containing a NumericDate value. Use of this claim is OPTIONAL.

6. iat (issued at): The "iat" (issued at) claim identifies the time at which the JWT was issued. This claim can be used to determine the age of the JWT. Its value MUST be a number containing a NumericDate value. Use of this claim is OPTIONAL.

7. jti (JWT ID): The "jti" (JWT ID) claim provides a unique identifier for the JWT. The identifier value MUST be assigned in a manner that ensures that there is a negligible probability that the same value will be accidentally assigned to a different data object; if the application uses multiple issuers, collisions MUST be prevented among values produced by different issuers as well. The "jti" claim can be used to prevent the JWT from being replayed. The "jti" value is a case-sensitive string. Use of this claim is OPTIONAL.
2. Public Claim Tipleri
Kendi isteğimize göre hazırlanır. Açıklaması şöyle.
These claim names can be defined at will by those using JWTs. However, in order to prevent collisions, any new Claim Name should either be registered in the IANA "JSON Web Token Claims" registry or be a Public Name: a value that contains a Collision-Resistant Name.

In each case, the definer of the name or value needs to take reasonable precautions to make sure they are in control of the part of the namespace they use to define the Claim Name.
3. Private Claim Tipleri
Sistemler arasında gidip gelmesi için hazırlanır. Açıklaması şöyle.
This could be thought of as analogous to creating private custom claims to share information specific to your application. These could be any names that are not Registered Claims Names or Public Claims Names. Unlike Public Claim Names, Private Claim Names are subject to collision and should be used with caution.
scope Claim Nedir
Hangi kaynağa izin verildiğini belirtir. Bu kaynağa ne tür izin verildiği de belirtilir. Örneğin "read:contacts" gibi. Açıklaması şöyle
The scope claim is commonly used to provide authorization information. For example, letting the application know what part of the application the user is authorized to access. This, of course, does not relieve the server of its duty to perform its own authorization checks. A general principle of web application security is redundancy. The client app provides one checkpoint, the server another.
Açıklaması şöyleScope sayesinde kaynağın her türlü bilgisine değil de sadece belirtilen kısmına erişim verilir.
Scopes define the permissions that determine what data of a user an application can access. For instance, if a 3rd party application wants to recommend movies to a user, it requires access to the movies the user has watched (e.g., “watched_movies”). This is where scopes come into play. This 3rd party application can access user information only to the extent the user has permitted.

This process ensures the safety of user information. Instead of accessing all of a user’s data, the 3rd party can access user data within the permissions granted.
Authorities Nedir
Açıklaması şöyle.  
Authorities represent the actions (that one has permission for) a user can perform within an application. Compared to scopes, they are usually more detailed and specify which actions can be carried out within a specific application.

For example, a user can add a movie to their favorites (e.g., “user”). The permissions to perform this action are called authorities.
SpringSecurity JWT token'ı parse ederken “scope” veya “scp” satırlarını okur ve bunları bir string listesine çevirir. Listedeki her eleman için ‘SCOPE_’ ön ekini ekler ve SimpleGrantedAuthority nesnesi yaratır. 















Result Object

Giriş
Eğer bir metod birden fazla sonuç dönecekse kullanılır. Java'daki Optional Result Object kullanımını tam karşılamıyor. Çünkü hata durumunda hata mesajı yerine exception fırlatıyor. Açıklaması şöyle
using exceptions for expected cases is considered an anti-pattern
Yani Optional kullanımı şöyle
public Account findAccountByUserIdX(int userId) {
var user = repository.findById(userId).orElseThrow(UserNotFoundException::new); return repository.findAnAccountOfUser(user).orElseThrow(AccountNotFoundException::new); }
Bunun yerine bazı Java kütüphaneleri Either diye bir yapı sunuyor

Not :Girdi için Parameter Object yazısına bakabilirsiniz

Örnek
jar içindeki main method nesnesini bulmak isteyelim. İki şeyden birisini dönebiliriz.
1. Method nesnesi
2. Hata mesajı

Şöyle yaparız
class MainMethodFinder {

  Result result = new Result();

  static class Result {

    Method mainMethod;

    private String errorMessage;

    public String getErrorMessage() {
      return errorMessage;
    }

    public Method getMainMethod() {
      return mainMethod;
    }

    boolean hasError() {
      return !StringUtil.isNullOrEmpty(errorMessage);
    }
  }
  ...
}


23 Şubat 2023 Perşembe

Apache Kafka Connect JdbcSinkConnector

Giriş
Örneğin MySQL'den okumak için io.debezium.connector.mysql.MySqlConnector kullanılır ama PostgreSQL'e yazmak için io.confluent.connect.jdbc.JdbcSinkConnector kullanılır

1. Connector'a bir isim verilir
2. connector.class her zaman io.confluent.connect.jdbc.JdbcSinkConnector olarak belirtilir.

3. Veri tabanı bağlantısı bilgisi tanımlanır. Bu alanlar şöyle
connection.url
connection.user
connection.password

4. Tüketilecek topicler topics ile belirtilir. Açıklaması şöyle
Topics to subscribe for updates.

5. Eğer hedef veri tabanında tablo yoksa auto.create ile yaratılması istenebilir. Açıklaması şöyle
Auto-create the schema objects if they do not exist e.g. the table will be auto-created during the initial sync.

Örnek
Şöyle yaparız
$ curl -X POST -H "Accept:application/json" -H "Content-Type:application/json"
--data @postgres-sink-btc.json http://localhost:8083/connectors
Dosyanın içi şöyledir
{
  "name": "postgres-sink-btc",
  "config": {
    "connector.class":"io.confluent.connect.jdbc.JdbcSinkConnector",
    "tasks.max":"1",
    "topics": "topic_BTC",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "value.converter": "org.apache.kafka.connect.json.JsonConverter",
    "connection.url": "jdbc:postgresql://questdb:8812/qdb?useSSL=false",
    "connection.user": "admin",
    "connection.password": "quest",
    "key.converter.schemas.enable": "false",
    "value.converter.schemas.enable": "true",
    "auto.create": "true",
    "insert.mode": "insert",
    "pk.mode": "none"
  }
}
Açıklaması şöyle
topics: Kafka topic to consume and convert into Postgres format.

connection: Using default credentials for QuestDB (admin/quest) on port 8812.

value.converter: This example uses JSON with schema, but you can also use Avro or raw JSON. 
Örnek
Şöyle yaparız
name=test-jdbc-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
connection.url=jdbc:postgresql://postgres-postgresql:5432/test?user=postgres&password=<postgresql-password>
topics=students
dialect.name=PostgreSqlDatabaseDialect
auto.create=true
insert.mode=upsert
pk.fields=id
pk.mode=record_value
transforms=unwrap
transforms.unwrap.type=io.debezium.transforms.UnwrapFromEnvelope
transforms* alanı için açıklama şöyle
This is another SMT provided by Debezium that we are going to use. By default, the structure of debezium is complex and consists of multiple levels of information including event key schema, event key payload, event value schema, event value payload (For details refer to Connector Documentation). Even in the event value payload section, we have multiple structures for values before and after. Of these, we are only interested in the final payload and that is what this SMT provides us with. It unwraps the original message and provides a relevant section. Note that we have applied this SMT after the data is saved in Kafka and before it is inserted in PostgreSQL so that Kafka remains a source of truth and has all the information (if and when required).

Apache Cassandra Debezium Connector

Örnek
Şöyle yaparız
{
    "name": "kafka-cosmosdb-sink",
    "config": {
        "connector.class": "com.datastax.oss.kafka.sink.CassandraSinkConnector",
        "tasks.max": "1",
        "topics": "myserver.retail.orders_info",
        "contactPoints": "<Azure Cosmos DB account name>.cassandra.cosmos.azure.com",
        "loadBalancing.localDc": "<Azure Cosmos DB region e.g. Southeast Asia>",
        "datastax-java-driver.advanced.connection.init-query-timeout": 5000,
        "ssl.hostnameValidation": true,
        "ssl.provider": "JDK",
        "ssl.keystore.path": "<path to JDK keystore path e.g. <JAVA_HOME>/jre/lib/security/cacerts>",
        "ssl.keystore.password": "<keystore password: it is 'changeit' by default>",
        "port": 10350,
        "maxConcurrentRequests": 500,
        "maxNumberOfRecordsInBatch": 32,
        "queryExecutionTimeout": 30,
        "connectionPoolLocalSize": 4,
        "auth.username": "<Azure Cosmos DB user name (same as account name)>",
        "auth.password": "<Azure Cosmos DB password>",
        "topic.myserver.retail.orders_info.retail.orders_by_customer.mapping": "order_id=value.orderid, customer_id=value.custid, purchase_amount=value.amount, city=value.city, purchase_time=value.purchase_time",
        "topic.myserver.retail.orders_info.retail.orders_by_city.mapping": "order_id=value.orderid, customer_id=value.custid, purchase_amount=value.amount, city=value.city, purchase_time=value.purchase_time",
        "key.converter": "org.apache.kafka.connect.storage.StringConverter",
        "transforms": "unwrap",
        "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
        "offset.flush.interval.ms": 10000
    }
}


22 Şubat 2023 Çarşamba

Accord Consensus Algorithm

Giriş
Açıklaması şöyle
... called Accord (pdf) from a team at Apple and the University of Michigan. 
Açıklaması şöyle
Accord addresses two problems that aren’t solved in previous consensus protocols: How can we have a globally available consensus and achieve it in one round trip? The first novel mechanism is the reorder buffer.

Assuming commodity hardware is in use, differences in clocks between nodes are inevitable. The reorder buffer measures the difference between nodes in addition to the latency between them. Each replica can use this information to correctly order data from each node and account for the differences, guaranteeing one round-trip consensus with a timestamp protocol.

The other mechanism is fast-path electorates. Failure modes can create latency when electing a new leader before resuming. Fast-path electorates use pre-existing features in Cassandra with some novel implementations to maintain a leaderless fast path to quorum under the same level of failure tolerated by Cassandra.

21 Şubat 2023 Salı

Apache Kafka Partition Rakamını Hesaplama

Partition Nedir?
Açıklaması şöyle. Topic bölümlendirilir. Her parçaya partition denilir.
Topics are divided into smaller parts called partitions. A partition can be described as a commit log. Messages can be appended to the log and can be read-only in the order from the beginning to the end. Partitions are designed to provide redundancy and scalability. The most important fact is that partitions can be hosted on different servers (brokers), giving a very powerful way to scale topics horizontally.
Partition Yöntemleri
Açıklaması şöyleConsistent Hashing yazısına bakabilirsiniz.
A common and efficient approach is Consistent Hashing. Although, others such as Hashing and range-based partitioning can still be used.
Global Event Order
Partition ve Message Order yazısına taşıdım

Partition Rakamı Nasıl Belirlenir?
kafka-topics komutu ile topic yaratırken bir de istenilen partition sayısı belirtilebiliyor. Bu sayı nasıl hesaplanır. 

Broker Sayısı
Eğer broker sayısı < 6 ise açıklaması şöyle
Partitions per topic is the million-dollar question and there’s no one answer. If you have a small cluster of fewer than six brokers, create two times the number of brokers you have ( N X 2 if N < 6). The reasoning behind it is that if you have more brokers over time, for example 12 brokers. Well, at least you will have enough partitions to cover that.
Consumer Sayısı
partition değeri ile ilgili açıklama şöyle. Yani consumer sayısı da Partition Rakamını belirlerken bir etken
Remember! Kafka topic partition must be the same or less than the number of concurrent consumer threads
Sonuç
Bu iki sayının en küçüğü alınır




20 Şubat 2023 Pazartesi

Backend for Frontend - BFF

Giriş
Açıklaması şöyle
Instead of a single point of entry, it introduces multiple gateways. Because of that, you can have a tailored API that targets the needs of each client (mobile, web, desktop, voice assistant, etc.), and remove a lot of the bloat caused by keeping it all in one place.
Şeklen şöyle. Web ve Mobile için farklı giriş noktaları sağlıyor


Aslında bu BFF bir ilk değil. Hemen hemen aynı şeyi anlatan diğer bazı denemeler de şöyle. Bunlar MVC, MVP, MVVM vs.



17 Şubat 2023 Cuma

Apache Kafka Zookeeper Yerine Kafka Raft

Neden Gerekiyor ? 
Açıklaması şöyle. Yani Zookeeper ile her şey yavaş kalıyor. Ayrıca Kafka ve Zookeeper iki farklı araç ve birbirlerinden çok farklı oldukları için her ikisini de öğrenmek gerekiyor. Bu da zaman ve enerji kaybı
In a Kafka cluster, the Kafka Controller manages broker, topic, and partition meta-data (the “control plane”) 
...
But how does Kafka know which controller should be active, and where the meta-data is stored in case of broker failures? Traditionally a ZooKeeper Ensemble (cluster) has managed the controller elections and meta-data storage ...

But this introduces a bottleneck between the active Kafka controller and the ZooKeeper leader, and meta-data changes (updates) and failovers are slow (as all communication is between one Kafka broker and one ZooKeeper server, and ZooKeeper is not horizontally write scalable). The new Kafka Raft mode (KRaft for short) replaces ZooKeeper with Kafka topics and the Raft consensus algorithm to make Kafka self-managed,...

The Kafka cluster meta-data is now only stored in the Kafka cluster itself, making meta-data update operations faster and more scalable. The meta-data is also replicated to all the brokers, making failover from failure faster too. Finally, the active Kafka controller is now the Quorum Leader, using Raft for leader election.

The motivation for giving Kafka a “brain transplant” (replacing ZooKeeper with KRaft) was to fix scalability and performance issues, enable more topics and partitions, and eliminate the need to run an Apache ZooKeeper cluster alongside every Kafka cluster.
Zookeeper'ın Kaldırılması
Kafka 2.8.0'dan itibaren KPI-500 ile (yani Kafka Process Improvement - 500) Zookeper'ın kaldırılması için çalışılıyor. Kafka Version 2.8.0 çıktığında bu hala gerçekleşmemişti. Açıklaması şöyle
Kafka Version 2.8.0 introduces an early access to Zookeeper-less Kafka as part of KPI-500 using the Kraft mode. The implementation is partially complete and thus is not to be used in production environments.
Kafka 3.3 ile Zookeeper isteğe bağlı olarak kaldırılabilir. Açıklaması şöyle
The changes have been happening incrementally, starting from Kafka version 2.8 (released April 2021) which had KRaft in early access. In version 3.0 (September 2021) it was in preview. The maturity of KRaft has improved significantly since version 2.8 (3.0 had 20 KRaft improvements and bug fixes, 3.0.1 had 1, 3.1.0 had 5, and 3.1.1 had only 1 minor fix), and on October 3 2022 version 3.3 was marked as production ready (for new clusters).
Zookeeper Yerine KRaft
Açıklaması şöyle
“KRaft” stands for “Kafka Raft”, and is a combination of Kafka topics and the Raft consensus algorithm.
@metadata Topic
Açıklaması şöyle. Internal Quorum'daki controller'lar artık @metadata topic ile haberleşecekler.
When Kafka Raft Metadata mode is enabled, Kafka will store its metadata and configurations into a topic called @metadata. This internal topic is managed by the internal quorum and replicated across the cluster. The nodes of the cluster can now serve as brokers, controllers or both (called combined nodes).

When KRaft mode is enabled, only a few selected servers can serve as controllers and these will compose the internal quorum. The controllers can either be in active or standby mode that will eventually take over if the current active controller server fails or is taken down.
Normalde "Broker Liveliness" bilgisi Zookeeper'da saklanır. Kafka Controller "liveliness" bilgisini sürekli kontrol eder. Bu artık olmayacak.

AWS Simple Email Service - SES

Giriş
Açıklaması şöyle
Amazon Simple Email Service (SES) is an API for sending emails and managing email lists. SES’s main API endpoints are focused on sending emails and managing email contacts. The service also includes more advanced endpoints related to deliverability, like managing the dedicated IP addresses from which SES sends your emails.
Açıklama şöyle
The first thing you have to do is to add email identities to Amazon SES. It will only allow the registered email identities to send the emails. By default, your Amazon SES will be working on a sandbox environment. So you can only send the emails to the registered email identities. You can request AWS support to remove the sandbox environment.

15 Şubat 2023 Çarşamba

Dcoker Compose ve NGINX

Örnek
Şöyle yaparız
version: '3'
services:
  lb:
    build:
      context: nginx
      dockerfile: Dockerfile
    ports:
      - "9090:9090"
    networks:
      - my-network
    depends_on:
      - service1
      - service2

  service1:
    build:
      context: service1
      dockerfile: Dockerfile
    ports:
      - "8181:8080"
    networks:
      - my-network

  service2:
    build:
      context: service2
      dockerfile: Dockerfile
    ports:
      - "8282:8080"
    networks:
      - my-network

networks:
  my-network:
    driver: bridge


NGINX - nginx.conf Dosyası Load Balancer

1. upstream İle Sunucular Tanımlanır
Şöyle yaparız.
upstream app1 {
  server host.docker.internal:5000;
}

upstream app2 {
  server host.docker.internal:5001;
}

server {
  ...
}
HTTP Load Balancer
proxy_pass ile istek upstream block ile tanımlı sunucular gönderilir

Örnek
Şöyle yaparız. http://localhost:9090 adresine gelen isteklerin %10'u service1'e, %90'ı ise service2'ye gönderiliyor. http://localhost:9090;
# here we must point to the internal port of application ;) upstream servers { server service1:8080 weight=1 fail_timeout=15s; server service2:8080 weight=9 fail_timeout=15s; } server { listen 9090; location / { proxy_redirect off; proxy_pass http://servers; } }
gRPC Load Balancer
grpc_pass ile istek upstream block ile tanımlı sunucular gönderilir

Örnek
Şöyle yaparız
upstream grpcnodes {
  server ip_address:8001;
  server ip_address:8002;
  server ip_address:8003;
}
server {

  listen 1443 http2;
    ssl_certificate /home/ubuntu/http2/certificates/localhost-certificate.pem;
    ssl_certificate_key /home/ubuntu/http2/certificates/localhost-privatekey.pem;

    location / {
      grpc_pass grpcnodes;
      ##try_files $uri $uri/ =404;//Comment this else you will get 404 as a response
    }
}

12 Şubat 2023 Pazar

Kibana

Giriş
5601 numaralı portu dinler. Loglara bakmak için buraya bağlanırız

Örnek
 http://localhost:5601/app/dev_tools#/console adresine gideriz.
İndeks yaratmak için şöyle yaparız

Sorgulamak için şöyle yaparız




7 Şubat 2023 Salı

aws komutu auto complete

Örnek - on-partial
Şöyle yaparız
[my-profile]
#...
cli_auto_prompt = on-partial

Örnek - on
Şöyle yaparız
[my-profile]
#...
cli_auto_prompt = onl

6 Şubat 2023 Pazartesi

AWS Aurora - AWS RDS'ten Biraz Daha Gelişmiştir

Giriş
Açıklaması şöyle. AWS RDS'ten Biraz Daha Gelişmiştir. Hem MySQL hem de PostgreSQL olarak kullanılabilir.
- MySQL & PostgresSQL compatible relational database.
- Provides 5x better performance than MySQL
- Provides 3x better performance than Postgres SQL
- Distributed, fault-tolerant, self-healing storage system
- 2 copies of your data is contained in each Availability Zone (AZ) — minimum of 3 AZ’s and 6 copies.
- Can handle the loss of up to 2 copies without affecting write ability.
- Can handle lose of up to 3 copies of data without affecting read ability.
- Automated backups always enabled — doesn’t impact performance.
Açıklaması şöyle
Amazon Aurora is an elevated version of Amazon RDS. Large enterprises use this since their data volume and complexity of operations are much higher. It doesn't support all the same database engines as Amazon RDS, and instead only supports MySQL and PostgreSQL. Aurora scales up and down as the load on your database increases and decreases. Newer providers like PlanetScale also offer this capability with additional schema migration features and lower costs.

Amazon Aurora, like RDS, can perform replication. It actually offers about 15 different types of replications, and one replication can be done within milliseconds. On the other hand, RDS can perform only five types of replications, taking more time.

Some of the use cases that can depict the strength of Amazon Aurora are enterprise applications, SaaS applications, and web/mobile gaming.
Benefits
  1. Auto-scaling allows scaling operations to be carried out automatically without affecting the DB performance. It allows up to 128 TB per database instance.
  2. Aurora backup operations are automatic, continuous, incremental, and have about 99.99999999% durability.
  3. Aurora can detect database failure and recover in less than a minute. Furthermore, in case of a permanent failure, it can automatically move to a replica without data loss.
Replication
Açıklaması şöyle
- It expands the Multi-AZ replicas from 2 to up to 15.
- It automatically copies data across six storage nodes in different AZs of the same region, even if you do not have any read replica.

Having more replicas gives you increased read capacity but also more options when promoting a read replica to become a new primary. The lag between the primary and the replica is usually less than 100 ms, which means there is still a chance of an outdated read after a write.

Even if you decide not to have a replica, the fact that you have copies of data stored automatically in different AZs helps to prevent data loss, albeit at the expense of a longer recovery time, as a new primary will be recreated in the event of failure with the existing one.

Capacity Type Settings in Amazon Aurora
Şu seçenekler mevcut
- Provisioned Capacity
- Serverless Capacity
Upgrade
Burada bir yazı var. Seçenekler şöyle
1. In-place upgrade
2. AWS Database Migration Service
3. Manual migration with PostgreSQL tools