21 Eylül 2021 Salı

Data Mesh Nedir?

Giriş
Açıklaması şöyle
Data mesh is a highly decentralized data architecture in which independent teams are responsible for managing data within their domains. Each domain or department, such as finance or supply chain, becomes a building block of an ecosystem of data products called mesh.
Tarihçesi
Açıklaması şöyle
The concept of data mesh was created by Zhamak Dehghani, director of next tech incubation at the software consultancy ThoughtWorks. Dehghani attempted to solve problems caused by centralized data infrastructure. She observed how many of her clients that centralized all of their data in one platform found it hard to manage a huge variety of data sources. A centralized setup also forced teams to change the whole data pipeline when responding to new needs. Teams struggled with solving the influx of data requests from other departments, which suffocated innovation, agility, and learning.
Challenges of Data Mesh
Açıklaması şöyle
Decentralized data architecture also leads to several challenges, such as:

- Duplication of Data: Repurposing and replicating data from the source domain to serve other domain’s needs may lead to the duplication of data and higher data management costs.

- Neglected Quality: The existence of multiple data products and pipelines may lead to the neglect of quality principles and huge technical debt.

- Change Management Efforts: Deploying data mesh architecture and decentralized data operations will involve a lot of change management efforts.

- Choosing Future-Proof Technologies: Teams will have to carefully decide on which technologies to use to standardize them across the company and ensure they can tackle future challenges.
Data Mesh vs. Data Fabric
Açıklaması şöyle
The data mesh concept may appear similar to data fabric as both architectures provide access to data across various platforms. But there are several differences between the two.

For one, data fabric brings data to a unified location, while with data mesh, data sets are stored across multiple domains. Also, data fabric is tech-centric because it primarily focuses on technologies, such as purpose-built APIs, and how they can be efficiently used to collect and distribute data. Data mesh, however, goes a step further. It not only requires teams to build data products by copying data into relevant data sets but also introduces organizational changes, including the decentralization of data ownership.

There are various interpretations of how data mesh compares to data fabric. And two companies may introduce different tech solutions for data mesh or data fabric depending on their data size and type, security protocols, employee skillsets, and financial resources.

19 Eylül 2021 Pazar

Time Zone Handling

Giriş
Burada bazı öneriler var

Veri tabanı Sütun Tipleri
Veri tabanında tarih ve saat metin (text) olarak asla saklanmamalı. Bu iş için sütun tipleri var.  Sütun tipi olarak ta UTC saklamak daha iyi. PostgreSQL için TIMESTAMP WITH TIME ZONE kullanılabilir.

Veri tabanı Sunucusu
Veri tabanının kendisi de UTC saat diliminde çalışmalı. Şeklen şöyle. Böylece veri tabanına yapılan select işlemleri de kolaylaşır. 
Not : Docker zaten varsayılan ayar olarak UTC ile başlıyor.

Java
Java kullanıyorsak eski  java.util.Date ve   java.util.Calendar sınıfları kullanılmamalı. Yeni Java sınıfları şöyle
Class         String representation Time zone applicability
Instant                         2021-05-27T16:05:48.147558500Z Always UTC
LocalTime                 21:35:48.284556200 NA
LocalDate                 2021-05-27         NA
LocalDateTime                 2021-05-27T21:35:48.286557 NA
OffsetTime                  21:35:48+05:30 Supports
OffsetDateTime                2021-05-27T21:35:48.288557+05:30 Supports
ZonedDateTime               2021-05-27T21:35:48.291556100                  Supports (DST aware)
                                        +05:30[Asia/Colombo]

Hibernate
Hibernate Kullanıyorsak veriyi UTC olarak kaydetmek için şöyle yaparız
#for plain hibernate
hibernate.jdbc.time_zone=UTC

#for Spring boot jpa
spring.jpa.properties.hibernate.jdbc.time_zone=UTC


ISO 8601

Giriş
Şeklen şöyle. Tarih + Saat + UTC Offset içerir

Örnek - Tarih ve Saat
2021–06–13T17:55:02+0000
Örnek - Tarih ve Saat
2023‐08‐07
2023‐08‐07T13:25:38Z
Örnek - Karışık
2023-W32-1T15:38+02:00 (= Monday of the 32nd week in my local time zone)
Bu karışık  şeyler yüzünden RFC 3339 daha iyi olabilir deniliyor

Oracle Coherence

Giriş
Benim gözümde Memcached muadili idi ancak aslında çok daha fazla şey yapabiliyor.

1. SpringBoot İle Kullanma
Maven ile şu satırı dahil ederiz
<dependencies>
  <dependency>
    <groupId>com.oracle.coherence.spring</groupId>
    <artifactId>coherence-spring-boot-starter</artifactId>
    <version>3.0.0-M1</version>
  </dependency>
  <dependency>
    <groupId>com.oracle.coherence.ce</groupId>
    <artifactId>coherence</artifactId>
    <version>21.06-M2</version>
  </dependency>
</dependencies>
Açıklaması şöyle
CoherenceAutoConfiguration will kick in, bootstrap Coherence, and you can immediately start injecting Coherence dependencies into your Spring-managed classes using a rich set of available annotations including:

- @CoherenceCache
- @CoherenceMap
- Filter Binding Annotations
- Extractor Binding Annotations
- and many more
2. Cache Olarak Kullanma
Açıklaması şöyle. Burada özel bir şey yapmaya gerek yok. Spring'e ait anotasyonları kullanmak yeterli.
If you need to use Coherence for caching using Spring’s Cache abstraction, just add the @EnableCaching annotation and CoherenceAutoConfiguration will add a CoherenceCacheManager to the Spring ApplicationContext.

Now you can take advantage of Spring’s Cache abstraction that is backed by Coherence and use the relevant Spring annotations such as @Cacheable, @CacheEvict, @CachePut.
3. Oracle Coherence'a Mahsus Anotasyonlar

@CoherenceEventListener Anotasyonu
Örnek
Şöyle yaparız. Burada   @Where ile CohQL expression kullanılıyor
@CoherenceEventListener
@MapName("people") @WhereFilter("age >= 18") public void onAdult(MapEvent<String, Person> people) { // TODO: process the event... }

@CoherenceMap Anotasyonu
Örnek
Şöyle yaparız
@CoherenceMap
private NamedMap<String, Person> people;


16 Eylül 2021 Perşembe

GitLab CI/CD

Giriş
Açıklaması şöyle
The Gitlab software was created by Valery Sizov and Dmytro Zaporozhets in 2013. The software was written in Ruby, and Go was used to rewrite some parts later on. The current tool involves Ruby on Rails, Go, and Vue.js programming languages. GitLab Inc. was launched in 2014 with San Francisco as its headquarters. Initially, it was free and open-source. In 2013, it was split into two versions; Enterprise and Community Editions. In 2017, the company announced that Gitlab would be offered as an open-source tool under an MIT license. Today, the company operates in 67 countries with 1280 employees.
GitLab Flow
Açıklaması şöyle
GitLab Flow has a main principle: upstream first, there is only one main branch master, which is the upstream of all other branches, so the merge order is very important, such as the development environment is (master), pre-release The environment is (pre-production), and the production environment is (production).

If the production environment sends an error, you need to pull out a hotfix branch first, merge it into master after modification, merge it into pre-production after acceptance, and then accept it again. Merge to production, the above steps cannot be skipped unless it is an emergency.
GitLab CI/CD
GitLab CI/CD'nin amacı GitLab'a commit yapılınca projeyi tekrar bizim sunucumuza deploy etmek diye düşünülebilir. Açıklaması şöyle
Here is how Gitlab works:
1. Create a structured order of CI/CD jobs in the GitLab-ci.yml configuration file and store this file in the project root directory. The CI/CD pipeline contains four important stages: build, test, staging, and production.
2. Install a runner for your project and register it.
3. When a config file is pushed to the repository, the runner executes the CI/CD based on the predefined conditions.
4. When the script passes all these stages, it is automatically deployed to production.
Kavramlar
Pipeline
Açıklaması şöyle
In GitLab CI/CD, a pipeline is a collection of stages and jobs that specify the actions to be taken for a certain branch or tag in your repository. You may automate the testing and deployment processes by having each pipeline start when a new commit or merge request occurs.
Stages
Açıklaması şöyle. Her stage içinde Job vardır. Job, Runner tarafından çalıştırılır
Stages reflect the pipeline steps, including build, test, and deploy. You specify one or more jobs—individual units of work — within each level. Jobs can operate on separate runners (such as virtual machines, containers, or Kubernetes pods), either concurrently or sequentially.
Runners
Açıklaması şöyle. Her stage içinde Job vardır. Job, Runner tarafından çalıştırılır
GitLab CI/CD employs runners to conduct the tasks listed in your pipeline. GitLab offers both shared runners and customized runners that you may build up on your own infrastructure. Your builds and tests will always run because GitLab Runners watch for new jobs and run them in secure, isolated environments.
image Alanı
maven:3-jdk-11

Örnek - multi-module
1. parent .gitlab-ci.yml tanımlanır. Burada extends ile child modul'ün parametreleri alınır ve job
 çalıştırılır yaparız. .build-module build aşamasında çalıştırılır. Ondan kalıtan build-common-module 
aynı zamanda .common-module den de kalıtır ve common-module derlenir
.build-module:
stage: build script: - echo "Building $MODULE" - mvn -pl $MODULE clean compile --also-make artifacts: expire_in: 10 min paths: - "*/target" # BUILD JOBS build-common-module: extends: - .common-module - .build-module
 
Örnek - parallel
Bir örnek burada

Örnek
AWS EKS örneği burada

Örnek
.gitlab-ci.yml şöyle olsun
stages:
 - build
 - deploy

maven-build:
  image: maven:3-jdk-11
  stage: build
  script: "mvn package -B"
  artifacts:
    paths:
      - target/gitlab-ci-demo.jar

deploy-master:
  rules:
    - if: '$CI_COMMIT_BRANCH =~ /^master$/'
  before_script:
    - apt-get update -qq && apt-get install -y -qq sshpass
  stage: deploy
  script:
    - sshpass -V
    - export SSHPASS=$CI_USER_PASS
    - sshpass -e scp -o StrictHostKeyChecking=no
target/gitlab-ci-demo.jar gitlab-ci@167.172.188.139:/home/gitlab-ci
    - sshpass -e ssh -tt -o StrictHostKeyChecking=no
gitlab-ci@167.172.188.139 sudo mv /home/gitlab-ci/gitlab-ci-demo.jar
/opt/java/webapps
    - sshpass -e ssh -tt -o StrictHostKeyChecking=no
gitlab-ci@167.172.188.139 sudo systemctl restart gitlab-ci-demo.service


15 Eylül 2021 Çarşamba

Data Warehouse Nedir - Süzülmüş Verinin Sakladığı Yer

Giriş
Belli bir amaca göre süzülmüş veri anlamına gelir.

Snowflake Schema
Açıklaması şöyle
Snowflake Schema in Data Warehouse Model

- The snowflake schema is a variant of the star schema.

- Here, the centralized fact table is connected to multiple dimensions.
In the snowflake schema, dimensions are present in a normalized form in multiple related tables.

- The snowflake structure materialized when the dimensions of a star schema are detailed and highly structured, having several levels of relationship, and the child tables have multiple parent tables.

-  The snowflake effect affects only the dimension tables and does not affect the fact tables.

===========================+

- For Example: (please refer the image)
- The Employee dimension table now contains the attributes: EmployeeID, EmployeeName, DepartmentID, Region, Territory.

- The DepartmentID attribute links with the Employee table with the Department dimension table. The Department dimension is used to provide detail about each department, such as the Name and Location of the department.

-  The Customer dimension table now contains the attributes: CustomerID, CustomerName, Address, CityID.

- The CityID attributes link the Customer dimension table with the City dimension table. The City dimension table has details about each city such as CityName, Zipcode, State, and Country.

- The main difference between star schema and snowflake schema is that the dimension table of the snowflake schema is maintained in the normalized form to reduce redundancy.

- The advantage here is that such tables (normalized) are easy to maintain and save storage space.

- However, it also means that more joins will be needed to execute the query. This will adversely impact system performance.

===========================+
- Advantages:
- There are two main advantages of snowflake schema given below:

- It provides structured data which reduces the problem of data integrity.
- It uses small disk space because data are highly structured.

===========================+
- Disadvantages:
- Snowflaking reduces space consumed by dimension tables but compared with the entire data warehouse the saving is usually insignificant.

- Avoid snowflaking or normalization of a dimension table, unless required and appropriate.

- Do not snowflake hierarchies of one dimension table into separate tables.

- Hierarchies should belong to the dimension table only and should never be snowflakes.
Multiple hierarchies that can belong to the same dimension have been designed at the lowest possible detail.
Şeklen şöyle


Data Warehouse Üreticileri
Açıklaması şöyle
- Traditional data warehouses are Teradata, Oracle Exadata, IBM DB2 Warehouse etc.

- Cloud DWH are Amazon Redshift, Google Big Query and Snowflake Cloud Data warehouse.
Apache Hudi
Açıklaması şöyle
Traditional data warehouses often deploy Hadoop to store data and provide batch analysis. Kafka is used separately to distribute Hadoop data to other data processing frameworks, resulting in duplicated data. Hudi helps effectively solve this problem; we always use Spark pipelines to insert new updates into the Hudi tables, then incrementally read the update of Hudi tables. In other words, Hudi tables are used as the unified storage format to access data.
Business Intelligence
Şeklen şöyle


Data Lake vs Data Warehouse
Şeklen şöyle. Data Lake ham veri, Data Warehouse is işlenmiş veri.

Açıklaması şöyle
Data lakes and data warehouses are both widely used for storing big data, but they are not interchangeable terms. A data lake is a vast pool of raw data, the purpose for which is not yet defined. A data warehouse is a repository for structured, filtered data that has already been processed for a specific purpose.
Buradaki açıklamaya göre Data Warehouse birinci kuşak çözüm. Data Lake ise ikinci kuşak çözüm
The first generation: proprietary enterprise data warehouse and business intelligence platforms; solutions with large price tags that have left companies with equally large amounts of technical debt; Technical debt in thousands of unmaintainable ETL jobs, tables and reports that only a small group of specialized people understand, resulting in an under-realized positive impact on the business.

The second generation: big data ecosystem with a data lake as a silver bullet; complex big data ecosystem and long running batch jobs operated by a central team of hyper-specialized data engineers have created data lake monsters that at best has enabled pockets of R&D analytics; over promised and under realized.

13 Eylül 2021 Pazartesi

Actor Model

Giriş
Actor Model aslında eski bir fikir. Microservice ve cloud computing'den çok daha önce vardı. Açıklaması şöyle
What’s most interesting about the actor model is that, long before there was cloud computing, it was specifically designed for and has been proven to meet the fundamental requirements of multi-cloud applications — concurrency, failover, and scaling.

Şeklen şöyle

Bazı özellikleri şöyle
- Actor instances are reactive and execute rules, logic, and data transformations only when reacting to a message.
- Actor instances are absolutely reentrant and stateless. They react to one message at a time and have no memory of previous messages processed. All data needed to react to a message must be in the message itself or in a persistent datastore.
- Actor instances pass messages to other actor instances when they need them to do something.
- Actor instances publish events when they need to tell interested parties about something.
- An actor instance bounded by one context can pass messages to, or publish events for, actor instances bounded by another context — enabling it to use microservices developed, deployed, and maintained by other teams.




9 Eylül 2021 Perşembe

Global System for Mobile Communications Railway - GSM-R

Giriş
Önce bazı kısaltmaları bilmek lazım

ERTMS vs ETSI
Açıklaması şöyle. Yani ERTMS yeni sistem, ETSI ise eski sistem
ERTMS, the European Rail Traffic Management System, is the new rail management system which combines the European Train Control System (ETCS) with GSM-R. As a unique European train control system, ERTMS is designed to gradually replace the existing incompatible systems throughout Europe.

GSM-R
GSM-R demiryollarında kullanılan GSM tabanlı iletişim altyapısıdır. Açıklaması şöyle
GSM-R is the data communication bearer for the European Train Control System (ETCS), in particular for ETCS Level 2 and Level 3. GSM-R is fully defined in ETSI standards.
Avrupa demiryollarında kullanılır. Açıklaması şöyle
From the year 2000 onwards, GSM-R has been introduced all over Europe as a common standard for railway operations essential to interoperability, as well as in many other parts of the world. Expansion of the GSM-R implementation is still ongoing.
Tarihçesine dair bir açıklama şöyle. 1994 yılında standart kabul edilmiş
In 1994, ETSI GSM standard was selected by UIC as the bearer for first Digital Railway Radio Communication System. Needs of railways were captured in dedicated specifications named EIRENE, including both functional and system aspects. These specifications were reinforced as GSM-R within ETSI/3GPP international standards.

The first operational implementation of GSM-R targeting the setup of this new technology was launched in 1999, and the first countrywide GSM-R operation started in 2004. In parallel, the EU Directives officially adopted the GSM-R as the basis for mobile communication between train and track for voice (train radio) and control-command and signaling data (ETCS), with the aim to form a worldwide standard, the European Rail Traffic Management System, the now well-known ERTMS.
GSM-R, European Rail Traffic Management System (ERTMS) standardının bir parçası. Şeklen şöyle

Yani en kısa haliyle GSM-R bir kullanıcının MSISDN numarasını yani telefon numarasını bilmeye gerek kalmadan örneğin X numaralı trendeki "lead driver" aranabilir. 

Bir yazılım Functional Number'ı MSISDN numarasına çeviriyor. 

EIRENE ve MORANE Projeleri

Şeklen şöyle

Value Added Services
EIRENE projesindeki tanımlar dışında ilave özellikler de istenebiliyor ve tanımlanıyor. Açıklaması şöyle
An Intelligent Network (IN) can be added to provide functions defined in EIRENE and enhance non-standardised functions, such as:

- FFN (Follow Me Functional Node)
- Location-dependent addressing
- Additional administration and authorisation functions such as the Access Matrix
- Functional addressing for SMS
- Group communication support
- Functional numbers with variable length
- Support of dedicated location processing systems

Neden GSM Tabanlı
GSM kullanmasının sebebi hazır ticari ürünlerin en az değişiklikle kullanılması gayesidir.

GSM Eski Değil mi ?
Evet eski. Açıklaması şöyle. Yani ihtiyacı karşılasa bile, GSM teknolojisi artık ortadan kalkmaya başladığı için GSM-R'nin yerine başka bir şey gelmesi gerekecek.
A successor to GSM-R is required primarily due to the expected obsolescence of GSM-R. GSM-R builds on existing GSM mobile standards, using the frequency bands 876-880 MHz
(uplink) and 921-925 MHz (downlink) that are harmonised within CEPT for the operational communication of railway companies (GSM-R) in accordance with current mobile
technology. GSM is a second generation mobile technology, but the industry is already moving to LTE3 (arguably a fourth generation [4G] technology) and is expected to evolve to fifth generation (5G) technology after 2020. The ability of the rail industry to continue to support GSM-R beyond roughly 2030 is doubtful. Given the long procurement cycles in the rail sector, planning for a successor needs to begin now.
Bir başka açıklama şöyle
GSM-R is facing a number of challenges:

- The system life-cycle is coming to an end, with vendor support uncertain beyond 2030
- Extra capacity is required in some areas to support railway operations
- The rollout of European Rail Traffic Management System (ERTMS) has increased the strain on the GSM-R network
FRMCS - Future Railway Mobile Communication System
GSM-R yerine FRMC planlanıyor. Çalışmalar 2012 yılında başlamış. Açıklaması şöyle
Nevertheless, on one side the needs of the railways are constantly evolving, and on the other side the telecom standards evolution remains dependent of the telecom industry evolution cycles, with an end of support for GSM-R planned by 2030 onwards.

These considerations led UIC, as soon as 2012, to launch the first studies for a successor to GSM-R, pertinently named Future Rail Mobile Communications System (FRMCS),
UUS1 
Açıklaması şöyle. Sanırım Speech servisi için gerekiyor.
UUS1 is a supplementary service that transmits the User-to-User Information. In the GSM-R system, UUS1 can be used in most of applications, including AC acknowledgement, functional number, OTDI, called subaddress, and call matrix.

Customized Applications for Mobile networks Enhanced Logic (Camel) Architecture
Camel karışık bir standart.
Next Generation Intelligent Networks kitabındaki şekil şöyle

ve şöyle

Service Control Layer - SCL
Üst katmandaki uygulamalara hizmet veren katman. Açıklaması şöyle
In a fixed/mobile converged infrastructure, the call control layer is separated from a service control layer. Call control is well standardized in 3GPP, but there is a considerable amount of freedom in the architecture of the service delivery platforms, which is also reflected in the amount of standards and industry activities to harmonize these approaches.
Mobile Station International Subscriber Directory Number - MSISDN
Açıklaması şöyle
GSM mobil şebekelerinde abonenin tanımlanması ve doğrulanması için operatör tarafından aboneye atanan özel bir numaradır.
IEC - International Escape Code yani 00
CC - Country Code yani ülke kodu. Türkiye için 90
NDC - National Destination Code - Turkcell için 530
SN - Subscriver Number - Telefon numarası

International Mobile Subscriber Identity- IMSI
IMSI, Sim kart numarasıdır
 
International Mobile Equipment Identity - IMEI
IMEI cihaza yani telefona verilen numaradır

Unstructured Supplementary Service Data - USSD
Açıklaması şöyle.  SMS gibi metin tabanlıdır. "*" karakteri ile başlar, "#" ile biter.
Unstructured Supplementary Service Data (USSD), sometimes referred to as "quick codes" or "feature codes", is a communications protocol used by GSM cellular telephones to communicate with the mobile network operator's computers. USSD can be used for WAP browsing, prepaid callback service, mobile-money services, location-based content services, menu-based information services, and as part of configuring the phone on the network.

USSD messages are up to 182 alphanumeric characters long. Unlike short message service (SMS) messages, USSD messages create a real-time connection during a USSD session. The connection remains open, allowing a two-way exchange of a sequence of data. This makes USSD more responsive than services that use SMS.
Bir MSISDN numarasının belli bir role register, deregister olmasıdır gibi işler içindir. 

USSD vs SMS
Her ikisi de SS7 protokolünü kullanır, ancak USSD session kullanır yani çift yönlüdür, SMS ise tek yönlüdür

Signalling System No. 7- SS7
Signalling System No. 7 yazısına taşıdım


Features
Bazı feature'lar şöyle. Şeklen şöyle




Frekans
Dünyadaki durum şeklen şöyle






Rate Limiting Algorithms

Giriş
Algoritmalar şöyle
1. Fixed Window Counter
2. Sliding Window Counter
3. Token Bucket
4. Leaky Bucket
5. Sliding Logs


1. Counter algorithm aka Fixed window counter - Burst Desteklemez
Bu algoritmanın kötü tarafı şöyle
For example, if the limit is 100 requests per minute, a user could send 100 requests in the last second of the current minute and 100 more in the first second of the next minute—effectively sending 200 requests within two seconds.
2. Sliding window
Açıklaması şöyle. Diyelim ki dakikada 100 istek işliyoruz. En son gelen istekten bir dakika öncesine gideriz ve sliding window logda toplam kaç tane istek olduğunu sayarız. Eğer bu sayı 100 veya daha büyükse yeni istek reddedilir. Eğer küçükse yani yeni istek kabul edilecekse 1 dakikadan önceki eski istekler de logdan silinir.
Sliding window log algorithm keeps a log of request timestamps for each user (Generally redis sorted sets are used for this). When a request comes, we first remove all outdated timestamps before inserting the new request time to the log. Then we decide whether this request should be processed depending on whether the log size has exceeded the limit.
Örnek
Şöyle yaparız. Burada window olarak HashMap kullanmanın bir özelliği yok. List te olabilirdi
public static void main(String[] args) {
  int[] data = {2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 21}; // input data
  int windowSize = 4; // size of the sliding window
  int slideInterval = 4; // slide interval

  Map<Integer, Integer> window = new HashMap<>(); // the sliding window

  for (int i = 0; i < data.length; i++) {
    window.put(i, data[i]); // add the current element to the window

    if (window.size() == windowSize) { // check if the window is full

      String joined = window.values().stream()
                        .map(Object::toString)
                        .collect(Collectors.joining(", "));
      System.out.println("Window : " + joined);

      Iterator<Map.Entry<Integer, Integer>> iterator = window.entrySet().iterator();
      for (int remove = 0; remove < slideInterval && iterator.hasNext(); remove++) {
        iterator.next();
        iterator.remove();
      }
      i = (i - windowSize) + slideInterval + 1; // slide the window
    }
  }
}
Çıktı şöyle
Window : 2, 4, 6, 8 Window : 12, 14, 16, 18
3. Token Bucket - Burst Destekler
Token Bucket yazısına taşıdım

4. Leaky Bucket aka Spike Control Policy - Sabit Hızdadır
Leaky Bucket yazısına taşıdım





7 Eylül 2021 Salı

JSON Web Token (JWT) Nerede Saklanır

Giriş
Açıklaması şöyle.
We have three options available for storing the data on the client side and each of those has its own advantages and disadvantages. And the options are:
1. Cookie
2. Local Storage
3. Session Storage
Cookie İçinde Saklamak
Bunun için bir ön koşul var. Açıklaması şöyle. Yani JWT 4K'dan küçük olmalı.
The purpose of JWTs is to be stateless, right? Cookies are capped out at 4k, which means the JWT needs to be < 4k for this to work.
Cookie içinde SameSite=strict, HttpOnly gibi bayraklarla birlikte saklamak.
- SameSite=strict CSRF saldırısına karşı korur.
- HttpOnly ise XSS saldırısına karşı korur. HttpOnly tarayıcıya enjekte edilen javascript kodlarının token'a erişip başka yere göndermesini engeller
Açıklaması şöyle
... using cookies alone is not the solution but extra steps to prevent XSS attack must be taken by enabling “HTTP-only” parameter in cookies which basically do not allow any third party JavaScript code to read your cookies and enabling the secure flag which transports your cookies only through HTTPS.
Local Storage İçinde Saklamak
Bu önerilmiyor. Açıklaması şöyle.
Local storage wasn’t designed to be used as a secure storage mechanism in a browser. It was designed to be a simple string only key/value store that developers could use to build slightly more complex single-page apps.
— Randall Degges

When you store sensitive information in local storage, you’re essentially using the most dangerous thing in the world(javascript) to store your most sensitive information in the worst vault ever created.
— Randall Degges
Session Storage İçinde Saklamak
Açıklaması şöyle.
The downside is that you need to manage a cache on the API side, but this is easily doable.

If you’re using JWTs anyway, you STILL NEED to have centralized sessions that handles revocation, right?.

gRPC Error Handling

Giriş
Açıklaması şöyle
By default, gRPC relies heavily on status code for error handling. 
Sunucu tarafından fırlatılan exception gRPC tarafından StatusRuntimeException'a çevrilir.

Örnek
Şöyle yaparız. Burada ServiceException kendi sınıfımız
import io.grpc.StatusRuntimeException;

//Client call
public Product getProduct(String productId) {
  Product product = null;
  try {
    var request = GetProductRequest.newBuilder().setProductId(productId).build();
    var productApiServiceBlockingStub = ProductServiceGrpc.newBlockingStub(managedChannel);
    var response = productApiServiceBlockingStub.getProduct(request);
    // Map to domain object
    product = ProductMapper.MAPPER.map(response);
  } catch (StatusRuntimeException error) {
    log.error("Error while calling product service, cause {}", error.getMessage());
    throw new ServiceException(error.getCause());
  }
  return product;
}
Ancak bir problem var. O da hata mesajının kaybolması. Çıktı olarak şunu alırız
io.grpc.StatusRuntimeException: UNKNOWN
Açıklaması şöyle
gRPC wraps our custom exception in StatusRuntimeException and swallows the error message and assigns a default status code UNKNOWN.
Bunu düzeltmek için sunucu tarafında şöyle yaparız. Bu sefer onError() metodunu çağırıyoruz.
//Server Product Service API
public void getProduct(
    GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
  try {
    ...
    responseObserver.onNext(response);
    responseObserver.onCompleted();
  } catch (ResourceNotFoundException error) {
    var status = Status.NOT_FOUND.withDescription(error.getMessage()).withCause(error);
    responseObserver.onError(status.asException());
  }
}
İstemci tarafında çıktı olarak şunu alırız
Error while calling product service, cause NOT_FOUND: Product ID not found
Hata mesajı düzgün ancak hala exception içindeki getCause() null döner. Sebebinin açıklaması şöyle. Yani io.grpc.Status.withCause() çağrısı orijinal exception'ı göndermiyor.
Create a derived instance of Status with the given cause. However, the cause is not transmitted from server to client.
Şöyle yaparız. Bu sefer io.grpc.Metadata kullanılıyor
public void getProduct(
    GetProductRequest request, StreamObserver<GetProductResponse> responseObserver) {
  try {
    ...
    responseObserver.onNext(response);
    responseObserver.onCompleted();
  } catch (ResourceNotFoundException error) {
    Map<String, String> errorMetaData = error.getErrorMetaData();
    var metadata = new Metadata();    
    errorMetaData.entrySet().stream() 
        .forEach(
            entry ->
                metadata.put(
                    Metadata.Key.of(entry.getKey(), Metadata.ASCII_STRING_MARSHALLER),
                    entry.getValue()));
    var statusRuntimeException =
        Status.NOT_FOUND.withDescription(error.getMessage()).asRuntimeException(metadata); 
    responseObserver.onError(statusRuntimeException);
  }
}
İstemci tarafında şöyle yaparız
} catch (StatusRuntimeException error) {

  Metadata trailers = error.getTrailers();
  Set<String> keys = trailers.keys();

  for (String key : keys) {
    Metadata.Key<String> k = Metadata.Key.of(key, Metadata.ASCII_STRING_MARSHALLER);
    log.info("Received key {}, with value {}", k, trailers.get(k));
  }
}
İstemci tarafında çıktı olarak şunu alırız
Received key Key{name='resource_id'}, with value 32c29935-da42-4801-825a-ac410584c281 
Received key Key{name='content-type'}, with value application/grpc 
Received key Key{name='message'}, with value Product ID not found
- Bütün bunlarlar uğraşmak yerine Google Richer Error Model kullanılabilir.
- Ayrıca sunucu tarafında bir sürü exceptıon yakalayıp io.grpc.Status dönmek yerine io.grpc.ServerInterceptor arayüzünden kalıtıp ortak bir kod yazılabilir.
- Eğer Spring kullanıyorsak aynı şeyi şöyle yaparız
@GrpcAdvice
public class ExceptionHandler {

  @GrpcExceptionHandler(ResourceNotFoundException.class)
  public StatusRuntimeException handleResourceNotFoundException(ResourceNotFoundException
cause) {
    var errorMetaData = cause.getErrorMetaData();
    var errorInfo =
        ErrorInfo.newBuilder()
            .setReason("Resource not found")
            .setDomain("Product")
            .putAllMetadata(errorMetaData)
            .build();
    var status =
        com.google.rpc.Status.newBuilder()
            .setCode(Code.NOT_FOUND.getNumber())
            .setMessage("Resource not found")
            .addDetails(Any.pack(errorInfo))
            .build();
    return StatusProto.toStatusRuntimeException(status);
  }
}