25 Nisan 2023 Salı

OpenTelemetry Collector - Sidecar

Örnek
sidecar.yaml şöyledir
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: sidecar
spec:
  mode: sidecar
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
    exporters:
      logging:
      otlp:
        endpoint: "<path_to_central_collector>:4317"
    service:
      telemetry:
        logs:
          level: "debug"
      pipelines:
        traces:
          receivers: [otlp]
          processors: []
          exporters: [logging, otlp]
instrumentation.yaml  şöyledir
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation metadata: name: java-instrumentation spec: propagators: - tracecontext - baggage - b3 sampler: type: always_on java:
deployment.yaml şöyledir
apiVersion: apps/v1
kind: Deployment
metadata:
  name: petclinic
  labels:
    app: petclinic
spec:
  replicas: 1
  selector:
    matchLabels:
      app: petclinic
  template:
    metadata:
      annotations:
        instrumentation.opentelemetry.io/inject-java: 'true'
        sidecar.opentelemetry.io/inject: 'sidecar'
      labels:
        app: petclinic
    spec:
      containers:
      - name: petclinic
        image: <path_to_petclinic_image>
        ports:
        - containerPort: 8080
Açıklaması şöyle
To enable the instrumentation, we need to update the deployment file and add annotations to it. This way we tell the OpenTelemetry Operator to inject the sidecar and the java-instrumentation to our application.

petclinic-svc.yaml şöyledir
apiVersion: v1
kind: Service
metadata:
  name: petclinic-service
spec:
  selector:
    app: petclinic
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

OpenTelemetry Backend

Giriş
Açıklaması şöyle
Even though OpenTelemetry does not provides their own backend, by using it, we are not tied to any tool or vendor, since it is vendor agnostic. Not only can we use any programming language we want, but we can also pick and choose the storage backend and also easily switch to another backend/vendor, by just configure another exporter.
Honeycomb, Lightstep, New Relic, Tempo (Grafana Cloud) gibi bir sürü şey olabilir

Jaeger ve Zipkin
Açıklaması şöyle
Jaeger and Zipkin predate OpenTelemetry, so each has its trace transport format. They do provide integration with the OpenTelemetry format, though.
Jaeger
Açıklaması şöyle
Jaeger inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and can be used for monitoring microservices based distributed systems.
Örnek 
Jaeger'ı çalıştırmak için şöyle yaparız
docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HOST_PORT=:9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14250:14250 \
  -p 14268:14268 \
  -p 14269:14269 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.32
Örnek - Docker Compose ve Jaeger 
Şöyle yaparız
version: "3"

services:
  jaeger:
    image: jaegertracing/all-in-one:1.37           #1
    environment:
      - COLLECTOR_OTLP_ENABLED=true                #2
    ports:
      - "16686:16686"                              #3
Açıklaması şöyle
1. Use the all-in-one image
2. Very important: enable the collector in OpenTelemetry format
3. Expose the UI port


OpenTelemetry Collector

Deployment
İki yöntem var
1. Sidecar
2. A Central (Gateway) OpenTelemetry Collector

1. Sidecar
Açıklaması şöyle
In this scenario, the OpenTelemetry instrumented application sends the data to a (collector) agent that resides together with the application. This agent will then offload responsibility and handle all the trace data from the instrumented application.

The collector can be deployed as an agent via a sidecar which can be configured to send data directly to the storage backend.
Şeklen şöyle
2. A Central (Gateway) OpenTelemetry Collector
Açıklaması şöyle
You can also decide to send the data to another OpenTelemetry collector and from the (central) collector send the data further to the storage backend. In this configuration, we have a central OpenTelemetry collector that is deployed using the deployment mode, which comes with many advantages like auto scaling.
Collector Bileşenleri
Şeklen şöyle

OpenTelemetry Collector üzerinde 3 tane bileşen var. Bunlar
1. Receivers
2. Processors
3. Exporters

OpenTelemetry Protocol
Açıklaması şöyle
OpenTelemetry Protocol (OTLP) specification describes the encoding, transport, and delivery mechanism of telemetry data between telemetry sources, intermediate nodes such as collectors and telemetry backends. 

Each language SDK provides an OTLP exporter you can configure to export data over OTLP. The OpenTelemetry SDK then transforms events into OTLP data.
1. Receivers
Açıklaması şöyle
A receiver, which can be push or pull based, is how data gets into the collector. The OpenTelemetry collector can receive telemetry data in multiple formats.
Örnek - gRPC
Şöyle yaparız
otlp:
  protocols:
    grpc:
      endpoint: "0.0.0.0:4317"
Örnek - gRPC + Http
Şöyle yaparız
otlp:
protocols: grpc: http:
2. Processors
Açıklaması şöyle
Processors are run on data between being received and being exported. Processors are optional though some are recommended
batch Processor
Açıklaması şöyle
The batch processor accepts spans, metrics, or logs and places them into batches. Batching helps better compress the data and reduce the number of outgoing connections required to transmit the data. This processor supports both size and time based batching.
Örnek
Açıklaması şöyle
Configuring a processor does not enable it. Processors are enabled via pipelines within the service section.
Şöyle yaparız
service:
  traces:
    receivers: [opencensus, jaeger]
    processors: [batch]
    exporters: [opencensus, zipkin]
  ...
processors:
  batch:
3. Exporters
Açıklaması şöyle
In order to visualise and analyse the telemetry you will need to use an exporter. An exporter is a component of OpenTelemetry and is how data gets sent to different systems/back-ends.
Açıklaması şöyle. Yani OpenTelemetry protocol (OTLP) formatındaki veriyi başka bir formata çevirir.
Generally, an exporter translates the internal format into another defined format, so you can send different types of data to different backends. For example, you can send metrics (i.e Prometheus) to one backend and traces to another.
Açıklaması şöyle
- The Jaeger exporter is used to send data to Jaeger.

The Logging exporter is very useful when troubleshooting as it exports data to the console

Console Exporter
Açıklaması şöyle
A common exporter to start with and that is very useful for development and debugging tasks is the console exporter.
Örnek
Açıklaması şöyle
In the exporters section, you can add more destinations. For example, if you would also like to send trace data to Grafana Tempo, just add these lines to the central_collector.yaml file.
Şöyle yaparız
pipelines:
  traces:
    receivers: [otlp]
    processors: []
    exporters: [logging, otlp]

exporters:
  logging:
  otlp:
   endpoint: "<tempo_endpoint>"
   headers:
     authorization: Basic <api_token>
4. Extensions
Açıklaması şöyle
Extensions are available primarily for tasks that do not involve processing telemetry data. Examples of extensions include health monitoring, service discovery, and data forwarding. Extensions are optional.
Örnek
Şöyle yaparız
extensions:
  health_check:
  pprof:
  zpages:
  memory_ballast:
    size_mib: 512



24 Nisan 2023 Pazartesi

RAID Disk - Redundant Array of Independent Disks

Giriş
RAID bir sürü işi görebilir. Açıklaması şöyle. Konfigürasyona göre redundancy için kullanılabilir.
In general, the purpose of a RAID, depending on the chosen Raid level, provides a different balance among the key goals data redundancyavailabilityperformance and capacity.
Açıklaması şöyle.
RAID is not a backup mechanism; it's a redundancy mechanism
...
The main advantage of a redundant system is that it will not go down completely when a complete disk failure happens – the mirror allows you to continue using the NAS without interruption while the array is rebuilding.
RAID 5 ve RAID 6 var

Hardware RAID ve Software RAID
LSI, DELL, HP gibi üreticiler Hardware RAID sağlıyorlar. Açıklaması şöyle
Q : RAID with different drive types possible?
A : Hardware RAID controllers from LSI, DELL, HP etc. does not allow mixing disks with different interfaces (eg: SATA and SAS) in a single array. What you can do is to create two different arrays, each for a specific interface protocol - a SATA array and a SAS one, for example.

Software RAID does not share this limitation - basically any block device (even a loopback device) can be part of any arrays. However, mixing different disk technologies is generally discouraged to avoid an unbalanced array (performance wise). For cache drives, as ZFS L2ARC or LVM dm-cache, things are different - here you actually want a faster drive. So, for example, using an NVMe cache in front of a SATA array is perfectly fine.
RAID 6
Kırmızı/Mavi ışık yanıyorsa bir disk senkronizasyonu kaçırmıştır ve "Rebuild" işleminde olabilir.


22 Nisan 2023 Cumartesi

Yazılım Mimarisi - Strangler (Sarmaşık) Örüntüsü

Giriş
Not : Strangler için OpenAPI ya da eski adıyla Swagger kullanılabilir.

Açıklaması şöyle
The name refers to strangler vines that grow around trees, gradually building up a solid structure that eventually is able to completely replace the tree that they started growing around. The strangler pattern for microservices means to gradually and strategically build a "mesh" of microservices around an existing monolith, replacing certain functions as needed, and over time potentially replacing the monolithic application entirely.
Stranger örüntüsü API'yi değiştirirken testlerin bozulmasını da engelleyebilir. Açıklaması şöyle
build an anti-corruption layer, or a facade, or a proxy between your tests and the SUT, so you can change the API of the SUT without having to change too many parts of your tests. That will allow you to keep the tests as they are for now. Later, when you have some time for cleaning up, you may decide to migrate the tests to the new API one-by-one.

This approach is also known as strangler pattern and can often be used to gradually swap out legacy components by components with a new design, not only for tests.
Şeklen şöyle. Burada ilk hafta Strangler örüntüsü istekler halen eski sisteme yönlendiriyor. Daha sonraki haftalarda micro servislere yönlendiriyor






6 Nisan 2023 Perşembe

Google Cloud True Time

Giriş
Dağıtık bir ağda zaman senkronizasyonunu sağlamak zor bir iş. Google bunu True Time ile sağlıyor. Google True Time, Google Spanner gibi veri tabanlarında kullanılıyor. Açıklaması şöyle
Google created a distributed SQL database called Spanner, it relies on something called True Time for very strong consistency of transactions across nodes. Google knows that time is uncertain, so True Time defines a bounded and small uncertainty of time window where transactions can not be ordered definitely. True Time works as a Global Time across Google datacenters.

True Time is expressed as a time interval [earliest, latest]. It exposes an API called now() whose value lies in this interval. The uncertainty interval varies between 1 ms to 7 ms — note that the maximum uncertainty has a tight upper bound.

The APIs TT.before(t) or TT.earliest() and TT.after(t) or TT.latest() take a timestamp as input and answers whether the given timestamp is before or after the current uncertainty interval.

The relation between TT.earliest(), TT.latest() and absolute time of an event is:

TT.earliest ≤ Absolute Time of current event ≤ TT.latest
Google bunu nasıl sağlıyor?
Açıklaması şöyle
Google does this magic by couple of tricks:

Optimized Infrastructure: Google infra runs on specially designed private network. They have optimized the network over time, it has a lot of redundancy of connections across datacenters and failure handling mechanisms built in. It does not mean network partition don’t happen or things don’t go wrong — however the possibility of such incidents and communication latency reduces a lot.

Using own clocks: True Time does not rely on external NTP pools or servers. Rather, Google datacenters are equipped with GPS receivers and Atomic clocks. See the below picture of such an installation:
AWS Time Sync Service
Açıklaması şöyle
Inspired from Google True Time, AWS also manages its own fleet of Atomic clocks and GPS clock receivers. Any EC2 server can connect to these time references via NTP using Chrony daemon for more accurate time rather than connecting to external NTP pools or time servers over NTP. More details can be found here. Leap second smearing is also handled by Amazon Time Sync Service.


4 Nisan 2023 Salı

Amazon Web Service (AWS) DynamoDB - Hem Key-value Hem de Document-Oriented Çalışabilir

Giriş
Bir NoSQL veri tabanıdır, ancak multi-model'i destekler yani key-value system ve document store olarak kullanılabilir.
1. Key-value store olarak Cassandra, HBase, Redis ile rakip
2. Document DB olarak MongoDB ile rakip.


DynamoDB ilk olarak 2012 yılında piyasaya çıktı

Özellikleri şöyle

Açıklaması şöyle
DynamoDB is Amazon's answer to MongoDB, a NoSQL database that works on JSON documents. These databases rely heavily on nested data and do not enforce any strict schema unless the developer turns that option on. That means that DynamoDB is great for high-volume sites like a CMS or mobile apps with a lot of traffic. For example, both Major League Baseball and Duolingo make use of DynamoDB.
Pricing Model
1. throughput model : Kullanım miktarı tahmin edilir ve bu aşılamaz.
2. on-demand pricing model : Kullanım miktarına göre fiyatlandırılır

Yanlış Kullanım
1. Normalizing Data
DynamoDB bir SQL veri tabanı değildir. Bu yüzden denormalized şekilde kullanılmalıdır

2. Single Table Design
Single Table Design (STD) için açıklaması şöyle
In DynamoDB, you are charged for the capacity throughput and indexes of each table.

The more tables you have the more you will end up paying (especially if each table has several secondary indexes).

The STD instead encourages grouping all of your (related) data entities in one table.
Replication
Açıklaması şöyle
Departing from the traditional SQL-based offerings, DynamoDB offers a persistence model where the information is spread into partitions with a dual consistency approach.

A write operation first saves the updated data to a persistence node. It is then synchronously copied to another persistence node. Only at this point, the operation is confirmed to the caller.

There is an asynchronous process that copies it from the second persistence node to a third one.

This means you have the redundancy of the data being persisted into 3 nodes, each located in a separate AZ. At the same time, you do not need to wait for all 3 nodes to save before returning the operation, which helps to maintain the latency at a lower level.

When retrieving data you have two choices: eventually consistent and strongly consistent.

If you opt for eventually consistent, your operation will be directed to any of the 3 nodes. If it happens to be the asynchronously copied one, there is a chance the information you will retrieve will be outdated when compared to the main node.

In contrast, the strongly consistent mode will only be directed to the main node.

MongoDB vs DynamoDB 
- MongoDB en büyük belge büyüklüğü olarak 16 MB'yi destekler. Bu DynamoDB'de 400 KB
- MongoDB C++ ile geliştirilmiştir. DynamoDB Java ile geliştirilmiştir

Primary Key 
Açıklaması şöyle
... it only allows three data types for primary keys: string, number, and binary. (It does support many different data types for other attributes within a table.)
Açıklaması şöyle
DynamoDB has a weird take on the concept of a primary key. You will have two keys to identify specific data:
Primary Key = Partition Key + Sort Key
Şeklen şöyle

Örnek
Tablo şöyle olabilir
      PRIMARY_KEY     SORT_KEY         OTHER_INFO
1.    ORDER#1234       PRODUCT#1      ProductName,Price etc
2.    ORDER#1234       INVOICE#1         InvoiceDate, PaymentInfo
3.    ORDER#1234       CUSTOMER#1   CustomerName, ShippingAddress
Value Type
Açıklaması şöyle
Dynamo DB stores the value in a JSON serialized format

Sütun Tipleri
Açıklaması şöyle
DynamoDB supports many different data types for attributes within a table. They can be categorized as follows:

1. Scalar Types – A scalar type can represent exactly one value. The scalar types are numberstringbinaryBoolean, and null.

2. Document Types – A document type can represent a complex structure with nested attributes, such as you would find in a JSON document. The document types are list and map.

3. Set Types – A set type can represent multiple scalar values. The set types are string setnumber set, and binary set.

Global Secondary Index
Eğer istediğimiz veri Primary Key dışındaysa bu index kullanılır. Tek problem her indeksin tek başına bir tablo olması

DynamoDB Disadvantages
Açıklaması şöyle
Size limit — item can only reach 400KB in size
Limited querying options (limited number of indices)
Throttling on burst throughput (and hot keys in certain situations)
create-table
Örnek
Şöyle yaparız
aws dynamodb --endpoint-url=http://localhost:4566 create-table \
    --table-name Music \
    --attribute-definitions \
        AttributeName=Artist,AttributeType=S \
        AttributeName=SongTitle,AttributeType=S \
    --key-schema \
        AttributeName=Artist,KeyType=HASH \
        AttributeName=SongTitle,KeyType=RANGE \
--provisioned-throughput \
        ReadCapacityUnits=10,WriteCapacityUnits=5
describe-table
Şöyle yaparız
aws --endpoint-url=http://localhost:4566 dynamodb describe-table 
--table-name Music | grep TableStatus
put-item
Şöyle yaparız
aws --endpoint-url=http://localhost:4566 dynamodb put-item \
  --table-name Music  \
  --item \
  '{"Artist": {"S": "No One You Know"}, "SongTitle": {"S": "Call Me Today"},
"AlbumTitle": {"S": "Somewhat Famous"}, "Awards": {"N": "1"}}'
scan
Şöyle yaparız
aws dynamodb scan --endpoint-url=http://localhost:4566 --table-name Music
Örnek
Açıklaması şöyle
Because DynamoDB is not relational and does not enforce ACID by default, it must use a modified version of standard SQL. Amazon has developed a query language called PartiQL which uses many SQL concepts but is built for highly nested data. The query below takes advantage of the key-value underpinnings of DynamoDB in a relatively SQL standard way.
Şöyle yaparız
UPDATE
    Music
SET
    AwardsWon = 1
SET
    AwardDetail = { 'Grammys': [ 2020, 2018 ] }
WHERE
    Artist = 'Acme Band'
    AND SongTitle = 'PartiQL Rocks'

3 Nisan 2023 Pazartesi

Cache Stratejileri - Cache Access Patterns Refresh-ahead

Giriş
Şeklen şöyle


Açıklaması şöyle
.. it refreshes the cache data before its expiration time,it is done for hot-data, the data we expect to be requested in the near future.

Approach
1. Supposed the cached data’s expiration time is 60 seconds and the refresh-ahead factor is 0.5.
2. If the cached object is accessed after 60 seconds, Coherence will perform a synchronous read from the cache store to refresh its value.
3. If the cached data is accessed after 30 seconds, said 35th second, the cache returns the data and asynchronously refreshes the data.