10 Temmuz 2026 Cuma

Positional Block Encoding - Hierarchical Block Allocation

Giriş

Verilen n kademe/seviye girdiyi tek bir sayıya çevirir

1. İki Kademe

Formül şöyle

- areaNo
- systemNo
- SYTEMS_PER_AREA - Yani base
- TERMINALS_PER_SYSTEM
- BLOCK_SIZE

Start : (areaNo * SYTEMS_PER_AREA + systemNo) * TERMINALS_PER_SYSTEM
End : Start + BLOCK_SIZE

Yeni bir sabit hesaplarız

TERMINALS_PER_AREA = SYTEMS_PER_AREA * TERMINALS_PER_SYSTEM

Sonra

start = (areaNo * TERMINALS_PER_AREA) + (systemNo * TERMINALS_PER_SYSTEM)

ve end

start + BLOCK_SIZE

1 Haziran 2026 Pazartesi

DMR (Digital Mobile Radio) Notlarım

Kişisel Notlar

DMR standardı, ETSI, European Telecommunications Standards Institute tarafından yazılmış.

İngilizce kullanımda portable radio el telsizi demek. Yani

Portable Radio - El telsizi

Mobile Radio - Araç Telsizi

Tiers

Açıklaması şöyle

Understanding the DMR Tiers is not strictly necessary but knowing which Tier your radio is, is very important.

TIER 1 - Radio to Radio
Simplex only. No Time slots. This means it cannot be used on amateur repeaters as it would use up both time slots.
The original Boefeng DM5R was Tier 1 only. DM5R+ had Tier 2.
TIER 2 - Repeater-based
Supports 2 Time Slots (TDMA). What us Amateurs Use
TIER 3
Advanced Trunked system. Complicated Commercial system allows for automatic repeater switching plus many other things.
Similar to the Airwave system used by the emergency services.

Tier 2 - Conventional DMR

Site yoktur. Şeklen şöyle

Talkgroup <--> Repeater

Tier 2 - Wide-Area DMR

Site vardır. Şeklen şöyle. Burada telsizler sitelerle ilişkilenir. Sitelerin kimlik numarası vardır ve birbirlerine yönlendirmeyi ağ yönetim Ip uygulaması bilir ve yapar. Yani routing statiktir.

Talkgroup <--> Site <--> Repeaters

Channel

Kanala bazı özellikler atanabilir. Bunlar üreticiye mahsus şeylerdir

- Kanal seçilince otomatik olarak bir Talk Group dinlenmeye başlanır

- Kanal seçilince otomatik olarak kriptolu/açık iletişim kullanılır

Channel Plan

Açıklaması şöyle

A channel plan is the mapping between logical channel numbers (LCNs) and actual RF frequencies used by a DMR system.

Açıklaması şöyle

The DMR standard defines the concept of logical channels and how channel information can be communicated, but the actual channel plan (the specific frequencies and LCN assignments) is determined by the system operator or vendor deployment.

Açıklaması şöyle

A grant such as:

> Talkgroup 1001 → LCN 2 → Slot 1

means:

> Tune to 451.0125 MHz and use Slot 1.

OTAP — Over-The-Air Programming

Açıklaması şöyle

OTAP allows a radio's configuration (codeplug) to be updated remotely through the radio network instead of connecting a programming cable.

OTAP protokolü üreticiye mahsustur ve muhtemelen all radios gönderim yeteneği vardır.

OTAR — Over-The-Air Rekeying

Açıklaması şöyle

OTAR is completely different. OTAR updates cryptographic keys, not radio configuration.

CSBK - Control Signalling Block

Bir CSBK mesajı şu hedeflere adreslenebilir

A specific radio (Individual ID)
A talkgroup (Group ID)
All radios (Broadcast)
A subset of radios depending on the CSBK type

DMR 24-bit adresler kullanır. Bazı özel adresler şöyle

0xFFFFFF (16777215) All Radios / Broadcast / All Call
0xFFFFFE (16777214) Reserved special address
0x000000 Often used as Null Address depending on context

CSBK Mesaj Grupları

The CSBK contains a 6-bit CSBKO (CSBK Opcode) field. The opcode determines the message family and how the 64-bit data field is interpreted.

Common ETSI CSBK families include:

Group                                        Purpose
Channel Grants                        Assign traffic channels for voice or data
Announcement Messages        Broadcast system information
Voting / Adjacent Site Messages Roaming and site selection
Random Access / ALOHA        Registration and access control
Acknowledgement Messages        Confirm requests
Negative Acknowledgement           Messages Reject requests
Maintenance / Control                System management
Manufacturer Specific                Vendor extensions using MFID

OpCode dışında FID alanı da önemli. Açıklaması şöyle

Also note that ETSI defines a second level of grouping through the FID (Feature ID) field. The same CSBKO value can have different meanings depending on the FID. ETSI reserves FIDs for standard features and manufacturer-specific features.

Simulcast Zone

Açıklaması şöyle.

Simulcast zone, birden fazla fiziksel siteyi tek bir RF hücre (logical site/cell) gibi çalıştıran senkronize yayın domain’idir.

Şeklen şöyle. Bu yüzden siteler arasında roaming olmuyor.

Site A ─|
Site B ─|── same frequency, same slot, same content
Site C ─|

Açıklaması şöyle. Yani tekrarlayıcıların (repeater) senkronizasyonu çok önemli.

The challenge with simulcast is that a radio may receive signals from A, B, and C simultaneously. Therefore the transmitters must be synchronized very precisely (typically GPS-disciplined timing and frequency references). Otherwise the overlapping signals create destructive interference.

Talk Group

Açıklaması şöyle

A DMR GROUP (Often called a “talkgroup” ) is a method of grouping or assembling multiple users (Radio ID’s) to a single contact. A Group or “talk group” is simply a group of users that need to talk to each other and hear all the communications in that group.

- An example would be all the maintenance personnel at a hospital would need to all share communications with each other. So, you would create the “maintenance” group.

Broadcast: Bazı gruplar abone üye olsun olmasın global kabul edilir ve tüm telsizlere yayın yapar

Roaming : Telsiz site değiştirdikçe roaming bayrağına göre bu gurubu dinlemeye devam eder.

Tier 1 – RF Control & Channel Architecture (DMR Systems)

1. Trunking Architecture Types

1.1 Site-Controlled Channel Allocation

Each site manages its own channel allocation
Decisions are made locally at site level
Traffic channel assignment is handled by the site
No mandatory centralized routing entity

1.2 Centralized Architecture (Star Topology)

Central system is managed by ISS (Integrated Soft Switch) in a star topology.

Topology

Star topology
All sites connect to central ISS
Routing decisions are centralized

Call / Channel Allocation Flow

Radio locks to site control channel
Site forwards registration to ISS
ISS performs authorization and routing decision
ISS instructs site to assign traffic channel
Site executes RF channel assignment

Summary: Site executes, ISS decides.

2. Channel Types

2.1 RF carrier bandwidth

Dijital kanal 12,5 kHz bant genişliğindedir. Analog kanal DMR standardı değildir ama kullanılır ve genellikle 25 kHz bant genişliğindedir

2.2 Analog Channel

Not part of DMR traffic
Supported for legacy systems

2.3 DMR Conventional Channel

Simplex (P2P)

Radio-to-radio communication
No repeater
Direct communication model

Repeater Mode

Uses repeater infrastructure
Extends coverage
Frequency pair operation

2.4 DMR Trunked Channel (Tier III)

Dynamic channel allocation
Separation of control and traffic planes

Channel Types:

Control Channel: signaling and system control
Traffic Channel: voice/data communication

3. Squelch (Muting Mechanism)

Squelch is a receiver function that mutes speaker output when no valid signal is present. Without it, constant background noise ("hiss") would be heard.

4. Squelch Types

4.1 Color Code (Digital Squelch)

Used in digital DMR channels
Equivalent to CTCSS/DCS in analog systems
Range: 1–15
Filters unrelated digital systems on same frequency

4.2 Tone Squelch (CTCSS – Analog)

Uses sub-audible tones (e.g. 88.5 Hz)
Must match for audio to open
Allows frequency sharing without interference

4.3 Digital Squelch (DCS – Analog)

Digital code-based squelch
Alternative to CTCSS
Uses digital patterns instead of tones

5. ANI - Automatic Number Identification

Sadece analog kanallarda olur. Analog tone ile arama yapan telsiz kimliği karşı tarafta gösterilir.

5. SelectiveCall

Sadece analog kanallarda olur. Analog tone ile aranan telsizin ses çıkarması sağlanır.

6. Kripto

Aslında DMR standardında kripto tanımlı değil. Çoğu üretici kanala bağlı bir kripto tanımlıyor. Telsiz kanal değiştirince, o kanal için tanımlı kripto algoritmasını kullanıyor

7. Repeater

Repeater'lar katmanlı çalışır ve birbirlerine trafik gönderebilirler

8. System Mental Model

More to come

21 Nisan 2026 Salı

Monitoring vs. Observability

Giriş

Açıklaması şöyle

Monitoring answers: "Is something wrong?" and observability answers: "Why is it wrong?" You need both.

Açıklaması şöyle

In practice, you need three things. Metrics are used to detect problems, logs to explain errors, and traces to discover latency. If your metrics are wrong, you would never know that something is failing. And if you don't know something is failing, you never check logs and traces, which is why metrics are the entry point of any investigation.

Monitoring Challenges

Açıklaması şöyle

Most of the time, teams don't have strategies for monitoring. It is the last backlog item to be picked up before the final production release. One service team adds a dashboard, another adds alerts, and a third team introduces a different naming convention.

Six months down the line, you get duplicate metrics and inconsistent naming. There are no standard dashboards and alerts that nobody trusts. Eventually, teams ignore alerts, stop relying on monitoring, and fall back to guesswork. That is a dangerous place to be.

One pattern I have seen repeatedly is metric explosion without clarity. A service exposes 400 metrics, and nobody knows which one matters.

Good monitoring is not about collecting more metrics. It is about collecting the right metrics. A production-ready service rarely needs more than 10-20 core metrics and a small number of critical alerts. Everything else is an investigation detail. Not an operational signal.

Yazar 4 tane metriğin takip edilmesi gerektiğini söylüyor

1. Latency: Earliest Signal

Burada ortalamaları (average) gösteren metriklerin işe yaramadığını söylüyor. Percentile p50, p90 gibi bakmak daha iyi. Eğer p99 hareket etmeye başlarsa bir problem var anlamına gelir.

Örnek

Şöyle yaparız. order-service içindeki end pointleri gösterir.

uri : groups metrics per endpoint,

le : “less than or equal”

anlamına gelir.

# Latency (p50, p95, p99)
histogram_quantile(0.50,
sum(rate(http_server_requests_seconds_bucket{application="order-service"}[5m])) by (uri, le)
)
histogram_quantile(0.95,
sum(rate(http_server_requests_seconds_bucket{application="order-service"}[5m])) by (uri, le)
)
histogram_quantile(0.99,
sum(rate(http_server_requests_seconds_bucket{application="order-service"}[5m])) by (uri, le)
)

2. Traffic: System Load

Açıklaması şöyle

Traffic metrics include requests per second, events per second, messages per second, and batch rates. Most incidents begin with a traffic change. Sometimes expected and sometimes not.

A common pattern that I have always observed: Traffic increases, and that increases latency. Integrations slow down, and errors appear. Without traffic metrics, the root cause looks mysterious. With traffic metrics, it becomes obvious.

Prometheus query example:

Requests per second:

rate(http_server_requests_seconds_count[1m])

This metric alone explains a surprising number of incidents.

3. Errors - The most misunderstood signal

Burada şunlar önemli.

- Error rate is more important than error count

- 4xx vs 5xx - Critical distinction

4. Saturation — Where failures actually begin

- CPU and Memory - Necessary but not enough

- Connection pool usage.

- Kubernetes saturation signals

13 Nisan 2026 Pazartesi

Production Issues Troubleshoot

Bazı problemler şöyle

Here are 15 real production scenario-based questions:

1. Your Spring Boot service CPU suddenly spikes to 90% in production. How will you investigate and fix it?

2. After deployment, your service starts throwing intermittent 500 errors. How will you debug this issue?

3. One microservice goes down and causes a chain failure in other services. How will you prevent this in future?

4. Your API response time increased from 200ms to 3 seconds after a new release. How will you identify the root cause?

5. Database connections are getting exhausted under load. What steps will you take to fix this?

6. A third-party service you depend on is timing out frequently. How will you handle this in your system?

7. You observe duplicate transactions happening in your system. How will you prevent this?

8. Logs are too large and distributed, making debugging difficult. How will you improve observability?

9. Memory usage keeps increasing and your service crashes after some time. How will you detect and fix memory leaks?

10. Your microservice works fine locally but fails in production. How will you approach debugging?

11. A new deployment breaks one feature but works for others. How will you safely roll back?

12. Traffic suddenly spikes 5x during peak hours and your service becomes slow. How will you scale?

13. Inter-service communication is failing due to network latency. How will you optimize it?

14. You need to trace a single request across multiple services during a failure. How will you implement tracing?

15. A bug in one service causes inconsistent data across multiple services. How will you handle data consistency?

Bazı problemle şöyle

Your Spring Boot service runs flawlessly in development, but crashes every night at 2am in production. Walk me through your debugging approach."

Most candidates respond:
‣ I would check the logs.
‣ I would restart the service.
‣ I would increase memory?
‣ Interview over.

Here is what interviewers are actually evaluating:

Step 1: Identify the pattern
2am is consistent. Not random. Not traffic-driven. This indicates a scheduled trigger or resource exhaustion. First question: what executes at 2am? Batch jobs? Scheduled tasks? Cron jobs?

Step 2: Analyze memory behavior before failure
Inspect JVM metrics and heap usage trends. If memory steadily increases from 10pm to 2am before crashing, it signals a memory leak not a functional bug or infrastructure issue.

Step 3: Diagnose the leak
Enable GC logs. Capture heap dumps. Identify objects with abnormal growth unclosed connections, static collections, or uncleared ThreadLocal variables. Even a single unclosed DB connection inside a loop can bring down the service.

Step 4: Validate connection pool utilization
HikariCP default pool size is 10. If a batch process consumes all connections without releasing them, subsequent requests block. By 2am, the pool is exhausted and the service becomes unresponsive.

Solution: enforce connection timeouts and use proper try-with-resources patterns.

Step 5: Monitor with APM tools
Use Prometheus & Grafana, New Relic, or Datadog. Configure proactive alerts instead of reactive fixes. If heap usage exceeds 80% at 1am, alerts should trigger before failure occurs. That is production-grade engineering.

The gap between 12 LPA and 35 LPA is not defined by frameworks. It is defined by understanding what breaks at 3am and why.

Cpu Spike

Bir başka örnek burada

Database connections are getting exhausted under load

Örnek şöyle

@Service
public class UserService {
  @Autowired
  private JdbcTemplate jdbcTemplate; // OK

  @Transactional
  public void updateUsers(List<User> users) {
    users.forEach(user -> 
      jdbcTemplate.update(
        "UPDATE users SET last_login = ? WHERE id = ?",
        LocalDateTime.now(), user.getId()
      )
    );
  

  @Async
  @Transactional
  public void asycnUpdateUser(User user) {
    jdbcTemplate.update(
      "UPDATE users SET last_login = ? WHERE id = ?",
      LocalDateTime.now(), user.getId()
    );
  }
}

Açıklaması şöyle

Async threads can scale independently, but database connections cannot. This quickly overwhelms the connection pool.

9 Nisan 2026 Perşembe

Distributed Lock Source of Truth Olabilir mi?

Giriş

Soru şöyle

You have a distributed lock to prevent two users from booking the same hotel room.

Lock expires in 5 seconds. Your DB write takes 6 seconds under load.

Two users got confirmed bookins for the same room. How? What is the process to fix this issue.

Aslında şuna dikkat etmek lazım.

Lock ≠ correctness.
If your DB allows duplicates, your system will eventually produce them.
The real fix lives in atomic writes + constraints, not just distributed locks.

Yani lock aslında işlemi en baştan yapmamak için. Eğer iki işlem başlarsa bir tanesi başarısız olmalı.

Açıklaması şöyle

This is a correctness question. And at the Senior to Principal level, this is exactly what interviewers are testing for: do you understand the difference between coordination and actual data integrity?

If you are preparing for system design interviews right now, this is the kind of failure-mode thinking that matters a lot in strong loops.

Now, let us break this one down properly.

[1] How did both users get confirmed bookings?

The timeline usually looks like this:

- User A acquires the distributed lock for Room 101
- Lock lease is valid for 5 seconds
- User A starts the DB write to mark the room as booked
- Under load, that DB write takes 6 seconds
- At second 5, the lock expires before User A finishes
- User B now acquires the same lock because the lock service thinks it is free
- User B also starts a booking write
- Both flows eventually return success, and both users get confirmations

So what actually failed here? The system assumed the distributed lock was the source of truth. A lease-based lock only gives you temporary coordination.

If the critical section takes longer than the lease, another actor can enter while the first one is still working.

I cover fundamentals like locking, transactions, consistency, retries, idempotency, and failure handling in much more depth inside my System Design Fundamentals Guide for Senior to Principal engineers.

You can check it out here: puneetpatwari.in

[2] The deeper bug is usually not the lock itself

A lot of candidates stop at “increase the lock timeout.” That is not the real fix. The deeper issue is that your final correctness guarantee is missing at the database layer.

Because even if the lock expires, the database should still protect the invariant: “Only one valid booking can exist for this room for this date range.”

If both writes succeeded, it usually means one of these is true:
- no proper uniqueness or exclusion constraint existed
- booking availability was checked outside the final transaction
- writes were not serialized with row-level locking
- confirmation was sent before durable conflict detection finished

The lock helped reduce contention.
But the DB failed to enforce correctness.

[3] What is the right process to fix it

I would fix this in 4 steps.

1. Reconstruct the exact race
Check lock acquire time, lock expiry time, DB commit time, and confirmation event time for both users.

2. Move the invariant to the database

For hotel booking, correctness should be enforced with transactional logic such as:
- row-level locking on the inventory row
- atomic reserve-if-available update
- or exclusion/uniqueness constraints depending on data model

3. Treat the distributed lock as an optimization.
It can reduce hot contention, but it should never be the only thing preventing double booking.

4. Fix the confirmation path
Only send “booking confirmed” after the transaction commits successfully and conflict checks have passed.

5] If you still want to use distributed locks, do it safely

If a distributed lock stays in the design, I would add:
- lease renewal or heartbeats for long critical sections
- fencing tokens so stale lock holders cannot keep writing
- alerts when p99 DB latency gets too close to lock TTL
- idempotency keys so retries do not create duplicate booking flows

A good rule of thumb is simple: If your lock TTL is 5 seconds and your write path can take 6 seconds under load, your design is already telling you it is unsafe.

8 Nisan 2026 Çarşamba

Correlation Id vs Trace Id

Giriş

Açıklaması şöyle

I often noticed that some developers do not really understand the difference between traceId and correlationId. I saw this so often that I decided to write this post.

At first they look similar.
Both are IDs.
Both appear in logs.
Both help during incidents.

But they answer different questions.

traceId answers:
"How did this specific execution path go through the system?"

correlationId answers:
"Which logs and events belong to the same business story?"

That difference becomes obvious once async enters the picture

Example:

A user places an order.

The system does this:

1. Order Service creates the order
2. Payment Service charges the card
3. Kafka event is published
4. Billing Worker creates invoice
5. Email Service sends confirmation

Now imagine the logs:

Order created
correlationId=ORDER-8472
traceId=T1

Payment charged
correlationId=ORDER-8472
traceId=T1

Billing started from Kafka consumer
correlationId=ORDER-8472
traceId=T2

Email sending failed
correlationId=ORDER-8472
traceId=T3

This is the key point

One correlationId
Multiple traceIds

Why?

Because the business flow is one.
But the technical executions are split.

The HTTP request is one execution.
Kafka consumer is another.
Retry later can be another.
Email worker can be another too.

So:

correlationId helps you reconstruct the whole story.
traceId helps you inspect one exact path in detail.

That is why using correlationId instead of tracing is a mistake.
You may connect logs, but you still do not get spans, timing hierarchy, or where exactly latency exploded.

And using only traceId is also not enough.
In distributed async systems, tracing often shows fragments. Correlation is what lets you stitch them back together 🧩

How I usually use them during incidents:

1. Start with correlationId
Find everything related to the same order, job, or user flow.

2. Then drill into traceId
Open the exact failing execution and inspect where it slowed down or broke.

Simple version:

traceId = the path
correlationId = the story

Have you seen teams mix these two and then realize the difference only during a production incident?

Fencing Tokens

Giriş

Açıklaması şöyle

Distributed systems concept: Fencing Tokens
You designed a fancy distributed locking algorithm just to find that an old primary is able to overwrite data!

The problem:
- Node A holds the lock, and is doing some work.
- Node A gets disconnected/unresponsive/crashes, and resume execution after its lease expires ("true" time)
- Node B, in the meantime, acquired the lock and wrote some data.
- Node A resume executions, thinking their lock is still valid
- Node A overwrites the data written by Node B, even tho it doesn't have the lock anymore.

That's were fencing token comes in: when a node acquires the lock, it gets a token with a monotonically increasing number. When the node tries to write data, it must include the token. If the token is outdated (i.e., lower than the current token), the write is rejected, preventing stale nodes from overwriting newer data.

Fencing tokens are used in a variety of systems, like etcd

The big takeaway is that you can't rely on just the client to know whether they are in their right. The target resource must have a gating mechanism to verify that the request makes sense.

JSON Web Token - JWT ve Hemen Logout

Giriş

Eğer tamamen stateless çalışıyorsak hemen logout mümkün değil. Ancak sunucu tarafına biraz state eklersek bazı çözümler elde ederiz.

1. Short-lived access tokens

- Keep access tokens valid for 5 to 15 minutes

- This limits the damage window

- Very common and simple

2. Refresh token revocation

- Store refresh tokens in DB or Redis

- On logout, delete or mark them revoked

- This is the most common real-world pattern

3. Token blacklist / denylist

- Store revoked JWT IDs or token hashes until they expire

- Check this list on every request

- Useful for high-risk logout or compromised accounts

- But now auth is no longer fully stateless

4. Token versioning

- Store a tokenVersion or sessionVersion on the user record

- Include that version in the JWT

- On logout-all-devices or password reset, increment the version

- Old tokens stop working once the version mismatches

26 Mart 2026 Perşembe

Yazılım Mimarisi - Idempotency ve Phantom Write

Giriş

Açıklaması şöyle

You typically implement idempotency like this:
Check if request already processed (via key / timestamp / PK)
If not → write data
If yes → skip

Eğer check işlemi atomic değilse problem oluyor.

Failure Mode 1: The TTL Expiry Trap

Açıklaması şöyle

The most common idempotency implementation stores a request key with a time-to-live (TTL) — typically 24 or 48 hours. The assumption is that any duplicate will arrive within that window. In practice, this assumption frequently breaks.

Açıklaması şöyle

The fix: Never use TTL-only idempotency for operations with unbounded retry windows. Instead, use a database-backed idempotency store with a three-state model (IN_PROGRESS, COMPLETED, FAILED) where the expires_at column drives a cleanup job for storage management — not correctness. The cleanup window should be set significantly longer than your worst-case replay window (7 days minimum for Kafka-based systems).

Failure Mode 2: The Partial Execution Ghost

Açıklaması şöyle

A request arrives, the system writes the idempotency key with status IN_PROGRESS, begins processing, writes half the data, and crashes — JVM OOM, container eviction, network partition. The idempotency key is now in IN_PROGRESS state. When the retry arrives, the system faces an impossible decision: did the original operation complete or not?

Açıklaması şöyle

The fix: Wrap both the business logic and the idempotency state transition in a single database transaction. If the transaction rolls back, both the business data and the idempotency status roll back together. For stale IN_PROGRESS keys (where the original processor is likely dead), use a configurable timeout threshold to reclaim and re-execute safely.

Failure Mode 3: The Concurrent Check Race

Burada check koşulu atomic değil. Açıklaması şöyle

The fix: Use INSERT ... ON CONFLICT DO NOTHING (PostgreSQL 9.5+) to make the check-and-claim atomic. If the RETURNING clause yields no rows, the key already existed — fetch its status with SELECT ... FOR UPDATE. For non-blocking behavior, SELECT ... FOR UPDATE SKIP LOCKED lets the second instance return 409 Conflict immediately rather than waiting.

Failure Mode 4: The Layer Mismatch

Açıklaması şöyle

The fix: Propagate a correlation ID from the original request as a Kafka header, and have every downstream consumer enforce its own idempotency barrier using that ID as the deduplication key.

Spring Boot + SQL Server

Kod şöyle. Burada

- Partial Execution tek transaction ile çözülüyor.

- The Concurrent Check Race, DuplicateKeyException ile çözülüyor. Eğer Postgres kullanıyor olsaydık exception yerine SQL'in kaç tane satırı değiştirdiğine bakacaktır

- The Layer Mismatch sorunu outbox pattern ile çözülüyor.

@Service
@RequiredArgsConstructor
public class IdempotentService {
  private final JdbcTemplate jdbc;
  public record Response(String result) {}

  @Transactional
  public Response handleRequest(String idempotencyKey, String payload) {
    try {
      // Attempt barrier insert (atomic)
      // SQL Server:
      // INSERT INTO idempotency_table (idempotency_key, status)
      // VALUES (?, 'IN_PROGRESS')
      jdbc.update(
        "INSERT INTO idempotency_table (idempotency_key, status) VALUES (?, 'IN_PROGRESS')",
        idempotencyKey
      );

      // First request owns the key → perform business logic
      String result = doBusinessLogic(payload);

      // Insert into outbox for async processing
      // SQL Server:
      // INSERT INTO outbox_table (idempotency_key, payload) VALUES (?, ?)
      jdbc.update(
        "INSERT INTO outbox_table (idempotency_key, payload) VALUES (?, ?)",
        idempotencyKey, result
      );

      // Mark barrier as completed and store result
      // SQL Server:
      // UPDATE idempotency_table SET status='COMPLETED', response=? WHERE idempotency_key=?
      jdbc.update(
        "UPDATE idempotency_table SET status='COMPLETED', response=? WHERE idempotency_key=?",
        result, idempotencyKey
      );
      return new Response(result);
     } catch (DuplicateKeyException ex) {
       // Barrier row already exists → handle duplicate
       // SQL Server:
       // SELECT * FROM idempotency_table WITH (UPDLOCK, ROWLOCK) WHERE idempotency_key=?
       IdempotencyRecord record = jdbc.queryForObject(
         "SELECT status, response FROM idempotency_table WITH (UPDLOCK, ROWLOCK) WHERE idempotency_key=?",
         (rs, rowNum) -> new IdempotencyRecord(rs.getString("status"), rs.getString("response")),
         idempotencyKey
       );

       switch (record.status) {
         case "COMPLETED":
           // Return cached result
           return new Response(record.response);
         case "IN_PROGRESS":
           // Someone else is working → can wait or throw 409
           throw new IllegalStateException("Request is already in progress");
         case "FAILED":
           // Previous attempt failed → allow retry
           throw new IllegalStateException("Previous attempt failed, safe to retry");
         default:
           throw new IllegalStateException("Unknown barrier state: " + record.status);
         }
      }
  }

  private String doBusinessLogic(String payload) {
    // your domain logic here
    return "processed:" + payload;
  }

  private static class IdempotencyRecord {
      final String status;
      final String response;
      IdempotencyRecord(String status, String response) {
        this.status = status;
        this.response = response;
      }
  }
}

Eğer hem SQL Server hem de Postgres için çalışsın istiyorsak şöyle yaparızz

    
    @Service
@RequiredArgsConstructor
public class IdempotentService {

    private final JdbcTemplate jdbc;

    public record Response(String result) {}

    @Transactional
    public Response handleRequest(String idempotencyKey, String payload) {
        boolean isWinner = false;

        try {
            // --------------------------
            // Attempt atomic barrier insert
            // --------------------------
            // Postgres:
            // INSERT INTO idempotency_table (idempotency_key, status)
            // VALUES (?, 'IN_PROGRESS')
            // ON CONFLICT DO NOTHING
            //
            // SQL Server:
            // INSERT INTO idempotency_table (idempotency_key, status)
            // VALUES (?, 'IN_PROGRESS')
            int rows = jdbc.update(
                    "INSERT INTO idempotency_table (idempotency_key, status) VALUES (?, 'IN_PROGRESS')",
                    idempotencyKey
            );

            // Postgres: rows == 1 → winner
            // SQL Server: INSERT succeeded → winner
            isWinner = rows == 1;

        } catch (DuplicateKeyException ex) {
            // SQL Server only: duplicate → loser
            isWinner = false;
        }

        if (isWinner) {
            // --------------------------
            // Winner executes business logic
            // --------------------------
            String result = doBusinessLogic(payload);

            // Insert into outbox (side effect)
            // INSERT INTO outbox_table (idempotency_key, payload) VALUES (?, ?)
            jdbc.update(
                    "INSERT INTO outbox_table (idempotency_key, payload) VALUES (?, ?)",
                    idempotencyKey, result
            );

            // Mark barrier as completed + store response
            // UPDATE idempotency_table SET status='COMPLETED', response=? WHERE idempotency_key=?
            jdbc.update(
                    "UPDATE idempotency_table SET status='COMPLETED', response=? WHERE idempotency_key=?",
                    result, idempotencyKey
            );

            return new Response(result);
        } else {
            // --------------------------
            // Loser reads existing row safely
            // --------------------------
            // SQL Server: SELECT ... WITH (UPDLOCK, ROWLOCK) WHERE idempotency_key=?
            // Postgres: SELECT * FROM idempotency_table WHERE idempotency_key=?
            IdempotencyRecord record = jdbc.queryForObject(
                    "SELECT status, response FROM idempotency_table " +
                            (isPostgres() ? "" : "WITH (UPDLOCK, ROWLOCK) ") +
                            "WHERE idempotency_key=?",
                    (rs, rowNum) -> new IdempotencyRecord(rs.getString("status"), rs.getString("response")),
                    idempotencyKey
            );

            switch (record.status) {
                case "COMPLETED":
                    return new Response(record.response);
                case "IN_PROGRESS":
                    throw new IllegalStateException("Request already in progress");
                case "FAILED":
                    throw new IllegalStateException("Previous attempt failed, safe to retry");
                default:
                    throw new IllegalStateException("Unknown barrier state: " + record.status);
            }
        }
    }

    private boolean isPostgres() {
        // Detect DB type from DataSource or JdbcTemplate if needed
        return true; // placeholder, implement detection
    }

    private String doBusinessLogic(String payload) {
        return "processed:" + payload;
    }

    private static class IdempotencyRecord {
        final String status;
        final String response;

        IdempotencyRecord(String status, String response) {
            this.status = status;
            this.response = response;
        }
    }
}

25 Mart 2026 Çarşamba

Claude

Giriş

Bir örnek burada. Şeklen şöyle

1. Claude.md Dosyası

Ana kontrol dosyası. Örneğin

- Asla main brach'i kullanma

2. CLAUDE.local.md Dosyası

Açıklaması şöyle.

CLAUDE.local.md is useful for notes you do not want to commit but still want to apply in the current project.

3. subdirectories

Açıklaması şöyle

- CLAUDE.md files inside subdirectories are not all loaded up front, but only when Claude Code actually reads content from those directories
- When multiple CLAUDE.md files are active at the same time, a nearest-scope rule usually applies, meaning instructions closer to the current task and narrower in scope take priority
- Within the same layer, rules that are more explicit and more specific are also more likely to be followed consistently than vague general statements

4. .claude Dizini

.claude/commands

tekrar eden işleri otomatikleştirme

4.1 .claude/rules

proje kuralları (test, naming, vs.)

Komutlar

/init

Başlangıç CLAUDE.md dosyasını yaratır.

/reflection for Regular Retrospectives

Açıklaması şöyle

At the end of each session, you can ask Claude Code to summarize what from that round of collaboration is worth adding to CLAUDE.md, and then turn those points into more stable project rules.

/skill-creator

Açıklaması şöyle.

A skill isn't a prompt. You don't type it. You build it once, describe what it does and when to use it, and Claude recognises when to fire it on its own. The right context appears, the skill runs. You do nothing.

Özel bir skill yapılandırmak için bu komutu kullanırız. Açıklaması şöyle.

You describe what you need, it helps you draft the skill, then runs a test (one session with the skill, one without) and opens a browser window so you can compare the results. Then it optimises automatically based on your feedback so the skill triggers when it should.

23 Mart 2026 Pazartesi

Summary

In real systems:
- 80% → Cache-Aside + Eviction
- High-scale → Add these:
  - Stampede protection
  - Two-level cache
  - Event invalidation
Spring mainly supports:
- Cache-Aside (natively)
- Partial Write-Through
- Eviction patterns
@Cacheable, @CachePut, @CacheEvict are mainly Cache-Aside tools
Advanced patterns require custom logic or cache provider features
High-scale systems often combine:
- Cache-Aside + Eviction
- Two-Level Cache
- Stampede Protection
- Event-Driven Invalidation
Spring annotations alone are not enough for advanced caching—you end up:
- Using Caffeine / Redis features directly
- Or writing custom cache layers

Read-Heavy Strategies

Cache-Aside - Implemented by App
Read-Through - Implemented by Cache Provider
Refresh-Ahead - Implemented by Cache Provider

Write-Heavy Strategies

Write-Through - Implemented by Cache Provider
Write-Behind (aka Write-Back) - Implemented by Cache Provider
Write-Around - Implemented by App

1. Cache-Aside (Lazy Loading)

App reads from cache → if miss → load from DB → put in cache. Cache is not responsible for loading; application does it.

@Service
public class UserService {
    @Cacheable(value = "users", key = "#id")
    public User getUser(Long id) {
        return userRepository.findById(id)
                .orElseThrow();
    }
}

2. Write-Through

Write goes to cache and DB synchronously. Cache always up-to-date.

@CachePut(value = "users", key = "#user.id")
public User saveUser(User user) {
    return userRepository.save(user);
}

3. Read-Through

Cache itself loads data (app doesn’t call DB directly). App only talks to cache provider. Cache abstracts loading logic. Provider like Hazelcast / Redis with loader.

4. Write-Behind

Write goes to cache → DB updated asynchronously later. Very fast writes.

public void saveUser(User user) {
    cache.put(user.getId(), user);

    asyncExecutor.submit(() -> {
        userRepository.save(user);
    });
}

5. Refresh-Ahead

Cache refreshes entries before expiration to avoid cache miss spikes. Not supported via Spring annotations.

Caffeine.newBuilder()
    .refreshAfterWrite(Duration.ofMinutes(5))
    .build(key -> loadFromDb(key));

6. Cache Eviction / Invalidation

Explicitly remove/update cache when data changes.

@CacheEvict(value = "users", key = "#id")
public void deleteUser(Long id) {
    userRepository.deleteById(id);
}

7. Write-Around

Writes go directly to DB, cache updated only on read. Prevents cache from being updated on writes. Cache becomes stale by design. Relies on future reads to populate.

@Service
public class OrderService {

    @Autowired
    private OrderRepository orderRepository;

    @Autowired
    private CacheManager cacheManager;

    public void createOrder(Order order) {
        orderRepository.save(order); // cache not updated
    }

    @Cacheable(value = "userOrders", key = "#userId")
    public List getOrdersForUser(Long userId) {
        return orderRepository.findByUserId(userId);
    }
}

8. Negative Caching Control

Cache “not found” results. Example: user not found → cache null. Prevents repeated DB hits. Key insight: unless="#result == null" avoids caching null values.

@Cacheable(value = "users", key = "#id", unless = "#result == null")
public User getUser(Long id) {
    return userRepository.findById(id).orElse(null);
}

9. Two-Level Cache

L1 (in-memory) + L2 (distributed like Redis). L1: Caffeine, L2: Redis. Must combine manually.

10. Cache Stampede Protection (Önbellek yığılması)

Prevent many threads from hitting DB on same miss. Only one thread fetches DB; others wait or use cache.

11. Read-Repair

If stale data detected → fix cache during read. Not supported via @Cacheable.

public User getUser(Long id) {
    User cached = cache.get(id);

    if (cached != null && isStale(cached)) {
        User fresh = userRepository.findById(id).orElse(null);
        cache.put(id, fresh); // repair
        return fresh;
    }

    if (cached != null) {
        return cached;
    }

    User fresh = userRepository.findById(id).orElse(null);
    cache.put(id, fresh);
    return fresh;
}

12. Event-Driven Cache Invalidation

Use events (Kafka, etc.) to invalidate/update cache entries.

19 Mart 2026 Perşembe

Amazon Web Service (AWS) EventBridge - “Kafka-lite, fully managed, rule-based event routing

Giriş

Akış şöyle

Webhooks → simple HTTP push (external trigger mechanism)
Amazon EventBridge → event router (central nervous system)
AWS Lambda → code runner (brain doing the work)

Burada bir tane örnek mimari var. Bu mimaride Webhook çağrıları direkt mikroservislere tetikleyeceğine önce AWS EventBridge'a giriyorlar. Daha sonra buradan yönlendiriliyorlar. Bu mimarideki önemli noktalar şöyle

The Patterns Nobody Documents
Here’s what I learned building this that you won’t find in AWS documentation.

Pattern 1: Event Normalization at the Edge
Don’t let raw external events onto your bus. Ever. Your webhook handler should transform vendor-specific payloads into domain events. When we integrated PayPal, our services didn’t care. They still received payment.completed events with the same schema.

Pattern 2: Event Versioning from Day One
We screwed this up initially. Six months in, we needed to change the event schema. Half our services were still consuming v1 events. Now every event includes a version field, and EventBridge rules route based on version. Services can migrate on their own schedule.

Pattern 3: Dead Letter Queues for Everything
This saved us during Black Friday. A bug in the inventory service caused it to reject 15% of order.created events. Because we had DLQs configured, those events sat safely in a queue while we fixed the bug, then we replayed them. Zero lost orders.

Pattern 4: Archive Anything That Touches Money
EventBridge archiving is criminally underused. We archive every payment-related event for 90 days. When customers dispute charges, we have perfect audit trails. When the finance team needs transaction reports, we replay archived events. Cost? $47/month for 2.1M archived events.

17 Mart 2026 Salı

MIT lisans

MIT vs LGPL

Açıklaması şöyle

LGPL says, “you can use this code, but if you change it, you must share your changes under the same terms.” MIT says, “Do whatever you want.” One protects the community. The other lets corporations take without giving back.

LGPL - GNU Lesser General Public License

LGPL
GPL'in kısıtlayıcı olduğu düşünüldüğü için LGPL (GNU Lesser General Public License) çıkmıştır.

Açıklaması şöyle

LGPL says, “you can use this code, but if you change it, you must share your changes under the same terms.”

LGPL Kodu Kullanırsak ve Uygulamamızı Dağıtırsak (Distribution)
GPL ile LGPL'in ayrıştığı en önemli nokta bence bu. LGPL yazılımı kullanıyorsak ve kendi ürünümüzü satıyorsak, kaynak kodumuzu açmak zorunda değiliz. Açıklaması şöyle. Eğer kaynak kodumuzu açmak istersek kendi kodumuz da LGPL lisanslı olmalı.

Yes, you can distribute your software without making the source code public and without giving recipients the right to make changes to your software.

The LGPL license explicitly allows such usages of libraries/packages released under that license.

10 Mart 2026 Salı

Medallion Architecture

Giriş

Açıklaması şöyle

Medallion architecture is a data design pattern that organizes data into three layers:

Bronze Layer (Raw):
Data ingested in its original format
Minimal transformation
Append-only historical record
No data quality enforcement
Silver Layer (Refined):
Cleaned and conformed data
Schema enforced
Deduplicated
Validated
Still fairly granular
Gold Layer (Curated):
Business-level aggregations
Denormalized for consumption
Optimized for specific use cases
Analytics-ready
Origin: Popularized by Databricks around 2019-2020 as part of the lakehouse pattern.

23 Şubat 2026 Pazartesi

Source code comments ve Decision Context

Giriş

Bu yazı Kod Gözden Geçirmesi - Code Review sürecinden yola çıkarak başladı. Kod Gözden Geçirmesi sürecinde şöyle bir madde vardı.

1. Source code comments are sufficient :
Yazan cümleler genelde şöyle. İşte burada görecelilik ön plana çıkıyor.

If there is a comment, does it explain why the code does what it does?
Is each line of the code - in its context - either self-explanatory enough that it does not need a comment, or if not, is it accompanied by a comment which closes that gap?
Can the code be changed so it does not need a comment any more?

Emniyet kritik bazı projelerde her satır için comment olması isteniyor. O zaman iş sanki biraz daha kolay. Sadece her satıra bakmak yeterli. Kod şöyle görünüyor.

/* Display an error message */
function display_error_message( $error_message )
{
  /* Display the error message */
  echo $error_message;

  /* Exit the application */
  exit();
}

/* -------------------------------------------------------------------- */

/* Check if the configuration file does not exist, then display an error */
/* message */
if ( !file_exists( 'C:/xampp/htdocs/essentials/configuration.ini' ) ) {
  /* Display an error message */
  display_error_message( 'Error: ...');
}

Yapılması gerekenlere bazı örnek

- Source code conforms to coding standard and is checked by automated tool
- Source code is checked manually by reviewer if automation is not possible
- Source code is checked for memory leaks by a dedicated tool
- Source code is compatible and traceable to SRS

2. The Pattern I Notice in Every High-Quality Codebase

Daha sonra The Pattern I Notice in Every High-Quality Codebase yazısını gördüm

Yüksek kalite kodlarda bir karar yani "neden" açıklaması vardır. Açıklaması şöyle

I've started noticing four types of decision context that great codebases maintain:
...
Without this context, all code looks equally arbitrary.

1. Business context — Why this business rule exists

Örnek şöyle

// Stripe charges 2.9% + $0.30 per transaction
// We pass this through to users on transactions <$10
// For larger transactions, we absorb it (reduces churn by 8%)
const FEE_THRESHOLD = 1000; // in cents

2. Historical context — Why we chose this approach

Örnek şöyle

// We tried async/await here but hit deadlocks under load
// See incident post-mortem: docs/incidents/2024-01-15-deadlock.md
// Synchronous approach is slower but reliable
fn process_batch_sync(items: Vec<Item>) -> Result<()> {

3. Constraint context — What limits our options
Örnek şöyle
// API rate limited to 100 req/min per docs/api-limits.md
// We batch requests to stay under limit with 20% safety margin
const maxRequestsPerMinute = 80
 4. Future context — What we plan to change
Örnek şöyle
// TODO: Move to event-driven architecture
// Blocked on: Kafka cluster provisioning (INFRA-445)
// Timeline: Q2 2024
// This polling approach is temporary
pollForUpdates();