23 Mart 2026 Pazartesi

Cache Stratejileri Sunumu

Summary

  • In real systems:
    • 80% → Cache-Aside + Eviction
    • High-scale → Add these:
      • Stampede protection
      • Two-level cache
      • Event invalidation
  • Spring mainly supports:
    • Cache-Aside (natively)
    • Partial Write-Through
    • Eviction patterns
  • @Cacheable, @CachePut, @CacheEvict are mainly Cache-Aside tools
  • Advanced patterns require custom logic or cache provider features
  • High-scale systems often combine:
    • Cache-Aside + Eviction
    • Two-Level Cache
    • Stampede Protection
    • Event-Driven Invalidation
  • Spring annotations alone are not enough for advanced caching—you end up:
    • Using Caffeine / Redis features directly
    • Or writing custom cache layers

Read-Heavy Strategies

  • Cache-Aside - Implemented by App
  • Read-Through - Implemented by Cache Provider
  • Refresh-Ahead - Implemented by Cache Provider

Write-Heavy Strategies

  • Write-Through - Implemented by Cache Provider
  • Write-Behind (aka Write-Back) - Implemented by Cache Provider
  • Write-Around - Implemented by App

1. Cache-Aside (Lazy Loading)

App reads from cache → if miss → load from DB → put in cache. Cache is not responsible for loading; application does it.

@Service
public class UserService {
    @Cacheable(value = "users", key = "#id")
    public User getUser(Long id) {
        return userRepository.findById(id)
                .orElseThrow();
    }
}

2. Write-Through

Write goes to cache and DB synchronously. Cache always up-to-date.

@CachePut(value = "users", key = "#user.id")
public User saveUser(User user) {
    return userRepository.save(user);
}

3. Read-Through

Cache itself loads data (app doesn’t call DB directly). App only talks to cache provider. Cache abstracts loading logic. Provider like Hazelcast / Redis with loader.

4. Write-Behind

Write goes to cache → DB updated asynchronously later. Very fast writes.

public void saveUser(User user) {
    cache.put(user.getId(), user);

    asyncExecutor.submit(() -> {
        userRepository.save(user);
    });
}

5. Refresh-Ahead

Cache refreshes entries before expiration to avoid cache miss spikes. Not supported via Spring annotations.

Caffeine.newBuilder()
    .refreshAfterWrite(Duration.ofMinutes(5))
    .build(key -> loadFromDb(key));

6. Cache Eviction / Invalidation

Explicitly remove/update cache when data changes.

@CacheEvict(value = "users", key = "#id")
public void deleteUser(Long id) {
    userRepository.deleteById(id);
}

7. Write-Around

Writes go directly to DB, cache updated only on read. Prevents cache from being updated on writes. Cache becomes stale by design. Relies on future reads to populate.

@Service
public class OrderService {

    @Autowired
    private OrderRepository orderRepository;

    @Autowired
    private CacheManager cacheManager;

    public void createOrder(Order order) {
        orderRepository.save(order); // cache not updated
    }

    @Cacheable(value = "userOrders", key = "#userId")
    public List getOrdersForUser(Long userId) {
        return orderRepository.findByUserId(userId);
    }
}

8. Negative Caching Control

Cache “not found” results. Example: user not found → cache null. Prevents repeated DB hits. Key insight: unless="#result == null" avoids caching null values.

@Cacheable(value = "users", key = "#id", unless = "#result == null")
public User getUser(Long id) {
    return userRepository.findById(id).orElse(null);
}

9. Two-Level Cache

L1 (in-memory) + L2 (distributed like Redis). L1: Caffeine, L2: Redis. Must combine manually.

10. Cache Stampede Protection (Önbellek yığılması)

Prevent many threads from hitting DB on same miss. Only one thread fetches DB; others wait or use cache.

11. Read-Repair

If stale data detected → fix cache during read. Not supported via @Cacheable.

public User getUser(Long id) {
    User cached = cache.get(id);

    if (cached != null && isStale(cached)) {
        User fresh = userRepository.findById(id).orElse(null);
        cache.put(id, fresh); // repair
        return fresh;
    }

    if (cached != null) {
        return cached;
    }

    User fresh = userRepository.findById(id).orElse(null);
    cache.put(id, fresh);
    return fresh;
}

12. Event-Driven Cache Invalidation

Use events (Kafka, etc.) to invalidate/update cache entries.

19 Mart 2026 Perşembe

Amazon Web Service (AWS) EventBridge - “Kafka-lite, fully managed, rule-based event routing

Giriş
Akış şöyle
Webhooks → simple HTTP push (external trigger mechanism)
Amazon EventBridge → event router (central nervous system)
AWS Lambda → code runner (brain doing the work)
Burada bir tane örnek mimari var. Bu mimaride Webhook çağrıları direkt mikroservislere tetikleyeceğine önce AWS EventBridge'a giriyorlar. Daha sonra buradan yönlendiriliyorlar. Bu mimarideki önemli noktalar şöyle
The Patterns Nobody Documents
Here’s what I learned building this that you won’t find in AWS documentation.

Pattern 1: Event Normalization at the Edge
Don’t let raw external events onto your bus. Ever. Your webhook handler should transform vendor-specific payloads into domain events. When we integrated PayPal, our services didn’t care. They still received payment.completed events with the same schema.

Pattern 2: Event Versioning from Day One
We screwed this up initially. Six months in, we needed to change the event schema. Half our services were still consuming v1 events. Now every event includes a version field, and EventBridge rules route based on version. Services can migrate on their own schedule.

Pattern 3: Dead Letter Queues for Everything
This saved us during Black Friday. A bug in the inventory service caused it to reject 15% of order.created events. Because we had DLQs configured, those events sat safely in a queue while we fixed the bug, then we replayed them. Zero lost orders.

Pattern 4: Archive Anything That Touches Money
EventBridge archiving is criminally underused. We archive every payment-related event for 90 days. When customers dispute charges, we have perfect audit trails. When the finance team needs transaction reports, we replay archived events. Cost? $47/month for 2.1M archived events.

17 Mart 2026 Salı

MIT lisans

MIT vs LGPL
Açıklaması şöyle
LGPL says, “you can use this code, but if you change it, you must share your changes under the same terms.” MIT says, “Do whatever you want.” One protects the community. The other lets corporations take without giving back.


LGPL - GNU Lesser General Public License

LGPL
GPL'in kısıtlayıcı olduğu düşünüldüğü için LGPL (GNU Lesser General Public License) çıkmıştır.
Açıklaması şöyle
LGPL says, “you can use this code, but if you change it, you must share your changes under the same terms.”
LGPL Kodu Kullanırsak ve Uygulamamızı Dağıtırsak (Distribution)
GPL ile LGPL'in ayrıştığı en önemli nokta bence bu. LGPL yazılımı kullanıyorsak ve kendi ürünümüzü satıyorsak, kaynak kodumuzu açmak zorunda değiliz. Açıklaması şöyle. Eğer kaynak kodumuzu açmak istersek kendi kodumuz da LGPL lisanslı olmalı.
Yes, you can distribute your software without making the source code public and without giving recipients the right to make changes to your software.

The LGPL license explicitly allows such usages of libraries/packages released under that license.

10 Mart 2026 Salı

Medallion Architecture

Giriş
Açıklaması şöyle
Medallion architecture is a data design pattern that organizes data into three layers:

Bronze Layer (Raw):
  • Data ingested in its original format
  • Minimal transformation
  • Append-only historical record
  • No data quality enforcement
Silver Layer (Refined):
  • Cleaned and conformed data
  • Schema enforced
  • Deduplicated
  • Validated
  • Still fairly granular
Gold Layer (Curated):
  • Business-level aggregations
  • Denormalized for consumption
  • Optimized for specific use cases
  • Analytics-ready
Origin: Popularized by Databricks around 2019-2020 as part of the lakehouse pattern.

23 Şubat 2026 Pazartesi

Source code comments ve Decision Context

Giriş
Bu yazı Kod Gözden Geçirmesi - Code Review sürecinden yola çıkarak başladı. Kod Gözden Geçirmesi  sürecinde şöyle bir madde vardı.

1. Source code comments are sufficient :
Yazan cümleler genelde şöyle. İşte burada görecelilik ön plana çıkıyor.
  • If there is a comment, does it explain why the code does what it does?
  • Is each line of the code - in its context - either self-explanatory enough that it does not need a comment, or if not, is it accompanied by a comment which closes that gap?
  • Can the code be changed so it does not need a comment any more?
Emniyet kritik bazı projelerde her satır için comment olması isteniyor. O zaman iş sanki biraz daha kolay. Sadece her satıra bakmak yeterli. Kod şöyle görünüyor.
/* Display an error message */
function display_error_message( $error_message )
{
  /* Display the error message */
  echo $error_message;

  /* Exit the application */
  exit();
}

/* -------------------------------------------------------------------- */

/* Check if the configuration file does not exist, then display an error */
/* message */
if ( !file_exists( 'C:/xampp/htdocs/essentials/configuration.ini' ) ) {
  /* Display an error message */
  display_error_message( 'Error: ...');
}
Yapılması gerekenlere bazı örnek
- Source code conforms to coding standard and is checked by automated tool
- Source code is checked manually by reviewer if automation is not possible
- Source code is checked for memory leaks by a dedicated tool
- Source code is compatible and traceable to SRS

2. The Pattern I Notice in Every High-Quality Codebase
Yüksek kalite kodlarda bir karar yani "neden" açıklaması vardır. Açıklaması şöyle
I've started noticing four types of decision context that great codebases maintain:
...
Without this context, all code looks equally arbitrary.
1. Business context — Why this business rule exists
Örnek şöyle
// Stripe charges 2.9% + $0.30 per transaction
// We pass this through to users on transactions <$10
// For larger transactions, we absorb it (reduces churn by 8%)
const FEE_THRESHOLD = 1000; // in cents
2. Historical context — Why we chose this approach
Örnek şöyle
// We tried async/await here but hit deadlocks under load
// See incident post-mortem: docs/incidents/2024-01-15-deadlock.md
// Synchronous approach is slower but reliable
fn process_batch_sync(items: Vec<Item>) -> Result<()> {
3. Constraint context — What limits our options
Örnek şöyle
// API rate limited to 100 req/min per docs/api-limits.md
// We batch requests to stay under limit with 20% safety margin
const maxRequestsPerMinute = 80
 4. Future context — What we plan to change
Örnek şöyle
// TODO: Move to event-driven architecture
// Blocked on: Kafka cluster provisioning (INFRA-445)
// Timeline: Q2 2024
// This polling approach is temporary
pollForUpdates();




18 Şubat 2026 Çarşamba

Data Models

Giriş
Yazıyı (10 Data Models Every Data Engineer Must Know (Before They Break Production)) ilk olarak burada gördüm.

10. 10. Star Schema: The Legacy Workhorse (That Fails at Scale)
Açıklaması şöyle.
Star schemas are intuitive and analyst-friendly, but at scale they become a performance bottleneck, especially with massive fact tables, high-cardinality dimensions, and near-real-time workloads.
9. Snowflake Schema: Over-Engineered & Slow
Açıklaması şöyle.
Snowflake schemas optimize storage, not query performance. In modern analytics (cloud OLAP, dashboards, ad-hoc queries), compute is the bottleneck, not disk. Excessive normalization explodes join depth and kills latency.
8. Data Vault: The Enterprise Monster (When You Need Auditability)
Açıklaması şöyle.
Data Vault excels at auditability, lineage, and full historization, critical for regulated industries (banking, healthcare). But its multi-layer architecture makes it fundamentally unsuited for low-latency analytics.
7. Wide-Column Stores (Cassandra, Bigtable) for Time-Series Chaos
Açıklaması şöyle. 
Wide-column databases dominate high-velocity ingest (IoT, metrics, logs) where writes never stop. But they sacrifice query flexibility, no joins, limited filtering, and rigid access patterns. You win on writes, lose on exploration.
6. Graph Models (Neo4j, TigerGraph) for Hidden Relationships
Açıklaması şöyle.
When insight lives in relationships (fraud rings, social influence, network hops), relational joins collapse under recursive depth. Graph databases treat relationships as first-class citizens, making multi-hop traversals fast and natural.
5. Streaming Event Sourcing (Kafka + CDC)
Açıklaması şöyle.
Batch ETL is fundamentally incompatible with real-time systems. CDC turns database mutations into immutable events, enabling near-zero-latency pipelines, replayable state, and system-wide consistency across microservices.
4. Columnar Storage (Parquet, Delta Lake) for Cheap, Fast Analytics
Parquet bir örnek
Açıklaması şöyle.
Row-based databases are optimized for point lookups, not scans. Analytics workloads read a few columns across billions of rows, exactly what columnar storage is built for. The result: orders-of-magnitude faster queries at a fraction of the cost.
Örnek
Şöyle yaparız
CREATE TABLE sales_parquet (
    order_id BIGINT,
    region   STRING,
    amount   DECIMAL(10,2),
    order_ts TIMESTAMP
)
USING PARQUET
PARTITIONED BY (region, order_date);

SELECT
    region,
    SUM(amount) AS total_sales
FROM sales_parquet
WHERE order_date = '2025-12-25'
  AND region = 'US'
GROUP BY region;
Açıklaması şöyle. 
Why this is fast
- Only amount and region columns are read
- Only the order_date=2025-12-25 and US partitions are scanned
- All other files are skipped entirely
3. Multi-Model Hybrids (When SQL + NoSQL Collide)
Açıklaması şöyle. Burada veri tabanının JSONB sütunları desteklemesi önemli
Real-world data is rarely one shape. Modern apps mix relational facts, semi-structured JSON, and relationships. Multi-model databases let you query everything in one place, without forcing awkward ETL or duplicating data.
2. Reverse ETL (Operational Analytics) to Put Data Back in Apps

1. The Unified Serving Layer (The Future of Production Data)
One dataset. Many engines. Zero rewrites. Açıklaması şöyle
Modern data stacks fracture data across OLTP, OLAP, search, and streaming systems, creating sync lag and duplicated logic. A Unified Serving Layer uses one logical data layer (Iceberg/Hudi/Delta) with multiple access modes: SQL analytics, near-real-time reads, ML, and even graph/search workloads.