23 Şubat 2026 Pazartesi

Source code comments ve Decision Context

Giriş
Bu yazı Kod Gözden Geçirmesi - Code Review sürecinden yola çıkarak başladı. Kod Gözden Geçirmesi  sürecinde şöyle bir madde vardı.

1. Source code comments are sufficient :
Yazan cümleler genelde şöyle. İşte burada görecelilik ön plana çıkıyor.
  • If there is a comment, does it explain why the code does what it does?
  • Is each line of the code - in its context - either self-explanatory enough that it does not need a comment, or if not, is it accompanied by a comment which closes that gap?
  • Can the code be changed so it does not need a comment any more?
Emniyet kritik bazı projelerde her satır için comment olması isteniyor. O zaman iş sanki biraz daha kolay. Sadece her satıra bakmak yeterli. Kod şöyle görünüyor.
/* Display an error message */
function display_error_message( $error_message )
{
  /* Display the error message */
  echo $error_message;

  /* Exit the application */
  exit();
}

/* -------------------------------------------------------------------- */

/* Check if the configuration file does not exist, then display an error */
/* message */
if ( !file_exists( 'C:/xampp/htdocs/essentials/configuration.ini' ) ) {
  /* Display an error message */
  display_error_message( 'Error: ...');
}
Yapılması gerekenlere bazı örnek
- Source code conforms to coding standard and is checked by automated tool
- Source code is checked manually by reviewer if automation is not possible
- Source code is checked for memory leaks by a dedicated tool
- Source code is compatible and traceable to SRS

2. The Pattern I Notice in Every High-Quality Codebase
Yüksek kalite kodlarda bir karar yani "neden" açıklaması vardır. Açıklaması şöyle
I've started noticing four types of decision context that great codebases maintain:
...
Without this context, all code looks equally arbitrary.
1. Business context — Why this business rule exists
Örnek şöyle
// Stripe charges 2.9% + $0.30 per transaction
// We pass this through to users on transactions <$10
// For larger transactions, we absorb it (reduces churn by 8%)
const FEE_THRESHOLD = 1000; // in cents
2. Historical context — Why we chose this approach
Örnek şöyle
// We tried async/await here but hit deadlocks under load
// See incident post-mortem: docs/incidents/2024-01-15-deadlock.md
// Synchronous approach is slower but reliable
fn process_batch_sync(items: Vec<Item>) -> Result<()> {
3. Constraint context — What limits our options
Örnek şöyle
// API rate limited to 100 req/min per docs/api-limits.md
// We batch requests to stay under limit with 20% safety margin
const maxRequestsPerMinute = 80
 4. Future context — What we plan to change
Örnek şöyle
// TODO: Move to event-driven architecture
// Blocked on: Kafka cluster provisioning (INFRA-445)
// Timeline: Q2 2024
// This polling approach is temporary
pollForUpdates();




18 Şubat 2026 Çarşamba

Data Models

Giriş
Yazıyı (10 Data Models Every Data Engineer Must Know (Before They Break Production)) ilk olarak burada gördüm.

10. 10. Star Schema: The Legacy Workhorse (That Fails at Scale)
Açıklaması şöyle.
Star schemas are intuitive and analyst-friendly, but at scale they become a performance bottleneck, especially with massive fact tables, high-cardinality dimensions, and near-real-time workloads.
9. Snowflake Schema: Over-Engineered & Slow
Açıklaması şöyle.
Snowflake schemas optimize storage, not query performance. In modern analytics (cloud OLAP, dashboards, ad-hoc queries), compute is the bottleneck, not disk. Excessive normalization explodes join depth and kills latency.
8. Data Vault: The Enterprise Monster (When You Need Auditability)
Açıklaması şöyle.
Data Vault excels at auditability, lineage, and full historization, critical for regulated industries (banking, healthcare). But its multi-layer architecture makes it fundamentally unsuited for low-latency analytics.
7. Wide-Column Stores (Cassandra, Bigtable) for Time-Series Chaos
Açıklaması şöyle. 
Wide-column databases dominate high-velocity ingest (IoT, metrics, logs) where writes never stop. But they sacrifice query flexibility, no joins, limited filtering, and rigid access patterns. You win on writes, lose on exploration.
6. Graph Models (Neo4j, TigerGraph) for Hidden Relationships
Açıklaması şöyle.
When insight lives in relationships (fraud rings, social influence, network hops), relational joins collapse under recursive depth. Graph databases treat relationships as first-class citizens, making multi-hop traversals fast and natural.
5. Streaming Event Sourcing (Kafka + CDC)
Açıklaması şöyle.
Batch ETL is fundamentally incompatible with real-time systems. CDC turns database mutations into immutable events, enabling near-zero-latency pipelines, replayable state, and system-wide consistency across microservices.
4. Columnar Storage (Parquet, Delta Lake) for Cheap, Fast Analytics
Parquet bir örnek
Açıklaması şöyle.
Row-based databases are optimized for point lookups, not scans. Analytics workloads read a few columns across billions of rows, exactly what columnar storage is built for. The result: orders-of-magnitude faster queries at a fraction of the cost.
Örnek
Şöyle yaparız
CREATE TABLE sales_parquet (
    order_id BIGINT,
    region   STRING,
    amount   DECIMAL(10,2),
    order_ts TIMESTAMP
)
USING PARQUET
PARTITIONED BY (region, order_date);

SELECT
    region,
    SUM(amount) AS total_sales
FROM sales_parquet
WHERE order_date = '2025-12-25'
  AND region = 'US'
GROUP BY region;
Açıklaması şöyle. 
Why this is fast
- Only amount and region columns are read
- Only the order_date=2025-12-25 and US partitions are scanned
- All other files are skipped entirely
3. Multi-Model Hybrids (When SQL + NoSQL Collide)
Açıklaması şöyle. Burada veri tabanının JSONB sütunları desteklemesi önemli
Real-world data is rarely one shape. Modern apps mix relational facts, semi-structured JSON, and relationships. Multi-model databases let you query everything in one place, without forcing awkward ETL or duplicating data.
2. Reverse ETL (Operational Analytics) to Put Data Back in Apps

1. The Unified Serving Layer (The Future of Production Data)
One dataset. Many engines. Zero rewrites. Açıklaması şöyle
Modern data stacks fracture data across OLTP, OLAP, search, and streaming systems, creating sync lag and duplicated logic. A Unified Serving Layer uses one logical data layer (Iceberg/Hudi/Delta) with multiple access modes: SQL analytics, near-real-time reads, ML, and even graph/search workloads.