Yazılım Çorbası: Haziran 2021

30 Haziran 2021 Çarşamba

Minimum Spanning Tree

Giriş

Açıklaması şöyle

Find the subtree of the input graph that minimizes the total weight of its edges.

Prim- Dijkstra algorithm

Açıklaması şöyle

In Dijkstra's original paper, he talks about two problems related to graphs. The second one is the problem of finding the shortest path between two nodes, which is what is most commonly meant by Dijkstra's algorithm. However, he also poses another problem, namely that of constructing the tree of minimum total length between the n nodes of the connected graph.

Açıklaması şöyle

The algorithm here suggested by Dijkstra is today known as Prim's algorithm.

Kruskal's algorithm

Açıklaması şöyle

Kruskal's algorithm works as follows:
- sort the edges by increasing weight
- repeat: pop the cheapest edge, if it does not create cycles, include it in the MST

29 Haziran 2021 Salı

git .gitattributes Dosyası

1. End Of Line Formatlama Nasıl Olmalı

Açıklaması şöyle. Windows ve Linux karışık ortamlarda çalışıyorsak 1. seçenek tercih edilmeli.

1. Checkout Windows-style, commit Unix-style
Git will convert LF to CRLF when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects, this is the recommended setting on Windows ("core.autocrlf" is set to "true")

2. Checkout as-is, commit Unix-style
Git will not perform any conversion when checking out text files. When committing text files, CRLF will be converted to LF. For cross-platform projects this is the recommended setting on Unix ("core.autocrlf" is set to "input").

3. Checkout as-is, commit as-is
Git will not perform any conversions when checking out or committing text files. Choosing this option is not recommended for cross-platform projects ("core.autocrlf" is set to "false")

Burada listelenmiyor ama bence her zaman Linux style checkout ve commit en iyisi.

2. Formatting and Whitespace Uygulanması İçin Yöntemler

Formatting ve whitespace için iki seçenek var

1. git config ile istenilen değeri atamak

2. .gitattributes dosyasına eklemeler yapmak

Bu İki Yöntemden .gitattributes Tercih Edilmeli

Açıklaması şöyle. Yani depomuza (repository) .gitattributes dosyası eklemek, herkese teker teker "git config ..." yapmaktan daha kolay

Originally, Git for Windows introduced a different approach for line endings that you may have seen: core.autocrlf. This is a similar approach to the attributes mechanism: the idea is that a Windows user will set a Git configuration option core.autocrlf=true and their line endings will be converted to Unix style line endings when they add files to the repository.

The difference between these two options is subtle, but critical: the .gitattributes is set in the repository, so its shared with everybody. But core.autocrlf is set in the local Git configuration. That means that everybody has to remember to set it, and set it identically.

Kısaca şöyle yaparız. Kod için şöyle yaparız

# linux line-endings
*        text eol=lf

Eğer zaten git'e ekli dosyalar varsa ve .gitattributes sonradan ekleniyorsa, .gitattributes dosyasını oluşturduktan sonra dosyayı IntelliJ ile açıp satır sonunu LF yaparız. Bu durumda git dosyada değişiklik olduğunu düşünüyor. Daha sonra şöyle yaparız

git add --renormalize .

Bunu yapınca git artık değişiklik olduğunu düşünmüyor.

1. git config Yöntemi

Windows'ta End Of Line formatlamayı Linux gibi için şöyle yaparız

git config --global core.autocrlf true

Linux'ta End Of Line formatlamayı Linux gibi için şöyle yaparız

git config --global core.autocrlf input

2. .gitattributes Dosyası Yöntemi

Not : IntelliJ kullanıyorsak Settings > Editor > Code Style altında Line separator alanı Unix and macOs (\n) seçilmeli. Böylece her şey her zaman Linux gibi olacak

.gitattributes dosyasındaki text değerinin açıklaması şöyle

text
This attribute enables and controls end-of-line normalization. When a text file is normalized, its line endings are converted to LF in the repository. To control what line ending style is used in the working directory, use the eol attribute for a single file and the core.eol configuration variable for all text files.

Örnek - *.sh Dosyaları İçin LF'i muhafaza etmek

Windows bilgisayarda checkout yaparken bash betiklerinin (script) bozulmaması için şöyle yaparız

*.sh text eol=lf

Aynı şey sanırım şöyle de yapılabilir.

# Never modify line endings of our bash scripts
*.sh -crlf

Örnek - Dosya Uzantısına Göre Karışık Kullanım

Şöyle yaparız

# auto
*           text=auto     
*.txt       text

# windows line-endings
*.vcproj    text eol=crlf 

# linux line-endings
*.sh        text eol=lf   

*.jpg       -text

Örnek - auto normalization

Şöyle yaparız

* text=auto

Açıklaması şöyle. Böylece Windows'tan commit yapsak bile repository 'de LF olarak saklanır. Windows checkout yaparken CRLF olarak alır, Linux ise LF olarak alır.

With this set, Windows users will have text files converted from Windows style line endings (\r\n) to Unix style line endings (\n) when they’re added to the repository.

28 Haziran 2021 Pazartesi

Software Configuration Management Süreci - Semantic versioning

Giriş

Açıklaması şöyle

According to semantic versioning, a version number has 03 parts as MAJOR, MINOR and PATCH.

MAJOR version change: Incremented when you make incompatible API changes. Consumers that use the service will be affected.
MINOR version change: when you add functionality in a backwards-compatible manner.
PATCH version change: when you make backwards-compatible bug fixes.

Örnek

Şöyle yaparız

1.2.3
^ ^ ^
| | |
| | +--- Minor bugs, spelling mistakes, etc.
| +----- Minor features, major bug fixes, etc.
+------- Major version, UX changes, file format changes, etc.

25 Haziran 2021 Cuma

gRPC SpringBoot Service

Giriş

gRPC servisi SpringBoot içinde de kullanılabilir. grpc-spring-boot-starter kullanılır. Bir örnek burada

1. gRPC Servisi Spring Component olarak tanımlanır

2. Bu service kullanılmak istenilen yerde @GrpcClient anotasyonu ile koda dahil edilir

Örnek

Servisi Spring component yapmak için şöyle yaparız

import io.grpc.stub.StreamObserver;
import org.lognet.springboot.grpc.GRpcService;

@GRpcService
public class UserDetailsService extends
  UserDetailsServiceGrpc.UserDetailsServiceImplBase {
...
}

metodları için şöyle yaparız. Burada hem Unary hem de Bi-Directional Stream görülebilir.

@Override
public void generateRandomUser(UserDetailsRequest request,
  StreamObserver<UserDetailsResponse> responseObserver) {
  UserDetailsResponse output = UserDetailsResponse.newBuilder()...;
    .setCity(...)
    ...
    .build();
  responseObserver.onNext(output);
  responseObserver.onCompleted();
}

@Override
public StreamObserver<UserDetailsRequest> generateRandomUserStream(
  StreamObserver<UserDetailsResponse> responseObserver) {
  return new StreamObserver<UserDetailsRequest>() {
    @Override
    public void onNext(UserDetailsRequest input) {
      UserDetailsResponse output = UserDetailsResponse.newBuilder()
        .setCity(...)
        ...
        .build();
      responseObserver.onNext(output);
    }
    @Override
    public void onError(Throwable throwable) {
    }
    @Override
    public void onCompleted() {
      responseObserver.onCompleted();
    }
  };
}

Bunu generateRandomUser() metodunu teker teker çağırmak için şöyle yaparız

import net.devh.boot.grpc.client.inject.GrpcClient;
import reactor.core.publisher.Flux;
@Service
public class UserDetailsGrpcBlockingClient {

  @GrpcClient("UserDetailsService")
  UserDetailsServiceGrpc.UserDetailsServiceBlockingStub userDetailsServiceBlockingStub;

  public Flux<Object> getUserDetailsResponse(Integer range) {
    return
      Flux.range(1, range)
	.map(i -> UserDetailsRequest.newBuilder()
                    .setCity(...)
	            ...
	            .build())
	.map(i -> {
	  UserDetailsResponse response = this.userDetailsServiceBlockingStub
          .generateRandomUser(i);
	  return (Object) Map.of(response.getId(), new UserDetails(response));
         })
       .subscribeOn(Schedulers.boundedElastic());
  }
}

stream olarak çağırmak için şöyle yaparız. Burada gRPC StreamObserver nesnesi Flux nesnesine çevriliyor.

import io.grpc.stub.StreamObserver;
import net.devh.boot.grpc.client.inject.GrpcClient;
import reactor.core.publisher.DirectProcessor;
import reactor.core.publisher.Flux;
import reactor.core.publisher.FluxSink;
import reactor.core.scheduler.Schedulers;

@Service
public class UserDetailsGrpcStreamClient {

  @GrpcClient("UserDetailsService")
  private UserDetailsServiceGrpc.UserDetailsServiceStub stub;

  public Flux<Object> generateUserStreamResponse(Integer range){
    DirectProcessor<Object> processor = DirectProcessor.create();
    StreamObserver<UserDetailsResponse> observer = 
      new StreamObserverImpl(processor.sink());
    StreamObserver<UserDetailsRequest> inputStreamObserver =
      this.stub.generateRandomUserStream(observer);
    return Flux.range(1, range)
      .map(i -> UserDetailsRequest.newBuilder()
        .setCity(...)
        ...
        .build())
      .doOnNext(inputStreamObserver::onNext)
      .zipWith(processor, (a, b) -> b)
      .doOnComplete(inputStreamObserver::onCompleted)
      .subscribeOn(Schedulers.boundedElastic());
  }
}

Çevirmek için şöyle yaparız.

class StreamObserverImpl implements StreamObserver<UserDetailsResponse> {

  final FluxSink<Object> sink;

  public StreamObserverImpl(FluxSink<Object> sink) {
    this.sink = sink;
  }

  @Override
  public void onNext(UserDetailsResponse output) {
    this.sink.next(Map.of(output.getId(), new UserDetails(output)));
  }

  @Override
  public void onError(Throwable throwable) {
    this.sink.error(throwable);
  }

  @Override
  public void onCompleted() {
    this.sink.complete();
  }
}

Dayanıklılık Testi - Endurance Testing

Giriş

Endurance Testi, Performance testi kapsamında yapılan testlerden bir tanesi

Amacı Nedir?

Açıklaması şöyle. Yani amacı sistemi uzun müddet, üst sınır değerinde bir yerde tutmak. Sistem bir motor olsaydı, bu motoru 1 gün boyunca, en yüksek RPM değerinde çalıştırmak olurdu

Endurance testing in software testing is a kind of non-functional test that is performed to evaluate the software applications’ behavior under high loads for an extended amount of time. It is performed during the last stage of the performance run cycle, and sometimes, can last for as long as a year.

Bu Testte Nelere Bakılabilir

Ölçülebilecek bazı şeyler şöyle. Aslında bu listedeki maddeler hemen her performance testinde bakılacak şeyler

The following are some of the goals of running an endurance test:

- Check the system for memory leaks
- Discover how the system performs under prolonged usage
- Ensure that the system’s response time improves after running the test
- Determine the number of transactions/users the system can support while meeting all performance goals

Endurance Testi İçin Bazı Araçlar

Bazı araçlar şöyle

JMeter: This open-source software is freely available at platform-independent. Apache JMeter is a great performance testing tool that can run endurance testing with real-time example
scenarios. This testing tool easily integrates with Selenium and can also perform unit testing.

LoadRunner: Considered a leader in performance testing, LoadRunner supports scripts from Selenium and JMeter by declaring an interface library. Similar to JMeter, this endurance test tool also excels at running both integration testing and unit testing. LoadRunner may not be a free tool, but it does allow free trials to a certain number of users.

Appvance: Alongside endurance testing, Appvance can be used for security, performance, and functional testing. This AI-driven automation tool provides a virtual user dashboard and real-time analytics.

OpenSTA: Open System Testing Architecture - commonly known as OpenSTA - is written in C++ by CYRANO and is supported by Microsoft Windows OS. This open-source tool can be used to perform scripted HTTP and HTTPS heavy load tests with performance measures.

WebLoad Professional: This endurance testing tool supports both Perfecto Mobile and Selenium. You can expect various pricing plans for this performance testing tool. Like NeoLoad, WebLoad Professional offers a free plan with limited users.

24 Haziran 2021 Perşembe

tan (tanjant) metodu - Y/X Yaparak Açıyı Hesaplar

Giriş

Karşı/Komşu şeklinde düşünülür.

Yani

tan(angle) = Y / X

Örnek

Elimizde açı ve karşı değerleri olsun. Komşuyu yani X'i bulmak için şöyle yaparız

Y / Math.tan(angle)

yaparız

21 Haziran 2021 Pazartesi

HTTP Strict Transport Security - HSTS - Mecburi HTTPS Kullanımı İçindir

Giriş

Mecburi HTTPS kullanımı içindir. Açıklaması şöyle. Http 3xx Redirection kodları yerine kullanılabilir.

HTTP Strict Transport Security (also named HSTS) is an opt-in security enhancement that is specified by a web application through the use of a special response header. Once a supported browser receives this header that browser will prevent any communications from being sent over HTTP to the specified domain and will instead send all communications over HTTPS. It also prevents HTTPS click through prompts on browsers.

Açıklaması şöyle

It allows web servers to declare that web browsers (or other complying user agents) should automatically interact with it using only HTTPS connections

Period Of Time

Açıklaması şöyle

HSTS Policy specifies a period of time during which the user agent should only access the server in a secure fashion.

Örnek

Cevapta şunu görürüz

Request: https://www.google.com/?gws_rd=ssl;
Response: Status Code: 200
          strict-transport-security: max-age=31536000

Örnek

Cevapta şunu görürüz.

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

preload için açıklama şöyle

If the site owner would like their domain to be included in the HSTS preload list maintained by Chrome (and used by Firefox and Safari), then use the header preload

16 Haziran 2021 Çarşamba

Taylor Serisi - Taylor Series

Giriş

Açıklaması şöyle. Taylor serisi sanırım gerçek fonksiyonu hesaplamanın pahalı olduğu ancak yaklaşık bir değerin de işe yaradığı durumlarda kullanılır.

One practical reason for choosing a Taylor Series approximation of a function over the function itself is if you are able to compute using only the four arithmetic operations. For example, if you are asked to find the cosine of an angle and the only computing device you have is a four-function calculator, then you can get a good approximation of the cosine of the angle using the first few terms of the Taylor Series of the cosine function.

Açıklaması şöyle. Yani elimizde bir matematik kütüphanesi yoksa, Taylor Serisi kullanılarak trigonometric hesaplamalar yapılabilir.

A : Taylor series and other numerical methods are maybe less relevant from a formal education standpoint, but they are an incredibly nifty suite of tools to have in your back pocket if you ever find yourself programming in an environment without a proper math library (or need to approximate something weird like the error function).

A : Taylor series are literally how computers compute all trig and log functions, except maybe some that use newton approximation (sqrt comes to mind). Of course there's lookup tables and all kinds of speedups, but if I ask you "calculate sin(x), you may add, subtract, multiply, and divide", how will you ever do it if you don't know about Taylor?

Örnek - sin

Şöyle yaparız

Örnek - exp

Şöyle yaparız

exp(x)≈∑n=0∞xnn!=1+x+x22+x36+x424+…

Yazılım Mimarisi - Replica/Replication - Çoğaltma

Giriş

Replication kelimensin Türkçesi çoğaltma

Replication V.S. Cache - Çoğaltma ve Ön Bellek

Açıklaması şöyle. Cache latency problemi içindir. Yazılım Mimarisi - Cache ölçeklemek için kullanılan diğer yöntem olan cache konusunu ele alıyor

From the perspective of scalability in distributed system design, cache and replication are used for different goals. Cache is in memory and is used to improve the latency. Replication is still in disk and is used to scale out read throughput and enhance durability.

Replication ve Scaling - Çoğaltma ve Ölçeklendirme

Replication, ölçekleme için kullanılan yöntemlerden birisi. Açıklaması şöyle

Caching is one of the two ways(the other is replication) to scale read heavy applications.

Açıklaması şöyle

There are many techniques to scale a relational database: master-slave replication, master-master replication, federation, sharding, denormalization, and SQL tuning.
- Replication usually refers to a technique that allows us to have multiple copies of the same data stored on different machines.
- Federation (or functional partitioning) splits up databases by function.
- Sharding is a database architecture pattern related to partitioning by putting different parts of the data onto different servers and the different user will access different parts of the dataset
- Denormalization attempts to improve read performance at the expense of some write performance by coping of the data are written in multiple tables to avoid expensive joins..
- SQL tuning.

Data Replication vs. Data Synchronization - Veri Çoğaltması ve Veri Eş Uyumluluğu

Açıklaması şöyle. Yani Data Replication aynı veri tabanı içinde olur, Data Synchronization ise faklı veri tabanlarının eş uyumlu hale gelmesidir.

Data Replication:
Data replication involves creating multiple copies of data and distributing them across different systems or nodes(usually called standbys).

Data Synchronization:
Data synchronization, on the other hand, focuses on maintaining consistency and accuracy between the source of truth and other data sources.

Naive Methods of Data Replication - Çoğaltmanın Bön Yöntemleri

1. Kaynak veri tabanının gönderilecek değişiklikleri bellekte tutması. Eğer hedef veri tabanı ile bağlantı kaybolursa, kaynak sistemin belleği yetmeyeceği için çoğaltma bozulur

1. Primary Replica - Master-slave replication

Şeklen şöyle

Açıklaması şöyle

Only the primary DB host handles DB updates. The update on primary is synced to replicas via bin log replay. Most mainstream databases like MySQL have built in support for this setup. Read request is load balanced(LB) to the replicas.

2. Primary Replica Zayıflıkları

2.1 Primary Failure

Açıklaması şöyle

Github has shared their solution (here and here). The idea is to have a separate system that constantly monitors the status of master and the lag on each replica. The monitor will detect the primary’s failure and adjust the network topology to promote one replica as the new primary. This requires being exposed to many low level network details. I find it intimidating to depend on unfamiliar open source projects doing tricky stuff on the network.

Many NoSQL databases have symmetric hosts thus have good support for node failures. I believe the main benefit today from a NoSQL database like Cassandra is the ease of operation.

2.2 Consistency

Açıklaması şöyle

The primary replica set up will result in update delay in replicas and is a classic eventual consistency model. Essentially we trade strong consistency for read scalability. Eventual consistency is enough for most applications, except for ones requiring ‘read your write’ consistency.

‘Read your write’ consistency can be improved by forcing the read request to primary if it’s following a write. Or naively force the read to wait for several seconds so that all replicas have caught up. When there are replicas not in the same datacenter(DC), the read will also need to be restricted to the same DC.

2.3 High Watermark

Açıklaması şöyle. Burada yazma ve okuma işlemleri Master'a gidiyor, ancak Master isteği işledikten sonra daha Replica'ya gönderemeden çöküyor. Yeni Master seçilince de bu işlemden haberi olmuyor

Let's assume, the leader received a write operation. The leader wrote the transaction on the WAL. Let's also take that a consumer read the operation immediately after it was written, and before the operation could be propagated to all the followers, the leader crashed.

Post the leader crash, the cluster would undergo Leader Election, & one of the followers becomes the new leader for that partition. However, the latest changes from the previous leader were not replicated to the new leader, i.e new leader is behind the old leader.

Now let's assume, another consumer tries to read the latest record. Since the new leader doesn't have the latest write, this consumer doesn't know about that record. This leads to data inconsistency/data loss, which is exactly we didn't want!!

Note: We do have these transactions in the WAL on the old leader, but those log entries cannot be recovered until the old leader becomes alive again.

Açıklaması şöyle.

To overcome the problem, we use the concept of High Watermark.

The leader keeps track of the indexes of the entries that have been successfully replicated on each follower. The high-water mark index is the highest index, which has been replicated on the quorum of the followers.

The leader can push the high-water mark index to all followers as part of a heartbeat message(in case it's a push based model)/leader can respond to the pull request from the followers with the high watermark index.

Açıklaması şöyle. Yani master quorum sayısı kadar replica nın asgari watermark değerini hesaplar. Bu değer değişince replica'lara duyurur

The leader gets pull requests from the followers, with the latest offset they are in sync with. Hence the leader can easily make a call on when to update the high watermark. Once the high watermark is updated on the leader, with the next fetch, the leader will propagate the updated high watermark to the followers.
...
This guarantees that even if the leader fails and another leader is elected, the client will not see any data inconsistencies, as any client would not have read anything beyond the high watermark. This is how we can prevent inconsistent reads while ensuring high availability and resiliency........

3. Master-master replication

Açıklaması şöyle

Each database server can act as the master at the same time as other servers are being treated as masters. At some point in time, all of the masters sync up to make sure that they all have correct and up-to-date data.

Here are some advantages of master-master replication.
- If one master fails, the other database servers can operate normally and pick up the slack. When the database server is back online, it will catch up using replication.
- Masters can be located in several physical sites and can be distributed across the network.
- Limited by the ability of the master to process updates.

3.1 Conflict Resolution

Bazı yöntemler şöyle

3.1. Conflict avoidance

Açıklaması şöyle

It is the simplest strategy to avoid conflicts. We just need to ensure that all writes for a particular record goes to the same leader, or more aptly to the same data center. It might look simple, but edge cases, when the entire data center is down or such may hamper the entire application.

3.2. Convergent Conflict Resolution

Açıklaması şöyle

In multi-leader replication, there is no defined ordering of writes, thus making it unclear what the final value should be. This inconsistency questions the durability of the data and every replication must ensure that the data is the same at all places. This method of handling conflicts can be done in various ways :
- LWW(Last Write Win) — Each write is given a unique ID and the write with the highest write is chosen as the winner.
- Give each replica a unique Id and let writes originated at higher-numbered replicas take precedence over the lower counterparts.
- Merge the values.
- Record the conflict in an explicit data structure that preserves all the information , and write application code that resolves conflict later by notifying the user.

3.3. Custom conflict resolution logic

Açıklaması şöyle

Most multi-leader replication tools provide the option to custom define your conflict resolution in the application code. On write, as soon as a conflict is detected, the conflict handler is called, and it runs in the background to resolve it. On read, if a conflict is detected, all conflicting writes are stored and the next time data is read, these multiple versions of the data are returned to the application, which in turn prompts the user or automatically resolve the conflict, and write back to the database.

3.4. Automatic conflict resolution

Açıklaması şöyle

There has been a lot of research on building automatic conflict resolutions which would be intelligent enough to resolve the conflicts caused by concurrent data modifications.

- Conflict-free replicated data types( CRDTs) are a family of data structures for sets, maps, ordered lists, counters, etc that can be concurrently edited by multiple users. It uses two-way merges.
- Mergeable data structure tracks history explicitly, just like it, and uses a three-way merge function
- Operational transformation is the algorithm behind collaborative editing applications such as Google docs. It’s a whole big topic which is very interesting to study.

4. Replication ve Consistency

Hem replica yapıp hem de consistency için kullanılan bazı çözümler şöyle

1. Read-Impose Write-Consult-Majority

2. Leader-based Replication

Tüm yazma işlemleri Leader'a yönlendirilir. Leader yazar ve veriyi diğerlerine dağıtır

3. Leased-Leader-based Replication