Yazılım Çorbası: Mart 2017

31 Mart 2017 Cuma

Quick Find Algoritması

Giriş
O (N ^2) bir algoritmadır. Çünkü önce N tane union sonra da N tane find işlemi yapılır.

Algoritma
Algoritma şöyledir. parent dizisi bir elemanın en üst atasını saklar.

// Naive implementation of find
int find(int parent[], int i)
{
  if (parent[i] == -1)
    return i;
  return find(parent, parent[i]);
}

// Naive implementation of union()
void Union(int parent[], int x, int y)
{
  int xset = find(parent, x);
  int yset = find(parent, y);
  parent[xset] = yset;
}

Union işlemi için bir uyarının açıklaması şöyle

"And I mentioned that a lot of us would get us wrong. The mistake we might make is to put ID of P here rather than first picking out, that value. And you can think about the implications of that. That's an insidious bug."

Elimizde şöyle bir kod olsun. Union metodunda p'nin en üst atasını alıp bir yerde saklamak gerekir. Yoksa diziyi dolaşırken en üst atanın değerini değiştiği için kaybedebiliriz.

public class QuickFindUF {
  private int[] id;

  public QuickFindUF(int N) {
    id = new int[N];
    for(int i = 0; i < N; i++) {
       id[i] = i;
    }
  }

  public boolean Connected(int p, int q) {
    return id[p] == id[q];
  }

  public void Union (int p, int q) {
    int pid = id[p];
    int qid = id[q];
    for (int i = 0; i < id.length; i++) {
      if (id[i] == pid) { //"id[i] == id[p]" is not correct! Why?
        id[i] = qid;
      }
    }
  }
}

Union
Union çağrısı x'in en üst atasını y yap anlamına gelir.

Union işlemi sonunda elimize şöyle bir yapı geçer.

Let there be 4 elements 0, 1, 2, 3

Initially all elements are single element subsets.
0 1 2 3 

Do Union(0, 1)
   1   2   3  
  /
 0

Do Union(1, 2)
     2   3   
    /
   1
 /
0

Do Union(2, 3)
         3    
        /
      2
     /
   1
 /
0

Union By Rank
Bu algoritma "union by rank" ve "path compression" yöntemkleri ile iyileştirilebilir. Bu yöntemleri anlamadım. "union by rank" şöyle yaparız.

Let us see the above example with union by rank
Initially all elements are single element subsets.
0 1 2 3 

Do Union(0, 1)
   1   2   3  
  /
 0

Do Union(1, 2)
   1    3
 /  \
0    2

Do Union(2, 3)
    1    
 /  |  \
0   2   3

Path Compression
"path compression" ile şöyle yaparız.

Let the subset {0, 1, .. 9} be represented as below and find() is called
for element 3.
              9
         /    |    \  
        4     5      6
     /     \        /  \
    0        3     7    8
            /  \
           1    2  

When find() is called for 3, we traverse up and find 9 as representative
of this subset. With path compression, we also make 3 as child of 9 so 
that when find() is called next time for 1, 2 or 3, the path to root is 
reduced.

               9
         /    /  \    \
        4    5    6     3 
     /           /  \   /  \
    0           7    8  1   2

28 Mart 2017 Salı

pthread_attr yapısı

Giriş
Şu satır dahil edilir.

#include <pthread.h>

Açıklaması şöyle

The attr argument points to a pthread_attr_t structure whose contents
are used at thread creation time to determine attributes for the new
thread; this structure is initialized using pthread_attr_init(3) and
related functions.  If attr is NULL, then the thread is created with
default attributes.

Bu yapı pthread_create() metoduna verilebilen parametrelerden bir tanesi. Thread'in davranışını ayarlamak için kullanılır.

Yapısı Nasıldır
Bu yapı işletim sistemine göre değişiklik gösterebilir. Örnek:

#define __SIZEOF_PTHREAD_ATTR_T 36

typedef union
{
    char __size[__SIZEOF_PTHREAD_ATTR_T];
    long int __align;
} pthread_attr_t;

Bu yapının alanlarını atamak için kullanılan metodlar şöyle. setX() metodlarına karşılık gelen getX() metodları var.

pthread_attr_init()
pthread_attr_setdetachstate()
pthread_attr_setguardsize()
pthread_attr_setinheritsched()
pthread_attr_setschedparam()
pthread_attr_setschedpolicy()
pthread_attr_setscope()
pthread_attr_setstack()
pthread_attr_setstacksize()

Ayrıca bu yapıyı yok etmek için pthread_attr_destroy() kullanılır.

pthread_attr_init metodu
Yapıyı default değerlerle doldurmak için şöyle yaparız.

pthread_attr_t thread_attr;
pthread_attr_init(&thread_attr);

pthread_attr_setscope metodu
Örnek ver

pthread_attr_setdetachstate metodu
Joinable thread için şöyle yaparız.

pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

pthread_attr_setstack metodu
stack adresini atamak için kullanılır. Örnek ver

pthread_attr_setstacksize metodu
C++11 thread'lerinde yığıt boyutunu ayarlamak mümkün değil. Posix ile yapmak mümkün.

Yığıt boyutu şöyle alınır.

pthread_attr_t thread_attr;
size_t tmp_size=0;

pthread_attr_init(&thread_attr);

pthread_attr_getstacksize(&thread_attr , &tmp_size );

Yığıt boyutu şöyle atanır.

pthread_attr_t thread_attr;

pthread_attr_init(&thread_attr);

pthread_attr_setstacksize(&thread_attr , PTHREAD_STACK_MIN + 0x1000);

Thread belli bir yığıt ile şöyle başlatılır.

void* foo(void* arg){...} //Thread method
.
.
.
.
pthread_attr_t attribute;
pthread_t thread;

pthread_attr_init(&attribute);
pthread_attr_setstacksize(&attribute,1024); //Set stack size
pthread_create(&thread,&attribute,foo,0); //Create thread
pthread_join(thread,0); //Wait for thread to finish

pthread_attr_setinheritsched metodu
Şöyle yaparız.

pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);

pthread_attr_setschedpolicy metodu

Şöyle yaparız.

pthread_attr_setschedpolicy(&attr, SCHED_RR);

pthread_attr_setschedparam metodu
Şöyle yaparız.

struct sched_param sched;
memset( &sched, 0, sizeof( sched ) );
sched.sched_priority = priority;
pthread_attr_setschedparam(&attr, &sched);

pthread_create metodu

Giriş
İmzası şöyle

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                      void *(*start_routine) (void *), void *arg);

pthread_create metodu ile yeni bir thread yaratılır.
1. parametre thread nesnesini saklamak içindir. Yani "out" parametresidir.
2. parametre thread'e geçilmek istenen pthread_attr nesnesini belirtir. Genelde NULL kullanılır. Açıklaması şöyle

The attr argument points to a pthread_attr_t structure whose contents
are used at thread creation time to determine attributes for the new
thread; this structure is initialized using pthread_attr_init(3) and
related functions.  If attr is NULL, then the thread is created with
default attributes.

3. parametre thread'in ana metodunu belirtir.
4. parametre thread'e ilave veri geçmek için kullanılır.

Klasik kullanımı şöyledir.

pthread_create (&t, NULL, threadMain, NULL);

Şöyle bir ana thread metodumuz olsun

void *testThread (void *pParm){...}

Bu metodu herhangi bir şeye cast etmemize gerek yok. Yani şöyle yapmak gereksiz.

pthread_create(&t, NULL, (void *)threadMain, NULL);

Şöyle çağırırız.

pthread_create (&t, NULL, threadMain, NULL);

lambda kullanmak istersek şöyle yaparız.

pthread_create (&t, NULL, [](void* ptr){
  ...; 
  return (void*)nullptr;
}, NULL);

2. parametre pthread_attr yapısı
pthread_attr yapısı yazısına taşıdım.

3. thread'in ana metodu
thread'in ana metodunu belirtiriz.İmzası şöyledir. void* döner ve parametre olarak void* alır.

void *myThread (void *arg) {
  ...
  return nullptr;
}

metod null dönmek zorunda değildir. Bir nesne dönmek istersek şöyle yaparız.

struct threadresult {
  int sum;
  int product;
};

void* myThread (void* arg) {
  ...
  struct threadresult* result = malloc(sizeof(*result));
  result->sum = ...;
  result->product = ...;

  return result;
}

4. parametre Thread'e İlave Veri geçmek

pthread_create() metodunun 4. alanı, başlatılan thread'e geçilmek istenen parametre. Eğer birden fazla parametre geçilmesi gerekiyorsa bir struct şeklinde geçilebilir. Şöyle bir struct'ımı olsun

struct dimension {
  unsigned int width;
  unsigned int height;
};

Parametrenin thread başlamadan önce yok olmaması gerekir. Dolayısıyla eğer gerekiyorsa önce struct'ı heap üzerinde yaratır, daha sonra thread'e parametre olarak geçerim.

// Pass a struct in pthread_create (NOT on the stack)
struct dimension *dim = new dimension();
dim->width = 1;
dim->height = 2;

pthread_create(&ph, &attr, myThread, dim);

Thread metodu void pointer aldığı için parametreyi struct'a cast ederek kullanabilir.

void *myThread(void* dimension) {

  struct dimension* dim = (struct dimension*) dimension;
  //Do work
    
  return 0;
}

Gerekiyorsa parametre thread'den çıkarken yok edilir. Şöyle yaparız.

void* myThread (void* dimesion) {
  // Get thread args in usable type.
  struct dimension* dim = (struct dimension*) dimension;  ...

  delete dim;  return 0;
}

25 Mart 2017 Cumartesi

TCP Congestion Control

Giriş
Açıklaması şöyle

Congestion control is perhaps the most important aspect of TCP, which makes TCP capable of
achieving high performance and avoid congestion collapse, at least in wired and single hop wireless
networks. Where the flow control mechanism addresses the receiver’s resources, congestion control
addresses the network resources, preventing the sender to push too much traffic into the network.

Senders use the acknowledgments for data sent, and the lack of these, to infer network conditions
between the sender and receiver.

The TCP congestion control algorithm has received much attention after its introduction in 1988,
and a substantial number of proposals for improvement of the congestion control mechanism have
been put forward. Most of these TCP variants, such as Tahoe, Reno, Vegas etc. have focused on congestion control

Congestion Control kaynak ve hedef arasındaki tüm yol içindir. Hedef bilgisayar sorunsuz olsa bile yoldaki herhangi bir router'daki sorun da tıkanıklığa sebep olabilir.

Congestion control için iki tane algoritma var. İlki TCP Tahoe ve en eski yöntem. Diğeri ise TCP Reno ve daha yeni.

Tahoe Algoritması
TCP Tahoe şöyle

The first version of TCP with congestion control became known as TCP Tahoe.

Algoritması şöyle

Assign a congestion window Cw:

1. Initial value of Cw = 1 (packet)

2. If trx successful, congestion window doubled. Continues until Cmax is reached

3. After Cw ≥ Cmax, Cw = Cw + 1
4. If timeout before ACK, TCP assumes congestion

Congestion olursa algoritma şöyle

TCP response to congestion is drastic:
1. A random backoff timer disables all transmissions for duration of timer
2. Cw is set to 1
C3. ax is set to Cmax / 2
Congestion window can become quite small for successive packet losses.
Throughput falls dramatically as a result.

Şeklen şöyle

Algoritma Açıklamarı
1 ve 2. adımların açıklaması şöyle

The TCP Tahoe congestion control strategy consists of multiple mechanisms. For each connection,
TCP maintains a congestion window that limits the total number of unacknowledged packets that may be in transit end-to-end. The congestion window is an extension of the sliding window that TCP uses for flow control. When a connection is initialized, and after a timeout, TCP uses a mechanism called slow start to increase the congestion window. It starts with a window of two times the Maximum Segment Size (MSS). Although the initial rate is low, the rate of increase is very rapid. For every packet acknowledged, the congestion window increases by one MSS so that effectively the congestion window doubles for every RTT. The window is doubled as follows: If the congestion window has two packets outstanding, and one packet is acknowledged, this means that the congestion window is increased to three packets, and only one packet is outstanding. I.e. the sender may now send two new packets. When the final packet (of the original two) is acknowledged, this allows the sender to increase the congestion window with one MSS yet again, bringing the total congestion window to four, and of these two are free. In other words, the congestion window has doubled.

3. adımın açıklaması şöyle

When the congestion window exceeds a threshold ssthresh, the algorithm enters a new state, called
congestion avoidance. In some implementations (e.g., Linux), the initial ssthresh is large, resulting
in the first slow start usually ending in a loss of a packet. The ssthresh is updated at the end of each
slow start, and will often affect subsequent slow starts triggered by timeouts.

In the state of congestion avoidance, the congestion window is additively increased by one MSS
every RTT, instead of the previous one MMS per acknowledged packet, as long as non-duplicate
ACKs are received.

Congestion olursa açıklaması şöyle

When a packet is lost, the likelihood of receiving duplicate ACKs is very high. (It is also possible,
though unlikely, that the stream has undergone extreme packet reordering, which would also prompt
duplicate ACKs.) Triple duplicate ACKs are interpreted in the same way as a timeout. In such a case,
Tahoe performs a ”fast retransmit”, reduces the congestion window to one MSS, and resets to the
slow-start state.

Timeout Süresi
Açıklaması şöyle

In order to estimate a typical RTT, it is therefore natural to take some sort of average of the SampleRTT values. TCP maintains an average, called EstimatedRTT, of the SampleRTT values. Upon obtaining a new SampleRTT, TCP updates EstimatedRTT according to the following formula:

EstimatedRTT = (1 – x) • EstimatedRTT + x • SampleRTT

The formula above is written in the form of a programming-language statement — the new value of EstimatedRTT is a weighted combination of the previous value of EstimatedRTT and the new value for SampleRTT.

SampleRTT
Kurose and Ross'a ait "Computer Networks : A Top-Down Approach" kitabındaki ifade şöyle

Instead of measuring a SampleRTT for every transmitted segment, most TCP implementations take only one SampleRTT measurement at a time.

Bazı TCP gerçekleştirimleri tek bir SampleRTT ölçümü yapıyorlar. Bu bana da tuhaf geldi.Açıklaması ise şöyle

As in most systems engineering topics, there's a balance between performance and accuracy. In the case of TCP, it could take a SampleRTT measurement every other segment, but that would imply a higher processing delay, and thus a higher queueing delay, etc. The single SampleRTT measurement helps TCP get an idea of the RTT while staying lean and fast.

Each node wants to spend the minimum time possible processing each segment so that it can get forwarded onto the network and move on to the next segment.

Additionally, each subsequent measurement is less valuable than the one before. The first measurement lets you know the neighborhood of RTT, the second helps you fine-tune that, etc. So there's a diminishing payoff the more RTTs you measure, especially since after each measurement you have fewer segments remaining to send. What's the point of measuring the RTT if there's only 5 segments left to send? Thus the first RTT measurement is the most important and really the only one needed.

Random Early Detection (RED)
Bu algoritma TCP'ye yardımcı olmak için router'larda bulunur. Açıklaması şöyle

RED is used to prevent queues from filling up by randomly dropping queued packets. Full queues lead to tail-drop, and this can cause multiple TCP flows to become globally synchronized, simultaneously shrinking and expanding their windows, alternately starving and filling queues.

Oyun Dünyasında TCP
Gerçek zamanılı oyun dünyasında TCP kullanılmıyor. Açıklaması şöyle

The model whereby one host acts as a "server" and keeps track of game state (all players, statistics and object state in a world), and all clients communicate with the server via UDP is the industry standard.

Sebebi ise kaybolan tek bir paketin arkadan gelen diğer paketleri de engellemesi.Açıklaması şöyle

The real problem with TCP is head of line blocking. If a single packet is lost on a TCP connection then the loss must be detected,the packet resent and the resent packet must be delivered to the application before any data coming after the lost packet can be delivered to the application.
...
Games where realtime isn't so critical can get away with just using TCP.