Yazılım Çorbası: IEEE 754 - Extended Precision

X87 - 32 bit Derleme
X87 FPU yazısına göz atabilirsiniz.

fdiv
Sanırım bölme işlemi yapıyor.

fistp
float -> int çevrimi için kullanılır.

mov
Kullanım şekli şöyle.

mov mem, reg
mov reg, mem
mov reg, reg
mov reg, imm
mov mem, imm

Açıklaması şöyle

Most x86 instructions (other than some specialized instructions such as movsb) can only access one memory location. Therefore a move from memory to memory requires going through a register with two mov instructions.

Elimizde şöyle bir kod olsun

x = y;

Assembly kodu şöyledir

movl    -0x8(%rbp), %esi
movl    %esi, -0xc(%rbp)

Streaming SIMD Extensions 2 - SSE2 - 64 bit Derleme
Tüm modern işlemciler (Intel veya AMD) SSE2'yi desteklerler.Açıklaması şöyle.

SSE2 was introduced into Intel chips with the Pentium 4 in 2001 and AMD processors in 2003

Visual Studio 2012'den itibaren 64 bit kodlar için SSE2 kodları üretiyor. SSE2 kodları XMM yazmaçlarını kullanır. Assembly kodu şöyledir

movsd   QWORD PTR [rsp+8], xmm1
call    cos
movsd   xmm1, QWORD PTR [rsp+8]
movsd   QWORD PTR [rsp], xmm0
movapd  xmm0, xmm1
call    cos
movsd   xmm2, QWORD PTR [rsp]
ucomisd xmm2, xmm0

divss
Sanırım bölme işlemi yapıyor. gcc'de şöyle yaparız.

__m128 s1, s2;
s1 = _mm_set_ps(1.0, 1.0, 1.0, 1.0);
s2 = _mm_set_ps(0.0, 0.0, 0.0, 0.0);
s2 = _mm_div_ss(s1, s2);

_mm_loadu_si128
Elimizde bir dizi olsun.

short* tempBufferVert = new short[width * height];

Dizideki her elemanı 4 il bölmek için şöyle yaparız.

#include <emmintrin.h> // SSE2

for (int i = 0; i < width * height; i += 8)
{
  __m128i v = _mm_loadu_si128((__m128i *)&tempBufferVert[i]);
  v = _mm_srai_epi16(v, 2); // v >>= 2
  _mm_storeu_si128((__m128i *)&tempBufferVert[i], v);
}

setcsr
Şöyle yaparız.

_mm_setcsr (0x00001D80);

Intel 80 bit floating point
80 bitlik precision bazı C++ derleyicileri tarafından kullanılabiliyor. Açıklaması şöyle.

The C++ standard allows implementations to evaluate floating-point expressions with more precision than the nominal type requires.

C# için de aynı kural geçerli.Açıklaması şöyle.

In particular, in some situations the JIT is permitted to use a more accurate intermediate representation - e.g. 80 bits when your original data is 64 bits - whereas in other situations it won't. That can result in seeing different results when any of the following is true:

Bir başka açıklama şöyle.

The IEEE 754 specifies that computations can be processed at a higher precision than what is stored in memory then rounded when written back to memory ..... In short, the standard does not promise that the same computation carried out on all hardware will return the same answer.

Örnek
Aşağıdaki basit örnekte derleyicinin iki double sayısını çarparken fazla precision kullandığını görebiliriz.

#include <cstdint>
#include <cinttypes>
#include <cstdio>

using namespace std;

int main() {
    double xd = 1.18;
    int64_t xi = 1000000000;

    int64_t res1 = (double)(xi * xd);

    double d = xi * xd;
    int64_t res2 = d;

    printf("%" PRId64"\n", res1);
    printf("%" PRId64"\n", res2);
}

Çıktı olarak şunu alırız.

1179999999
1180000000

Örnek
32 ve 64'bitlik uygulamalar arasında da farklar olabilir.

long double dvalue = 2.7182818284589998;
long double dexp = -0.21074699576017999;
long double result = std::powl( dvalue, dexp)

Sonuç olarak şunu alırız.

64bit -> result = 0.80997896907296496 and 32bit -> result = 0.80997896907296507

FPU içindeki kayan nokta hesaplaması için kullanılan register'lar 80 bit olabiliyor.

C ile kullanılan long double 80 bitlik bu veri tipine denk gelebilir. Bu durumda iki farklı makine arasında aynı hesaplamayı yapsak bile farklı sonuçlar elde edilebilir. Derleyiciler strict mode çalışmaya zorlanarak farklı makineler arasında aynı sonucu elde etmek mümkün.

C++

1. /fp:precise - MS derleyici için kullanılır

Varsayılan seçenek budur. Açıklaması şöyle

fp:precise weakens some of the rules, however it warranties that the precision of the calculations will not be lost.

Ara değerlerin (intermediate value) FPU registerlarında saklanmasına müsaade etmez. Hesaplamadan sonra ara değerler bellekteki alana geri yazılır. Açıklaması şöyle özellike x87 kodlarında bu açıklama önemli.

If the value to be computed on is placed on a larger register then one computation is done and then the value is moved off of the register back to memory the result is truncated there. It could then be moved back on to the larger register for another computation.

On the other hand if all the computations are done on the larger register before the value is moved back to memory you will get a different result.

Bu seçenek Visual C++ 6.0 ile kullanılan /Op (Improve Float Consistency) seçeneğinin yerini almıştır. Eski seçenek için de benzer bir açıklama Açıklaması şöyleydi

By default, the compiler uses the coprocessor’s 80-bit registers to hold the intermediate results of floating-point calculations. This increases program speed and decreases program size. However, because the calculation involves floating-point data types that are represented in memory by less than 80 bits, carrying the extra bits of precision (80 bits minus the number of bits in a smaller floating-point type) through a lengthy calculation can produce inconsistent results.

2. /fp:strict - MS derleyici için kullanılır

Platformlar arasında uyumluluğu sağlamak için yukarıdaki gibi 80 bitlik floating point hesaplamalarını kullanılmaz. Açıklaması şöyle

Using fp:strict means that all the rules of IEEE 754 are respected. fp:strict is used to sustain bitwise compatibility between different compilers and platforms.

Java
strictfp yazısına taşıdım

3. /fp:fast

Açıklaması şöyle

fp:fast allows compiler specific optimizations and transformations of expressions containing floating point calculation. It is the fastest methods but the results will differ between different compilers and platforms.

Yazılım Çorbası

29 Ocak 2018 Pazartesi

IEEE 754 - Extended Precision

Hiç yorum yok:

Yorum Gönder

Blog Arşivi