Kumulative Normalverteilungsfunktion in C/C++

Question 1

Ich habe mich gefragt, ob Statistikfunktionen in Mathematikbibliotheken integriert sind, die Teil der Standard-C++-Bibliotheken wie cmath sind. Wenn nicht, können Sie eine gute Statistikbibliothek empfehlen, die eine kumulative Normalverteilungsfunktion hätte? Danke im Voraus.

Genauer gesagt möchte ich eine kumulative Verteilungsfunktion verwenden / erstellen.

Question 2

Es gibt keine direkte Funktion. Da aber die Gaußsche Fehlerfunktion und ihre Komplementärfunktion mit der kumulativen Normalverteilungsfunktion verwandt sind (vgl hieroder hier) können wir die implementierte c-Funktion verwenden erfc (komplementäre Fehlerfunktion):

double normalCDF(double value)
{
   return 0.5 * erfc(-value * M_SQRT1_2);
}

Was das Verhältnis von berücksichtigt erfc(x) = 1-erf(x) mit M_SQRT1_2 = √0,5.

Ich benutze es für statistische Berechnungen und es funktioniert großartig. Keine Notwendigkeit, Koeffizienten zu verwenden.

Question 3

Hier ist eine eigenständige C++-Implementierung der kumulativen Normalverteilung in 14 Codezeilen.

http://www.johndcook.com/cpp_phi.html

#include <cmath>

double phi(double x)
{
    // constants
    double a1 =  0.254829592;
    double a2 = -0.284496736;
    double a3 =  1.421413741;
    double a4 = -1.453152027;
    double a5 =  1.061405429;
    double p  =  0.3275911;

    // Save the sign of x
    int sign = 1;
    if (x < 0)
        sign = -1;
    x = fabs(x)/sqrt(2.0);

    // A&S formula 7.1.26
    double t = 1.0/(1.0 + p*x);
    double y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*exp(-x*x);

    return 0.5*(1.0 + sign*y);
}

void testPhi()
{
    // Select a few input values
    double x[] = 
    {
        -3, 
        -1, 
        0.0, 
        0.5, 
        2.1 
    };

    // Output computed by Mathematica
    // y = Phi[x]
    double y[] = 
    { 
        0.00134989803163, 
        0.158655253931, 
        0.5, 
        0.691462461274, 
        0.982135579437 
    };

        int numTests = sizeof(x)/sizeof(double);

    double maxError = 0.0;
    for (int i = 0; i < numTests; ++i)
    {
        double error = fabs(y[i] - phi(x[i]));
        if (error > maxError)
            maxError = error;
    }

        std::cout << "Maximum error: " << maxError << "\n";
}

Question 4

Auf Vorschlag der Leute, die vor mir geantwortet haben, habe ich herausgefunden, wie man es mit gsl macht, aber dann eine Lösung gefunden, die nicht aus der Bibliothek stammt (hoffentlich hilft das vielen Leuten da draußen, die danach suchen wie ich):

#ifndef Pi 
#define Pi 3.141592653589793238462643 
#endif 

double cnd_manual(double x)
{
  double L, K, w ;
  /* constants */
  double const a1 = 0.31938153, a2 = -0.356563782, a3 = 1.781477937;
  double const a4 = -1.821255978, a5 = 1.330274429;

  L = fabs(x);
  K = 1.0 / (1.0 + 0.2316419 * L);
  w = 1.0 - 1.0 / sqrt(2 * Pi) * exp(-L *L / 2) * (a1 * K + a2 * K *K + a3 * pow(K,3) + a4 * pow(K,4) + a5 * pow(K,5));

  if (x < 0 ){
    w= 1.0 - w;
  }
  return w;
}

Question 5

Boost ist so gut wie der Standard 😀 Bitte schön: Mathe/Statistik fördern.

Question 6

Die hier angegebenen Implementierungen des normalen CDF sind mit einfacher Genauigkeit Annäherungen, die gehabt haben float Ersetzt mit double und sind daher nur auf 7 oder 8 signifikante (Dezimal-) Stellen genau.
Für eine VB-Implementierung von Hart’s Doppelte Genauigkeit Annäherung, siehe Abbildung 2 von West Bessere Annäherungen an kumulative Normalfunktionen.

Bearbeiten: Meine Übersetzung von Wests Implementierung in C++:

double
phi(double x)
{
  static const double RT2PI = sqrt(4.0*acos(0.0));

  static const double SPLIT = 7.07106781186547;

  static const double N0 = 220.206867912376;
  static const double N1 = 221.213596169931;
  static const double N2 = 112.079291497871;
  static const double N3 = 33.912866078383;
  static const double N4 = 6.37396220353165;
  static const double N5 = 0.700383064443688;
  static const double N6 = 3.52624965998911e-02;
  static const double M0 = 440.413735824752;
  static const double M1 = 793.826512519948;
  static const double M2 = 637.333633378831;
  static const double M3 = 296.564248779674;
  static const double M4 = 86.7807322029461;
  static const double M5 = 16.064177579207;
  static const double M6 = 1.75566716318264;
  static const double M7 = 8.83883476483184e-02;

  const double z = fabs(x);
  double c = 0.0;

  if(z<=37.0)
  {
    const double e = exp(-z*z/2.0);
    if(z<SPLIT)
    {
      const double n = (((((N6*z + N5)*z + N4)*z + N3)*z + N2)*z + N1)*z + N0;
      const double d = ((((((M7*z + M6)*z + M5)*z + M4)*z + M3)*z + M2)*z + M1)*z + M0;
      c = e*n/d;
    }
    else
    {
      const double f = z + 1.0/(z + 2.0/(z + 3.0/(z + 4.0/(z + 13.0/20.0))));
      c = e/(RT2PI*f);
    }
  }
  return x<=0.0 ? c : 1-c;
}

Beachten Sie, dass ich Ausdrücke in die vertrauteren Formen für Annäherungen an Reihen und fortgesetzte Brüche umgeordnet habe. Die letzte magische Zahl in Wests Code ist die Quadratwurzel von 2π, die ich in der ersten Zeile an den Compiler weitergegeben habe, indem ich die Identität acos(0) = ½ π ausgenutzt habe.
Ich habe die magischen Zahlen dreimal überprüft, aber es besteht immer die Möglichkeit, dass ich mich vertippt habe. Wenn Sie einen Tippfehler entdecken, kommentieren Sie ihn bitte!

Die Ergebnisse für die Testdaten, die John Cook in seiner Antwort verwendet hat, sind

 x               phi                Mathematica
-3     1.3498980316301150e-003    0.00134989803163
-1     1.5865525393145702e-001    0.158655253931
 0     5.0000000000000000e-001    0.5
0.5    6.9146246127401301e-001    0.691462461274
2.1    9.8213557943718344e-001    0.982135579437

Ich schöpfe einen kleinen Trost aus der Tatsache, dass sie allen Ziffern zustimmen, die für die Mathematica-Ergebnisse angegeben wurden.

Question 7

Aus NVIDIA CUDA-Beispielen:

static double CND(double d)
{
    const double       A1 = 0.31938153;
    const double       A2 = -0.356563782;
    const double       A3 = 1.781477937;
    const double       A4 = -1.821255978;
    const double       A5 = 1.330274429;
    const double RSQRT2PI = 0.39894228040143267793994605993438;

    double
    K = 1.0 / (1.0 + 0.2316419 * fabs(d));

    double
    cnd = RSQRT2PI * exp(- 0.5 * d * d) *
          (K * (A1 + K * (A2 + K * (A3 + K * (A4 + K * A5)))));

    if (d > 0)
        cnd = 1.0 - cnd;

    return cnd;
}

Question 8

Aus https://en.cppreference.com/w/cpp/numeric/math/erfc

Die normale CDF kann wie folgt berechnet werden:

#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;

double normalCDF(double x) // Phi(-∞, x) aka N(x)
{
    return erfc(-x / sqrt(2))/2;
}

Die Verwendung von 2,0 anstelle von 2 im Nenner hilft dabei, Dezimalzahlen anstelle von ganzen Zahlen zu erhalten.

Ich hoffe, das hilft.