0% found this document useful (0 votes)
1K views6 pages

Empirical Distribution Function (EDF) in Excel Tutorial

This is the second entry in our ongoing series about empirical or sample distribution. In this tutorial, we will start with the general definition, motivation and applications of EDF, and then use NumXL to carry out our EDF analysis. In an earlier entry, we discussed the histogram as a non-parametric method for the probability distribution inference of a random variable. In this tutorial, we go over the empirical distribution function and estimate its values for the different points in the sample.

Uploaded by

NumXL Pro
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
1K views6 pages

Empirical Distribution Function (EDF) in Excel Tutorial

This is the second entry in our ongoing series about empirical or sample distribution. In this tutorial, we will start with the general definition, motivation and applications of EDF, and then use NumXL to carry out our EDF analysis. In an earlier entry, we discussed the histogram as a non-parametric method for the probability distribution inference of a random variable. In this tutorial, we go over the empirical distribution function and estimate its values for the different points in the sample.

Uploaded by

NumXL Pro
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 6

Tutorial:

Empirical Distribution Function (EDF)


Thisisthesecondentryinourongoingseriesaboutempiricalorsampledistribution.Inthistutorial,we willstartwiththegeneraldefinition,motivationandapplicationsofEDF,andthenuseNumXLtocarry outourEDFanalysis. Inanearlierentry,wediscussedthehistogramasanonparametricmethodfortheprobability distributioninferenceofarandomvariable.Inthistutorial,wegoovertheempiricaldistribution functionandestimateitsvaluesforthedifferentpointsinthesample. Forsampledata,wegeneratedadatasetof29randomlygeneratedvaluesfromtheGaussian distribution.

Background
Theempiricaldistributionfunction(EDF)orempiricalcdfisastepfunctionthatjumpsby1/Natthe occurrenceofeachobservation: Where

EDF ( x)

1 N

I {x
i 1

x}

I {A} istheindicatorofaneventfunction

1 xi x I {xi x} 0 xi x

Bydefinition,theEDFfunctioncomputesthecumulativedistributionoftheunderlyingrandomnumber.

Why do we care?
TheEDFestimatesthetrueunderlyingcumulativedensityfunctionofthepointsinthesample;itis virtuallyguaranteedtoconvergewiththetruedistributionasthesamplesizegetssufficientlylarge.

Process
First,letsorganizeourinputdata.Wecanstartbyplacingthevaluesofthesampledatainaseparate column.Thesamplemaycontainoneormoremissingvalues.

EmpiricalDistributionFunction(EDF)Tutorial

SpiderFinancialCorp,2013

NowwearereadytoconstructourEDFPlotFirst,selecttheemptycellinyourworksheetwhereyou wishtheoutputtabletobegenerated,thenlocateandclickontheDescriptiveStatisticsiconinthe NumXLtab(ortoolbar).Then,selecttheEmpiricalDistributionFunctionitemfromthedropdown menu.

TheEDFWizardpopsup.

EmpiricalDistributionFunction(EDF)Tutorial 2 SpiderFinancialCorp,2013

Selectthecellsrangeforthevaluesoftheinputvariable. Notes:

1. Thecellsrangeincludes(optional)theheading(Label)cell,whichwouldbeusedintheoutput tableswhereitreferencesthosevariables. 2. Bydefault,theoutputtablecellsrangeissettothecurrentselectedcellinyourworksheet. 3. Bydefault,theoutputgraphcellsrangeissettothe7cellsrightofthecurrentselectedcellin yourworksheet. Finally,onceweselecttheinputdata(X)cellsrange,theOptionsandMissingValuestabsbecome available(enabled). Next,selecttheOptionstab.

Initially,thetabissettothefollowingvalues: OverlayNormaldistributionischecked.Thisoptionineffectinstructsthewizardtogeneratea secondcurvefortheGaussiandistributionforcomparisonpurposes.Leavethisoptionchecked.

Now,clickontheMissingValuestab.

EmpiricalDistributionFunction(EDF)Tutorial

SpiderFinancialCorp,2013

Inthistab,youcanselectanapproachtohandlemissingvaluesinthedataset(Xs).Bydefault,any observationwithmissingvaluewouldbeexcludedfromtheanalysis. Thistreatmentisagoodapproachforouranalysis,soletsleaveitunchanged. Now,clickOKtogeneratetheoutputtables.

EmpiricalDistributionFunction(EDF)Tutorial 4 SpiderFinancialCorp,2013

Notes:

1. ThevaluesofallobservationsaresortedinascendingorderandplacedincolumnE. 2. TheXBarandYBarcolumnscarrynospecialstatisticalmeaning;theyaremerelycomputedto assistusgeneratingastepwisetypeofgraphinExcel. 3. Finally,theequivalentcumulativedensityfunction(CDF)ofthenormaldistributioniscomputed incolumnI. ThegeneratedplotoftheEDFisshownbelow:

Conclusion
Inthistutorial,wedemonstratedtheprocesstogenerateanempiricaldistributionfunctioninExcel usingNumXLsaddinfunctions.

EmpiricalDistributionFunction(EDF)Tutorial

SpiderFinancialCorp,2013

Wheredowegofromhere? Toobtaintheprobabilitydensityfunction(PDF),oneneedstotakethederivativeoftheCDF,butthe EDFisastepfunctionanddifferentiationisanoiseamplifyingoperation.Asaresult,theconsequent PDFisveryjaggedandneedsconsiderablesmoothingformanyareasofapplication. Inournextentry,wewilllookatthekerneldensityestimationmethodtoobtaintheprobabilitydensity functionoftheunderlyingrandomprocess.

EmpiricalDistributionFunction(EDF)Tutorial

SpiderFinancialCorp,2013

You might also like