Heteroskedascity - GLS estimator, WLS estimator
when we use OLS estimator to analize data, it's sastify condition: iid
when deviation is fluctual then our estimator is not efficiently. That's heteroskedascity phenomenon.
To resolve this problem, we use FGLS estimator or WLS estimator.
To detect the heteroskedasticity, we predict error(resid) of dependent variable
step1: test of heteroskedasticity
step2: predict resid - get sigma
step3: estimate - FGLS approach
step4: estimate- WLS approach
.......................//.........................................................
there have many way to  implement estimators. I introduce two way implement that by stata
//create a specific data-generating process (DGP)
//formal y=1+1*x2+1*x3+ui; x2,x3~N(0,25)
//this simple way estimate FGLS, WLS
set seed 10101
quietly set obs 500
gen x1=5*rnormal(0)
gen x2=5*rnormal(0)
gen e = 5*rnormal(0)
gen u = sqrt(exp(-1+0.2*x2))*e
gen y = 1+x1+x2+u
summarize
//OLS
regress y x1 x2
// test heteroskedasticity
quietly regress y x1 x2
estat hettest x1 x2, mtest
// predict residual
predict uhat, residual
// gen square residual
gen uhatsq = uhat^2
//regress squared residual and vars
quietly regress uhatsq x2
//predict v
predict v
//FGLS
regress y x1 x2 [aweight = 1/v]
//WLS is FGLS with vce(robust)
regress y x1 x2 [aweight = 1/v], vce(robust)
....................................//....................................................

. //this simple way estimate FGLS, WLS
. set seed 10101

. quietly set obs 500

. gen x1=5*rnormal(0)

. gen x2=5*rnormal(0)

. gen e = 5*rnormal(0)

. gen u = sqrt(exp(-1+0.2*x2))*e

. gen y = 1+x1+x2+u

. summarize

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          x1 |       500   -.3583016     4.77857  -13.72421   22.04421
          x2 |       500    .3206456    4.833256   -15.5615   15.25074
           e |       500    .1100656    5.151708  -15.95077    16.9177
           u |       500     .065142    4.155814   -21.0197   18.57779
           y |       500    1.027486    7.946687  -28.66867   29.49777

. //OLS
. regress y x1 x2

      Source |       SS       df       MS              Number of obs =     500
-------------+------------------------------           F(  2,   497) =  660.60
       Model |   22898.116     2   11449.058           Prob > F      =  0.0000
    Residual |  8613.64745   497  17.3312826           R-squared     =  0.7267
-------------+------------------------------           Adj R-squared =  0.7256
       Total |  31511.7634   499  63.1498265           Root MSE      =  4.1631

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .9875029    .039009    25.31   0.000       .91086    1.064146
          x2 |   .9850448   .0385676    25.54   0.000     .9092691     1.06082
       _cons |    1.06546   .1871315     5.69   0.000     .6977933    1.433126
------------------------------------------------------------------------------

. // test heteroskedasticity
. quietly regress y x1 x2

. estat hettest x1 x2, mtest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity 
         Ho: Constant variance

---------------------------------------
    Variable |      chi2   df      p 
-------------+-------------------------
          x1 |      0.25    1   0.6180 #
          x2 |    238.24    1   0.0000 #
-------------+-------------------------
simultaneous |    238.92    2   0.0000
---------------------------------------
                  # unadjusted p-values

. // predict residual
. predict uhat, residual

. // gen square residual
. gen uhatsq = uhat^2

. //regress squared residual and vars
. quietly regress uhatsq x2

. //predict v
. predict v
(option xb assumed; fitted values)

. //FGLS
. regress y x1 x2 [aweight = 1/v]
(sum of wgt is   1.0855e+02)

      Source |       SS       df       MS              Number of obs =     420
-------------+------------------------------           F(  2,   417) = 1824.44
       Model |   12593.879     2   6296.9395           Prob > F      =  0.0000
    Residual |  1439.24788   417  3.45143376           R-squared     =  0.8974
-------------+------------------------------           Adj R-squared =  0.8969
       Total |  14033.1269   419  33.4919496           Root MSE      =  1.8578

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .9225066   .0174488    52.87   0.000      .888208    .9568052
          x2 |   1.056485   .0375267    28.15   0.000     .9827202     1.13025
       _cons |   .9679429   .1601493     6.04   0.000     .6531423    1.282744
------------------------------------------------------------------------------

end of do-file
. //WLS is FGLS with vce(robust)
. regress y x1 x2 [aweight = 1/v], vce(robust)
(sum of wgt is   1.0855e+02)

Linear regression                                      Number of obs =     420
                                                       F(  2,   417) =  802.38
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.8974
                                                       Root MSE      =  1.8578

------------------------------------------------------------------------------
             |               Robust
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .9225066   .0360801    25.57   0.000     .8515851     .993428
          x2 |   1.056485   .0459786    22.98   0.000     .9661065    1.146864
       _cons |   .9679429   .1697225     5.70   0.000     .6343246    1.301561
------------------------------------------------------------------------------

................................................//.................................................................//................................ 

// the other way (MUS), this way use none-linear square to estimate residual
//results below seem smaller error than the first way.
set seed 10101
quietly set obs 500
gen x11=5*rnormal(0)
gen x21=5*rnormal(0)
gen e1 = 5*rnormal(0)
gen u1 = sqrt(exp{-1+0.2*x21})*e
gen y1 = 1+1*x11+1*x21+u
summarize
quietly regress y1 x11 
predict double u11,resid
gen double u11sq = u11^2
gen double one=1
nl(u11sq = exp({xb: x21 one})), nolog
predict double varu1, yhat
regress y x11 x21 [aweight=1/varu1]
regress y x11 x21 [aweight=1/varu1], vce(robust)
...............................................//......................................//.................................................

. do "C:\Users\PC\AppData\Local\Temp\STD01000000.tmp"

. set seed 10101

. quietly set obs 500

. gen x11=5*rnormal(0)

. gen x21=5*rnormal(0)

. gen e1 = 5*rnormal(0)
. gen u1 = sqrt(exp(-1+0.2*x21))*e1

. gen y1 = 1+1*x11+1*x21+u

. summarize

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         x11 |       500   -.3583016     4.77857  -13.72421   22.04421
         x21 |       500    .3206456    4.833256   -15.5615   15.25074
          e1 |       500    .1100656    5.151708  -15.95077    16.9177
          u1 |       500     .065142    4.155814   -21.0197   18.57779
          y1 |       500    1.027486    7.946687  -28.66867   29.49777

. quietly regress y1 x11 

. predict double u11,resid

. gen double u11sq = u11^2

. gen double one=1

. nl(u11sq = exp({xb: x21 one})), nolog
(obs = 500)

      Source |       SS       df       MS
-------------+------------------------------         Number of obs =       500
       Model |  977513.441     2   488756.72         R-squared     =    0.2964
    Residual |  2320297.94   498  4659.23282         Adj R-squared =    0.2936
-------------+------------------------------         Root MSE      =  68.25857
       Total |  3297811.38   500  6595.62277         Res. dev.     =  5640.238

------------------------------------------------------------------------------
       u11sq |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     /xb_x21 |   .1306607   .0154047     8.48   0.000     .1003945    .1609269
     /xb_one |   3.368359   .1151343    29.26   0.000      3.14215    3.594568
------------------------------------------------------------------------------

. predict double varu1, yhat

. regress y x11 x21 [aweight=1/varu1]
(sum of wgt is   2.0266e+01)

      Source |       SS       df       MS              Number of obs =     500
-------------+------------------------------           F(  2,   497) = 1521.75
       Model |  27819.5057     2  13909.7528           Prob > F      =  0.0000
    Residual |  4542.89717   497  9.14063817           R-squared     =  0.8596
-------------+------------------------------           Adj R-squared =  0.8591
       Total |  32362.4028   499  64.8545147           Root MSE      =  3.0233

------------------------------------------------------------------------------
          y1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         x11 |   .9816037     .02728    35.98   0.000     .9280053    1.035202
         x21 |   1.013435   .0270721    37.43   0.000     .9602454    1.066625
       _cons |   1.080826   .1556078     6.95   0.000     .7750957    1.386556
------------------------------------------------------------------------------

. regress y x11 x21 [aweight=1/varu1], vce(robust)
(sum of wgt is   2.0266e+01)

Linear regression                                      Number of obs =     500
                                                       F(  2,   497) = 1769.85
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.8596
                                                       Root MSE      =  3.0233

------------------------------------------------------------------------------
             |               Robust
          y1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         x11 |   .9816037   .0242359    40.50   0.000     .9339863    1.029221
         x21 |   1.013435   .0261269    38.79   0.000     .9621026    1.064768
       _cons |   1.080826   .1599336     6.76   0.000     .7665965    1.395055
------------------------------------------------------------------------------

end of do-file

0 comments:

Post a Comment