-
Notifications
You must be signed in to change notification settings - Fork 0
/
weakiv.sthlp
1618 lines (1392 loc) · 72.5 KB
/
weakiv.sthlp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
{smcl}
{* *! version 2.4.02 9feb2015}{...}
{cmd:help weakiv}
{hline}
{title:Title}
{p2colset 5 16 18 2}{...}
{p2col:{hi: weakiv} {hline 2}}Weak-instrument-robust tests and confidence intervals
for instrumental-variable (IV) estimation of linear, panel, probit and tobit models{p_end}
{p2colreset}{...}
{marker syntax}{...}
{title:Syntax}
{phang}
Standalone estimation (specifying model to be estimated):
{p 8 14 2}
{cmd:weakiv}
{it:iv_cmd}
{it:depvar} [{it:varlist1}]
{cmd:(}{it:varlist2}{cmd:=}{it:varlist_iv}{cmd:)} [{it:weight}]
[{cmd:if} {it:exp}] [{cmd:in} {it:range}]
{bind:[{cmd:,} {it:model_options}}
{it:test_options} {it:ci_options} {it:graph_options} {it: misc_options}]
{phang}
Obtaining model from previous call to {it:ivregress}, {it:ivreg2}, {it:ivreg2h},
{it:xtivreg}, {it:xtivreg2}, {it:xtabond2}, {it:ivprobit}, or {it:ivtobit}:
{p 8 14 2}
{cmd:weakiv}
[{cmd:,} {it:test_options} {it:ci_options} {it:graph_options} {it: misc_options}]
{phang}
Replay syntax:
{p 8 14 2}
{cmd:weakiv}
[{cmd:,} {it:graph_options} {it:project(varlist)} {it:project2(var1 var2)} {it: misc_options}]
{synoptset 20}{...}
{synopthdr:iv_cmd/model_options}
{synoptline}
{synopt:{it:iv_cmd}}
{it:ivregress}, {it:ivreg2}, {it:ivreg2h}, {it:xtivreg}, {it:xtivreg2}, it:xtabond2}, {it:ivprobit}, or {it:ivtobit}
{p_end}
{synopt:{opt <misc>}}
options supported by {helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h}, {helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2}, {helpb ivprobit} or {helpb ivtobit}
{p_end}
{synoptline}
{p2colreset}{...}
{synoptset 20}{...}
{synopthdr:test_options}
{synoptline}
{synopt:{opt null(numlist)}}
null hypotheses for tests of coefficients on weakly-identified endogenous variables in IV model
{p_end}
{synopt:{opt kwt(#)}}
weight on {it:K} test statistic in {it:K-J} test (see also {it:kjlevel(.)} option below)
{p_end}
{synopt:{opt lm}}
use Lagrange multiplier tests (available for linear models only)
{p_end}
{synopt:{opt md}}
use Wald/Minimum Distance tests (all models)
{p_end}
{synopt:{opt strong(varlist)}}
names of strongly-identified endogenous regressors (if not all are weakly identified)
{p_end}
{synopt:{opt cuestrong}}
use LIML or CUE for strongly-identified endogenous regressors (instead of default IV or 2-step GMM)
{p_end}
{synopt:{opt cuepoint}}
report LIML or CUE point estimates for weakly-identified endogenous regressors
and include in grid if grid search used
(linear models only; reporting option only, does not affect test statistics)
{p_end}
{synopt:{opt subset(varlist)}}
endogenous regressors for subset AR test (multiple-endogenous regressor case only)
{p_end}
{synopt:{opt testexog(varlist)}}
exogenous regressors to be included in the reported tests (optional)
{p_end}
{synopt:{opt clrsims(#)}}
number of reps for simulating distribution of the CLR statistic (default=use closed-form method if available, use 10,000 reps if not);
{opt clrsims(0)} means do not simulate (use closed-form or report missing value)
{p_end}
{synopt:{opt small}}
makes small-sample adjustment
{p_end}
{synopt:{opt eq(diff/lev/sys)}}
({it:xtabond2} only) use (only equation in differences / only equation in levels / both) for weak-ID-robust tests
{p_end}
{synoptline}
{p2colreset}{...}
{synoptset 20}{...}
{synopthdr:ci_options}
{synoptline}
{synopt:{opt usegrid}}
construct grid for confidence-interval estimation, graphs etc.
{p_end}
{synopt:{opt noci}}
supress reporting/calculation of confidence intervals
{p_end}
{synopt:{opt project(varlist)}}
endogenous regressors for projection-based confidence intervals (multiple-endogenous regressor case only)
{p_end}
{synopt:{opt project2(var1 var2)}}
2 endogenous regressors for projection-based confidence set (multiple-endogenous regressor case only)
{p_end}
{synopt:{opt gridpoints(numlist)}}
number(s) of grid points (in dimensions corresponding to endogenous regressors)
{p_end}
{synopt:{opt gridmult(#)}}
multiplier of Wald confidence-interval for grid
{p_end}
{synopt:{opt gridmin(numlist)}}
lower limit(s) for grid search (in dimensions corresponding to endogenous regressors)
{p_end}
{synopt:{opt gridmax(numlist)}}
upper limit(s) for grid search (in dimensions corresponding to endogenous regressors)
{p_end}
{synopt:{opt grid(numlist [ | numlist [ | numlist ... ] ] )}}
explicit list(s) of grid points (in dimensions corresponding to endogenous regressors); if multiple lists, separated by "|"
{p_end}
{synopt:{opt level(numlist)}}
default confidence level(s) for confidence intervals and sets (max 3)
{p_end}
{synopt:{opt arlevel(#)}}
optional confidence level for AR confidence intervals
{p_end}
{synopt:{opt jlevel(#)}}
optional confidence level for J confidence intervals
{p_end}
{synopt:{opt kjlevel(#)}}
(usage 1) optional overall confidence level for K-J confidence intervals and tests
{p_end}
{synopt:{opt kjlevel(#k #j)}}
(usage 2) optional separate confidence levels for K and J in K-J confidence intervals and tests
{p_end}
{synoptline}
{p2colreset}{...}
{synoptset 20}{...}
{synopthdr:graph_options}
{synoptline}
{synopt:{opt graph(namelist)}}
graph test rejection probabilities and confidence intervals (ar, clr, k, j, kj, wald)
{p_end}
{synopt:{opt graphxrange(numlist)}}
lower and upper limits of x axis for graph of test statistics
(option unavailable for 2-endogenous regressor case)
{p_end}
{synopt:{opt graphopt(string)}}
graph options to pass to graph command
(applies to combined contour/surface graph in 2-endogenous regressor case)
{p_end}
{synoptline}
{col 7}{it:2-endogenous-regressor usage}
{synopt:{opt contouropt(string)}}
graph options to pass to contour graph command
{p_end}
{synopt:{opt surfaceopt(string)}}
graph options to pass to surface graph command
{p_end}
{synopt:{opt contouronly}}
do contour plot (confidence set) only; suppress surface plot
{p_end}
{synopt:{opt surfaceonly}}
do surface plot (rejection probability surface) only; supress contour plot
{p_end}
{synoptline}
{p2colreset}{...}
{synoptset 20}{...}
{synopthdr:misc_options}
{synoptline}
{synopt:{opt estadd}[({it:prefix})]}
add main {it:weakiv} results (scalars and macros) to model estimated by IV for Wald tests;
estimation results obtained from previous call to
{helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2},
{helpb ivprobit} or {helpb ivtobit}
remain in memory with {it:weakiv} results added;
{it:prefix} is an optional prefix added to names of scalars and macros
(not available with replay syntax)
{p_end}
{synopt:{cmdab:estuse:wald(name)}}
obtain IV model from stored previous estimation by
{helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2},
{helpb ivprobit} or {helpb ivtobit}
{p_end}
{synopt:{cmdab:eststore:wald(name)}}
store IV model used for Wald tests
(estimated by {helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2},
{helpb ivprobit} or {helpb ivtobit})
under {it:name}
{p_end}
{synopt:{cmdab:display:wald}}
display model estimated by IV for Wald tests
(not available with replay syntax)
{p_end}
{synoptline}
{p2colreset}{...}
{title:Contents}
{phang}{help weakiv##description:Description}{p_end}
{phang}{help weakiv##tests:Tests, confidence intervals, rejection probabilities}{p_end}
{phang}{help weakiv##interpretation1:Summary interpretations of {it:weakiv} output: 1 endogenous regressor}{p_end}
{phang}{help weakiv##interpretation2:Summary interpretations of {it:weakiv} output: 2 endogenous regressors}{p_end}
{phang}{help weakiv##project:Summary interpretations of {it:weakiv} output: Projection-based inference for 2+ endogenous regressors}{p_end}
{phang}{help weakiv##options:Options}{p_end}
{p 8}{help weakiv##model_options:Model options}{p_end}
{p 8}{help weakiv##test_options:Test options}{p_end}
{p 8}{help weakiv##ci_options:Confidence interval estimation}{p_end}
{p 8}{help weakiv##graph_options:Graphing options}{p_end}
{p 8}{help weakiv##misc_options:Miscellaneous options}{p_end}
{phang}{help weakiv##examples:Examples}{p_end}
{p 8}{help weakiv##gen_examples:General use}{p_end}
{p 8}{help weakiv##ci_examples:Confidence interval and grid examples}{p_end}
{p 8}{help weakiv##graph_examples:Graphing examples}{p_end}
{p 8}{help weakiv##misc_examples:{it:estadd} option and other miscellaneous examples}{p_end}
{phang}{help weakiv##saved_results:Saved results}{p_end}
{phang}{help weakiv##acknowledgements:Acknowledgements}{p_end}
{phang}{help weakiv##references:References}{p_end}
{phang}{help weakiv##citation:Citation of weakiv}{p_end}
{marker description}{...}
{title:Description}
{pstd}
{opt weakiv} performs a set of tests of the coefficient(s) on the endogenous variable(s)
in an instrumental variables (IV) model,
and constructs confidence sets for these coefficients.
These tests and confidences are robust to weak instruments
in the sense that identification of the coefficients is not assumed.
This is in contrast to the traditional IV/GMM estimation methods,
where the validity of tests on estimated coefficients requires the assumption that they are identified.
{pstd}
{opt weakiv} can be used to estimate linear
(including panel fixed effects, first diffences, and dynamic panel data),
probit and tobit IV models.
{opt weakiv} supports a range of variance-covariance estimators for linear IV models
including heteroskedastic-, autocorrelation-, and one- and two-way cluster-robust VCEs.
{opt weakiv} also provides graphics options that allow the plotting of
confidence intervals and rejection probabilities (one endogenous regressor),
and confidence regions and rejection surfaces (two endogenous regressors).
{pstd}
{opt weakiv} can estimate models with any number of endogenous regressors.
There are several options available for models with 2 or more endogenous regressors.
(a) The user can specify, using the {opt strong(.)} option,
that some coefficients are strongly identifed,
in which case {opt weakiv} will report tests for the
weakly-identified subset of coefficients.
(b) The {opt project(.)} option requests the reporting
of conservative projection-based confidence intervals
for the specified endogenous regressors.
(c) The {opt project2(.)} option requests the construction of
2-dimensional projection-based confidence sets
for the 2 specified endogenous regressors.
These 2-D confidence sets can be graphed using the {opt graph(.)} option.
(d) The {opt subset(.)} option specifies that
only the listed subset of endogenous regressors is tested.
The subset test is available only for i.i.d. linear models (including linear panel models)
and only the subset AR test is available.
{pstd}
For estimations with 1 weakly-identified coefficient,
{opt weakiv} reports tests of the specified null,
confidence intervals, and provides graphing options.
For estimations with 2 weakly-identified coefficients,
{opt weakiv} reports tests of the specified null
and provides graphing options for confidence sets and rejection surfaces.
For estimations with more than 2 weakly-identified coefficients,
{opt weakiv} reports only tests of the specified null hypothesis.
The default null is that all weakly-identified coefficients are zero.
{pstd}
{opt weakiv} can be used either as a standalone estimator
where the user provides the specification of the model,
or after previous IV/GMM estimation by {helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2}, {helpb ivprobit} or {helpb ivtobit}.
{pstd}
When used as a standalone estimator,
{opt weakiv} works by calling {helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2}, {helpb ivprobit} or {helpb ivtobit}
depending what the user has specified as {it:iv_cmd}.
{opt weakiv} passes all user-specified model estimation options
to the estimation command:
variable lists, VCE specification, estimation method, etc.
{pstd}
When used with a model previously estimated by
{helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2},
{helpb ivprobit} or {helpb ivtobit},
{opt weakiv} obtains the model specification from the previous estimation.
This is either the model currently in memory,
or the stored model provided by the user in {opt usemodel(name)}.
{pstd}
{opt weakiv} also supports Stata {it:replay} syntax.
If the {opt weakiv} results are the current estimation in memory,
{opt weakiv} with no model specified will replay them.
This can be used to tweak the graph options
without {opt weakiv} having to recalculate the full set of estimation results (see below).
The {opt project(.)} and {opt project2(.)} options are also available with {it:replay} syntax.
{pstd}
{opt weakiv} requires {helpb avar} (Baum and Schaffer 2013) to be installed.
The graphing options for the 2-endogenous-regressor case require Stata 12 or higher
for the contour plots of confidence regions,
and require version 1.06 or higher of {helpb surface} (Mander 2005)
for the 3-D plots of rejection surfaces.
{opt weakiv} will prompt the user for installation of
{helpb avar} and {helpb surface} if necessary.
{helpb avar} is an essential component and {opt weakiv} will not run without it.
Neither {helpb graph contour} nor {helpb surface} is an essential component;
{opt weakiv} will run but will not provide the corresponding graphs.
{pstd}
Estimator notes:
{pstd}
Except where noted below,
{opt weakiv} supports the variance-covariance estimation options available
with {helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h}, {helpb xtivreg}, {helpb xtabond2} and {helpb xtivreg2}.
Weights that are supported by each IV command are also supported by {opt weakiv}.
{p2col 5 16 17 0: {helpb ivtobit}}
Only variance-covariance estimation options that assume homoskedasticity are supported.
{p_end}
{p2col 5 16 17 0: {helpb ivprobit}}
The {opt twostep} option (Newey's (1987) two-step estimator) is required.
Only variance-covariance estimation options that assume homoskedasticity are supported.
{p_end}
{p2col 5 16 17 0: {helpb ivreg2h}}
(IV estimation using heteroskedasticity-based instruments)
The {opt gen} option of {helpb ivreg2h} is required
in order to generate the new instruments as variables.
{p_end}
{p2col 5 16 17 0: {helpb xtivreg}}
Only the fixed-effects and first-differences estimators are supported.
{p_end}
{p2col 5 16 17 0: {helpb xtabond2}}
If {opt weakiv} used as a postestimation command after {helpb xtabond2},
the preceding estimation by {helpb xtabond2} requires the {opt svmat}
option to save the {helpb xtabond2}-transformed data as {it:e(.)} matrices.
The {opt svmat} option is {it:not} required
if {opt weakiv} used for standalone estimation ({opt weakiv xtabond2 ...}).
Support for {helpb xtabond2} requires matafavor to be set for speed;
to do this type or click
{stata mata mata set matafavor speed : mata mata set matafavor speed}.
{p_end}
{marker tests}{...}
{title:Tests, confidence intervals, rejection probabilities}
{pstd}
{opt weakiv} calculates Lagrange multiplier (LM) or minimum distance (MD)
versions of weak-instrument-robust tests of the coefficient
on the endogenous variable {it:beta} in an instrumental variables (IV) estimation.
In an exactly-identified model where the number of instruments
equals the number of endogenous regressors,
it reports the Anderson-Rubin ({it:AR}) test statistic.
When the IV model contains more instruments than endogenous regressors
(the model is overidentified),
{opt weakiv} also conducts the conditional likelihood ratio ({it:CLR}) test,
the Lagrange multiplier {it:K} test, the {it:J} overidentification test,
and a combination of the {it:K} and overidentification tests ({it:K-J}).
{pstd}
The default behavior of {opt weakiv} is to report LM versions of these tests for linear models,
and MD versions for the IV probit and IV tobit models.
The MD versions of these tests can be requested by the {opt md} option;
for linear models, these are equivalent to Wald-type tests.
The LM versions of these tests are not available for IV probit/tobit models.
In the current implementation of {opt weakiv},
the {it:CLR} test is available for the 1-endog-regressor case only.
For reference, {opt weakiv} also reports a Wald test
using the relevant traditional IV parameter and VCE estimators;
this Wald test is identical to what would be obtained by
standard estimation using
{helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2},
{helpb ivprobit} or {helpb ivtobit}.
{pstd}
The {it:AR} test is a joint test of the structural parameter
({it:beta=b0}, where {it:beta} is the coefficient on the endogenous regressor)
and the exogeneity of the instruments
({it:E(Zu)=0}, where {it:Z} are the instruments
and {it:u} is the disturbance in the structural equation).
The {it:AR} statistic can be decomposed into the {it:K} statistic
(which tests only {it:H0:beta=b0},
assuming the exogeneity conditions {it:E(Zu)=0} are satisfied)
and the {it:J} statistic
(which tests only {it:H0:E(Zu)=0},
assuming that {it:beta=b0} is true).
This {it:J} statistic is evaluated at the null hypothesis,
as opposed to the Hansen {it:J} statistic from GMM estimation,
which is evaluated at the parameter estimate.
{pstd}
The {it:CLR} test is a related approach to testing {it:H0:beta=b0}.
It has good power properties,
and in particular is the most powerful test for the linear model under homoskedasticity
(within a class of invariant similar tests).
An important advantage of the {it:CLR} test over the {it:K} test is that
the {it:K} test can lose power in some regions of the parameter space
when the objective function has a local extremum or inflection point;
the {it:CLR} test does not suffer from this problem.
The {it:CLR} test is a function of a rank statistic {it:rk}.
For the case of more than one endogenous regressor,
there are several such rank tests available;
{opt weakiv} employs the SVD-based test of Kleibergen and Paap (2006)
(see e.g. {helpb ranktest},
which has the computational advantage of a closed-form solution.
The rank statistic {it:rk} can also be interpreted
as a test of underidentification of the model (see Kleibergen 2005);
under the null hypothesis that the model is underidentified,
{it:rk} has a chi-squared distribution.
{marker underid}{...}
{pstd}
The {it:CLR} test statistic has a non-standard distribution.
For the case of i.i.d. linear models with a single weakly-identified endogenous regressor,
{opt weakiv} uses the fast and accurate algorithm
implemented by Mikusheva and Poi (2006).
The default behavior of {opt weakiv} for non-i.i.d. and nonlinear models
with a single weakly-identified endogenous regressor
is to use the same algorithm to obtain p-values for the {it:CLR} test;
although this is not the correct p-value function for these cases,
the simulations by Finlay and Magnusson (2009) suggest it provides
a good approximation.
For all models with multiple endogenous regressors,
{opt weakiv} obtains the p-value by simulation;
the seed for the random number generator is temporarily set to the value 12345
so that the resulting p-values are replicable.
The default number of simulations is 10,000;
this can be altered using the {opt clrsims(#)} option.
This option can also be used to override the default behavior
of the i.i.d.-linear-model algorithm
with single-endogenous regressor models.
A larger number of simulations will give a more accurate p-value
but can slow execution, especially in grid searches.
The simulation method can be turned off completely
by specifying {opt clrsims(0)}.
{pstd}
The {it:K-J} test combines the {it:K} and {it:J} statistics to jointly test
the structural parameter and the exogeneity of the instruments.
It is more efficient than the {it:AR} test and allows different weights or test levels
to be put on the parameter and overidentification hypotheses.
Unlike the {it:K} test,
the {it:K-J} test does not suffer from the problem of spurious power losses.
To perform the {it:K-J} test, the researcher specifies
the significance levels {it:alpha_K} and {it:alpha_J} for
the {it:K} and {it:J} statistics.
Because the {it:K} and {it:J} tests are independent,
the null of the {it:K-J} test is rejected
if either {it:p_K}<{it:alpha_K}
or {it:p_J}<{it:alpha_J},
where {it:p_K} and {it:p_J} are the {it:K} and {it:J} p-values, respectively.
The overall size of the {it:K-J} test is given by (1-(1-{it:alpha_K})*(1-{it:alpha_J})).
{pstd}
The default behavior of {opt weakiv} is for the user to choose
the overall size of the {it:K-J} test
and the weights {it:kwt} and (1-{it:kwt}) to put on
the {it:K} and {it:J} components, respectively.
Alternatively, the user may specify
the separate significance levels {it:alpha_K} and {it:alpha_J},
from which the overall {it:K-J} test size and weights are calculated.
The p-value function for the {it:K-J} test is
{it:p=min(p1,p2)}, where
{it:p1=(p_K/kwt)*(1-(1-kwt)*p_K)} and
{it:p2=(p_J/(1-kwt))*(1-kwt*p_J)}.
For large {it:L%} (e.g., 95%),
this is approximately equivalent to
rejecting the null at the {it:(100-L)%} significance level if
{it:K} is greater than the {it:kwt*(100-L)%} critical value or
{it:J} is greater than the {it:(1-kwt)*(100-L)%} critical value.
For example, if {it:kwt}=0.8 and {it:(100-L)}=5%,
then the text rejects if the p-value for the {it:K} test is below 4%
or the p-value for the {it:J} test is below 1%
(because (1-(1-0.04)*(1-0.01))=0.0496 which is approximately 0.05).
{marker closedform}{...}
{pstd}
For the single-endogenous-regressor case,
{opt weakiv} also inverts these tests to obtain and report
weak-instrument-robust confidence intervals and
(with the {opt graph(.)} option),
the corresponding rejection probabilities.
In a graph of rejection probabilities,
an L% confidence interval is readily visualized
as the range of values for {it:b0}
such that the rejection probability for the statistic
lies below a horizontal line drawn at L%.
In the case of estimation of the i.i.d. linear model,
{opt weakiv} uses a closed-form solution for these confidence intervals.
Closed-form solutions for the {it:J} and {it:K-J} tests are unavailable;
to obtain confidence intervals for these tests,
specify the {opt usegrid} option.
In all other specifications (nonlinear or non-i.i.d.),
{opt weakiv} estimates confidence intervals by grid search.
{pstd}
For the 2-endogenous-regressors case,
{opt weakiv} uses a grid search and graphical methods
to report the corresponding confidence regions
and rejection probabilities.
In this case, the rejection probabilities form a 3-D rejection surface
where {it:beta1}, the coefficient on endogenous regressor 1, is plotted against the x-axis,
{it:beta2}, the coefficient on endogenous regressor 2, is plotted against the y-axis,
and the rejection probability is plotted against the z-axis (vertical axis).
An L% confidence region is the set of values
for {it:b1} and {it:b2} such that
the null hypothesis {it:H0:beta1=b1 and beta2=b2} cannot be rejected.
In a 3-D graph of the rejection probability surface,
an L% confidence region is readily visualized
as the range of values for {it:b1} and {it:b2}
such that the rejection surface lies below a horizontal plane drawn at L%.
{pstd}
{opt weakiv} can also accommodate models with multiple endogenous regressors,
where some coefficients are weakly identified and some are strongly identified.
{opt weakiv} supports this via the {opt strong(.)}, {opt subset(.)},
{opt project(.)} and {opt project2(.)} options.
{pstd}
The {opt strong(.)} option is available for linear models only.
In effect, the strongly-identified coefficients are removed from the testing,
and {opt weakiv} reports tests, confidence intervals, graphs etc.
for the remaining weakly-identified coefficient(s).
Testing in this case follows the method of Kleibergen (2004);
see Mikusheva (2013) for a concise description.
Briefly, the method uses the standard formulae for weak-identification-robust statistics,
but for the strongly-identified coefficients replaces hypothesized values
with estimates obtained under the null from an efficient estimator.
The default efficient estimators are IV in the i.i.d. case
and 2-step efficient GMM in the non-i.i.d. case;
the {opt cuestrong} replaces these with
the LIML and CUE estimators, respectively
(note that the CUE estimator requires numerical optimization
and grid searches in particular will be slow with this option).
To obtain confidence intervals and rejection probabilities,
the procedure is repeated for each hypothesized value
of the weakly-identified coefficient(s).
Note that if a specific null hypothesis is to be tested
using the {opt null(numlist)} option,
the parameter values in {it:numlist} correspond to
the weakly-identified-coefficients only.
{pstd}
{marker SSAR}{...}
The {opt subset(.)} option implements the subset AR test of Guggenberger et al. (2012).
They show that a weak-instrument-robust AR test
of a subset of endogenous regressors can be performed by,
in effect, using LIML to estimate the coefficients on the regressors not in the subset.
This test is available for i.i.d. linear models only.
{pstd}
The {opt project(.)} option implements projection-based
confidence intervals for the listed weakly-identified endogenous variables.
A projection-based test for a coefficient {it:H0:beta=b0} rejects the null
if the test statistic exceeds the test critical value
for every configuration of hypothesized coefficients
on all the other weakly-identified regressors.
It is conservative in the sense that it has asymptotic size
less than or equal to nominal size.
Intuitively, whereas a standard correctly-sized test
at the 5% significance level
will commit a Type I error 5% of the time,
a conservative projection-based test
will commit a Type I error at most 5% of the time.
See Guggenberger et al. (2012) for a discussion and references.
The projection-based confidence intervals implemented by
{opt weakiv} require grid search.
To get an accurate projection-based CI for a variable,
the user should specify a suitably large number of grid points
in that dimension.
{pstd}
For models with 3 or more weakly-identified endogenous regressors,
the {opt project2(var1 var2)} option implements projection-based
confidence sets for the 2 listed weakly-identified endogenous variables.
These confidence sets can be graphed with the {opt graph(.)} option.
{pstd}
To include exogenous regressors in the hypotheses tested,
the {opt testexog(.)} option can be used.
This also allows the user to construct and graph confidence sets
where one regressor is endogenous and one is exogenous.
Projection-based CIs for exogenous regressors can also be calculated.
These exogenous regressors are treated in the same way
as the possibly-weakly-identified endogenous regressors;
the only difference with the tests discussed above is that
an exogenous variable that is a regressor
and the coefficient for which appears in the null hypothesis
also appears in the orthogonality conditions {it:E(Zu)=0}.
{pstd}
The {it:K} and {it:CLR} confidence intervals and sets
are centered around the point estimates from the CUE
(continuously-updated GMM) estimator;
in the iid case, the CUE estimator reduces to the LIML estimator.
The CUE estimates cannot be used directly for inference,
but provide a useful reference point.
The {opt cuepoint} option requests that CUE estimates
for the weakly-identified endogenous regressors are reported
and included as points in the grid search
(if a grid is used, and only if the CUE estimates lie within the grid limits).
This option is primarily for reporting and will have no impact
on the weak-identification-robust tests.
It can be useful in graphing because
the {it:K} and {it:CLR} statistics are zero at the CUE point estimates
and the {opt cuepoint} option will guarantee
that this is visible in graphs of rejection probabilities and surfaces.
The option is available for linear models only.
The default CUE is the standard GMM CUE;
if Wald/MD tests are requested instead of the default LM tests,
the point estimates are those of the CUE-MD estimator
described in Magnusson (2010).
In both cases, the exogenous regressors are partialled out
before the CUE estimates are calculated.
(NB: In exactly-identified models,
the CUE, LIML and IV estimators coincide.)
{pstd}
{marker method}{...}
The LM and MD weak-identification-robust tests implemented by {opt weakiv} are due to
Anderson and Rubin (1949),
Kleibergen (2002, 2005), Moreira (2003),
Magnusson (2010) and Guggenberger et al. (2012).
For the linear models supported by {opt weakiv},
including the panel data fixed effects, first differences and dynamic panel data models,
the MD versions are equivalent to Wald versions of these tests.
In the construction of the LM versions of the tests,
any exogenous regressors are first partialled out.
For further discussion of these tests,
see Finlay and Magnusson (2009),
Kleibergen (2002, 2005),
Mikusheva (2013),
Moreira (2003),
Chernozhukov and Hansen (2005, 2008),
Magnusson (2010),
and the references therein.
{marker interpretation1}{...}
{title:Summary interpretations of {it:weakiv} output: 1 endogenous regressor}
{pstd}
The following summarizes what the various statistics assume and test
in the single-endogenous-regressor case, and how to interpret the results.
The interpretations are similar for the mulitiple-endogenous-regressor case
when only one coefficient is weakly identified.
The structural parameter, {it:beta}, is the coefficient on the endogenous regressor;
{it:b0} is a hypothesized value for {it:beta};
the excluded instruments are {it:Z};
the assumption that the instruments are exogenous is {it:E(Zu)=0}.
Roughly speaking,
a well-specified model is one in which
{it:H0:beta=b0} cannot be rejected for a narrow range of hypothesized values {it:b0}
and the assumption of instrument exogeneity
cannot be rejected for a wide range of hypothesized values {it:b0}
(i.e., the exogeneity assumption is generally satisified).
{marker cset}{...}
{pstd}
An {it:L%} confidence interval is
the range of {it:b0} such that the rejection probability (=1-{it:pvalue}) is below {it:L%}.
Users can use the {it:graph(.)} option to plot rejection probabilities.
The confidence intervals reported by {opt weakiv} can, in overdidentified models,
be empty, disjoint (composed of unconnected segments), open-ended,
or cover the entire range of possible values for {it:beta}.
An empty confidence interval (null set) means there is no possible value {it:b0}
that is consistent with the model;
this is an indication of misspecification when {it:L%} is fairly high.
Disjoint confidence intervals arise when the plot of the rejection probability
dips below {it:L%} in more than one range;
an example is when the {it:K} statistic has inflection points or local minima
that cause spurious power losses
(inspection of a graph of the {it:K} rejection probability is a way of detecting this).
Open-ended confidence intervals commonly arise when the grid
does not extend far enough to capture the point where the rejection probability
crosses above the {it:L%} line.
Interpretation of an {it:L%} confidence interval that covers
the entire grid range of possible values for {it:beta}
depends on the null hypothesis tested:
if the null hypothesis is {it:H0:beta=b0},
it suggests the parameter {it:beta} is poorly identified or unidentified;
if the null hypothesis is {it:H0:E(Zu)=0},
it suggests the exogeneity conditions are generally satisfied.
{pstd}
Summary of specific tests:
{marker CLR}{...}
{p2col 5 11 12 0: {it:CLR}}
The null hypothesis is {it:H0:beta=b0}.
The exogeneity conditions {it:E(Zu)=0} are assumed to be satisfied.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the null hypothesis {it:beta=b0} cannot be rejected at the {it:(100-L)%} significance level.
{p_end}
{marker K}{...}
{p2col 5 11 12 0: {it:K}}
The null hypothesis is {it:H0:beta=b0}.
The exogeneity conditions {it:E(Zu)=0} are assumed to be satisfied.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the null hypothesis {it:beta=b0} cannot be rejected at the {it:(100-L)%} significance level.
{p_end}
{marker J}{...}
{p2col 5 11 12 0: {it:J}}
The null hypothesis is {it:H0:E(Zu)=0}.
The structural parameter is assumed to be {it:beta=b0}.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the null hypothesis {it:E(Zu)=0} cannot be rejected at the {it:(100-L)%} significance level.
Note the differences in the null hypothesis
and interpretation of confidence intervals
vs. the {it:CLR} and {it:K} statistics.
{p_end}
{marker K-J}{...}
{p2col 5 11 12 0: {it:K-J}}
(usage 1, specifying overall test level and weights on {it:K} and {it:J})
The null hypothesis is
{it:H0:beta=b0} {it:and} {it:H0:E(Zu)=0}.
For a test at significance {it:(100-L)%}
with weights on the {it:K} and {it:J} tests
of {it:kwt} and {it:(1-kwt)}, respectively,
the null is rejected if
{it:either}
(a) {it:K} is greater than the critical value for
a test at the {it:kwt*(100-L)%} significance level,
{it:or}
(b) {it:J} is greater than the critical value for
a test at the {it:(1-kwt)*(100-L)%} significance level.
(This is interpretation is an approximation;
see the text above for the exact definition.)
An {it:L%} confidence interval is the set of all values {it:b0} such that
the composite null hypothesis cannot be rejected at the {it:(100-L)%} significance level.
{p_end}
{marker K-J}{...}
{p2col 5 11 12 0: {it:K-J}}
(usage 2, specifying test levels separately for {it:K} and {it:J})
The null hypothesis is
{it:H0:beta=b0} {it:and} {it:H0:E(Zu)=0}.
For a test at significance {it:alpha_K} for the {it:K} test
and {it:alpha_J} for the {it:J} test,
the null is rejected if
{it:either}
(a) {it:K} is greater than the critical value for
a test at the {it:alpha_K} significance level,
{it:or}
(b) {it:J} is greater than the critical value for
a test at the {it:alpha_J} significance level.
The significance level for the overall test is
(1-(1-{it:alpha_K})*(1-{it:alpha_J))}.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the composite null hypothesis cannot be rejected at the {it:(100-L)%} significance level.
{p_end}
{marker AR}{...}
{p2col 5 11 12 0: {it:AR}}
The null hypothesis is
{it:H0:beta=b0} {it:and} {it:H0:E(Zu)=0}.
For a test at significance {it:(100-L)%}
the null is rejected if
{it:either} (a) {it:H0:beta<>b0},
{it:or} (b) {it:H0:E(Zu)<>0}.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the composite null hypothesis cannot be rejected at the {it:(100-L)%} significance level.
{p_end}
{marker Wald}{...}
{p2col 5 11 12 0: {it:Wald}}
The null hypothesis is
{it:H0:beta=b0}.
Identification of {it:beta} in the IV estimation is assumed to be strong.
An {it:L%} confidence interval is the set of all values {it:b0} such that
the null hypothesis cannot be rejected at the {it:(100-L)%} significance level.
{marker interpretation2}{...}
{title:Summary interpretations of {it:weakiv} output: 2 endogenous regressors}
{pstd}
The following summarizes what the various statistics assume and test
in the two-endogenous-regressors case, and how to interpret the results.
The interpretations are similar for the mulitiple-endogenous-regressor case
when only two coefficients are weakly identified.
The structural parameters, {it:beta1} and {it:beta2},
are the coefficients on the endogenous regressors;
{it:b1} and {it:b2} are hypothesized values for {it:beta1} and {it:beta2}, respectively.
Roughly speaking,
a well-specified model is one in which
{it:H0:beta1=b1 and beta2=b2} cannot be rejected
for a narrow range of hypothesized values {it:b1} and {it:b2}
(i.e., {it:beta1} and {it:beta2} are precisely estimated}
and the assumption of instrument exogeneity
cannot be rejected for a wide range of hypothesized values {it:b1} and {it:b2}
(i.e., the exogeneity assumption is generally satisified).
{pstd}
An 2-dimensional confidence set is a straightforward extension
of a 1-dimensional confidence interval.
An {it:L%} confidence set is
the range of {it:b1} and {it:b2} such that
the rejection probability (=1-{it:pvalue}) is below {it:L%}.
{opt weakiv} uses graphical methods to report confidence sets and rejection probabilities;
these are specified using the {it:graph(.)} option.
A confidence set is graphed in x-y space as a contour plot
using Stata 12's {helpb graph twoway contour}.
Up to 3 confidence levels can be specified using the {it:levels(.)} option;
these will be plotted as lower/higher contours in the contour plot.
The rejection probability is graphed in x-y-z space using Mander's (2005) {helpb surface};
the contours plotted by {helpb contour} are the contours of this surface.
The confidence sets plotted by {opt weakiv} can, in overdidentified models,
be empty, disjoint (composed of unconnected regions), open-ended, etc.
An empty confidence set (null set) means
there is no possible combination of values {it:b1} and {it:b2}
that is consistent with the model;
this is an indication of misspecification when {it:L%} is fairly high.
Disjoint confidence regions arise when the rejection probability surface
dips below {it:L%} in more than one range.
Open-ended confidence regions commonly arise when the grid
does not extend far enough to capture the point where the rejection probability
crosses above the {it:L%} plane.
{marker project}{...}
{title:Summary interpretations of {it:weakiv} output: Projection-based inference for 2+ endogenous regressors}
{pstd}
A standard {it:L%} confidence interval is properly sized if
the probability that it will contain the true value of the parameter beta is {it:L%}.
By contrast, the probability that a projection-based {it:L%} confidence interval
will contain the true value of the parameter beta is {it:at least} {it:L%}.
In this sense the projection-based confidence intervals reported by {opt weakiv} are conservative.
Projection-based intervals for the 2-endogenous-regressor case are easily visualized.
For a 2-dimensional confidence set plotted in x-y space,
the projection-based confidence interval for variable x
is the range of the x axis where some part of the confidence set lies above or beneath;
the projection-based confidence interval for variable y
is the range of the y axis where some part of the confidence set lies to the left or to the right.
See below for an example.
{pstd}
A 2-dimensional projection-based confidence set for two endogenous variables
is an extension of the above:
the probability that a {it:L%} 2-D projection-based confidence set
contains the true values of the parameters beta1 and beta2
is {it:at least} {it:L%}.
{marker options}{...}
{title:Options}
{marker model_options}{...}
{dlgtab:Model options (when used for standalone estimation)}
{phang} {it:iv_cmd} specifies the IV estimator to use:
{helpb ivregress}, {helpb ivreg2}, {helpb ivreg2h},
{helpb xtivreg}, {helpb xtivreg2}, {helpb xtabond2}, {helpb ivprobit} or {helpb ivtobit}.
This option is valid only when {opt weakiv}
is used for standalone estimation and the details of the model
are also provided; see {help weakiv##syntax:Syntax} above.
{marker test_options}{...}
{dlgtab:Testing}
{phang} {opt null(numlist)} specifies the null hypothesis for the coefficient on the
endogenous variable(s) in the IV model. The default is
{cmd:null(0)} for 1 weakly-identified coefficient,
{cmd:null(0 0)} for 2 weakly-identified coefficients, etc.
{phang} {opt kwt(#)} is the weight put on the {it:K} test statistic in the {it:K-J} test.
The default is {opt kwt(0.8)}.
It may not be used with the {opt kjlevel(#k #j)} option; see below.
{phang} {opt lm} specifies that LM tests instead of
the default Wald/Minimum Distance tests are reported (linear models only).
{phang} {opt strong(varlist)} specifies that,
in a multiple-endogenous-regressor estimation,
the coefficient(s) on the endogenous regressor(s) in {it:varlist}
is (are) strongly-identified (linear models only).
Tests and graphs are reported for the weakly-identified regressor(s) only.
{phang} {opt cuestrong} requests, for i.i.d. linear models,
that the LIML estimatior is used for the strongly-identified coefficients
in preference to the default IV estimator;
and for non-i.i.d. models,
that the CUE estimator is used in preference to
the default 2-step efficient GMM estimator.
The CUE estimator requires numerical optimization methods
and is noticably slower than 2-step GMM;
this will be particularly noticable when a grid search is used.
{phang} {opt subset(varlist)} specifies that,
in a multiple-endogenous-regressor estimation,
the weak-identification-robust subset AR test
is reported for the endogenous regressor(s)
in {it:varlist} (i.i.d. linear models only).
{phang} {opt clrsims(#)} specifies the number of draws to be used
in obtaining p-values for the {it:CLR} test by simulation.
The default is to use the linear-i.i.d.-single-endogenous-regressor
algorithm for models with one endogenous variable,
and 10,000 simulations otherwise.
The simulation method can be turned off by specifying {opt clrsims(0)}.
{phang} {opt small} specifies that small-sample adjustments be made when test
statistics are calculated for linear IV estimation.
When used in standalone estimation by {opt weakiv},
the default is not to employ small-sample adjustments.
When used after estimation of linear models,
the default is given by whatever small-sample
adjustment option was chosen in the IV command.
Small-sample adjustments are always made for IV probit and IV tobit estimation.
The default small-sample adjustment is N/(N-L)
where L is the number of exogenous variables (regressors and instruments);
for the fixed effects estimator, L includes the number of fixed effects;
if a cluster-robust VCE is used, the small-sample adjustment is
N_clust/(N_clust-1)*(N-1)/(N-L),
where N_clust is the number of clusters.
{phang} {opt eq(diff/lev/sys)} is specific to estimation by {helpb xtabond2}.
This requests that only the specified equation(s) (differences, levels, or both)
are used for weak-identification-robust testing.
{marker ci_options}{...}
{dlgtab:Confidence interval estimation}
{phang} {opt usegrid} specifies that a grid search is conducted.
This will override the default analytic solution for
constructing confidence intervals for the i.i.d. linear model.
Under the other models, grid-based estimation
is the only available method for constructing confidence sets.
{phang} {opt noci} requests that confidence intervals not be estimated/reported.
Grid-based test inversion can be time-intensive,
so this option can save time if a grid search is not required,
either because confidence intervals are not needed
or because a closed-form solution for confidence intervals is available
(the i.i.d. linear model only).
{phang} {opt project(varlist)} requests reporting,
for a multiple-endogenous-regressor estimation,
conservative projection-based confidence intervals
for the endogenous regressor(s) in {it:varlist}.
{opt project(_all)} requests reporting these
for all weakly-identified coefficients.
{phang} {opt project2(var1 var2)} requests construction,
for a multiple-endogenous-regressor estimation,
a 2-dimensional projection-based confidence set
for endogenous regressors {it:var1} and {it:var2}.
This confidence set can be graphed using the {opt graph(.)} option.
{marker gridpoints}{...}
{phang} {opt gridpoints(numlist)} specifies the number of equally spaced values over
which to calculate the confidence sets.
The default number of gridpoints is
100, 25, 11, 7 and 5
for the cases of 1, 2, 3, 4 and 5 endogenous regressors, respectively.
If more than 5 endogenous regressors are specified,
the number of gridpoints must be explicitly provided by the user,
e.g., for 6 endogenous regressors the user can specify {cmd:gridpoints(5 5 5 5 5 5)}.
The number of gridpoints can get easily get large in this case;
in this example, the total number of grid points searched is 5^6 = 15,625.
A large number of grid points will increase the required computation time,
but a greater number of grid points
will improve the precision of both graphs and confidence intervals.
{pmore} {bf:Note:} The default grid is centered around the Wald point estimate
(or the CUE point estimate if the {opt cuepoint} estimate is specified)
with a width equal to twice the Wald confidence interval.
With weak instruments,
this may often be too small a grid to estimate the confidence intervals and sets.