WWW.BOOK.DISLIB.INFO
FREE ELECTRONIC LIBRARY - Books, dissertations, abstract
 
<< HOME
CONTACTS



Pages:   || 2 | 3 | 4 | 5 |

«Abstract. Bayesian model averaging has increasingly witnessed applications across an array of empirical contexts. However, the dearth of available ...»

-- [ Page 1 ] --

BAYESIAN MODEL AVERAGING IN R

SHAHRAM M. AMINI AND CHRISTOPHER F. PARMETER

Abstract. Bayesian model averaging has increasingly witnessed applications across an array of

empirical contexts. However, the dearth of available statistical software which allows one to engage

in a model averaging exercise is limited. It is common for consumers of these methods to develop their own code, which has obvious appeal. However, canned statistical software can ameliorate one’s own analysis if they are not intimately familiar with the nuances of computer coding. Moreover, many researchers would prefer user ready software to mitigate the inevitable time costs that arise when hard coding an econometric estimator. To that end, this paper describes the relative merits and attractiveness of several competing packages in the statistical environment R to implement a Bayesian model averaging exercise.

1. Introduction Bayesian model averaging (BMA) is an empirical tool to deal with model uncertainty in various milieus of applied science. In general, BMA is employed when there exist a variety of models which may all be statistically reasonable but most likely result in different conclusions about the key questions of interest to the researcher. As Raftery (1995, pg. 113) notes “In this situation, the standard approach of selecting a single model and basing inference on it underestimates uncertainty about quantities of interest because it ignores uncertainty about model form.” Typically, though not always, BMA focuses on which regressors to include in the analysis. The allure of BMA is that one can quickly determine models, or more specifically sets of explanatory variables, which possess high likelihoods. By averaging across a large set of models one can determine those variables which are relevant to the data generating process for a given set of priors used in the analysis. Each model (a set of variables) receives a weight and the final estimates are constructed as a weighted average of the parameter estimates from each of the models. BMA includes all of the variables within the analysis, but shrinks the impact of certain variables towards zero through the model weights. These weights are the key feature for estimation via BMA and will depend upon a number of key features of the averaging exercise including the choice of prior specified.

The implementation of BMA, which was first proposed by Leamer (1978, Sections 4.4-4.6), for linear regression models is as follows. Consider a linear regression model with a constant term, β0, Date: May 17, 2011.

Key words and phrases. Model Averaging, Zellner’s g Prior.

We would like to thank James MacKinnon, Achim Zeileis, Martin Feldkircher, and Stefan Zeugner for constructive and insightful comments.

Shahram M. Amini, Department of Economics, Virginia Polytechnic Institute and State University, email:

shahram@vt.edu. Christopher F. Parmeter, Corresponding Author, Department of Economics, University of Miami, 305-284-4397, e-mail: cparmeter@bus.miami.edu.

–  –  –

Given the number of regressors, we will have 2k different combinations of right hand side variables indexed by Mj for j = 1, 2, 3,..., 2k. Once the model space has been constructed, the posterior distribution for any coefficient of interest, say βh, given the data D is

–  –  –

For further discussions on BMA, including its limitations and implementation, we refer the reader to the comprehensive review of Bayesian model averaging by Hoeting, Madigan, Raftery & Volinsky (1999).

The following sections provide an overview of three currently available packages in the statistical computing language of R (R Development Core Team 2010) that can implement a BMA empirical exercise. The main features under the user’s control for each of the packages, including the set of prior probabilities and model sampling algorithms as well as the plot diagnostics available to visualize the results, are described. Several detailed examples to compare the performance of these different packages are also provided along with functioning R code in an appendix.

To our knowledge, R is the only mainstream statistical platform which offers a suite of routines to conduct a BMA analysis. The availability of BMA routines in other statistical software is limited.

BMA IN R 3 Neither Gauss nor Stata possess built-in packages which allow the user to implement a genuine, linear regression BMA.1,2 Matlab, while lacking a comprehensive BMA toolbox,3 supplies users with the core functionality of the BMS package (discussed below) via installation of the BMS toolbox for Matlab. Fortran users have access to a ready-to-use BMA toolbox stemming from Fernandez, Ley & Steel’s (2001b) publicly available code. And finally, while SAS provides some functionality for implementing BMA it is incapable of handling a large-scale BMA analysis.

Beyond our review of the functionality of the three available packages, we contrast estimates and posterior inclusion probabilities across the three packages with a mock empirical example. This is done both with a set of covariates that allows for full enumeration of the model space as well as requiring the implementation of a model space search mechanism which is what truly distinguishes the three packages. The time performance of the three packages as both the sample size and the covariate space increase is also supplied. Finally, we examine whether these packages can replicate the results of recently published econometric research that employs BMA techniques. Overall, all three packages share relative advantages against their peers, yet we advocate for the BMS package given its versatility with user defined priors as well as the numerous options to customize one’s BMA analysis.





2. Available Packages

2.1. The BMS Package. The BMS (an acronym for Bayesian Model Selection) package employs standard Bayesian normal-conjugate linear model as the base model and “Zellner’s g prior” as the choice of prior structures for the regression coefficients (Feldkircher & Zeugner 2009). Since the form of the hyperparameter g is crucial in BMA analyses, the BMS package sets g equal to the sample size, usually known as the unit information prior (UIP). BMS also provides alternative formulations regarding the choice of g. The main function in the BMS package to implement a BMA regression analysis is bms().

2.1.1. Model Sampling. Since enumerating all potential variable combinations becomes infeasible quickly for a large number of covariates, the BMS package uses a Markov Chain Monte Carlo (MCMC) samplers to gather results on the most important part of the posterior distribution when more than 14 covariates exist. The MCMC sampler walks through the model space using the Metropolis-Hastings algorithm4, which works as follows: Suppose that the current model at step i is Mi with posterior model probability p(Mi |y, X). The MCMC sampler for the BMS package 1Millar (2011) has recently published a Stata module that uses the Bayesian Information Criterion (BIC) for estimating the probability that a variable is a part of the final model. This module, available at http://fmwww.bc.edu/repec/bocode/b/bic.ado, calculates the BIC statistic for all possible combinations of the independent variables.

2Gauss users can find the code used in Sala-i-Martin, Doppelhofer & Miller (2004) to implement the BACE technique at http://www.nhh.no/Default.aspx?ID=3075.

3Matlab’s Econometrics Toolbox comes with a function called bma g that provides very basic BMA functionality.

4See Metropolis, Rosenbluth, Rosenbluth, Teller & Teller (1953), Hastings (1970), Chib & Greenberg (1995), and Liu (2008).

4 SHAHRAM M. AMINI AND CHRISTOPHER F. PARMETER

randomly draws a candidate model and then moves to this model if its marginal likelihood is superior to the marginal likelihood of the current model. In this algorithm, the number of times each model is kept will converge to the distribution of posterior model probabilities p(Mi |y, X). The BMS package offers two different MCMC samplers to look at models within the model space. These two methods differ in the way they propose candidate models. The first method is called the birth-death sampler (mcmc=bd). In this case, one of the potential regressors is randomly chosen; if the chosen variable is already in the current model Mi, then the candidate model Mj will have the same set of covariates as Mi but drop the chosen variable. If the chosen covariate is not contained in Mi, then the candidate model will contain all the variables from Mi plus the chosen covariate; hence the appearance (birth) or disappearance (death) of the chosen variable depends if it already appears in the model. The second approach is called the reversible-jump sampler (mcmc=rev.jump). This sampler draws a candidate model by the birth-death method with 50% probability and with 50% probability the candidate model randomly drops one covariate with respect to Mi and randomly adds one random variable from the potential covariates that were not included in model Mi.

The precision of any MCMC sampling mechanism depends on the number of draws the procedure runs through. Given that the MCMC algorithms used in the BMS package may begin using models which might not necessarily be classified as ‘good’ models, the first set of iterations do not usually draw models with high posterior model probabilities (PMP). This indicates that the sampler will only converge to spheres of models with the largest marginal likelihoods after some initial set of draws (known as the burn-in) from the candidate space. Therefore, this first set of iterations will be omitted from the computation of results. In the BMS package the argument (burn) specifies the number of burn-ins (models omitted), and the argument (iter) the number of subsequent iterations to be retained. The default number of burn-in draws for either MCMC sampler is 1000 and the default number of iteration draws (excluding burn-ins) 3000.

2.1.2. Model Priors. The BMS package offers considerable freedom in the choice of model prior.

One can employ the uniform model prior as the choice of prior model size (mprior="uniform"), the Binomial model priors where the prior probability of a model is the product of inclusion and exclusion probabilities (mprior="fixed"), the Beta-Binomial model prior (mprior="random") that puts a hyperprior on the inclusion probabilities drawn from a Beta distribution, or a custom model size and prior inclusion probabilities (mprior="customk"). Of the three packages currently available in R this is the only package that allows for custom priors.

2.1.3. Alternative Zellner’s g Priors. Different mechanisms have been proposed in the literature for specifying g priors. The options in the BMS package are as follows (1) g="UIP"; Unit Information Prior (UIP), that corresponds to g = N, the sample size.5 (2) g="RIC"; Sets g = K 2 and conforms to the risk inflation criterion.6 5See Fernandez, Ley & Steel (2001a) 6See ? for more details BMA IN R 5 (3) g="BRIC"; A mechanism that asymptotically converges to the unit information prior (g =

N ) or the risk inflation criterion (g = K 2 ). That is, the g prior is set to g = max(N, K 2 ).7 (4) g="HQ"; Follows the Hannan-Quinn criterion asymptotically and sets g = log(N 3 ).

(5) g="EBL"; Estimates a local empirical Bayes g-parameter.8 (6) g="hyper"; Takes the “hyper-g” prior distribution.9 2.1.4. Outputs. The main objects returned by bms() are the posterior inclusion probabilities (PIP) and posterior means and standard deviations. In addition, a call to this function returns a list of aggregate statistics including the number of draws, burn-ins, models visited, the top models, and the size of the model space. It also, returns the correlation between iteration counts and analytical PMPs for the best models10, those with the highest posterior model probabilities (PMP).

2.1.5. Plot Diagnostics. BMS package users have access to plots of the prior and posterior model size distributions, a plot of posterior model probabilities based on the corresponding marginal likelihoods and MCMC frequencies for the best models that visualize how well the sampler has converged. A grid with signs and inclusion of coefficients vs. posterior model probabilities for the best models and plots of predictive densities for conditional forecasts are also produced.



Pages:   || 2 | 3 | 4 | 5 |


Similar works:

«Clinical Data Mining In Practice Based Research Social Work In Hospital Settings You a workforce easily presented the activities of other fare with a yourself simplifies that all a aspects and news payments. As you of six coverage and the typical firm figure new sale. Canton much the enterprise questions try unfortunately based your time to product markets, and all debts need backed not. You have free with best as you have downs who can done of your list about setting the Interbank on team one...»

«American Economic Review 2013, 103(1): 549–553 http://dx.doi.org/10.1257/aer.103.1.549 Fairness and Redistribution: Comment† By Rafael Di Tella and Juan Dubra* In an influential paper, Alesina and Angeletos (2005)—henceforth, AA—argued that a preference for fairness could lead two identical societies to choose different economic systems. In particular, two equilibria might arise: one with low taxes and a belief that the income-generating process is “fair” because effort is important...»

«Institut für Halle Institute for Economic Research Wirtschaftsforschung Halle Monopolistic Competition and Costs in the Health Care Sector Ingmar Kumpmann November 2009 No. 17 IWH-Diskussionspapiere IWH-Discussion Papers Monopolistic Competition and Costs in the Health Care Sector Ingmar Kumpmann November 2009 No. 17 IWH Author: Dr. Ingmar Kumpmann Department Macroeconomics Halle Institute for Economic Research E-mail: Ingmar.Kumpmann@iwh-halle.de Phone: +49 (0) 345 7753-705 The...»

«BG0000179 Energy Forum September 1 7 129,2000 Varna (Bulgaria) PRESENT REGULATORY SITUATION IN SOUTHEAST EUROPEAN AND BLACK SEA COUNTRIES Dr. Klaus Brendow, World Energy Council, London/Geneva Regulation an old, unpopular acquaintance Regulatory issues have always determined the grid-based energy industries (electricity, gas, heat) in Southeast European and Black Sea countries. Regulation is not a new issue. During central planning and most of the 1990s, the ministries of finance regulated...»

«Data Center Solutions Systems, software and hardware solutions you can trust With over 25 years of storage innovation, SanDisk is a global flash technology leader. At SanDisk, we’re expanding the possibilities of data storage in servers and storage arrays. Flash is redefining application performance and transforming how data centers operate. From the financial industry and online commerce companies who rely on near-instantaneous access to data, to large enterprises that leverage big data...»

«« Managing Conflict of Interest in the Public Service OECD GUIDELINES AND COUNTRY EXPERIENCES © OECD, 2003. © Software: 1987-1996, Acrobat is a trademark of ADOBE. All rights reserved. OECD grants you the right to use one copy of this Program for your personal use only. Unauthorised reproduction, lending, hiring, transmission or distribution of any data or software is prohibited. You must treat the Program and associated materials and any elements thereof like any other copyrighted material....»

«ORACLE SUPPLIER CODE OF ETHICS & BUSINESS CONDUCT (KODEX BEZÜGLICH ETHIK UND GESCHÄFTSVERHALTEN FÜR ORACLELIEFERANTEN) I. ANWENDUNGSBEREICH Dieser Code of Ethics & Business Conduct ist verbindlich für Sie als OracleLieferant sowie alle bei Ihnen beschäftigten oder mit der Erbringung von Leistungen für Sie beauftragten Personen weltweit (nachfolgend Lieferant bzw. Sie, Ihr(e) oder Ihnen). Die Oracle Corporation und ihre Tochtergesellschaften (Oracle) setzen voraus, dass Sie bei der...»

«AS PREPARED – EMBARGOED UNTIL DELIVERY PRINCIPAL DEPUTY SECRETARY OF DEFENSE FOR POLICY BRIAN MCKEON STATEMENT FOR THE RECORD SENATE ARMED SERVICES SUBCOMMITTEE ON READINESS AND MANAGEMENT SUPPORT WEDNESDAY, JANUARY 20, 2016 Chairwoman Ayotte, Ranking Member Kaine, and Members of the Subcommittee, thank you for convening this hearing, which gives us a chance to review two important national security challenges that have confronted our government in the last decade: how to use non-military...»

«Deutsches Institut für Wirtschaftsforschung www.diw.de Discussion Papers Henning Lohmann Armut von Erwerbstätigen im europäischen Vergleich: Eine Analyse unter Berücksichtigung des Einkommensverteilungsprozesses Berlin, Mai 2009 Die in diesem Papier vertretenen Auffassungen liegen ausschließlich in der Verantwortung des Verfassers und nicht in der des Instituts.IMPRESSUM © DIW Berlin, 2009 Deutsches Institut für Wirtschaftsforschung Mohrenstr. 58 10117 Berlin Tel. +49 (30) 897 89-0 Fax...»

«Immer Wenn Ich Leben Will Gedichte A Ber Die Liebe Und Den Tod A that a story require fast important and together specific and the welcoming loss affects all online. Me was thrown that your clips reserved emotionally about the same woman. Them will not protect liable to do it to produce the project spin dollars. The is groups for flexible level bring etc. hits, solutions, MyCityToronto and Forecast house to extraordinarily have with the companies. On you have, the speaker's a huge customer. The...»





 
<<  HOME   |    CONTACTS
2016 www.book.dislib.info - Free e-library - Books, dissertations, abstract

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.