Working Papers

Yurinskii's coupling is a popular tool for finite-sample distributional approximation in mathematical statistics and applied probability, offering a Gaussian strong approximation for sums of random vectors under easily verified conditions with an explicit rate of approximation. Originally stated for sums of independent random vectors in ℓ2-norm, it has recently been extended to the ℓp-norm, where 1≤p≤∞, and to vector-valued martingales in ℓ2-norm under some rather strong conditions. We provide as our main result a generalization of all of the previous forms of Yurinskii's coupling, giving a Gaussian strong approximation for martingales in ℓp-norm under relatively weak conditions. We apply this result to some areas of statistical theory, including high-dimensional martingale central limit theorems and uniform strong approximations for martingale empirical processes. Finally we give a few illustrative examples in statistical methodology, applying our results to partitioning-based series estimators for nonparametric regression, distributional approximation of ℓp-norms of high-dimensional martingales, and local polynomial regression estimators. We address issues of feasibility, demonstrating implementable statistical inference procedures in each section.

The density weighted average derivative (DWAD) of a regression function is a canonical parameter of interest in economics. Classical first-order large sample distribution theory for kernel-based DWAD estimators relies on tuning parameter restrictions and model assumptions leading to an asymptotic linear representation of the point estimator. Such conditions can be restrictive, and the resulting distributional approximation may not be representative of the underlying sampling distribution of the statistic of interest, in particular not being robust to bandwidth choices. Small bandwidth asymptotics offers an alternative, more general distributional approximation for kernel-based DWAD estimators that allows for, but does not require, asymptotic linearity. The resulting inference procedures based on small bandwidth asymptotics were found to exhibit superior finite sample performance in simulations, but no formal theory justifying that empirical success is available in the literature. Employing Edgeworth expansions, this paper shows that small bandwidth asymptotics lead to inference procedures with demonstrable superior higher-order distributional properties relative to procedures based on asymptotic linear approximations.


In the context of treatment effect estimation, this paper proposes a new methodology to recover the counterfactual distribution when there is a single (or a few) treated unit and possibly a high-dimensional number of potential controls observed in a panel structure. The methodology accommodates, albeit does not require, the number of units to be larger than the number of time periods (high-dimensional setup). As opposed to model only the conditional mean, we propose to model the entire conditional quantile function (CQF) in the absence of intervention and estimate it using the pre-intervention period using a penalized regression. We derive non-asymptotic bounds for the estimated CQF valid uniformly over the quantiles, allowing the practitioner to re-construct the entire contractual distribution. Moreover, we bound the probability coverage of this estimated CQF which can be used to construct valid confidence intervals for the (possibly random) treatment effect for every post-intervention period or simultaneously. We also propose a new hypothesis test for the sharp null of no-effect based on the p norm of deviation of the estimated CQF to the population one. Interestingly, the null distribution is quasi-pivotal in the sense that it only depends on the estimated CQF, p norm, and the number of post-intervention periods, but not on the size of the post-intervention period. For that reason, critical values can then be easily simulated. We illustrate the methodology is by revisiting the empirical study in Acemoglu et al (2016).

Fan, Jianqing, et al. “Bridging Factor and Sparse Models”. Revise and Resubmit, Annals of Statistics, ArXiv: Bridging factor and sparse models.

Factor and sparse models are two widely used methods to impose a low-dimensional structure in high dimension. They are seemingly mutually exclusive. In this paper, we propose a simple lifting method that combines the merits of these two models in a supervised learning methodology that allows to efficiently explore all the information in high-dimensional datasets. The method is based on a very flexible linear model for panel data, called factor-augmented regression model with both observable, latent common factors, as well as idiosyncratic components as high-dimensional covariate variables. This model not only includes both factor regression and sparse regression as specific models but also significantly weakens the cross-sectional dependence and hence facilitates model selection and interpretability.
The methodology consists of three steps. At each step, the remaining cross-section dependence can be inferred by a novel test for covariance structure in high-dimensions. We developed asymptotic theory for the factor-augmented sparse regression model and demonstrated the validity of the multiplier bootstrap for testing high-dimensional covariance structure. This is further extended to testing high-dimensional partial covariance structures. The theory and methods are further supported by an extensive simulation study and applications to the construction of a partial covariance network of the financial returns for the constituents of the S\&P500 index and prediction exercise for a large panel of macroeconomic time series from FRED-MD database.


Optimal pricing, i.e., determining the price level that maximizes profit or revenue of a given product, is a vital task for the retail industry. To select such a quantity, one needs first to estimate the price elasticity from the product demand. Regression methods usually fail to recover such elasticities due to confounding effects and price endogeneity. Therefore, randomized experiments are typically required. However, elasticities can be highly heterogeneous depending on the location of stores, for example. As the randomization frequently occurs at the municipal level, standard difference-in-differences methods may also fail. Possible solutions are based on methodologies to measure the effects of treatments on a single (or just a few) treated unit(s) based on counterfactuals constructed from artificial controls. For example, for each city in the treatment group, a counterfactual may be constructed from the untreated locations. In this paper, we apply a novel high-dimensional statistical method to measure the effects of price changes on daily sales from a major retailer in Brazil. The proposed methodology combines principal components (factors) and sparse regressions, resulting in a method called Factor-Adjusted Regularized Method for Treatment evaluation (FarmTreat). The data consist of daily sales and prices of five different products over more than 400 municipalities. The products considered belong to the sweet and candies category and experiments have been conducted over the years of 2016 and 2017. Our results confirm the hypothesis of a high degree of heterogeneity yielding very different pricing strategies over distinct municipalities.

In this paper, we survey the most recent advances in supervised machine learning (ML) and high-dimensional models for time-series forecasting. We consider both linear and nonlinear alternatives. Among the linear methods, we pay special attention to penalized regressions and ensemble of models. The nonlinear methods considered in the paper include shallow and deep neural networks, in their feedforward and recurrent versions, and tree-based methods, such as random forests and boosted trees. We also consider ensemble and hybrid models by combining ingredients from different alternatives. Tests for superior predictive ability are briefly reviewed. Finally, we discuss application of ML in economics and finance and provide an illustration with high-frequency financial data.

There has been considerable advance in understanding the properties of sparse regularization procedures in high-dimensional models. In time series context, it is mostly restricted to Gaussian autoregressions or mixing sequences. We study oracle properties of LASSO estimation of weakly sparse vector-autoregressive models with heavy tailed, weakly dependent innovations with virtually no assumption on the conditional heteroskedasticity. In contrast to current literature, our innovation process satisfy an L1 mixingale type condition on the centered conditional covariance matrices. This condition covers L1-NED sequences and strong (α-) mixing sequences as particular examples. From a modeling perspective, it covers several multivariate-GARCH specifications, such as the BEKK model, and other factor stochastic volatility specifications that were ruled out by assumption in previous studies.

Recently, there has been growing interest in developing econometric tools to conduct counterfactual analysis with aggregate data when a “treated” unit suffers an intervention, such as a policy change, and there is no obvious control group. Usually, the proposed methods are based on the construction of an artificial counterfactual from a pool of “untreated” peers, organized in a panel data structure. In this paper, we consider a general framework for counterfactual analysis in high dimensions with potentially non-stationary data and either deterministic and/or stochastic trends, which nests well-established methods, such as the synthetic control. Furthermore, we propose a resampling procedure to test intervention effects that does not rely on post-intervention asymptotics and that can be used even if there is only a single observation after the intervention. A simulation study is provided as well as an empirical application where the effects of price changes on the sales of a product is measured.


Masini, Ricardo, and Marcelo Medeiros. “Counterfactual Analysis and Inference With Nonstationary Data”. Journal of Business & Economic Statistics, vol. 40, no. 1, 2020, pp. 227-39, Publishers Version: Counterfactual Analysis and Inference with Nonstationary Data.


We consider a new, flexible and easy-to-implement method to estimate thecausal effects of an intervention on a single treated unit when a control group is not available and which nests previous proposals in the literature. It is a two-step methodology where in the first stage, a counterfactual is estimated based on a large-dimensional set of variables from a pool of untreated units by means of shrinkage methods, such as the least absolute shrinkage and selection operator (LASSO). In the second stage, we estimate the average intervention effect on a vector of variables, which is consistent and asymptotically normal. Our results are valid uniformly over a wide class of probability laws. We show that these results hold even when the exact date of the intervention is unknown. Tests for multiple interventions and for contamination effects are derived. By a simple transformation of the variables, it is possible to test for multivariate intervention effects on several moments of the variables of interest. Existing methods in the literature usually test for intervention effects on a single variable and assume that the time of the intervention is known. In addition, high-dimensionality is frequently ignored and inference is either conducted under a set of more stringent hypotheses and/or by permutation tests. A Monte Carlo experiment evaluates the properties of the method in finite samples and compares it with other alternatives. As an application, we evaluate the effects on inflation, GDP growth, retail sales and credit of an anti tax-evasion program.