Statistics Seminars

Statistics Seminars 2004-2005,
Department of Economics, Pompeu Fabra University

Schedule. Click here to jump to the next seminar

Thursday, September 23, 17:00, room 20.237. Friedrich Pukelsheim (Universitaet Augsburg) From Ramon Llull to biproportional representation
Abstract: The Catalan poet and religious writer Ramon Llull (1232-1316) was among the first to design a formal electoral system. We review Llull's work on the subject, report on our re-discovery of a manuscript of his that was considered lost, and describe our electronic edition of Llull's electoral writings, at "www.uni-augsburg.de/llull". Today's electoral systems pose different problems originating from the universality of the suffrage. We present a system of biproportional representation honoring not only vote counts that parties win through an election, but also population counts that geographical districts put forward based on census data. The system forms part of the new 2003 electoral law for the Swiss Canton Zurich, and has been implemented in our Java program BAZI that is available at "www.uni-augsburg.de/bazi".
Friday, October 22, 17:00, room 20.137. Paul Switzer (Stanford University) Trend Decomposition in Multiple Time Series
Abstract: Regional environmental or meteorological data are commonly available as parallel time series from multiple monitoring stations. Each of the station time series embodies aspects of a common smoothly varying regional time trend, with weights that are specific to each monitoring location. The regional trends are extracted from the station data by identifying linear combinations with maximal structure. The location-specific time-trend coefficients may be regarded as a spatial field. The interpolated spatial field provides a means for inferring the time trend at unmonitored locations.
Tuesday, November 2, 13:30, room 20.137. Gábor Lugosi (UPF) Regret minimization under partial monitoring.
Abstract: We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan consistent players for such games, that is, randomized playing strategies whose per-round regret vanishes with probability one as the number $n$ of game rounds goes to infinity. We prove a general lower bound of $\Omega(n^{-1/3})$ on the convergence rate of the regret, and exhibit a specific strategy that attains this rate on any game for which a Hannan consistent player exists. Joint work with Nicolò Cesa-Bianchi and Gilles Stoltz.
Thursday, November 25, 17:00, room 20.137. Karl G. Jöreskog (Univesity of Uppsala) Factor analysis of ordinal variables: BIML vs. FIML
Abstract
Wednesday, December 1, 17:00, room 20.199. Sarunas Raudys (Institute of Mathematics and Informatics, Vilnius) "Two approaches to consider small sample - high dimensionality problems in pattern recognition."
Abstract: "Statistical" approach supported by C.R.Rao, G. Mc.Lachlan, A.N. Kolmogorov and plethora of researchers around his laboraty of Statistical methods in Moscow University and many others are making certain assumptions about distribution density functions of pattern classes and are deriving analytical expressions for expected error rates which are rather exact, however, valid only if the assumptios are correct. In another approach conditionally represented by V. Vapnik, L.Devroye, G. Lugosgi, S.Amari, D. Haussler, T. Hastie and others, one makes minimal assumptions and derives "error" bounds. There exists a certain misunderstanding between supporters of these two approaches. The speaker in his talk will present a review of results considered in " S. Raudys and D.Young (2004). Results in statistical discriminant analysis: A review of the former Soviet Union literature, Journal of Multivariate Analysis. 89, 1-35", and in this seminar would like to initiate a discussion around these approaches and utilizations of their results in practice.
Monday, December 13, 13:00, room 20.137. Allan Timmermann (UCSD) "International Asset Allocation under Regime Switching, Skew and Kurtosis Preferences" (seminar jointly organized with Business and Finance seminars).
Thursday, December 16, 17:00, room 20.237. Erika Massimiliani (Universita di Bologna) "Multidimensional scaling on nonlinear manifolds" Abstract: We frequently encounter large sets of high-dimensional data, and the consequent problem of dimensionality reduction. The problem is, as the human brain does in everyday perception, to find meaningful low-dimensional structures hidden in the high-dimensional observation space. Many datasets have the observed data lying on an embedded submanifold of the high- dimensional space. The degrees of freedom along this submanifold correspond to the underlying variables. These datasets may contain nonlinear structures that are invisible to classical techniques for dimensionality reduction, such as principal component analysis (PCA) and multidimensional scaling (MDS). These methods are designed to operate when the submanifold is embedded linearly, or almost linearly, in the observation space. More generally there is a wider class of techniques, involving iterative optimization procedures, by which unsatisfactory linear representations obtained by PCA or MDS may be improved towards more successful nonlinear representations of the data. In [1-4] there is a full explanation of new approaches that are recently devised to address this problem. One of them, Isomap, attempts to preserve geometry on the manifold, mapping nearby points on the manifold to nearby points in low-dimensional space, and distant points to distant points. It is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations. It combines the major algorithmic features of PCA and MDS: computational efficiency, global optimality and asymptotic convergence. The algorithm works on the idea that only the geodesic distances (in some works called curvilinear distance) reflect the true low-dimensional geometry of the manifold. A lot of new publications [5-8] depend on this algorithm and new applications [9-10] have been studied. The goal of my research is to provide empirical evidence of its good performance in different scientific areas.
References:
[1] J. B. Tenenbaum, V. De Silva and J. C. Langford (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290 (5500), 2319-2323. https://www.sciencemag.org/cgi/reprint/290/5500/2319.pdf
[2] V. de Silva, J. B. Tenenbaum (2002).Global versus local methods in nonlinear dimensionality reduction. Advances in Neural Information Processing Systems 15. S. Becker, S., Thrun, S., and Obermayer, K. (eds). Cambridge, MIT Press, 705-712. https://web.mit.edu/cocosci/Papers/nips02-localglobal-in-press.pdf
[3] L. Saul and S. Roweis (2002). Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4, 119-155. https://www.cs.toronto.edu/~roweis/papers/lle_tr02.pdf
[4] S. T. Roweis and L. K. Saul (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290 (5500), 2323-2326. https://www.sciencemag.org/cgi/reprint/290/5500/2323.pdf
[5] J. A. Lee, A. Lendasse , M. Verleysen (2004). Nonlinear projection with curvilinear distances: Isomap versus curvilinear distance analysis. Neurocomputing, 57, 49- 76. https://www.dice.ucl.ac.be/~verleyse/papers/neurocom04jl.pdf
[6] Y. Bengio, J. Paiement, P. Vincent, O. Delalleau, N. Le Roux, M. Ouimet (2003). Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering. NIPS 2003. https://www.iro.umontreal.ca/~lisa/pointeurs/tr1238.pdf
[7] J. Nilsson, T. Fioretos, M. H?glund and M. Fontes (2004). Approximate geodesic distances reveal biologically relevant structures in microarray data. Bioinformatics, 20 (6), 874-880. https://lifesciences.asu.edu/bio494/mrosenberg/Nov08-2.pdf
[8] M-H Yang (2002). Extended Isomap for Pattern Classification. Proceedings of the Eighteenth National Conference on Artificial Intelligence (AAAI 2002), pp. 224-229, https://vision.ai.uiuc.edu/mhyang/papers/aaai02.pdf
[9] D. J. Navarro, M. D. Lee (2001). Spatial Visualization of Document Similarity. Defence Human Factors Special Interest Group Meeting, 16-17 August, 2001 https://quantrm2.psy.ohio- state.edu/Navarro/visual.pdf
[10] I. S. Lim, P. H. Ciechomski, S. Sarni, D. Thalmann (2003). Planar arrangement of High-dimensional Biomedical Data Sets by Isomap Coordinates. Proceedings of the 16th IEEE Symposium on Computer-Based Medical Systems (CBMS 2003) https://ligwww.epfl.ch/Publications/pdf/Lim_and_al_CBMS_03.pdf
Thursday, January 20, 17:00, room 20.137. Stefan Hoderlein (University of Mannheim) "Nonparametric Demand Analysis in a Heterogeneous Population using LPR based Estimators" Abstracts of the two papers covered in the seminar: 1. This paper is concerned with empirically modelling the demand behavior of a population with heterogeneous preferences under a weak conditional independence assumption. More specifically, we characterize the testable implications of negative semidefiniteness and symmetry of the Slutsky matrix across a heterogeneous population without assuming anything on the functional form of individual preferences. In the same spirit, implications of a linear budget set are being considered. Since the conditional independence assumption is the only substantial restriction in this model, we analyze possible alternatives and solutions if this assumption is violated. In particular, we consider in detail the concept of instruments in this framework. Finally, we provide asymptotic distribution theory for the new test statistics that emerge out of this framework, and apply these to Canadian data. 2. In this paper, we introduce a Kernel based estimation principle for nonparametric models named local partitioned regression. This principle is a nonparametric generalization of the familiar partition regression in linear models. It has several key advantages: First, it generates estimators for a very large class of semi- and nonparametric models. A number of examples which are particularly relevant for economic applications will be discussed in this paper. This class contains the additive, partially linear and varying coefficient models as well as several other models that have not been discussed in the literature. Second, LPR based estimators generally achieve optimality criteria: They have optimal speed of convergence and are oracle-efficient. Moreover, they are simple in structure, widely applicable and computationally inexpensive. The LPR estimation principle involves preestimation of conditional expectations and derivatives of densities. We establish that the asymptotic distribution of the estimator remains unaffected by preestimation if the total number of regressors is smaller than ten, in the sense that we do not require additional smoothness assumptions in preestimation. Finally, a Monte-Carlo simulation underscores these advantages.
Wednesday, February 16, 12:00, room 20.237. Donald Rubin (Harvard University) Causal inference through potential outcomes: Application to quality of life studies with "censoring" due to death and to studies of the effect of job-training programs on wages.
Thursday, March 17, 17:00, room 20.179. Diego Ruiz (UPF) Some Indexable Families of Restless Bandit Problems For the abstracts of the two corresponding papers click here and here.
Thursday, March 31, 17:00, room 40.041. Robin Hogarth and Natalia Karelaia (UPF) Simple models of bounded rationality: Predicting when and why they are effective Abstract: We explore environmental conditions under which simple, boundedly rational models produce effective responses. Specifically, we derive probabilities that models identify the best of m alternatives (m > 2) characterized by k attributes (k > 1). The models include a single variable (lexicographic), variations of elimination-by-aspects, equal weighting, hybrids of the preceding, and models exploiting dominance. We compare all with multiple regression. Four environmental factors affect relative performance: how attributes are weighted; characteristics of choice sets (e.g., correlational structure); whether attributes are continuous or binary; and error. We illustrate the theory with twenty simulated and four empirical datasets. Fits between predictions and realizations are excellent. No single model is "best." We further provide an overview by regressing the performance of the different models on factors characterizing environments. We conclude with suggestions for further research as well as some economic implications.
Thursday, June 2, 17:00, room 20.137. Maxwell Stinchcombe (University of Texas at Austin) THE UNBEARABLE FLIGHTINESS OF BAYESIANS: GENERICALLY ERRATIC UPDATING Abstract: A decision maker tries to learn the distribution of an observed, utility relevant, independent and identically distributed (iid) sequence of random variables. The random variables have infinite support, and the decision maker learns by updating their prior distribution on the set of distributions of the sequence. For a generic set of priors, Bayesian updating and the corresponding optimizing behavior are wildly erratic.
Thursday, June 16, 17:00, room 20.237. Nicolas Vayatis (Université Paris 6) Recursive aggregation of many classifiers Abstract: We consider a recursive algorithm to construct an aggregated estimator from a finite number of base decision rules in the classification problem. Similarly to regularized boosting methods, the estimator approximately minimizes a convex risk functional under the l1-constraint. It is defined by a stochastic version of the mirror descent algorithm (i.e., of the method which performs gradient descent in the dual space) with an additional averaging step. The main result is an upper bound for the expected accuracy of the proposed estimator. A similar bound is proved in a more general setting that covers, in particular, the regression model with squared loss. Eventually, we present computer simulations which describe the performance of the method on artificial data and provide a comparison with existing algorithms. Joint work with A. Juditsky, A. Nazin, and A. Tsybakov.
Monday, June 20, 17:15, room 20.137. Albert Abadie (Harvard University) Large Sample Properties of Matching Estimators for Average Treatment Effects (joint with G. Imbens)

Last years' seminars: 1999-2000, 2000-2001, 2001-2002 2002-2003 2003-2004

Back to Lugosi's homepage