Statistics Seminars 2007-2008,
Department of Economics, Pompeu Fabra University
Schedule.
Thursday, October 25, 16:00,
room 40.039.
Ab Mooijaart
(Leiden University)
Interaction in Latent Variable Models: Some aspects of estimation and testing
Abstract: The last decade there has been a lot of attention for models
with latent interaction variables. By these kinds of models one can
analyze, e.g., models with moderator variables. There has been
suggested a variety of ways for analyzing these kinds of
models. However, these methods are rather complicated to perform. See,
for instance, Jöaut;reskog and Yang (1996), Arminger and Muthán
(1998), Klein and Moosbrugger (2000), Lee and Zhu, H. (2002). There
has been suggested also more easy to handle types of analyses,
however, these types of analyses may have rather poor statistical
properties. See Marsh, Zhonglin and Hau (2004).
In this talk some of the aspects of estimation of the parameters and
testing the goodness of fit of the models will be discussed. In
particular the concentration will be on two topics:
1: what happens if we analyze data by a normal theory method by
ignoring the fact that there is a latent interaction variable? It will
be shown that, although the test statistic indicates that there is a
proper model, this may result in a wrong decision.
2: An alternative estimation procedure will be shown which is based on
fitting the means, covariances and some higher order moments. Some
advantages and disadvantages of this method will be discussed.
Tuesday, November 27, 17:00,
room 20.175.
Victor Panaretos
(Ecole Polytechnique Fédérale de Lausanne)
A Statistical Approach to Random Tomography in Structural Biology
Abstract: Single particle electron microscopy is a powerful method
that biophysicists employ to learn about the structure of biological
macromolecules. In contrast to the more traditional crystallographic
methods, this method images unconstrained particles, thus posing a
variety of statistical problems. We formulate and study such a
problem, one that is essentially of a random tomographic nature, where
a structural model for a biological particle is to be constructed
given random projections of its Coulomb potential density, observed
through the electron microscope. Although unidentifiable (ill-posed),
this problem can be seen to be amenable to a statistical solution,
once parametric assumptions are imposed. It can also be seen to
present challenges both from a data analysis point of view
(e.g. uncertainty estimation and presentation) as well as
computationally.
Tuesday, February 12, 11:00,
room 40.041.
Jorge León
(CINVESTAV)
Itô's formula for the solution of the fractional heat equation.
Abstract. In this talk, we define a stochastic integral with respect
to the solution of
the heat equation driven by a fractional Brownian motion with
parameter bigger than 1/2.
This integral satisfies a transfer principle with the Skorohod
integral. So, we can use
the Malliavin calculus techniques to analyze an Ito's formula for the
mentioned solution
of the fractional heat equation
Tuesday, February 12, 16:30,
room 20.237.
Wolfgang Polasek
(Institute for Advanced Studies, Vienna, Austria)
Long term regional forecasting with spatial
equation systems
Abstract:
We use a spatial econometric extension of the traditional regression-based
gravity model introduced in LeSage and Pace (2005) to model commodity
ï¬ows between 35 regions in Austria. Our focus is on a formal methodology
for incorporating information regarding the highway network into the spa-
tial connectivity structure of the spatial autoregressive econometric
model.
We show that our simple approach to incorporating this information in the
model produces improved model ï¬t and higher likelihood function values.
The model introduced by LeSage and Pace (2005) accounts for spatial de-
pendence in the origin-destination ï¬ows by introducing a spatial connec-
tivity matrix that allows for three types of spatial dependence in the ï¬ows
from origins to destinations. We modify this origin-destination
connectivity
structure to include information regarding the presence or absence of a ma-
jor highway artery that passes through the regions. Empirical estimates of
the relative importance of the diï¬erent types of origin-destination
connectiv-
ity between regions indicates that the strongest spatial autoregressive
eï¬ects
arise when both origin and destination regions have neighboring regions lo-
cated on the highway network. This is an intuitively plausible result that
should be viewed in the context of past regression-based origin-destination
gravity models that assume the ï¬ows between origin-destination pairs mak-
ing up the sample data observations are independent. Our approach builds
on that of LeSage and Pace (2005) to provide a formal spatial econometric
methodology that can easily incorporate network connectivity information
in spatial autoregressive models that can be estimated using slightly
altered
conventional algorithms that are widely available.
(Joint work with Richard Sellner and
Wolfgang Schwarzbauer.)
Wednesday, March 12, 16:00,
room 40.039.
Alex Beskos, Warwick University
Monte-Carlo Maximum Likelihood for diffusions.
Abstract: In several statistical applications the likelihood function is
not analytically available but can be estimated with Monte-Carlo methods
(see e.g. Geyer & Thompson (1992), "Constrained Monte Carlo maximum
likelihood for dependent data" for a classic reference). An interesting
question arising then is the relationaship between the computational
effort (as measured by the Monte-Carlo iterations $N$) and the data size
$n$ to achieve consistency. Typically, as the data size increases so
does the computational effort to obtain reliable estimates of the
likelihood.
We look at a particular context when the likelihood derives from
observing a continuous time process (diffusion) at discrete time
instances. In such cases, the transition density of the process is
typically unknown and the likelihood function unavailable. In earlier
work we have provided an unbiased estimator of the transition density of
the process. The Markov structure of the process is critical in the
optimal selection of computing effort being $N=O(n^{1/2})$. This
contrasts with an exponential rate reported in the literature for
general applications. We show the asymptotic results in some detail. In
the process, we present the basic principles of maximum likelihood
theory for ergodic processes. We also hint at connections of the main
ideas with other Monte-Carlo applications requiring tuning.
The talk is primarily based on the the article "Monte Carlo maximum
likelihood estimation for discretely observed diffusion processes" by
Beskos, Papaspiliopoulos and Roberts, Annals of Statistics (2008)
Thursday, April 10, 15:00,
room 40.273.
Eran Shmaya (California Institute of Technology)
Many inspections are manipulable
Abstract: A self proclaimed expert uses past observations of a
stochastic process to make probabilistic predictions about the
process. An inspector applies a test function to the infinite sequence
of predictions provided by the expert and the observed realization of
the process in order to check the expert's reliability. If the test
function is Borel and the inspection is such that a true expert will
always pass it, then it is also manipulable by an ignorant expert. The
proof uses Martin's theorem about determinacy of Blackwell
games. Under the axiom of choice, there exist non-Borel test functions
that are not manipulable.
Monday, May 26, 12:00,
room 20.287.
José Enrique Chacón (Universidad de Extremadura)
Plug-in choice for non-fixed-shape kernel density estimators
Abstract: There exist kernel density estimators that do not need a
bandwidth, because for every sample size n the whole kernel function
is allowed to be chosen, so that its shape may vary freely with n. In
this sense, these estimators are different from the classical ones,
where the shape of the kernel is fixed and, apart from the averaging
process, only a scaling parameter, the bandwidth, is allowed to depend
on n. In this talk we will introduce a plug-in data-dependent approach
to select the kernel function for non-fixed-shape kernel density
estimators. (Joint work with Alberto Rodríguez Casal.)
Thursday, June 5, 15:00,
room 40.S01.
Michael Greenacre (UPF)
Voyages in N-dimensional space:
Dynamic transitions between multivariate methods
Abstract: In this presentation I will show several animations that
show smooth transitions between visualizations of the results of one
multivariate method to those of another. Several well-known methods,
often presented as alternative approaches, are shown to be linkable
through a parameter that embeds the methods into a common
family. Generally, we define the parameter so that the first analysis
is obtained when the parameter has value 1, while the value of 0 (or
at the limit of 0) yields the second analysis. Then, by varying the
parameter from 1 to 0 (or the limit of 0), and calculating the
results at each intermediate step, we can assemble a series of frames
that can be concatenated into an animated video sequence. For each
pair of analyses, we shall first present the way the parameter is
defined, and then show the video as the parameter is varied. Methods
include principal component analysis, discriminant
analysis, correspondence analysis (simple, multiple and canonical),
log-ratio analysis, partial least squares and multidimensional
scaling. Seeing the connections between methods can facilitate
understanding of the methods as well as the interpretation of the
data themselves. This talk is aimed at applied statisticians and
researchers in economics, business, social and environmental
sciences.
Tuesday, June 10, 17:00,
room 20.237.
José Fernando Álvarez (UPF)
Joint routing and deployment of a fleet of container vessels
Abstract: Liner companies face a complex problem in determining the
optimal routing and deployment of a eet of container vessels. This
paper presents a model and an algo- rithm that address the two
problems jointly. The model captures the revenues and operating
expenses of a global liner company. The model allows for the
representa- tion of vessel types with different cost and operating
properties; transhipment hubs and associated costs; port delays;
regional trade imbalances; and the possibility of rejecting
transportation demand selectively. The proposed algorithm is applied
in a case study with 120 ports of call distributed throughout the
globe. The case study explores the sensitivity of optimal fleet
deployment and routing to varying bunker costs.