Ciclo de Palestras 2012 – 1° Semestre
Palestras do Departamento de Metodos Estatísticos - Instituto de Matemática - UFRJ
1º semestre de 2012
As palestras ocorreram no Auditório do Laboratório de Sistemas Estocásticos (LSE), sala I-044b, as 15:30 h, a menos de algumas exceções devidamente indicadas.
Percolação congelada foi introduzida por D. Aldous como um modelo prababilístico para a formação de um gel. Dada uma sequência de variáveis aleatótias iid, (U_v), com distribuição uniforme em [0,1] onde v é um vértice do grafo, o medelo é definido da seguinte maneira: no tempo t=0 todos os sítios estão inativos, o sítio v passa para o estado ativo no tempo U_v, quando se forma um aglomerado infinito de sítios ativos todos os sítios deste aglomerado passam para o estado congelado. Deste modo, no tempo t=1 todos os sítios estão ativos ou congelados.
Estudamos uma modificação deste processo, na rede quadrada, onde o aglomerado congela quando seu diâmetro atinge o valor N. Mostramos que no limite quando N tende ao infinito a probabilidade da origem estar ainda ativa é estritamente positiva. Trabalho conjunto com J. van den Berg (CWI, Amsterdã) e Pierre Nolin (ETH, Zurique).
We develop a novel computational framework for Bayesian optimal sequential design for nonparametric regression. This computational framework is based on evolutionary Markov chain Monte Carlo (EMCMC), which combines ideas of genetic or evolutionary algorithms with the power of Markov chain Monte Carlo. Our framework is able to consider general models for the observations, such as exponential family distributions and scale mixtures of normals. In addition, our framework allows optimality criteria with general utility functions that may include competing objectives, such as for example minimization of costs, minimization of the distance between true and estimated functions, and minimization of the prediction error. Finally, we illustrate our novel methodology with applications to experimental design for nonparametric function estimation.
In medical diagnostic testing, it is common the use of more than one diagnostic test applied to the same individual. Usually these tests are assumed to be independents and important performance measures are estimated as the sensitivities and specificities of the tests, in the presence or not of a reference test usually known as “gold standard”. These tests could be dependent since they are applied to the same individual and this assumption could modify the estimation of the performance measures. Considering two diagnostic tests, we could assume a bivariate Bernoulli distribution. Alternatively, we propose the use of different copula functions to model the association between tests. Under the Bayesian paradigm, the posterior summaries of interest are obtained using MCMC (Markov Chain Monte Carlo) methods. A detailed discussion on the elicitation of prior distributions on the test performance and copula parameter is considered in this study.We illustrate the proposed methodology considering two medical data sets introduced in the literature.
18/04 Colóquio Inter-institucional "Modelos estocásticos e aplicações" (excepcionalmente as 13:30 horas na sala C116 do IM)
A habilidade de aprisionar átomos bosônicos e fermiônicos em redes óticas, cujo potencial cristalino é gerado por lasers anti-propagantes, a temperaturas ultra baixas, deu início a uma nova área de pesquisa, na fronteira entre a Física da Matéria Condensada, a Física Atômica e a Ótica. Ao contrário do que acontece nos sistemas de Matéria Condensada, nas redes óticas há um grande controle sobre os parâmetros envolvidos: as interações entre os átomos são controladas através de um campo magnético, podendo ser atrativas ou repulsivas, o potencial químico é facilmente controlável e não há desordem. Com isso, um novo desenvolvimento nesta área é a possibilidade de realizar em laboratório modelos para férmions fortemente correlacionados, dentre os quais o mais estudado é o modelo de Hubbard. Atualmente, o principal desafio nesta área é conseguir o resfriamento necessário para observar fases ordenadas, como antiferromagnetismo, supercondutividade ou superfluidez. Neste colóquio vou discutir os avanços experimentais e teóricos mais recentes nesta área.
Nesta palestra faremos uma viagem pela teoria de modelos de crescimento percolativos e suas interfaces de competição. Veremos resultados clássicos, como o teorema da forma, bem como resultados recentes sobre a forma da interface de competição, além de problemas fundamentais que ainda estão em aberto.
The study concerns the exploratory study carried out to provide items to be submitted to aphasic patients, to evaluate their degree of desease. This preliminary study is devoted to the identifications of images to be submitted, by selecting them from an internationally adopted set of images. To select them we proceeded in two steps: i) the selection of the images based on their facility to be easily to be recognized by the patients; and ii) the evaluation of the primitiveness of the objects’ nouns to be verbalised, aiming at limiting attention to the most primitive ones. Both steps were carried out by submitting items to non-aphasic judges, in order to evaluate in a neutral way the quality of the items themselves. Thus images were submitted to Correspondence Analysis, to identify those least recognized by the judges, in order to exclude them in the further step. Then, the selected objects were submitted to two sets of judges to evaluate their degree of primitiveness, according to: i) a predefined seven-steps age scale, and ii) a 1-7 free scale. The results were submitted first to both Principal Component and Multiple Correspondence Analyses, to withdraw any judge that resulted an outlier in respect to others. Then, the remaining data were analysed through Multiple Factor Analysis, to compare to what extent the two scales of measurement gave different results: it appeared that the free-scale allowed the judges to use the whole scale, whereas the predefined one caused the selection of a limited number of steps. Nevertheless, its first principal component, that is the objects’ scores along the first axis, could be assumed as a measure of primitiveness.
09/03 (excepcionalmente uma 6ª feira as 13:30 horas)
We propose two new classes of links for the modeling of mixed models for binary response. We shows that these extensions are appropriate for the analysis of several typesof correlated data structures, in particular, for clustered and/or longitudinaldata and, more generally, in multilevel models. The links proposed can be named as power and reciprocal power by considering the relationship between them. Both include usual symmetric links as logit and probit as special cases. Also,the univariate and the random effects for symmetric links in binary regression are special cases of the models considered here. A Bayesian inference approach using MCMC is developed.
18/01 - (excepcionalmente às 13:30h)
Matthias Kormaksson (Cornell)
For many high-dimensional data a common goal is to test thousands of features against some null hypothesis. This simultaneous testing problem has been studied extensively over the last decade in the context of continuous microarray data. However, little attention has been given to a new class of data that are arising in several fields, including Genomics, Epigenomics, and Proteomics. These data, so called Next Generation Sequencing Data, are measured at a much higher resolution than regular microarray data and are not continuous but rather come in the form of counts or proportions. For these data new methods are needed to discover features that show a statistical difference across conditions. To address this need, we have developed a three groups mixture model, which can be applied to data that follow distributions in the exponential dispersion family. The proposed model fits into a framework that we call Mixture of Generalized Linear Mixed Models (MGLMM) and applies to a variety of high-dimensional data. In this talk I will present the MGLMM model and apply it to two different data sets arising in Methylation Sequencing and Proteomics Analysis. I will also present some simulation results that suggest a superior performance over the current methods being employed.
06/06 Colóquio Inter-institucional "Modelos estocásticos e aplicações" (excepcionalmente as 14:00 horas no Auditório 1 do IMPA)
Suppose that in a close election, a small (random) proportion of the votes are accidentally miscounted; is this random `noise’ likely to change the outcome of the election? It turns out that the answer to this question depends in interesting ways on the rule (i.e., the Boolean function f) by which the winner is selected. To take three simple examples, the answer is “no” if the function f is `majority’ or `dictator’, but “yes” if it is `parity’. The systematic study of this problem was begun in 1999 by Benjamini, Kalai and Schramm, who gave a sufficient condition (based on the discrete Fourier coefficients of f) for the answer to be “yes”, and used this result to prove that bond percolation on Z² is noise sensitive at criticality. More precisely, suppose that we perform critical (i.e., p = 1/2) bond percolation on Z², observe that there is a horizontal crossing of a particular n x n square, and then re-randomize each edge with probability epsilon > 0. Then the probability of having a horizontal crossing in the new configuration is close to 1/2. In this talk we consider the corresponding question for continuum percolation, and in particular for the Poisson Boolean model (also known as the Gilbert disc model). Let eta be a Poisson process of density lambda in the plane, and connect two points of eta by an edge if they are at distance at most 1. We prove that, at criticality, the event that there is a crossing of an n x n square is noise sensitive. The proof is based on two extremely general tools: a version of the BKS Theorem for product measure, and a new extremal result on hypergraphs. This is joint work with Daniel Ahlberg, Erik Broman and Simon Griffiths.
Ivan S. Oliveira (CBPF)
A Computação Quântica, ou mais genericamente, o Processamento da Informação Quântica, surgiu como uma área da física teórica no início dos anos 1980. A partir de 1994, com a descoberta do algoritmo de fatoração de Shor um grande número de pesquisadores foram atraídos para esta área, e e em 1997, a Ressonância Magnética Nuclear (RMN) despontou como uma das técnicas experimentais mais promissoras para a implementação de protocolos de computação e comunicação quânticos. Logo se percebeu, contudo, que o chamado problema do escalonamento, seria muito difícil de ser superado por qualquer técnica experimental em vigor, em particular a RMN. Os trabalhos então se concentraram em aspectos básicos do processamento da informação quântica em sistemas com um número pequeno de q-bits, a unidade de informação quântica. A RMN encontrou aí um nicho extraordinário para estudos fundamentais sobre emaranhamento, simulação de sistemas quânticos, e descoerência. Neste colóquio vamos apresentar os fundamentos do Processamento da Informação Quântica por RMN, com vários exemplos de estudos em um sistema com apenas 2 q-bits de informação, o mais simples de todos: a molécula do clorofórmio. Ênfase será dada aos trabalhos feitos pelo Grupo de Informação Quântica por RMN do Centro Brasileiro de Pesquisas Físicas.
Luca Martino (Carlos III)
Rejection sampling (RS) is a standard technique for universal Monte Carlo sampling. It can be used to generate i.i.d. samples from a target probability density function (pdf) by drawing from a simpler proposal density. The class of adaptive rejection sampling (ARS) methods are particularly appealing because they ensure high acceptance rates. Indeed they produce a sequence of proposal functions that actually converge toward the target pdf when the procedure is iterated. We will discuss a a novel family of generalized ARS algorithms which are applicable to a broad range of target densities and, furthermore, admit an efficient combination with other sampling techniques such as the “ratio of uniforms” method. In many practical applications, rejection samplers cannot provide a complete solution to the inference problems to be solved (e.g., when the target distribution is large dimensional) but they can still become useful blocks for the design of more sophisticated algorithms, such as Markov chain Monte Carlo (MCMC) methods. We will describe certain MCMC algorithms as the Multiple Try MH (MTM) technique and how the latter can be generalized using either generic weight functions or ARS building blocks. Some numerical examples will be provided for illustration.
27/04 (excepcionalmente uma 6ª feira)
We introduce trap models on a finite volume k-level tree as a class of Markov jump processes with state space the leaves of that tree. They serve to describe the GREM-like trap model of Sasaki-Nemoto. Under suitable conditions on the parameters of the trap model, we establish its infinite volume limit, given by what we call a K process in an infinite k-level tree. From this we deduce that the K-process also is the scaling limit of the GREM-like trap model on extreme time scales under a fine tuning assumption on the volumes. This is a joint work with L. R. G. Fontes, V. Gayrard.
After a few remarks about what we mean by quantization, I will explain the powerful role that operator-valued measure can play in quantizing any set equipped with a measure, for instance a group equipped with its (left) Haar measure. Integral quantizations based on the Weyl-Heisenberg group and on the affine group are compared. I will insist on the probabilistic aspects of such a procedure. An interesting application in quantum cosmology will be presented.
Luigi Ippoliti (Pescara)
This talk discusses a spatial dynamic structural equation model for the analysis of house prices at the State level in the USA. The study contributes to the existing literature by extending the use of dynamic factor models to the econometric analysis of multivariate lattice data. One of the main advantages of our model formulation is that by modeling the spatial variation via spatially structured factor loadings, we entertain the possibility of identifying similarity ”regions” that share common time series components. The factor loadings are modeled as conditionally independent multivariate Gaussian Markov Random Fields while the common components are modeled by latent and dynamic factors. The general model is proposed in a state-space formulation where both stationary and nonstationary autoregressive distributed-lag processes for the latent factors are considered. For the latent factors which exhibit a common trend, and hence are cointegrated, an error correction specication of the (vector) autoregressive distributed-lag process is proposed. Full probabilistic inference for the model parameters is facilitated by adapting standard Markov chain Monte Carlo (MCMC) algorithms for dynamic linear models to our model formulation. The fit of the model is discussed for a data set of 48 States for which we model the relationship between housing prices and the macroeconomy, using state level unemployment and per capita personal income.