www.rizbicki.ufscar.br
Open in
urlscan Pro
2801:b0:100::38
Public Scan
Submitted URL: http://www.rizbicki.ufscar.br/
Effective URL: https://www.rizbicki.ufscar.br/
Submission: On April 07 via api from US — Scanned from DE
Effective URL: https://www.rizbicki.ufscar.br/
Submission: On April 07 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
SEARCH Rafael Izbicki | PhD Rafael Izbicki | PhD * Home * Featured Publications * Lecture Notes * Teaching * Talks * Posts * Students * Contact * Miscellanea * * Light Dark Automatic RAFAEL IZBICKI ASSISTANT PROFESSOR OF STATISTICS FEDERAL UNIVERSITY OF SÃO CARLOS (UFSCAR) * * * * * BIOGRAPHY I’m an Assistant Professor at the Department of Statistics of the Federal University of São Carlos (UFSCar), Brazil. From 2010 to 2014, I was a PhD student in the Department of Statistics & Data Science at Carnegie Mellon University (CMU) (PhD thesis), USA. Prior to that, I graduated and received by Master’s degree at the University of São Paulo (USP) (Master’s dissertation). I am a Research Fellow at CNPq (2017-2024). I am interested in theory, methodology, applications, and foundations of statistics and machine learning. I am a member of the following research groups/collaborations: Book: Aprendizado de máquina: uma abordagem estatística In case you are looking for Rafael Stern, his site is here. Interests * Statistical Machine Learning * High-dimensional Inference * Nonparametric Statistics * Bayesian Inference * Foundations of Statistics * Astrostatistics Education * PhD in Statistics, 2014 Carnegie Mellon University * Master in Statistics, 2010 University of São Paulo * BSc in Statistics, 2009 University of São Paulo FEATURED PUBLICATIONS See all my publications here. CD-split and HPD-split: Efficient conformal regions in high dimensions Conformal methods create prediction bands that control average coverage assuming solely i.i.d. data. We introduce CD-split and HPD-split, which yield general prediction regions and converge to the optimal highest predictive density set. Rafael Izbicki, Gilson Shimizu, Rafael B. Stern May, 2022 Journal of Machine Learning Research PDF Cite Code Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting Parameter estimation, statistical tests and confidence sets are the cornerstones of classical statistics that allow scientists to make inferences about the underlying process that generated the observed data. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage and high power in a so-called likelihood-free inference (LFI) setting; that is, a setting where the likelihood is not explicitly known but one can forward-simulate observable data according to a stochastic model. We present ACORE, a frequentist approach to LFI that first formulates the classical likelihood ratio test (LRT) as a parametrized classification problem, and then uses the equivalence of tests and confidence sets to build confidence regions for parameters of interest. We also present a goodness-of-fit procedure for checking whether the constructed tests and confidence regions are valid. Niccolò Dalmasso, Rafael Izbicki, Ann B. Lee February, 2020 Proceedings of Machine Learning Research (ICML Track) Preprint PDF Quantification under prior probability shift: the ratio estimator and its extensions The quantification problem consists of determining the prevalence of a given label in a target population. However, one often has access to the labels in a sample from the training population but not in the target population. A common assumption in this situation is that of prior probability shift, that is, once the labels are known, the distribution of the features is the same in the training and target populations. In this paper, we derive a new lower bound for the risk of the quantification problem under the prior shift assumption. Using a weaker version of the prior shift assumption, which can be tested, we show that ratio estimators can be used to build confidence intervals for the quantification problem. Afonso F. Vaz, Rafael Izbicki, Rafael B. Stern May, 2019 Journal of Machine Learning Research PDF Cite ABC-CDE: Toward Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations We show how a nonparametric conditional density estimation (CDE) framework helps address three nontrivial challenges in ABC. (i) how to efficiently estimate the posterior distribution with limited simulations and different types of data, (ii) how to tune and compare the performance of ABC and related methods in estimating the posterior itself, rather than just certain properties of the density, and (iii) how to efficiently choose among a large set of summary statistics based on a CDE surrogate loss. Rafael Izbicki, Taylor Pospisil, Ann B. Lee February, 2019 Journal of Computational and Graphical Statistics Preprint PDF Converting High-Dimensional Regression to High-Dimensional Conditional Density Estimation Here we propose a fully nonparametric approach to conditional density estimation that reformulates CDE as a non-parametric orthogonal series problem where the expansion coefficients are estimated by regression. By taking such an approach, one can efficiently estimate conditional densities and not just expectations in high dimensions by drawing upon the success in high-dimensional regression. We show applications to photometric galaxy data, Twitter data, and line-of-sight velocities in a galaxy cluster. Rafael Izbicki, Ann B. Lee November, 2017 Electronic Journal of Statistics Preprint PDF Code Photo-z estimation: An example of nonparametric conditional density estimation under selection bias We describe a general framework for properly constructing and assessing nonparametric conditional density estimators under selection bias, and for combining two or more estimators for optimal performance. This leads to new improved photo-z estimators. We illustrate our methods on data from the Sloan Data Sky Survey and an application to galaxy-galaxy lensing. Rafael Izbicki, Ann B. Lee, Peter E. Freeman February, 2017 The Annals of Applied Statistics Preprint PDF RECENT PUBLICATIONS T. McNeely, G. Vincente, K. M. Wood, Rafael Izbicki, A. B. Lee (2023). Detecting Distributional Differences in Labeled Sequence Data with Application to Tropical Cyclone Satellite Imagery. Annals of Applied Statistics. Preprint PDF A. Shen, L. Masserano, Rafael Izbicki, T. Dorigo, M. Doro, A. B. Lee (2023). Classification under Prior Probability Shift in Simulator-Based Inference: Application to Atmospheric Cosmic-Ray Showers. NeurIPS (Machine Learning and the Physical Sciences Workshop; Best Poster Award). PDF F. M. Polo, Rafael Izbicki, E. G. Lacerda Jr, J. P. Ibieta-Jimenez, R. Vicente (2023). A unified framework for dataset shift diagnostics. Information Sciences. Preprint L. M. C. Cabezas, Rafael Izbicki, R. B. Stern (2023). NLS: Hierarchical clustering: visualization, feature importance and model selection. Applied Soft Computing Journal. Preprint L. Masserano, T. Dorigo, Rafael Izbicki, M. Kuusela, A. B. Lee (2023). Simulation-Based Inference with Waldo: Confidence Regions by Leveraging Prediction Algorithms or Posterior Estimators for Inverse Problems. Proceedings of Machine Learning Research (AISTATS track). Preprint PDF See all publications LECTURE NOTES * Introduction to Probability Theory and Random Processes, with Márcio A. Diniz, Luís E. B. Salasar and Rafael B. Stern * Statistical Inference, with Rafael B. Stern (incomplete) * Inferência Bayesiana, com Luís Gustavo Esteves e Rafael B. Stern * Aprendizado de máquina: uma abordagem estatística, com Tiago Mendonça dos Santos. * Introdução à Causalidade TEACHING Undergraduate courses * Bayesian Inference (19/2, 23/1) * Computational Statistics (14/2, 15/2, 16/2) * Data Mining (14/2, 15/1, 16/1, 17/1, 18/1, 19/1, 20/1, 21/1, 22/1, 23/1) * Introduction to Statistics (15/1, 17/2, 18/1, 19/1, 20/2, 21/1) * Perspectives in Data Science (21/2) * Statistical Inference (22/2) Graduate courses * Decision Theory (15/2, 16/2) * Probability Theory (16/1) * Statistical Inference (18/1) * Statistical Machine Learning (17/2, 18/2, 19/2, 20/1, 20/2, 21/2) TALKS (some videos) * In English: * Likelihood-Free Frequentist Inference Constructing Confidence Sets with Correct Conditional Coverage * CD-split and HPD-split: efficient conformal regions in high dimensions * Pragmatic Hypotheses in the Evolution of Science * Prior distributions as a regularizations tool: an application to crowdsourcing * In Portuguese: * FlexCode: modelando incertezas em problemas de predição * Astroestatística * Uma introdução ao machine learning usando o R * Quantificação sobre prior probability shift * Inferência conformal - uma abordagem flexível * Aprendizado supervisionado para além de predições pontuais RECENT POSTS Recomendações para meus orientandos O trabalho é de sua responsabilidade. Sempre que você precisar, estarei disponível para ajudar, mas eu não vou cobrar você. Fique atento a prazos, eles também são de sua responsabilidade. Jan 13, 2023 3 min read Blog Post Recomendações para meus alunos de disciplinas A falta de assiduidade está associada a um mal desempenho no curso. Se você faltar em uma aula, é sua responsabilidade saber o que foi discutido. Faça as listas de exercício sem ver a solução das questões (é difícil, mas se aguente! Jan 12, 2023 2 min read Blog Post Base rate fallacy “Quase todo mundo que está hospitalizado é vacinado. Vacinas não funcionam =(” Esse pensamento passa na cabeça de muita gente e é muito usado por anti-vacinas. Mas o argumento é correto? Jan 26, 2022 2 min read Blog Post Evolução da Covid-19 no Brasil Fiz um dashboard que mostra a evolução do coronavírus por município e estado brasileiro, pois senti falta de alguns gráficos para descrever essa dinâmica de forma mais localizada: Ele é útil não só para ver cada município separadamente, mas também para entender como o vírus está se espalhando no Brasil. May 2, 2020 1 min read Blog Post Covid-19 Einstein Analysis In this post I analyse the covid-19 data from https://www.kaggle.com/einsteindata4u/covid19, which contains information about patients from Albert Einstein’s Hospital, in São Paulo (Brazil). My main assumptions in the following analysis are that: Mar 29, 2020 8 min read R, machine learning, Blog Post STUDENTS PhD * Matheus Dorival Leonardo Bombonato Menes – (current student) * Rafael Peçanha Waissman – (current student) * Everton Artuso – (current student) * João Flávio Andrade Silva – (current student) * Luben Miguel Cruz Cabezas – (current student) * Milene Regina dos Santos – (current student) * Gabriel Oliveira - (co-advisor, current student) * Tiago Mendonça dos Santos - (co-advisor, current student) * Gilson Shimizu – Bandas de predição usando densidade condicional estimada e um modelo lda com covariáveis (2017-2021) * Marco Henrique de Almeida Inacio – Conditional independence testing, two sample comparison and density estimation using neural networks (2017-2020) Master * Maria Luiza Matos Silva – (current student) * Mateus Borges Comito - (current student) * Cristina Precioso do Amaral Melo - Análise de Sentimento na Cobertura sobre China pelo New York Times: Uma Comparação entre Multinomial Naive Bayes e DistilBERT - (MBA in Data Science 2023-2024) * Gedalias Hugo de Oliveira Valentim - Perfil profissiográfico dos auditores da Controladoria Geral da União - (MBA in Data Science 2023-2024) * Rodrigo Vidi (MBA in Data Science 2023-2024) * Tobias de São Pedro - Central de Recuperação do Crédito Tributário: estudo de modelo de predição de pagamento após contato telefônico com contribuintes devedores de ICMS (MBA in Data Science 2023-2024) * Rodrigo Vidi - Algoritmos de agrupamento e classificação para a identificação de empresas emissoras de notas fiscais inidôneas - (MBA in Data Science 2023-2024) * Mateus Piovezan Otto – Scalable and interpretable kernel methods based on random Fourier features - (2022-2023) * Víctor Candido Reis – Small and time-efficient distribution-free predictive regions (2021-2023) * Carlos Miguel Toste Sisto – Uso de Conformal Predictions para mensurar incertezas em previsões de modelos de Machine Learning - (MBA in Data Science 2022-2023) * Marcela Musetti - FBST em problemas de likelihood-free - (co-advisor, 2021-2023) * Bruno Tardelli – Sistema de Recomendação de produtos bancários: estudo de caso em uma cooperativa de crédito - (MBA in Data Science 2021-2022) * Felipe Hernandez Bisca – Multivariate conditional density estimation with copulas (2019-2021) * Fabiane Yassukawa - Aplicações de machine learning para diagnóstico de covid-19: análise de imagens tomográficas (MBA in Data Science 2020-2020) * Suleimy Cristina Mazin - Técnicas de machine learning para predizer dor pélvica crônica (MBA in Data Science 2020-2020) * Deborah Bassi Stern – Vector representation of texts applied to prediction models (2018-2020) * Victor Coscrato – Neural networks as an optimization tool for regression (2018-2019) * Rafael de Carvalho Ceregatti – A bayesian nonparametric approach for the two-sample problem (2016-2019, co-advisor) * Afonso Fernandes Vaz – Improved quantification under domain shift (2016-2018) * Marco Henrique de Almeida Inacio – Comparing two populations using Bayesian Fourier series density estimation (2016-2017) * Gretta Rossi Ferreira – Estimação de densidades condicionais com aplicações à astronomia (2015-2017) Undergraduate * Fernanda Waltrs Freitas - (current student) * Lucas Sala Bastinni - (current student) * Guilherme Pedrilho Soares - (current student) * Gabriela Soares - Uma abordagem estatística sobre a estimação de redshifts de quasares usando dados do S-PLUS - (2022) * Luben Miguel Cruz Cabezas - Métodos de Aprendizado Ativo (2021-2022) * Luben Miguel Cruz Cabezas (FAPESP) - A data-splitting approach for comparing hierarquical clustering algorithms (2020-2021) * Maria Luiza Matos Silva - Estudo de interações genéticas relacionadas à Esclerose Lateral Amiotrófica (2020-2020) * Víctor Candido Reis - Processos Gaussianos com enfoque em análise de regressão (2019-2019) * Mateus Borges Comito (CNPq) - Estudo de pessoas desaparecidas através de técnicas de aprendizado de máquina (2019-2020) * Víctor Candido Reis (CNPq; FAPESP) - Testes de hipóteses suaves para problemas multivariados (2018-2019) * Marcela Musetti - Combining photometric redshift estimators (2018) * Daniel Simionato (CNPq) – Inferência Via Métodos Preditivos (2017-2018) * Andressa de Jesus Dantas – Understanding Zika patients (2017-2018) * João Dantas – Optimal strategies in pocker (2017-2018) * Victor Coscrato – Word2Vec vs Bag-of-Words (2017) * Rafael Catoia – Collective posterior: can the updating time change it? (2017) * Mauricio Najjar Da Silveira (CNPq) – Comparação não-paramétrica de grupos com base em estimação de densidades (2016-2017; co-advisor) * Ana Molina – Comparação entre métodos de construção de árvores filogenéticas (2016-2017) * Victor Coscrato (CNPq) – Testes de Hipóteses Agnósticos (2016-2017) * Douglas Raul de Freitas – Alguns aspectos sobre o bigdata na estatística (2016-2017) * Letícia Octaviano da Cruz (CNPq) – Monitoramento Online da Dengue (2015-2016) * Paula Ianishi – Técnicas de predição para dados desbalanceados aplicadas ao problema de classificação morfológica de galáxias (2015-2016) * Felipe Henrique Mosquetta Oliveira – Tratamento e Classificação de Dados do Twitter sobre Política e Clima (2015) * Bruno Roberto Guimarães – Classificação automática de resenhas sobre jogos na Google Play Store (2015) CONTACT * rafaelizbicki at gmail dot com MISCELLANEA Twitter Threads * Click here Artigos de Divulgação * “Vidas Salvas: Projeção aponta que, após vacina, mais de mil mortes foram evitadas no Rio” (O Globo, 16/06/21). Artigo; Capa * “Levantamento mostra queda na idade média dos internados no Rio por Covid-19 após doses de reforço em idosos” (O Globo, 16/11/21). Artigo * “Alta de casos e baixa ocupação de leitos” (CBN, 8/6/22) Entrevista The secular nano-Hagaddah * A very small passover Hagaddah I created for my daughters. In English and in Portuguese POPULAR TOPICS ABC Agnostic Test Approximate Likelihood Astrostatistics Bayesian Inference Bayesian Statistics Conditional Density Estimation Crowdsourcing Density Ratio Diagnostics Ecology Hypothesis Test Hypothesis Tests Logical Coherence Logical Consistency machine learning Nonparametric Inference Nonparametric Statistics Portuguese Selection Bias © {2022} Rafael Izbicki. Published with Wowchemy — the free, open source website builder that empowers creators. CITE × Copy Download