www.rizbicki.ufscar.br Open in urlscan Pro
2801:b0:100::38  Public Scan

Submitted URL: http://www.rizbicki.ufscar.br/
Effective URL: https://www.rizbicki.ufscar.br/
Submission: On April 07 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

SEARCH




Rafael Izbicki | PhD
Rafael Izbicki | PhD
 * Home
 * Featured Publications
 * Lecture Notes
 * Teaching
 * Talks
 * Posts
 * Students
 * Contact
 * Miscellanea

 * 
 * Light Dark Automatic


RAFAEL IZBICKI


ASSISTANT PROFESSOR OF STATISTICS


FEDERAL UNIVERSITY OF SÃO CARLOS (UFSCAR)

 * 
 * 
 * 
 * 
 * 


BIOGRAPHY

I’m an Assistant Professor at the Department of Statistics of the Federal
University of São Carlos (UFSCar), Brazil. From 2010 to 2014, I was a PhD
student in the Department of Statistics & Data Science at Carnegie Mellon
University (CMU) (PhD thesis), USA. Prior to that, I graduated and received by
Master’s degree at the University of São Paulo (USP) (Master’s dissertation). I
am a Research Fellow at CNPq (2017-2024).

I am interested in theory, methodology, applications, and foundations of
statistics and machine learning. I am a member of the following research
groups/collaborations:




Book: Aprendizado de máquina: uma abordagem estatística

In case you are looking for Rafael Stern, his site is here.

Interests
 * Statistical Machine Learning
 * High-dimensional Inference
 * Nonparametric Statistics
 * Bayesian Inference
 * Foundations of Statistics
 * Astrostatistics

Education

 * PhD in Statistics, 2014
   
   Carnegie Mellon University

 * Master in Statistics, 2010
   
   University of São Paulo

 * BSc in Statistics, 2009
   
   University of São Paulo


FEATURED PUBLICATIONS

See all my publications here.

CD-split and HPD-split: Efficient conformal regions in high dimensions
Conformal methods create prediction bands that control average coverage assuming
solely i.i.d. data. We introduce CD-split and HPD-split, which yield general
prediction regions and converge to the optimal highest predictive density set.
Rafael Izbicki, Gilson Shimizu, Rafael B. Stern
May, 2022 Journal of Machine Learning Research
PDF Cite Code

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting
Parameter estimation, statistical tests and confidence sets are the cornerstones
of classical statistics that allow scientists to make inferences about the
underlying process that generated the observed data. A key question is whether
one can still construct hypothesis tests and confidence sets with proper
coverage and high power in a so-called likelihood-free inference (LFI) setting;
that is, a setting where the likelihood is not explicitly known but one can
forward-simulate observable data according to a stochastic model. We present
ACORE, a frequentist approach to LFI that first formulates the classical
likelihood ratio test (LRT) as a parametrized classification problem, and then
uses the equivalence of tests and confidence sets to build confidence regions
for parameters of interest. We also present a goodness-of-fit procedure for
checking whether the constructed tests and confidence regions are valid.
Niccolò Dalmasso, Rafael Izbicki, Ann B. Lee
February, 2020 Proceedings of Machine Learning Research (ICML Track)
Preprint PDF

Quantification under prior probability shift: the ratio estimator and its
extensions
The quantification problem consists of determining the prevalence of a given
label in a target population. However, one often has access to the labels in a
sample from the training population but not in the target population. A common
assumption in this situation is that of prior probability shift, that is, once
the labels are known, the distribution of the features is the same in the
training and target populations. In this paper, we derive a new lower bound for
the risk of the quantification problem under the prior shift assumption. Using a
weaker version of the prior shift assumption, which can be tested, we show that
ratio estimators can be used to build confidence intervals for the
quantification problem.
Afonso F. Vaz, Rafael Izbicki, Rafael B. Stern
May, 2019 Journal of Machine Learning Research
PDF Cite

ABC-CDE: Toward Approximate Bayesian Computation with Complex High-Dimensional
Data and Limited Simulations
We show how a nonparametric conditional density estimation (CDE) framework helps
address three nontrivial challenges in ABC. (i) how to efficiently estimate the
posterior distribution with limited simulations and different types of data,
(ii) how to tune and compare the performance of ABC and related methods in
estimating the posterior itself, rather than just certain properties of the
density, and (iii) how to efficiently choose among a large set of summary
statistics based on a CDE surrogate loss.
Rafael Izbicki, Taylor Pospisil, Ann B. Lee
February, 2019 Journal of Computational and Graphical Statistics
Preprint PDF

Converting High-Dimensional Regression to High-Dimensional Conditional Density
Estimation
Here we propose a fully nonparametric approach to conditional density estimation
that reformulates CDE as a non-parametric orthogonal series problem where the
expansion coefficients are estimated by regression. By taking such an approach,
one can efficiently estimate conditional densities and not just expectations in
high dimensions by drawing upon the success in high-dimensional regression. We
show applications to photometric galaxy data, Twitter data, and line-of-sight
velocities in a galaxy cluster.
Rafael Izbicki, Ann B. Lee
November, 2017 Electronic Journal of Statistics
Preprint PDF Code

Photo-z estimation: An example of nonparametric conditional density estimation
under selection bias
We describe a general framework for properly constructing and assessing
nonparametric conditional density estimators under selection bias, and for
combining two or more estimators for optimal performance. This leads to new
improved photo-z estimators. We illustrate our methods on data from the Sloan
Data Sky Survey and an application to galaxy-galaxy lensing.
Rafael Izbicki, Ann B. Lee, Peter E. Freeman
February, 2017 The Annals of Applied Statistics
Preprint PDF



RECENT PUBLICATIONS

T. McNeely, G. Vincente, K. M. Wood, Rafael Izbicki, A. B. Lee (2023). Detecting
Distributional Differences in Labeled Sequence Data with Application to Tropical
Cyclone Satellite Imagery. Annals of Applied Statistics.

Preprint PDF

A. Shen, L. Masserano, Rafael Izbicki, T. Dorigo, M. Doro, A. B. Lee (2023).
Classification under Prior Probability Shift in Simulator-Based Inference:
Application to Atmospheric Cosmic-Ray Showers. NeurIPS (Machine Learning and the
Physical Sciences Workshop; Best Poster Award).

PDF

F. M. Polo, Rafael Izbicki, E. G. Lacerda Jr, J. P. Ibieta-Jimenez, R. Vicente
(2023). A unified framework for dataset shift diagnostics. Information Sciences.

Preprint

L. M. C. Cabezas, Rafael Izbicki, R. B. Stern (2023). NLS: Hierarchical
clustering: visualization, feature importance and model selection. Applied Soft
Computing Journal.

Preprint

L. Masserano, T. Dorigo, Rafael Izbicki, M. Kuusela, A. B. Lee (2023).
Simulation-Based Inference with Waldo: Confidence Regions by Leveraging
Prediction Algorithms or Posterior Estimators for Inverse Problems. Proceedings
of Machine Learning Research (AISTATS track).

Preprint PDF

See all publications


LECTURE NOTES

 * Introduction to Probability Theory and Random Processes, with Márcio A.
   Diniz, Luís E. B. Salasar and Rafael B. Stern

 * Statistical Inference, with Rafael B. Stern (incomplete)

 * Inferência Bayesiana, com Luís Gustavo Esteves e Rafael B. Stern

 * Aprendizado de máquina: uma abordagem estatística, com Tiago Mendonça dos
   Santos.

 * Introdução à Causalidade


TEACHING

Undergraduate courses

 * Bayesian Inference (19/2, 23/1)
 * Computational Statistics (14/2, 15/2, 16/2)
 * Data Mining (14/2, 15/1, 16/1, 17/1, 18/1, 19/1, 20/1, 21/1, 22/1, 23/1)
 * Introduction to Statistics (15/1, 17/2, 18/1, 19/1, 20/2, 21/1)
 * Perspectives in Data Science (21/2)
 * Statistical Inference (22/2)

Graduate courses

 * Decision Theory (15/2, 16/2)
 * Probability Theory (16/1)
 * Statistical Inference (18/1)
 * Statistical Machine Learning (17/2, 18/2, 19/2, 20/1, 20/2, 21/2)


TALKS

(some videos)

 * In English:
   * Likelihood-Free Frequentist Inference Constructing Confidence Sets with
     Correct Conditional Coverage
   * CD-split and HPD-split: efficient conformal regions in high dimensions
   * Pragmatic Hypotheses in the Evolution of Science
   * Prior distributions as a regularizations tool: an application to
     crowdsourcing
 * In Portuguese:
   * FlexCode: modelando incertezas em problemas de predição
   * Astroestatística
   * Uma introdução ao machine learning usando o R
   * Quantificação sobre prior probability shift
   * Inferência conformal - uma abordagem flexível
   * Aprendizado supervisionado para além de predições pontuais


RECENT POSTS

Recomendações para meus orientandos
O trabalho é de sua responsabilidade. Sempre que você precisar, estarei
disponível para ajudar, mas eu não vou cobrar você. Fique atento a prazos, eles
também são de sua responsabilidade.
Jan 13, 2023 3 min read Blog Post

Recomendações para meus alunos de disciplinas
A falta de assiduidade está associada a um mal desempenho no curso. Se você
faltar em uma aula, é sua responsabilidade saber o que foi discutido. Faça as
listas de exercício sem ver a solução das questões (é difícil, mas se aguente!
Jan 12, 2023 2 min read Blog Post

Base rate fallacy
“Quase todo mundo que está hospitalizado é vacinado. Vacinas não funcionam =(”
Esse pensamento passa na cabeça de muita gente e é muito usado por anti-vacinas.
Mas o argumento é correto?
Jan 26, 2022 2 min read Blog Post

Evolução da Covid-19 no Brasil
Fiz um dashboard que mostra a evolução do coronavírus por município e estado
brasileiro, pois senti falta de alguns gráficos para descrever essa dinâmica de
forma mais localizada: Ele é útil não só para ver cada município separadamente,
mas também para entender como o vírus está se espalhando no Brasil.
May 2, 2020 1 min read Blog Post

Covid-19 Einstein Analysis
In this post I analyse the covid-19 data from
https://www.kaggle.com/einsteindata4u/covid19, which contains information about
patients from Albert Einstein’s Hospital, in São Paulo (Brazil). My main
assumptions in the following analysis are that:
Mar 29, 2020 8 min read R, machine learning, Blog Post



STUDENTS

PhD

 * Matheus Dorival Leonardo Bombonato Menes – (current student)
 * Rafael Peçanha Waissman – (current student)
 * Everton Artuso – (current student)
 * João Flávio Andrade Silva – (current student)
 * Luben Miguel Cruz Cabezas – (current student)
 * Milene Regina dos Santos – (current student)
 * Gabriel Oliveira - (co-advisor, current student)
 * Tiago Mendonça dos Santos - (co-advisor, current student)
 * Gilson Shimizu – Bandas de predição usando densidade condicional estimada e
   um modelo lda com covariáveis (2017-2021)
 * Marco Henrique de Almeida Inacio – Conditional independence testing, two
   sample comparison and density estimation using neural networks (2017-2020)

Master

 * Maria Luiza Matos Silva – (current student)
 * Mateus Borges Comito - (current student)
 * Cristina Precioso do Amaral Melo - Análise de Sentimento na Cobertura sobre
   China pelo New York Times: Uma Comparação entre Multinomial Naive Bayes e
   DistilBERT - (MBA in Data Science 2023-2024)
 * Gedalias Hugo de Oliveira Valentim - Perfil profissiográfico dos auditores da
   Controladoria Geral da União - (MBA in Data Science 2023-2024)
 * Rodrigo Vidi (MBA in Data Science 2023-2024)
 * Tobias de São Pedro - Central de Recuperação do Crédito Tributário: estudo de
   modelo de predição de pagamento após contato telefônico com contribuintes
   devedores de ICMS (MBA in Data Science 2023-2024)
 * Rodrigo Vidi - Algoritmos de agrupamento e classificação para a identificação
   de empresas emissoras de notas fiscais inidôneas - (MBA in Data Science
   2023-2024)
 * Mateus Piovezan Otto – Scalable and interpretable kernel methods based on
   random Fourier features - (2022-2023)
 * Víctor Candido Reis – Small and time-efficient distribution-free predictive
   regions (2021-2023)
 * Carlos Miguel Toste Sisto – Uso de Conformal Predictions para mensurar
   incertezas em previsões de modelos de Machine Learning - (MBA in Data Science
   2022-2023)
 * Marcela Musetti - FBST em problemas de likelihood-free - (co-advisor,
   2021-2023)
 * Bruno Tardelli – Sistema de Recomendação de produtos bancários: estudo de
   caso em uma cooperativa de crédito - (MBA in Data Science 2021-2022)
 * Felipe Hernandez Bisca – Multivariate conditional density estimation with
   copulas (2019-2021)
 * Fabiane Yassukawa - Aplicações de machine learning para diagnóstico de
   covid-19: análise de imagens tomográficas (MBA in Data Science 2020-2020)
 * Suleimy Cristina Mazin - Técnicas de machine learning para predizer dor
   pélvica crônica (MBA in Data Science 2020-2020)
 * Deborah Bassi Stern – Vector representation of texts applied to prediction
   models (2018-2020)
 * Victor Coscrato – Neural networks as an optimization tool for regression
   (2018-2019)
 * Rafael de Carvalho Ceregatti – A bayesian nonparametric approach for the
   two-sample problem (2016-2019, co-advisor)
 * Afonso Fernandes Vaz – Improved quantification under domain shift (2016-2018)
 * Marco Henrique de Almeida Inacio – Comparing two populations using Bayesian
   Fourier series density estimation (2016-2017)
 * Gretta Rossi Ferreira – Estimação de densidades condicionais com aplicações à
   astronomia (2015-2017)

Undergraduate

 * Fernanda Waltrs Freitas - (current student)
 * Lucas Sala Bastinni - (current student)
 * Guilherme Pedrilho Soares - (current student)
 * Gabriela Soares - Uma abordagem estatística sobre a estimação de redshifts de
   quasares usando dados do S-PLUS - (2022)
 * Luben Miguel Cruz Cabezas - Métodos de Aprendizado Ativo (2021-2022)
 * Luben Miguel Cruz Cabezas (FAPESP) - A data-splitting approach for comparing
   hierarquical clustering algorithms (2020-2021)
 * Maria Luiza Matos Silva - Estudo de interações genéticas relacionadas à
   Esclerose Lateral Amiotrófica (2020-2020)
 * Víctor Candido Reis - Processos Gaussianos com enfoque em análise de
   regressão (2019-2019)
 * Mateus Borges Comito (CNPq) - Estudo de pessoas desaparecidas através de
   técnicas de aprendizado de máquina (2019-2020)
 * Víctor Candido Reis (CNPq; FAPESP) - Testes de hipóteses suaves para
   problemas multivariados (2018-2019)
 * Marcela Musetti - Combining photometric redshift estimators (2018)
 * Daniel Simionato (CNPq) – Inferência Via Métodos Preditivos (2017-2018)
 * Andressa de Jesus Dantas – Understanding Zika patients (2017-2018)
 * João Dantas – Optimal strategies in pocker (2017-2018)
 * Victor Coscrato – Word2Vec vs Bag-of-Words (2017)
 * Rafael Catoia – Collective posterior: can the updating time change it? (2017)
 * Mauricio Najjar Da Silveira (CNPq) – Comparação não-paramétrica de grupos com
   base em estimação de densidades (2016-2017; co-advisor)
 * Ana Molina – Comparação entre métodos de construção de árvores filogenéticas
   (2016-2017)
 * Victor Coscrato (CNPq) – Testes de Hipóteses Agnósticos (2016-2017)
 * Douglas Raul de Freitas – Alguns aspectos sobre o bigdata na estatística
   (2016-2017)
 * Letícia Octaviano da Cruz (CNPq) – Monitoramento Online da Dengue (2015-2016)
 * Paula Ianishi – Técnicas de predição para dados desbalanceados aplicadas ao
   problema de classificação morfológica de galáxias (2015-2016)
 * Felipe Henrique Mosquetta Oliveira – Tratamento e Classificação de Dados do
   Twitter sobre Política e Clima (2015)
 * Bruno Roberto Guimarães – Classificação automática de resenhas sobre jogos na
   Google Play Store (2015)


CONTACT

 * rafaelizbicki at gmail dot com


MISCELLANEA

Twitter Threads

 * Click here

Artigos de Divulgação

 * “Vidas Salvas: Projeção aponta que, após vacina, mais de mil mortes foram
   evitadas no Rio” (O Globo, 16/06/21). Artigo; Capa

 * “Levantamento mostra queda na idade média dos internados no Rio por Covid-19
   após doses de reforço em idosos” (O Globo, 16/11/21). Artigo

 * “Alta de casos e baixa ocupação de leitos” (CBN, 8/6/22) Entrevista

The secular nano-Hagaddah

 * A very small passover Hagaddah I created for my daughters. In English and in
   Portuguese


POPULAR TOPICS

ABC Agnostic Test Approximate Likelihood Astrostatistics Bayesian Inference
Bayesian Statistics Conditional Density Estimation Crowdsourcing Density Ratio
Diagnostics Ecology Hypothesis Test Hypothesis Tests Logical Coherence Logical
Consistency machine learning Nonparametric Inference Nonparametric Statistics
Portuguese Selection Bias

© {2022} Rafael Izbicki.



Published with Wowchemy — the free, open source website builder that empowers
creators.

CITE

×



Copy Download