- 26
- 145 556
Tamara Broderick
United States
Приєднався 22 чер 2020
Tin Nguyen: "Sensitivity of MCMC-based analyses to small-data removal"
Talk Title: Sensitivity of MCMC-based analyses to small-data removal
Thesis Committee: Tamara Broderick, Ashia Wilson, and Stefanie Jegelka
Talk Abstract: If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian posterior is already very expensive; running multiple times is prohibitive, and the number of re-runs needed here is combinatorially large. Recent work proposes a fast and accurate approximation to find the worst-case dropped data subset, but that work was developed for problems based on estimating equations --- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions in the present work. We adapt the existing data-dropping approximation to estimators computed via MCMC. Observing that Monte Carlo errors induce variability in the approximation, we use a variant of the bootstrap to quantify this uncertainty. We demonstrate how to use our approximation in practice to determine whether there is non-robustness in a problem. Empirically, our method is accurate in simple models, such as linear regression. In models with complicated structure, such as hierarchical models, the performance of our method is mixed.
Thesis Committee: Tamara Broderick, Ashia Wilson, and Stefanie Jegelka
Talk Abstract: If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian posterior is already very expensive; running multiple times is prohibitive, and the number of re-runs needed here is combinatorially large. Recent work proposes a fast and accurate approximation to find the worst-case dropped data subset, but that work was developed for problems based on estimating equations --- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions in the present work. We adapt the existing data-dropping approximation to estimators computed via MCMC. Observing that Monte Carlo errors induce variability in the approximation, we use a variant of the bootstrap to quantify this uncertainty. We demonstrate how to use our approximation in practice to determine whether there is non-robustness in a problem. Empirically, our method is accurate in simple models, such as linear regression. In models with complicated structure, such as hierarchical models, the performance of our method is mixed.
Переглядів: 148
Відео
Tamara Broderick: "Toward a taxonomy of trust for probabilistic data analysis"
Переглядів 3018 місяців тому
Title: Toward a taxonomy of trust for probabilistic data analysis Abstract: Probabilistic data analysis increasingly informs critical decisions in medicine, economics, education, and beyond. A major concern is generalization: if we conclude that an economic or health intervention helps people based on a data analysis, we hope that it will indeed help people when deployed in the future. We might...
Soumya Ghosh: "Are you using test log-likelihood correctly?"
Переглядів 1018 місяців тому
Title: Are you using test log-likelihood correctly? Corresponding Paper, Transactions on Machine Learning Research (TMLR) 2024: openreview.net/pdf?id=n2YifD4Dxo Authors: Sameer Deshpande*, Soumya Ghosh*, Tin D. Nguyen*, Tamara Broderick (*contributed equally) TMLR Infinite Conference: tmlr.infinite-conf.org/paper_pages/n2YifD4Dxo.html Arxiv: arxiv.org/abs/2212.00219 Code linked in supplementary...
Brian Trippe: "Advances in Bayesian Linear Modeling in High Dimensions"
Переглядів 6552 роки тому
Title: Bayesian Linear Modeling in High Dimensions: Advances in Hierarchical Modeling, Inference, and Evaluation Principal corresponding paper, NeurIPS 2021: proceedings.neurips.cc/paper/2021/hash/6ffad86b9a8dd4a3e98df1b0830d1c8c-Abstract.html Thesis Committee: Tamara Broderick, Youssef Marzouk, Jeff Miller, and Hilary Finucane Abstract: Across the sciences, social sciences and engineering, app...
William Stephenson: "Can we globally optimize cross-validation loss?"
Переглядів 2712 роки тому
Title: Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression Corresponding paper, NeurIPS 2021: proceedings.neurips.cc/paper/2021/hash/cc298d5bc587e1b650f80e10449ee9d5-Abstract.html Authors: William Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick Arxiv: arxiv.org/abs/2107.09194 Code zip: openreview.net/attachment?id=4Il6i0jdrvP&name=code Abstract: M...
Lorenzo Masoero: "Bayesian nonparametrics for maximizing power in rare variants association studies"
Переглядів 2582 роки тому
Title: Bayesian nonparametric strategies for power maximization in rare variants association studies Corresponding paper, on the arXiv: arxiv.org/abs/2112.02032 Authors: Lorenzo Masoero, Joshua Schraiber, Tamara Broderick Abstract: Rare variants are hypothesized to be largely responsible for heritability and susceptibility to disease in humans. So rare variants association studies hold promise ...
Soumya Ghosh: "Approximate Cross-Validation for Structured Models"
Переглядів 2123 роки тому
Title: "Approximate Cross-Validation for Structured Models" Corresponding paper, at NeurIPS 2020: papers.nips.cc/paper/2020/hash/636efd4f9aeb5781e9ea815cdd633e52-Abstract.html Authors: Soumya Ghosh*, Will Stephenson*, Tin D. Nguyen, Sameer Deshpande, Tamara Broderick (*joint first authorship) Abstract: Many modern data analyses benefit from explicitly modeling dependence structure in data such ...
Raj Agrawal: "High-Dimensional Variable Selection & Nonlinear Interaction Discovery in Linear Time"
Переглядів 3193 роки тому
Title: "The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time" Corresponding paper, on the arXiv: arxiv.org/abs/2106.12408 Authors: Raj Agrawal, Tamara Broderick Abstract: Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects. Often, these effects are non...
Nicholas Bonaker: "Nomon: A Flexible, Bayesian Interface for Motor-Impaired Users"
Переглядів 4623 роки тому
Title: Nomon: A Flexible, Bayesian Interface for Motor-Impaired Users Authors: Nicholas Bonaker, Emli-Mari Nel, Keith Vertanen, Tamara Broderick Demo and more information: nomon.app (formerly at nomon.csail.mit.edu) Code: github.com/tbroderick/Nomon
MIT: Machine Learning 6.036, Lecture 14: Guest lecture (David Sontag) (Fall 2020)
Переглядів 2,4 тис.3 роки тому
* Lecture 14 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 12 / 08 * Speaker for Lecture 14: David Sontag * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: ...
MIT: Machine Learning 6.036, Lecture 13: Clustering (Fall 2020)
Переглядів 7 тис.3 роки тому
* Lecture 13 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 12 / 01 * Lecturer: Tamara Broderick * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: github.com...
MIT: Machine Learning 6.036, Lecture 12: Decision trees and random forests (Fall 2020)
Переглядів 18 тис.3 роки тому
* Lecture 12 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 11 / 17 * Lecturer: Tamara Broderick * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: github.com...
MIT: Machine Learning 6.036, Lecture 11: Recurrent neural networks (Fall 2020)
Переглядів 2,6 тис.3 роки тому
* Lecture 11 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 11 / 10 * Lecturer: Tamara Broderick * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: github.com...
MIT: Machine Learning 6.036, Lecture 10: Reinforcement learning (Fall 2020)
Переглядів 3,3 тис.3 роки тому
* Lecture 10 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 11 / 03 * Lecturer: Tamara Broderick * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: github.com...
MIT: Machine Learning 6.036, Lecture 9: State machines and Markov decision processes (Fall 2020)
Переглядів 4,3 тис.3 роки тому
* Lecture 9 for the MIT course 6.036: Introduction to Machine Learning (Fall 2020 Semester) * Full lecture information and slides: tamarabroderick.com/ml.html * Lecture date: 2020 / 10 / 27 * Lecturer: Tamara Broderick * Lecture TAs: Crystal Wang and Satvat Jagwani If you find any ways to improve how well the video captions reflect the live lectures, please submit a pull request to: github.com/...
MIT: Machine Learning 6.036, Lecture 8: Convolutional neural networks (Fall 2020)
Переглядів 4,4 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 8: Convolutional neural networks (Fall 2020)
MIT: Machine Learning 6.036, Lecture 7: Brief intermission (Fall 2020)
Переглядів 2,1 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 7: Brief intermission (Fall 2020)
MIT: Machine Learning 6.036, Lecture 6: Neural networks (Fall 2020)
Переглядів 7 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 6: Neural networks (Fall 2020)
MIT: Machine Learning 6.036, Lecture 5: Regression (Fall 2020)
Переглядів 7 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 5: Regression (Fall 2020)
MIT: Machine Learning 6.036, Lecture 4: Logistic regression (Fall 2020)
Переглядів 16 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 4: Logistic regression (Fall 2020)
MIT: Machine Learning 6.036, Lecture 3: Features (Fall 2020)
Переглядів 9 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 3: Features (Fall 2020)
MIT: Machine Learning 6.036, Lecture 2: Perceptrons (Fall 2020)
Переглядів 17 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 2: Perceptrons (Fall 2020)
MIT: Machine Learning 6.036, Lecture 1: Basics (Fall 2020)
Переглядів 44 тис.3 роки тому
MIT: Machine Learning 6.036, Lecture 1: Basics (Fall 2020)
Brian Trippe: "Bayes Estimates for Multiple Related Regressions" (JSM 2020)
Переглядів 2324 роки тому
Brian Trippe: "Bayes Estimates for Multiple Related Regressions" (JSM 2020)
Lorenzo Masoero: "Predicting and maximizing genomic variety discovery via Bayesian nonparametrics"
Переглядів 5484 роки тому
Lorenzo Masoero: "Predicting and maximizing genomic variety discovery via Bayesian nonparametrics"
Tamara Broderick: "Approximate Cross-Validation for Complex Models"
Переглядів 2364 роки тому
Tamara Broderick: "Approximate Cross-Validation for Complex Models"