Seminars

The Department of Statistics Seminar Series will be conducted remotely via Zoom this semester. Current speakers for this semester include:

 

Tuesday, September 15th at 4 PM

Prof. Vivak Patel, University of Wisconsin Department of Statistics 

Title: When Do We Stop SGD?

Abstract: Stochastic gradient descent (SGD) and related methods have become a staple in a broad class of optimization problems, even though the question of when to terminate the iterations is still open. In this talk, we will discuss why termination criteria are important, what the specific challenges are for developing rigorous termination criteria for SGD, and our recent efforts to address these challenges.

 

Tuesday, September 22nd at 3:30 PM

Claire McKay Bowen, Lead Data Scientist, Privacy and Data Security at the Urban Institute 

Title: Data Privacy in the Real World

Abstract: With recent misuses of data access such as the Facebook - Cambridge Analytica Scandal, society raises valid data privacy concerns when private companies and other entities gather their information. Statistical disclosure control (SDC) or limitation are methods that aim to release high-quality data products while preserving the confidentiality of sensitive data. These techniques have existed within the statistics field since the mid-twentieth century, but, over the past two decades, the data landscape has dramatically changed. Data adversaries (or intruders) can more easily reconstruct datasets and identify individuals from supposedly anonymized data with the advances in modern information infrastructure and computation. While traditional methods of SDC and secure data centers are still used extensively, varying opinions about procedures have been developed across academia, government, and industry and in different countries. A definition known as Differential Privacy (DP) has garnered much attention, and many researchers and data maintainers are moving to develop and implement differentially private methods. In this talk, I will introduce and survey what SDC and DP are and the current challenges in applying these methods to real world data. I will provide motivating examples such as the current collaboration with the Urban Institute and IRS to generate synthetic data (pseudo record data) of tax return data that is invaluable for analyzing US presidential candidates’ proposed tax policies.

 

Tuesday, September 29th at 3:30 PM

‪Jesús Arroyo Relión, Postdoctoral Fellow, Center for Imaging Science at Johns Hopkins University 

Title: Simultaneous prediction and community detection for networks with application to neuroimaging

Abstract: Community structure in networks is observed in many different domains, and unsupervised community detection has received a lot of attention in the literature. Increasingly the focus of network analysis is shifting towards using network information in some other prediction or inference task rather than just analyzing the network itself. In neuroimaging applications, brain networks are available for multiple subjects and the goal is often to predict a phenotype of interest. Community structure is well known to be a feature of brain networks, typically corresponding to different regions of the brain responsible for different functions. There are standard parcellations of the brain into such regions, usually obtained by applying clustering methods to brain connectomes of healthy subjects. However, when the goal is predicting a phenotype or distinguishing between different conditions, these unsupervised communities from an unrelated set of healthy subjects may not be useful.

In this talk, I will present a method for supervised community detection, aiming to find a partition of the network into communities that are most useful for predicting a particular response. We use a block-structured regularization penalty combined with a prediction loss function, and compute the solution with a combination of a spectral method and an ADMM optimization algorithm. We show that the spectral clustering method recovers the correct communities under a weighted stochastic block model. The method performs well on both simulated and real brain networks, providing support for the idea of task-dependent brain regions. This is joint work with Elizaveta Levina

 


 

Check back for announcements of additional seminar speakers. A Zoom link for each event will be sent via email to Department of Statistics faculty and students. Other interested Pitt community members should email srh75@pitt.edu to request the link.