Selected Publications

EBiCop: Ensemble Bivariate Copulas for Modeling Multivariate Cyber Data Breach Risks

Published in The Annals of Applied Statistics, 2024

Modeling cyber data breach risks poses a formidable challenge, primarily due to the intricate multivariate dependencies within a backdrop of limited data. This study proposes a novel ensemble learning approach that effectively captures both the temporal and cross-sectional dependence inherent in cyber risks. The proposed approach is significantly different from those traditional ones that directly model the multivariate dependence among risks. Instead, our approach leverages bivariate copulas to generate predictive members to capture the multivariate dependence, and the resulting predictive distribution is calibrated by minimizing the distribution score. Furthermore, the proposed model is applied in insurance pricing, and the results show that it can lead to more profitable contracts. Through extensive simulations and analysis of real-world data, our findings reveal that the proposed model has satisfactory fitting and predictive performance and outperforms those in the literature.

Download here

A Multivariate Frequency–Severity Framework for Healthcare Data Breaches

Published in The Annals of Applied Statistics, 2023

Data breaches in healthcare have become a substantial concern in recent years and cause millions of dollars in financial losses each year. It is fundamental for government regulators, insurance companies, and stakeholders to understand the breach frequency and the number of affected individuals in each state, as these are directly related to the federal Health Insurance Portability and Accountability Act (HIPAA) and state data breach laws. However, an obstacle to studying data breaches in healthcare is the lack of suitable statistical approaches. We develop a novel multivariate frequency-severity framework to analyze breach frequency and the number of affected individuals in this work.

Download here

Data breach cat bonds: Modeling and pricing

Published in North American Actuarial Journal, 2021

Data breaches cause millions of dollars in financial losses each year. The insurance industry has been exploring the ways to transfer such extreme risk. In this work, we investigate data breach catastrophe (CAT) bonds via developing a multiperiod pricing model. It is found that the nonstationary extreme value model can capture the statistical pattern of the monthly maximum of data breach size very well and, in particular, a positive time trend is discovered. For the financial risks, data-driven time series approaches are proposed to model the complex patterns exhibited by the financial data, which are different from those in the literature. Simulation studies are performed to determine the bond prices and cash flows. Our results show that the data breach CAT bond can be an attractive financial product and an effective instrument for transferring the extreme data breach risk.

Cybersecurity insurance: Modeling and pricing

Published in North American Actuarial Journal, 2019

Cybersecurity risk has attracted considerable attention in recent decades. However, the modeling of cybersecurity risk is still in its infancy, mainly because of its unique characteristics. In this study, we develop a framework for modeling and pricing cybersecurity risk. The proposed model consists of three components: the epidemic model, loss function, and premium strategy. We study the dynamic upper bounds for the infection probabilities based on both Markov and non-Markov models. A simulation approach is proposed to compute the premium for cybersecurity risk for practical use. The effects of different infection distributions and dependence among infection processes on the losses are also studied. This paper won the Best Paper Award in 2019!