Adin Urtasun, Aritz

Loading...
Profile Picture

Email Address

Birth Date

Job Title

Last Name

Adin Urtasun

First Name

Aritz

person.page.departamento

Estadística, Informática y Matemáticas

person.page.instituteName

InaMat2. Instituto de Investigación en Materiales Avanzados y Matemáticas

person.page.observainves

person.page.upna

Name

Search Results

Now showing 1 - 6 of 6
  • PublicationOpen Access
    Automatic cross-validation in structured models: is it time to leave out leave-one-out?
    (Elsevier, 2024-07-01) Adin Urtasun, Aritz; Krainski, Elias Teixeira; Lenzi, Amanda; Liu, Zhedong; Martínez-Minaya, Joaquín; Rue, Håvard; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Universidad Pública de Navarra / Nafarroako Unibertistate Publikoa
    Standard techniques such as leave-one-out cross-validation (LOOCV) might not be suitable for evaluating the predictive performance of models incorporating structured random effects. In such cases, the correlation between the training and test sets could have a notable impact on the model's prediction error. To overcome this issue, an automatic group construction procedure for leave-group-out cross validation (LGOCV) has recently emerged as a valuable tool for enhancing predictive performance measurement in structured models. The purpose of this paper is (i) to compare LOOCV and LGOCV within structured models, emphasizing model selection and predictive performance, and (ii) to provide real data applications in spatial statistics using complex structured models fitted with INLA, showcasing the utility of the automatic LGOCV method. First, we briefly review the key aspects of the recently proposed LGOCV method for automatic group construction in latent Gaussian models. We also demonstrate the effectiveness of this method for selecting the model with the highest predictive performance by simulating extrapolation tasks in both temporal and spatial data analyses. Finally, we provide insights into the effectiveness of the LGOCV method in modeling complex structured data, encompassing spatio-temporal multivariate count data, spatial compositional data, and spatio-temporal geospatial data.
  • PublicationOpen Access
    Dealing with risk discontinuities to estimate cancer mortality risks when the number of small areas is large
    (SAGE, 2021-02-17) Santafé Rodrigo, Guzmán; Adin Urtasun, Aritz; Lee, Duncan; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2
    Many statistical models have been developed during the last years to smooth risks in disease mapping. However, most of these modeling approaches do not take possible local discontinuities into consideration or if they do, they are computationally prohibitive or simply do not work when the number of small areas is large. In this paper, we propose a two-step method to deal with discontinuities and to smooth noisy risks in small areas. In a first stage, a novel density-based clustering algorithm is used. In contrast to previous proposals, this algorithm is able to automatically detect the number of spatial clusters, thus providing a single cluster structure. In the second stage, a Bayesian hierarchical spatial model that takes the cluster configuration into account is fitted, which accounts for the discontinuities in disease risk. To evaluate the performance of this new procedure in comparison to previous proposals, a simulation study has been conducted. Results show competitive risk estimates at a much better computational cost. The new methodology is used to analyze stomach cancer mortality data in Spanish municipalities.
  • PublicationOpen Access
    Bayesian modeling approach in Big Data contexts: an application in spatial epidemiology
    (IEEE, 2020) Orozco Acosta, Erick; Adin Urtasun, Aritz; Ugarte Martínez, María Dolores; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Estadística, Informática y Matemáticas
    In this work we propose a novel scalable Bayesian modeling approach to smooth mortality risks borrowing information from neighbouring regions in high-dimensional spatial disease mapping contexts. The method is based on the well-known divide and conquer approach, so that the spatial domain is divided into D subregions where local spatial models can be fitted simultaneously. Model fitting and inference has been carried out using the integrated nested Laplace approximation (INLA) technique. Male colorectal cancer mortality data in the municipalities of continental Spain have been analyzed using the new model proposals. Results show that the new modeling approach is very competitive in terms of model fitting criteria when compared with a global spatial model, and it is computationally much more efficient.
  • PublicationOpen Access
    Alleviating confounding in spatio-temporal areal models with an application on crimes against women in India
    (SAGE Publications, 2021) Adin Urtasun, Aritz; Goicoa Mangado, Tomás; Hodges, James S.; Schnell, Patrick M.; Ugarte Martínez, María Dolores; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Estadística, Informática y Matemáticas
    Assessing associations between a response of interest and a set of covariates in spatial areal models is the leitmotiv of ecological regression. However, the presence of spatially correlated random effects can mask or even bias estimates of such associations due to confounding effects if they are not carefully handled. Though potentially harmful, confounding issues have often been ignored in practice leading to wrong conclusions about the underlying associations between the response and the covariates. In spatio-temporal areal models, the temporal dimension may emerge as a new source of confounding, and the problem may be even worse. In this work, we propose two approaches to deal with confounding of fixed effects by spatial and temporal random effects, while obtaining good model predictions. In particular, restricted regression and an apparently—though in fact not—equivalent procedure using constraints are proposed within both fully Bayes and empirical Bayes approaches. The methods are compared in terms of fixed-effect estimates and model selection criteria. The techniques are used to assess the association between dowry deaths and certain socio-demographic covariates in the districts of Uttar Pradesh, India.
  • PublicationOpen Access
    Identifying extreme COVID-19 mortality risks in English small areas: a disease cluster approach
    (Springer, 2022) Adin Urtasun, Aritz; Congdon, P.; Santafé Rodrigo, Guzmán; Ugarte Martínez, María Dolores; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Estadística, Informática y Matemáticas
    The COVID-19 pandemic is having a huge impact worldwide and has highlighted the extent of health inequalities between countries but also in small areas within a country. Identifying areas with high mortality is important both of public health mitigation in COVID-19 outbreaks, and of longer term efforts to tackle social inequalities in health. In this paper we consider different statistical models and an extension of a recent method to analyze COVID-19 related mortality in English small areas during the first wave of the epidemic in the first half of 2020. We seek to identify hotspots, and where they are most geographically concentrated, taking account of observed area factors as well as spatial correlation and clustering in regression residuals, while also allowing for spatial discontinuities. Results show an excess of COVID-19 mortality cases in small areas surrounding London and in other small areas in North-East and and North-West of England. Models alleviating spatial confounding show ethnic isolation, air quality and area morbidity covariates having a significant and broadly similar impact on COVID-19 mortality, whereas nursing home location seems to be slightly less important.
  • PublicationOpen Access
    Space-time analysis of ovarian cancer mortality rates by age groups in Spanish provinces (1989-2015)
    (BioMed Central, 2020) Trandafir, Paula Camelia; Adin Urtasun, Aritz; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2
    Background: Ovarian cancer is a silent and largely asymptomatic cancer, leading to late diagnosis and worse prognosis. The late-stage detection and low survival rates, makes the study of the space-time evolution of ovarian cancer particularly relevant. In addition, research of this cancer in small areas (like provinces or counties) is still scarce. Methods: The study presented here covers all ovarian cancer deaths for women over 50 years of age in the provinces of Spain during the period 1989-2015. Spatio-temporal models have been fitted to smooth ovarian cancer mortality rates in age groups [50,60), [60,70), [70,80), and [80,+), borrowing information from spatial and temporal neighbours. Model fitting and inference has been carried out using the Integrated Nested Laplace Approximation (INLA) technique. Results: Large differences in ovarian cancer mortality among the age groups have been found, with higher mortality rates in the older age groups. Striking differences are observed between northern and southern Spain. The global temporal trends (by age group) reveal that the evolution of ovarian cancer over the whole of Spain has remained nearly constant since the early 2000s. Conclusion: Differences in ovarian cancer mortality exist among the Spanish provinces, years, and age groups. As the exact causes of ovarian cancer remain unknown, spatio-temporal analyses by age groups are essential to discover inequalities in ovarian cancer mortality. Women over 60 years of age should be the focus of follow-up studies as the mortality rates remain constant since 2002. High-mortality provinces should also be monitored to look for specific risk factors.