Adin Urtasun, Aritz
Loading...
Email Address
person.page.identifierURI
Birth Date
Job Title
Last Name
Adin Urtasun
First Name
Aritz
person.page.departamento
Estadística, Informática y Matemáticas
person.page.instituteName
InaMat2. Instituto de Investigación en Materiales Avanzados y Matemáticas
ORCID
person.page.observainves
person.page.upna
Name
- Publications
- item.page.relationships.isAdvisorOfPublication
- item.page.relationships.isAdvisorTFEOfPublication
- item.page.relationships.isAuthorMDOfPublication
19 results
Search Results
Now showing 1 - 10 of 19
Publication Open Access Automatic cross-validation in structured models: is it time to leave out leave-one-out?(Elsevier, 2024-07-01) Adin Urtasun, Aritz; Krainski, Elias Teixeira; Lenzi, Amanda; Liu, Zhedong; Martínez-Minaya, Joaquín; Rue, Håvard; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Universidad Pública de Navarra / Nafarroako Unibertistate PublikoaStandard techniques such as leave-one-out cross-validation (LOOCV) might not be suitable for evaluating the predictive performance of models incorporating structured random effects. In such cases, the correlation between the training and test sets could have a notable impact on the model's prediction error. To overcome this issue, an automatic group construction procedure for leave-group-out cross validation (LGOCV) has recently emerged as a valuable tool for enhancing predictive performance measurement in structured models. The purpose of this paper is (i) to compare LOOCV and LGOCV within structured models, emphasizing model selection and predictive performance, and (ii) to provide real data applications in spatial statistics using complex structured models fitted with INLA, showcasing the utility of the automatic LGOCV method. First, we briefly review the key aspects of the recently proposed LGOCV method for automatic group construction in latent Gaussian models. We also demonstrate the effectiveness of this method for selecting the model with the highest predictive performance by simulating extrapolation tasks in both temporal and spatial data analyses. Finally, we provide insights into the effectiveness of the LGOCV method in modeling complex structured data, encompassing spatio-temporal multivariate count data, spatial compositional data, and spatio-temporal geospatial data.Publication Open Access Online relative risks/rates estimation in spatial and spatio-temporal disease mapping(Elsevier, 2019) Adin Urtasun, Aritz; Goicoa Mangado, Tomás; Ugarte Martínez, María Dolores; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Estadística, Informática y MatemáticasBackground and objective: Spatial and spatio-temporal analyses of count data are crucial in epidemiology and other fields to unveil spatial and spatio-temporal patterns of incidence and/or mortality risks. However, fitting spatial and spatio-temporal models is not easy for non-expert users. The objective of this paper is to present an interactive and user-friendly web application (named SSTCDapp) for the analysis of spatial and spatio-temporal mortality or incidence data. Although SSTCDapp is simple to use, the underlying statistical theory is well founded and all key issues such as model identifiability, model selection, and several spatial priors and hyperpriors for sensitivity analyses are properly addressed. Methods: The web application is designed to fit an extensive range of fairly complex spatio-temporal models to smooth the very often extremely variable standardized incidence/mortality risks or crude rates. The application is built with the R package shiny and relies on the well founded integrated nested Laplace approximation technique for model fitting and inference. Results: The use of the web application is shown through the analysis of Spanish spatio-temporal breast cancer data. Different possibilities for the analysis regarding the type of model, model selection criteria, and a range of graphical as well as numerical outputs are provided. Conclusions: Unlike other software used in disease mapping, SSTCDapp facilitates the fit of complex statistical models to non-experts users without the need of installing any software in their own computers, since all the analyses and computations are made in a powerful remote server. In addition, a desktop version is also available to run the application locally in those cases in which data confidentiality is a serious issue.Publication Open Access Hierarchical and spline-based models in space-time disease mapping(2017) Adin Urtasun, Aritz; Ugarte Martínez, María Dolores; Goicoa Mangado, Tomás; Estadística e Investigación Operativa; Estatistika eta Ikerketa OperatiboaLa representación cartográfica de enfermedades (disease mapping) es un área de investigación de gran interés en epidemiología y salud pública. La gran variabilidad inherente a las medidas clásicas de estimación de riesgo como la razón de mortalidad estandarizada, hacen necesario el uso de técnicas estadísticas que estabilicen estas razones. Durante los últimos años se han desarrollado muchos modelos estadísticos para estudiar la distribución geográfica de una enfermedad y su evolución en el tiempo. Sin embargo, la disponibilidad de datos de alta calidad recogidos en muchas regiones y durante largos periodos de tiempo, así como la aparición de nuevos y cada vez más sofisticados modelos, han revelado nuevas dificultades que necesitan ser investigadas a fondo. En el Capítulo 1 se describen algunos modelos espacio-temporales de relevancia para el resto de capítulos abordados en la tesis y se detallan las restricciones necesarias para resolver los problemas de identificación de dichos modelos. El Capítulo 1 también describe la técnica inferencia! Bayesiana utilizada a lo largo de la tesis, basada en aproximaciones de Laplace e integración numérica (conocida como INLA), y su implementación en R. En el Capítulo 2 se han comparado cinco modelos espacio-temporales utilizados en disease mapping. Para poder comparar los diferentes términos de estos modelos, se ha calculado una descomposición del logaritmo de los riesgos estimados definiendo patrones espaciales, temporales y espacio-temporales a posteriori. Los resultados se ilustran con datos de mortalidad por cáncer de encéfalo en las provincias Españolas durante el periodo 1986-2010. Además, se ha realizado un estudio de simulación para comparar el rendimiento de los modelos en términos de sensitividad (habilidad para detectar regiones de alto riesgo verdaderas) y especificidad (habilidad para descartar regiones de alto riesgo falsas). Se concluye que cuando el número de casos esperados es muy pequeño (algo común cuando se analizan enfermedades raras o dominios muy pequeños como municipios), los modelos de P-splines se comportan mejor en términos de detección de áreas de alto riesgo. En el Capítulo 3 se propone una nueva familia de modelos espacio-temporales que incluyen efectos aleatorios para dos niveles espaciales, permitiendo modelizar efectos espaciales y espacio-temporales a diferentes niveles de agregación (como por ejemplo, municipios dentro de provincias o zonas de salud que se ven afectados por políticas de salud similares). Estos modelos han sido utilizados para analizar los datos de mortalidad en los municipios del País Vasco y Navarra durante el periodo 1986-2008. Se ha realizado un estudio de simulación en donde se concluye que si existen diferentes niveles de agregación espacial, los nuevos modelos a dos niveles se comportan mejor que modelos previos propuestos en la literatura. En el Capítulo 4 se presentan nuevos modelos de E-splines que incluyen correlaciones espaciales y temporales desde un enfoque completamente Bayesiano. Concretamente se describen modelos que incluyen B-spline temporales unidimensionales que pueden tener (o no) correlación espacial, así como modelos de B-spline espaciales bidimensionales que pueden tener (o no) correlación temporal. Los resultados se ilustran con datos de mortalidad por cáncer de mama en la España peninsular durante el periodo 1990-2010. Se observa que, en general, utilizar modelos con B-spline temporales distintos para cada área proporciona mejores resultados en términos de ajuste. Sin embargo, cuando el número de áreas aumenta, estos modelos pueden no ser factibles desde un punto de vista computacional. Por el contrario, los modelos de P-spline tridimensionales (previamente propuestos en la literatura y formulados en esta tesis desde un punto de vista completamente Bayesiano) son una alternativa prometedora, obteniendo estimaciones del riesgo precisas en tiempos computaciones mucho más cortos.Publication Open Access A two-stage approach to estimate spatial and spatio-temporal disease risks in the presence of local discontinuities and clusters(SAGE, 2018-04-13) Adin Urtasun, Aritz; Lee, Duncan; Goicoa Mangado, Tomás; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2Disease risk maps for areal unit data are often estimated from Poisson mixed models with local spatial smoothing, for example by incorporating random effects with a conditional autoregressive prior distribution. However, one of the limitations is that local discontinuities in the spatial pattern are not usually modelled, leading to over-smoothing of the risk maps and a masking of clusters of hot/coldspot areas. In this paper, we propose a novel two-stage approach to estimate and map disease risk in the presence of such local discontinuities and clusters. We propose approaches in both spatial and spatio-temporal domains, where for the latter the clusters can either be fixed or allowed to vary over time. In the first stage, we apply an agglomerative hierarchical clustering algorithm to training data to provide sets of potential clusters, and in the second stage, a two-level spatial or spatio-temporal model is applied to each potential cluster configuration. The superiority of the proposed approach with regard to a previous proposal is shown by simulation, and the methodology is applied to two important public health problems in Spain, namely stomach cancer mortality across Spain and brain cancer incidence in the Navarre and Basque Country regions of Spain.Publication Open Access High-dimensional order-free multivariate spatial disease mapping(Springer, 2023) Vicente Fuenzalida, Gonzalo; Adin Urtasun, Aritz; Goicoa Mangado, Tomás; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Universidad Pública de Navarra / Nafarroako Unibertsitate Publikoa, PJUPNA2001Despite the amount of research on disease mapping in recent years, the use of multivariate models for areal spatial data remains limited due to difficulties in implementation and computational burden. These problems are exacerbated when the number of areas is very large. In this paper, we introduce an order-free multivariate scalable Bayesian modelling approach to smooth mortality (or incidence) risks of several diseases simultaneously. The proposal partitions the spatial domain into smaller subregions, fits multivariate models in each subdivision and obtains the posterior distribution of the relative risks across the entire spatial domain. The approach also provides posterior correlations among the spatial patterns of the diseases in each partition that are combined through a consensus Monte Carlo algorithm to obtain correlations for the whole study region. We implement the proposal using integrated nested Laplace approximations (INLA) in the R package bigDM and use it to jointly analyse colorectal, lung, and stomach cancer mortality data in Spanish municipalities. The new proposal allows for the analysis of large datasets and yields superior results compared to fitting a single multivariate model. Additionally, it facilitates statistical inference through local homogeneous models, which may be more appropriate than a global homogeneous model when dealing with a large number of areas.Publication Open Access A scalable approach for short-term disease forecasting in high spatial resolution areal data(Wiley-VCH, 2023) Orozco Acosta, Erick; Riebler, Andrea; Adin Urtasun, Aritz; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2; Universidad Pública de Navarra / Nafarroako Unibertsitate PublikoaShort-term disease forecasting at specific discrete spatial resolutions has become a high-impact decision-support tool in health planning. However, when the number of areas is very large obtaining predictions can be computationally intensive or even unfeasible using standard spatiotemporal models. The purpose of this paper is to provide a method for short-term predictions in high-dimensional areal data based on a newly proposed ¿divide-and-conquer¿ approach. We assess the predictive performance of this method and other classical spatiotemporal models in a validation study that uses cancer mortality data for the 7907 municipalities of continental Spain. The new proposal outperforms traditional models in terms of mean absolute error, root mean square error, and interval score when forecasting cancer mortality 1, 2, and 3 years ahead. Models are implemented in a fully Bayesian framework using the well-known integrated nested Laplace estimation technique.Publication Open Access Temporal evolution of brain cancer incidence in the municipalities of Navarre and the Basque Country, Spain(BioMed Central, 2015) Ugarte Martínez, María Dolores; Adin Urtasun, Aritz; Goicoa Mangado, Tomás; Casado, Itziar; Ardanaz, Eva; Larrañaga, Nerea; Estatistika eta Ikerketa Operatiboa; Institute for Advanced Materials and Mathematics - INAMAT2; Estadística e Investigación Operativa; Gobierno de Navarra / Nafarroako Gobernua: proyecto 113 Res. 2186/2014Background: Brain cancer incidence rates in Spain are below the European’s average. However, there are two regions in the north of the country, Navarre and the Basque Country, ranked among the European regions with the highest incidence rates for both males and females. Our objective here was two-fold. Firstly, to describe the temporal evolution of the geographical pattern of brain cancer incidence in Navarre and the Basque Country, and secondly, to look for specific high risk areas (municipalities) within these two regions in the study period (1986–2008). Methods: A mixed Poisson model with two levels of spatial effects is used. The model also included two levels of spatial effects (municipalities and local health areas). Model fitting was carried out using penalized quasi-likelihood. High risk regions were detected using upper one-sided confidence intervals. Results: Results revealed a group of high risk areas surrounding Pamplona, the capital city of Navarre, and a few municipalities with significant high risks in the northern part of the region, specifically in the border between Navarre and the Basque Country (Gipuzkoa). The global temporal trend was found to be increasing. Differences were also observed among specific risk evolutions in certain municipalities. Conclusions: Brain cancer incidence in Navarre and the Basque Country (Spain) is still increasing with time. The number of high risk areas within those two regions is also increasing. Our study highlights the need of continuous surveillance of this cancer in the areas of high risk. However, due to the low percentage of cases explained by the known risk factors, primary prevention should be applied as a general recommendation in these populations.Publication Open Access In spatio-temporal disease mapping models, identifiability constraints affect PQL and INLA results(Springer, 2018) Goicoa Mangado, Tomás; Adin Urtasun, Aritz; Ugarte Martínez, María Dolores; Hodges, James S.; Institute for Advanced Materials and Mathematics - INAMAT2Disease mapping studies the distribution of relative risks or rates in space and time, and typically relies on generalized linear mixed models (GLMMs) including fixed effects and spatial, temporal, and spatio-temporal random effects. These GLMMs are typically not identifiable and constraints are required to achieve sensible results. However, automatic specification of constraints can sometimes lead to misleading results. In particular, the penalized quasi-likelihood fitting technique automatically centers the random effects even when this is not necessary. In the Bayesian approach, the recently-introduced integrated nested Laplace approximations computing technique can also produce wrong results if constraints are not wellspecified. In this paper the spatial, temporal, and spatiotemporal interaction random effects are reparameterized using the spectral decompositions of their precision matrices to establish the appropriate identifiability constraints. Breast cancer mortality data from Spain is used to illustrate the ideas.Publication Open Access Análisis espacio-temporal de los accidentes mortales con tractor en España durante el período 2010-2019(Interempresas Media, 2023) Arazuri Garín, Silvia; Ibarrola, Alicia; Mangado Ederra, Jesús; Adin Urtasun, Aritz; Arnal Atarés, Pedro; López Maestresalas, Ainara; Jarén Ceballos, Carmen; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Ingeniería; IngeniaritzaEl sector agrario y el de la construcción son los que presentan los índices de incidencia de accidentes de trabajo mortales más altos de nuestro país, según los datos recogidos por el Instituto Nacional de Seguridad y Salud en el Trabajo (INSST) (2021) dependiente del Ministerio de Trabajo y Economía Social (Cirauqui, 2022). Si tenemos en cuenta la evolución de estos índices, el sector agrario es el único que no ha mejorado dicho índice desde la aparición de la Ley 31/1995 de prevención de riesgos laborales y su siniestralidad continúa aumentando (Fundación Mapfre 2020). Pero, ¿qué ocurre cuando el accidente lo sufren personas que no encajan en la definición legal de trabajador? Estos accidentes no son considerados 'accidente de trabajo' y, por tanto, escapan a todas las estadísticas y datos oficiales del INSST. Este suele ser el caso de muchos accidentes que sufren personas jubiladas, menores de 16 años, familiares colaboradores, etc. que no son personas vinculadas a la actividad laboral tal y como se define en la legislación. Según Arana et al. (2010) de un total de 388 accidentes mortales ocurridos en España con maquinaria agrícola durante los años 2004-2008, solamente el 61,85% de ellos tuvieron carácter oficial. Las personas mayores fueron el sector de la población con un mayor riesgo, seguidos de los niños y las personas ajenas al sector agrario. La mayoría de las muertes fueron debidas al vuelco de tractores sin estructuras de protección.Publication Open Access Flexible Bayesian P-splines for smoothing age-specific spatio-temporal mortality patterns(SAGE, 2019) Goicoa Mangado, Tomás; Adin Urtasun, Aritz; Etxeberria Andueza, Jaione; Militino, Ana F.; Ugarte Martínez, María Dolores; Estadística, Informática y Matemáticas; Estatistika, Informatika eta Matematika; Institute for Advanced Materials and Mathematics - INAMAT2In this paper age-space-time models based on one and two-dimensional P-splines with B-spline bases are proposed for smoothing mortality rates, where both xed relative scale and scale invariant two-dimensional penalties are examined. Model tting and inference are carried out using integrated nested Laplace approximations (INLA), a recent Bayesian technique that speeds up computations compared to McMC methods. The models will be illustrated with Spanish breast cancer mortality data during the period 1985-2010, where a general decline in breast cancer mortality has been observed in Spanish provinces in the last decades. The results reveal that mortality rates for the oldest age groups do not decrease in all provinces.