Event Abstract

eScience Infrastructure for sharing neuroimaging data and running validated analysis pipelines on a high performance cloud

  • 1 Netherlands eScience Center, Netherlands
  • 2 Erasmus MC, Biomedical Imaging Group Rotterdam, Depts. Medical Informatics and Radiology, Netherlands
  • 3 VU University Medical Center, Dept. of Radiology and Nuclear Medicine, Neuroscience Campus Amsterdam, Netherlands
  • 4 VU University Medical Center, Dept. of Physics and Medical Technology, Neuroscience Campus Amsterdam, Netherlands
  • 5 Maastricht University, Dept. of Psychiatry and Neuropsychology/Alzheimer Center Limburg, Netherlands
  • 6 Radboud University Nijmegen Medical Centre, Dept. of Geriatric Medicine/Radboud Alzheimer Centre, Donders Institute for Brain, Cognition and Behaviour, Netherlands
  • 7 Radboud University Nijmegen Medical Centre, Dept. of Neurology, Donders Institute for Brain, Cognition and Behaviour, Netherlands
  • 8 Radboud University Nijmegen, Dept. of Neuroinformatics, Donders Institute for Brain, Cognition and Behaviour, Netherlands

Neuroimaging biomarkers are features derived from neuroimaging data that can be used for disease detection, staging, and prognosis, or to monitor the effect of treatment. To discover and validate these biomarkers, sample sizes are needed that exceed the typical size of single-site studies. This is achieved by combining data sets from different medical centers, and requires an eScience infrastructure for standardized image analysis and the exchange of imaging data, meta-data and analysis results. We present a Dutch pilot study to build such an infrastructure, whereby the emphasis is on patient privacy, access control, user agreements and dispatching analysis pipelines to a high performance cloud. The use case is hippocampal volume extraction (Van der Lijn et al. 2008). This analysis pipeline will be validated and applied to a combined data set from four cohorts.

Three types of data need to be stored: (1) Raw medical image data (DICOM), (2) Subject related meta data fields such as age and gender, and (3) derived data which includes the biomarkers. We have selected the XNAT platform (Markus et al. 2007) as a skeleton, and created a java-based tool for client-side anonimization and batch upload of the DICOM files. Patient IDs are first encrypted with a passphrase known to the data-owner, then hashed, following guidelines in Noumeir et al. (2007). Face scrambling is optional, and carried out on the server. A remaining challenge is to refine the XNAT permission system such that (1) data-owners can allow users to run a pre-installed analysis pipeline, without giving them access to the actual data files, and (2) pipeline-owners can allow users to run the pipeline, without making the code open access. Although these features oppose current trends towards open science, they allow inclusion of protected data sets that would otherwise be unavailable.

Biomarker extraction pipelines are often computationally intensive; the hippocampal volume pipeline takes several hours on a present day compute node for a single image. We will dispatch jobs to a high performance cloud, whereby jobs runs in a virtual machine (VM). This solution has two advantages over traditional grid computing: (1) the pipeline developer can choose the operating system and software running on the VM; and (2) since the VM gets destroyed after the pipeline finishes, no sensitive data can accidentally be left on the compute node.

Acknowledgements

Supported by the Netherlands eScience Center, grant 027.011.304

References

Van der Lijn F, den Heijer T, Breteler MM, Niessen WJ (2008) Hippocampus segmentation in MR images using atlas registration, voxel classification, and graph cuts. Neuroimage 43, 708-20.

Marcus, DS, Olsen T, Ramaratnam M, and Buckner, RL (2007) The Extensible Neuroimaging Archive Toolkit (XNAT): An informatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics 5(1), 11-34.

Rita Noumeir, Alain Lemay, and Jean-Marc Lina (2007) Pseudonymization of Radiology Data for Research Purposes. Journal of Digital Imaging 20(3), 284-295.

Keywords: Neuroimaging, data sharing, imaging biomarkers, XNAT, High performance cloud, DICOM, Pseudonimization

Conference: Neuroinformatics 2013, Stockholm, Sweden, 27 Aug - 29 Aug, 2013.

Presentation Type: Poster

Topic: Neuroimaging

Citation: De Boer P, Ranguelova E, Ivanova M, Koek M, Van Der Lijn F, Niessen W, Versteeg A, Vrenken H, Burgmans S, Van Boxtel M, Meulenbroek O, De Leeuw F, Bakker R and Tiesinga P (2013). eScience Infrastructure for sharing neuroimaging data and running validated analysis pipelines on a high performance cloud. Front. Neuroinform. Conference Abstract: Neuroinformatics 2013. doi: 10.3389/conf.fninf.2013.09.00085

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 08 Apr 2013; Published Online: 11 Jul 2013.

* Correspondence: Dr. Rembrandt Bakker, Radboud University Nijmegen, Dept. of Neuroinformatics, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, Netherlands, r.bakker@donders.ru.nl