link rel="icon" href="/favicon.ico" sizes="any">
Clinical resources

NTOG Federated Dataset & Visualisation Builder

A sister teaching tool that starts from a structured Concept JSON and turns it into a common data model, synthetic country datasets, aggregate federated outputs, and R visualisation scripts.

Author: Heidi Andersén, MD, PhD — Docent, Tampere University; Clinical Lecturer, University of Turku · Clinical lead, Finnish lung cancer registry · Last reviewed 10 June 2026

Federated dataset specification

The specification describes how a structured study concept is converted into synthetic country-level datasets, local aggregate outputs, pooled summary tables, and R visualisation code.

The teaching workflow follows a federated principle: patient-level rows remain local, and only aggregate summaries are pooled. In this prototype, all data are synthetic and generated in the browser for education.

A federated study is not built by collecting everything first and thinking later. It is built by defining a common question, common variables, local derivation rules, local checks, allowed outputs, and pooled summaries before data extraction.

Data standard, governance & ownership

A federated Nordic dataset is, above all, a shared standard and a governance agreement — not just software. This is the part that belongs to the Nordic community.

Who owns what

The common data elements and the Nordic codebook are an NTOG asset (© Nordic Thoracic Oncology Group) — a community standard that only the registry and society network can own and maintain. The interactive builder is stateless software powered by Vahtian: it generates a specification in your browser, retains nothing, and never holds, processes, or controls registry or patient data.

This separation is deliberate. It keeps data controllership and clinical authority with NTOG and the national registries, and keeps the software a replaceable instrument.

Governance the specification must satisfy (before any real data)

Interoperability standards to align with

This prototype is for education and early design only. It does not replace ethics review, data permissions, a DPIA, statistical review, or patient-level data governance. Do not enter patient-level, identifiable, or confidential data.

1. Study concept input

Paste the Concept JSON from the Study Design & Simulation Builder. This page uses that structured input to generate a federated-style teaching project.

Example concept loaded on page start.

2. Country dataset generation assumptions

These editable teaching assumptions define denominators, centre counts, practice-pattern heterogeneity, missingness, and the random seed.

What this section controls

The synthetic countries are intentionally different. This teaches that federated studies must pre-specify local denominators, missingness patterns, coding rules, and allowed aggregate outputs before analysis.

Cite this tool: Andersén HH. NTOG Federated Dataset & Visualisation Builder (v1.0). Zenodo; 2026. doi:10.5281/zenodo.20632632. Licensed under CC BY-NC 4.0.