Dataset service (1.0.0)

Download OpenAPI specification:Download

E-mail: ds.chaimeleon-eu@i3m.upv.es License: Apache 2.0

API to manage datasets.

Project source code

datasets

Operations with datasets

Create a new dataset

Creates a new dataset into the system.

Authorizations:

bearerAuth

Request Body schema: application/json

name	string [ 3 .. 128 ] characters Short descriptive name.
version	string [ 0 .. 16 ] characters Optional for backward compatibility, but should be included in new implementations.
project	string [ 2 .. 16 ] characters ^[a-zA-Z-]+$ (Optional) The code of the project which the dataset is assigned to. If not provided, it is assumed there is only one project in the system, so the first project found for the user will be taken.
previousId	string <uuid> (Optional) Specified when it is a new version of a previous dataset.
description	string Long explanation of dataset details, statistics, links, references, improvements/modifications if it's a new version, etc.
provenance	string Short text with the provenance of the data. Optional for backward compatibility, but should be included in new implementations.
purpose	string Short text with the intended purpose of the dataset. Optional for backward compatibility, but should be included in new implementations.
type	Array of strings Items Enum: "original" "annotated" "processed" "personal-data" Optional for backward compatibility, but should be included in new implementations.
collectionMethod	Array of strings Items Enum: "patient-based" "cohort" "only-image" "longitudinal" "case-control" "disease-specific" Optional for backward compatibility, but should be included in new implementations.
	Array of objects (StudyCreationObject)
	Array of objects (Subject) Subjects of studies and their clinical data.

Responses

Request samples

Payload

Content type

application/json

{"name": "Lung cancer",
"version": "20240415",
"project": "MY-PROJECT",
"previousId": "00e821c4-e92b-48f7-a034-ba2df547e2bf",
"description": "string",
"provenance": "string",
"purpose": "string",
"type": ["annotated"
],
"collectionMethod": ["only-image",
"disease-specific"
],
"studies": [{"studyId": "5e5629835938d12160636353",
"studyName": "TCPEDITRICOABDOMINOPLVICOCONCONTRASTE",
"subjectName": "17B76FEW",
"pathInDatalake": "blancagomez/17B76FEW_Neuroblastoma/TCPEDITRICOABDOMINOPLVICOCONCONTRASTE20150129",
"series": [{"folderName": "AXT1XL",
"tags": ["Axial",
"T2W"
]
}
],
"url": "https://www.quibim.com/studies?id=5e5629835938d12160636353"
}
],
"subjects": [{"subjectName": "string",
"subjectId": "string",
"eForm": { }
}
]
}

Response samples

201

Content type

application/json

{"url": "https://chaimeleon-eu.i3m.upv.es/dataset-service/datasets/f99017af-9015-4222-b064-77f3c1b49d8b/details",
"apiUrl": "/api/datasets/f99017af-9015-4222-b064-77f3c1b49d8b"
}

List datasets

You can list all available datasets in the system and optionally add some filters for flags, tags, name, id, author, project, etc. There are also some parameters for pagination and sorting the results.

Authorizations:

bearerAuth

query Parameters

draft	boolean (Optional filter) If true, only draft datasets will be shown (those created by the user). If false, only not draft datasets will be shown. If not set, both types of dataset will be shown.
public	boolean (Optional filter) If true, only public datasets will be shown. If false, only non-public datasets will be shown (depending on the user permissions). If not set, both types of dataset will be shown.
invalidated	boolean (Optional filter) If true, only invalidated datasets will be shown (depending on the user permissions). If false, only not invalidated datasets will be shown; If not set, both types of dataset will be shown.
tags	Array of strings[^[a-zA-Z-]+$] Example: tags=train-partition&tags=annotated (Optional filter) If set, only datasets with all those tags will be shown. Note as the type is array, you can repeat the parameter in the URL to send more than one tag. Tags can contain alphanumeric characters or '-', and with max length of 20.
project	string Example: project=MY-PROJECT (Optional filter) If set, only datasets of that project will be shown. You can do "GET /projects" to obtain all the possible values.
searchString	string (Optional, default is empty) Pass an optional search string (or substring)(case-insensitive) for the name, id or author of dataset."
searchSubject	string (Optional, default is empty) Pass an optional search string (or substring)(case-insensitive) for the name of a subject (actually the code, cause they are anonymised) which must be contained in the dataset. It can be useful for searching datasets to invalidate in case of a subject that must be taken out of the platform.
onlyLastVersions	string (Optional, default is false) If true, "old" datasets will not be listed. "old" means a dataset with the property "nextId" not null, i.e. there is another dataset which is the next version.
skip	integer <int32> >= 0 (Optional, default=0) Number of records to skip for pagination.
limit	integer <int32> >= 0 (Optional, default=30) Maximum number of records to return (records per page), value of 0 means no limit.
sortBy	string Enum: "name" "authorName" "creationDate" "studiesCount" "subjectsCount" "timesUsed" (Optional, default=creationDate) The list will be sorted by this property.
sortDirection	string Enum: "ascending" "descending" (Optional) The list will be sorted in this direction.

Responses

Response samples

200

Content type

application/json

{"total": 0,
"returned": 0,
"skipped": 0,
"limit": 0,
"list": [{"id": "00e821c4-e92b-48f7-a034-ba2df547e2bf",
"name": "Lung cancer",
"version": "20240415",
"authorName": "James Gordon",
"creationDate": "2016-08-29T09:12:33.001Z",
"project": "MY-PROJECT",
"draft": true,
"public": true,
"invalidated": true,
"corrupted": true,
"tags": ["train-partition",
"annotated"
],
"studiesCount": 0,
"subjectsCount": 0,
"timesUsed": 0
}
],
"allowedActionsForTheUser": ["create"
]
}

Get a dataset by its id

Returns the details of a dataset specified by its id.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset.

Responses

Response samples

200

Content type

application/json

{"id": "00e821c4-e92b-48f7-a034-ba2df547e2bf",
"name": "Lung cancer",
"version": "20240415",
"project": "MY-PROJECT",
"previousId": "efa2cba6-4a17-4612-8074-7e9eb9c9d7ca",
"nextId": "4bda04db-8b73-4a65-b1bb-d2011769a91a",
"authorId": "d290f1ee-6c54-4b01-90e6-d701748f0851",
"authorName": "James Gordon",
"authorEmail": "james@email.com",
"creationDate": "2016-08-29T09:12:33.001Z",
"description": "string",
"tags": ["train-partition",
"annotated"
],
"provenance": "string",
"purpose": "string",
"type": ["annotated"
],
"collectionMethod": ["only-image",
"disease-specific"
],
"license": {"title": "CC BY 4.0",
"url": "https://creativecommons.org/licenses/by/4.0/"
},
"pids": {"preferred": "zenodoDoi",
"urls": {"zenodoDoi": "https://doi.org/10.5072/zenodo.1081030",
"custom": "https://myDatasetsDB.com/ds327"
}
},
"contactInfo": "James Gordon (james@email.com)",
"draft": true,
"creating": true,
"public": true,
"invalidated": true,
"invalidationReason": "It is a discarded/useless draft",
"corrupted": true,
"lastIntegrityCheck": "2016-08-29T09:12:33.001Z",
"editablePropertiesByTheUser": ["public",
"invalidated",
"name",
"description"
],
"allowedActionsForTheUser": ["use",
"checkIntegrity"
],
"studiesCount": 0,
"subjectsCount": 0,
"ageLow": 54,
"ageHigh": 82,
"ageUnit": ["years",
"years"
],
"ageNullCount": 2,
"sex": ["Male",
"Female",
"Unknown"
],
"sexCount": [348,
170,
2
],
"diagnosis": ["Colon cancer",
"Rectum cancer",
"Unknown"
],
"diagnosisCount": [348,
170,
2
],
"bodyPart": ["LUNG",
"BRAIN",
"Unknown"
],
"bodyPartCount": [348,
170,
2
],
"modality": ["CT",
"MR",
"Unknown"
],
"modalityCount": [348,
170,
2
],
"manufacturer": ["Philips",
"Siemens",
"Unknown"
],
"manufacturerCount": [348,
170,
2
],
"diagnosisYearLow": 0,
"diagnosisYearHigh": 0,
"diagnosisYearNullCount": 2,
"seriesTags": ["Axial",
"T2W"
],
"sizeInBytes": 2834502270,
"timesUsed": 0
}

Change a property of a dataset by its id

Changes a property of a dataset in the system.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset to modify.

Request Body schema: application/json

property

string

Enum: "draft" "public" "invalidated" "invalidationReason" "name" "version" "description" "tags" "provenance" "purpose" "type" "collectionMethod" "previousId" "license" "pids" "contactInfo" "authorId"

The name of the property to change. The properies that can be changed depend on the current state of dataset and the current user, see the property "editablePropertiesByTheUser" returned by GET. See the DatasetDetails schema to get definition and value type for each of these properties.

value

object

The new value to assign. The type depends on the property, see the DatasetDetails schema.

Responses

Request samples

Payload

Content type

application/json

{"property": "public",
"value": true
}

Delete a dataset by its id

Deletes a dataset in the system. Normal users only can delete a dataset when it is still being created (or ended with error) (i.e. "creating" flag is true). Once created it can be used, and this usage will be traced, so it cannot be deleted (the deletion would hide the usage), but it can be invalidated instead. Check if "allowedActionsForTheUser" contains "delete" to know if the user can delete.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset to delete.

Responses

List studies of a dataset

Returns the list of studies in a dataset specified by its id.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset.

query Parameters

skip	integer <int32> >= 0 (Optional, default=0) number of studies to skip for pagination.
limit	integer <int32> >= 0 (Optional, default=30) Maximum number of studies to return (records per page), value of 0 means no limit.

Responses

Response samples

200

Content type

application/json

{"total": 0,
"returned": 0,
"skipped": 0,
"limit": 0,
"list": [{"studyId": "5e5629835938d12160636353",
"studyName": "TCPEDITRICOABDOMINOPLVICOCONCONTRASTE",
"subjectName": "17B76FEW",
"series": [{"folderName": "AXT1XL",
"tags": ["Axial",
"T2W"
]
}
],
"url": "https://www.quibim.com/studies?id=5e5629835938d12160636353",
"sizeInBytes": 182460520
}
]
}

Get the status of creation of a dataset by its id

Returns the details of the creation of a dataset specified by its id.

Authorizations:

bearerAuth

path Parameters

id

required

string

the id of the dataset

Responses

Response samples

200

Content type

application/json

{"status": "pending",
"lastMessage": "string"
}

Check the integrity of a dataset

(This operation is intended only for the superadmin_datasets role; check if "allowedActionsForTheUser" contains "checkIntegrity".) Launch a process to read the entire dataset, calculate the hash and compare with the one stored in the creation. The result will be in the property "DatasetDetails.lastIntegrityCheck".

Authorizations:

bearerAuth

path Parameters

id

required

string

the id of the dataset

Responses

Response samples

200

Content type

application/json

{"success": true,
"msg": "string"
}

Relaunch the creation job of a dataset

(This operation is intended only for the superadmin_datasets role; check if "allowedActionsForTheUser" contains "restartCreation".) When a creation job is interrupted (and fail) for any reason and the admin fix the problem, then another creation job can be launched in k8s with this operation in order to restart and complete the process of creation.

Authorizations:

bearerAuth

path Parameters

id

required

string

the id of the dataset

Responses

Readjust the file permissions of a dataset

(This operation is intended only for the superadmin_datasets role; check if "allowedActionsForTheUser" contains "readjustFilePermissions".) When the permissions in files or directories of a dataset in datalake are changed (for example if reuploaded a study), then a job can be launched in k8s with this operation in order to readjust these permissions.

Authorizations:

bearerAuth

path Parameters

id

required

string

the id of the dataset

Responses

Recollect the metadata of a dataset

(This operation is intended only for the superadmin_datasets role; check if "allowedActionsForTheUser" contains "recollectMetadata".) When a the service is updated to a new version which adds new metadata fields, then a job can be launched in k8s with this operation in order to rescan for collecting metadata again and fill all the fields.

Authorizations:

bearerAuth

path Parameters

id

required

string

the id of the dataset

Responses

List accesses of a dataset

(This operation is intended only for the admin_datasetAccess role; check if "allowedActionsForTheUser" contains "viewAccessHistory".) Returns the list of accesses to a dataset specified by its id.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset.

query Parameters

skip	integer <int32> >= 0 (Optional, default=0) Number of accesses to skip for pagination.
limit	integer <int32> >= 0 (Optional, default=30) Maximum number of accesss to return (records per page), value of 0 means no limit.

Responses

Response samples

200

Content type

application/json

{"total": 0,
"returned": 0,
"skipped": 0,
"limit": 0,
"list": [{"creationTime": "2023-09-19T09:52:13.001Z",
"username": "James Gordon",
"accessType": "b",
"instanceName": "my-job",
"toolName": "jupyter-tensorflow",
"toolVersion": "2.2.9",
"image": "chaimeleon-library/ubuntu-python-tensorflow-desktop-jupyter:3.12",
"resourcesFlavor": "large-gpu",
"duration": 23,
"startTime": "2023-09-19T09:52:13.001Z",
"endTime": "2023-09-19T10:29:19.001Z",
"endStatus": "succeeded",
"cmdLine": "# ls -lh",
"openchallengeJobType": "training"
}
]
}

Get the ACL of a dataset by its id

(This operation is intended only for the admin_datasetAccess role; check if "allowedActionsForTheUser" contains "manageACL".) Returns the access control list (ACL) of a dataset specified by its id. The ACL is a list of users who can use the dataset (in addition to the users joined to the project of the dataset). IMPORTANT: The ACL can be managed always but it is considered only when the dataset is public. If the dataset is not public, only users joined to the project can use it.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the dataset.

Responses

Response samples

200

Content type

application/json

[{"uid": "d290f1ee-6c54-4b01-90e6-d701748f0851",
"username": "user1"
}
]

Put a user to the ACL of the dataset

(This operation is intended only for the admin_datasetAccess role; check if "allowedActionsForTheUser" contains "manageACL".) Add a user to the ACL of the dataset if not exists (else do nothing). The user will be able to access dataset files. This is only useful for users that are not joined to the project of the dataset (the users in the project will always be able to access).

Authorizations:

bearerAuth

path Parameters

id required	string The id of the dataset.
username required	string Example: user1 The unique username of the user.

Responses

Delete a user from the ACL of a dataset

(This operation is intended only for the admin_datasetAccess role; check if "allowedActionsForTheUser" contains "manageACL".) Remove a user to the ACL of the dataset if exists (else do nothing).

Authorizations:

bearerAuth

path Parameters

id required	string The id of the dataset.
username required	string Example: user1 The unique username of the user.

Responses

List datasets that can be upgraded by the user

Lists all datasets that can be upgraded by the authenticated user. Upgrade a dataset means create another dataset that improves or correct this one and thus the previousId property of the new one will be filled with the id of this one.

Authorizations:

bearerAuth

Responses

Response samples

200

Content type

application/json

[{"id": "00e821c4-e92b-48f7-a034-ba2df547e2bf",
"name": "Lung cancer",
"version": "20240415"
}
]

users

Operations with users

List users

You can list all users in the system and optionally add some filters for disabled, name, project, etc. There are also some parameters for pagination.

Authorizations:

bearerAuth

query Parameters

disabled	boolean (Optional filter) If true, only disabled users will be shown. If false, only not disabled users will be shown; If not set, both types of users will be shown.
project	string Example: project=MY-PROJECT (Optional filter) If set, only users assigned to that project will be shown. You can do "GET /projects" to obtain all the possible values.
searchString	string (Optional, default is empty) Pass an optional search string (or substring)(case-insensitive) for the name, username or email of user."
skip	integer <int32> >= 0 (Optional, default=0) Number of records to skip for pagination.
limit	integer <int32> >= 0 (Optional, default=30) Maximum number of records to return (records per page), value of 0 means no limit.

Responses

Response samples

200

Content type

application/json

{"total": 0,
"returned": 0,
"skipped": 0,
"limit": 0,
"list": [{"uid": "string",
"username": "string",
"gid": 0,
"email": "string",
"name": "string",
"creationDate": "2016-08-29T09:12:33.001Z",
"disabled": true,
"emailVerified": true
}
]
}

Create or update user

Creates a new user or updates if already exists.

Authorizations:

bearerAuth

path Parameters

username

required

string

Example: user1

the unique username of the user

Request Body schema: application/json

gid	integer (Optional) The unique GID of the user. Usually you should not include that property. It is set only in special cases, in requests from k8s operator: with value -1 to autogenerate a new unique GID for the user and with some specific GID to copy from production deployment to the test deployment. New users will have the null value in GID until k8s operator assigns one.
roles	Array of strings (Optional) Array of the roles to be assigned to the user. If not provided the previous array will be kept (if new user, just an empty array will be assigned).
projects	Array of strings (Optional) Array of the projects to be assigned to the user. If not provided the previous array will be kept (if new user, just an empty array will be assigned).
siteCode	string [ 2 .. 16 ] characters (Optional) The code of site which the user belongs to. If not provided, the previous value will be kept (if new user, just null will be assigned). It can be null, which means no site (usually the site is mandatory only for some roles, those who upload data).

Responses

Request samples

Payload

Content type

application/json

{"gid": 0,
"roles": ["dataset-administrator",
"data-scientists"
],
"projects": ["CHAIMELEON"
],
"siteCode": "string"
}

Get user details

Returns the details of a user

Authorizations:

bearerAuth

path Parameters

username

required

string

Example: user1

The username of the user

query Parameters

scope

string

Enum: "gid" "all"

(Optional, default: gid) The scope of details:

gid - Return only the GID of the user.
all - Return all the details.

Responses

Response samples

200

Content type

application/json

{"roles": ["dataset-administrator",
"data-scientists"
],
"projects": ["CHAIMELEON"
],
"siteCode": "UPV",
"uid": "string",
"username": "string",
"gid": 0,
"name": "string",
"email": "string",
"attributesFromAuthService": [{"displayName": "string",
"attributes": [{"displayName": "string",
"values": ["string"
]
}
]
}
]
}

List user management jobs launched for the user

...

Authorizations:

bearerAuth

path Parameters

username

required

string

Example: user1

the unique username of the user

Responses

Response samples

200

Content type

application/json

[{"creationDate": "2016-08-29T09:12:12+00:00",
"name": "string",
"uid": "1761bfd2-2dfa-4640-a5e9-0403d8939f0a",
"status": "running"
}
]

Get job logs

Get the logs of a user management job

Authorizations:

bearerAuth

path Parameters

username required	string Example: user1 The username of the user
uid required	string Example: 1761bfd2-2dfa-4640-a5e9-0403d8939f0a The UID of the job

Responses

List users roles.

List all available roles for users.

Authorizations:

bearerAuth

Responses

Response samples

200

Content type

application/json

["string"
]

projects

Operations with projects

List projects

List the projects depending on the purpose.

Authorizations:

bearerAuth

query Parameters

purpose

string

Enum: "projectList" "datasetCreation" "datasetSearchFilter"

(Optional, default: projectList) List the projects depending on the purpose:

projectList - List all the projects. Useful for a general list of projects, usually available in the main menu.
datasetCreation - List the possible values for the property "project" of new dataset (POST /datasets) according to the authenticated user. New datasets only can be assigned to one of the projects the user has joined to.
userManagement - List the possible values for the property "projects" of a user (PUT /users/{username}) according to the authenticated user. Users only can be assigned to one of the projects the user-manager has joined to.
datasetSearchFilter - List all available projects which visible datasets can belong to. Useful to list the possible values for the param "project" in "GET /datasets"

Responses

Response samples

200

Content type

application/json

{"list": [{"code": "MY-PROJECT",
"name": "A wonderful project.",
"logoUrl": "https://chaimeleon-eu.i3m.upv.es/dataset-service/project-logos/77f3c1bm9d7h.png"
}
],
"allowedActionsForTheUser": ["create"
]
}

Create or update project

Creates a new project or updates if already exists.

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Request Body schema: application/json

name	string [ 3 .. 160 ] characters The long name of the project or just the codename but without restrictions of spaces and so on.
shortDescription	string
externalUrl	string <= 256 characters (Optional, default null) URL to the project web page.
logoUrl	string Optional URL to an image logo of the project. The image will be automatically downloaded and stored in the server. Set to empty string if there is no logo for the project.
	object (ProjectConfig)

Responses

Request samples

Payload

Content type

application/json

{"name": "Accelerating the lab to market transition of AI tools for cancer management",
"shortDescription": "CHAIMELEON will set up an EU-wide structured repository for health imaging data as an open source for artificial intelligence (AI) experimentation in cancer management.",
"externalUrl": "https://some-project.org/",
"logoUrl": "https://some-project.org/img/logo.png",
"projectConfig": {"defaultContactInfo": "project-manager@some-project.org or https://some-project.org/contact-form",
"defaultLicense": {"title": "Some Project Common License 1.0",
"url": "https://some-project.org/datasets-license.pdf"
},
"zenodoAccessToken": "iHU32BZJU8nosIkln89sd4FesEqbhfu4DIHbdsibgaa",
"zenodoAuthor": "SOME-PROJECT consortium",
"zenodoCommunity": "some_project",
"zenodoGrant": "10.13039/501100000780::952172"
}
}

Get the details of a project

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Responses

Response samples

200

Content type

application/json

{"name": "Accelerating the lab to market transition of AI tools for cancer management",
"shortDescription": "CHAIMELEON will set up an EU-wide structured repository for health imaging data as an open source for artificial intelligence (AI) experimentation in cancer management.",
"externalUrl": "https://some-project.org/",
"code": "MY-PROJECT",
"logoUrl": "https://chaimeleon-eu.i3m.upv.es/dataset-service/project-logos/77f3c1bm9d7h.png",
"editablePropertiesByTheUser": ["name",
"shortDescription",
"externalUrl",
"logoUrl"
],
"allowedActionsForTheUser": ["config",
"viewSubprojects"
]
}

Change a property of a project by its code

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Request Body schema: application/json

property	string Enum: "name" "shortDescription" "externalUrl" "logoUrl" The name of the property to change. The properies that can be changed depend on the current user, see the property "editablePropertiesByTheUser" returned by GET. See the object returned by GET operation to know definition and value type for each of these properties.
value	object The new value to assign. The type depends on the property, see the object returned by GET operation.

Responses

Request samples

Payload

Content type

application/json

{"property": "description",
"value": "This is an amazing project."
}

Put project logo

Upload an image file for the project's logo. Or remove the current logo with empty string. Alternatively you can copy the image from any URL with PATCH /projects/{code} {"property": "logoUrl", "value": "http://..."}

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Request Body schema: multipart/form-data

logo	string <binary> The contents of the file to upload. Empty string to remove the logo of the project.

Responses

Set the configuration of a project

...

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Request Body schema: application/json

defaultContactInfo	string <= 256 characters (Optional but recommended, default is empty) The default contact info assigned to new datasets. Usually the contact information of the project manager, the project email or the url of contact form of the project. The creator of a dataset can change later that contact info for his/her dataset.
	object (Optional, default is empty) The name and link to a default license document assigned to new datasets. It is useful when there is a custom license defined for the datasets of the project. The creator of a dataset can change later the license of his/her dataset.
zenodoAccessToken	string <= 128 characters (Optional, default is empty) The access token of the Zenodo account used to send datasets metadata when they are published. You can get one if you register in zenodo.org: in your account settings, go to "Applications", create a "Personal access token" and set the scopes "deposit:actions", "deposit:write". If you leave it empty, the datasets from this project will not be able to be published.
zenodoAuthor	string <= 128 characters (Optional, default is empty) The text that will appear as the author in dataset publications. That way you can set a collective authorship if the data can be provided by several sources. If you leave it empty, the name of the user who created the dataset will appear as the author of publication.
zenodoCommunity	string <= 128 characters (Optional, default is empty) The community code which the dataset publications will be related to. You can create one for your project in zenodo.org. It is useful as a collection to easily find all the depositions of your project. If you leave it empty, the publications will no be related to any community.
zenodoGrant	string <= 128 characters (Optional, default is empty) The grant code which the dataset publications will be related to.

Responses

Request samples

Payload

Content type

application/json

{"defaultContactInfo": "project-manager@some-project.org or https://some-project.org/contact-form",
"defaultLicense": {"title": "Some Project Common License 1.0",
"url": "https://some-project.org/datasets-license.pdf"
},
"zenodoAccessToken": "iHU32BZJU8nosIkln89sd4FesEqbhfu4DIHbdsibgaa",
"zenodoAuthor": "SOME-PROJECT consortium",
"zenodoCommunity": "some_project",
"zenodoGrant": "10.13039/501100000780::952172"
}

Get the configuration of a project

...

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Responses

Response samples

200

Content type

application/json

{"defaultContactInfo": "project-manager@some-project.org or https://some-project.org/contact-form",
"defaultLicense": {"title": "Some Project Common License 1.0",
"url": "https://some-project.org/datasets-license.pdf"
},
"zenodoAccessToken": "iHU32BZJU8nosIkln89sd4FesEqbhfu4DIHbdsibgaa",
"zenodoAuthor": "SOME-PROJECT consortium",
"zenodoCommunity": "some_project",
"zenodoGrant": "10.13039/501100000780::952172"
}

List subprojects of a project

List the subprojects of a project. Subprojects are just subsets of cases within a project. The users have to select a subproject when they upload images putting the corresponding ID (externalId here) in a dicom tag. Subprojects are stored here for two reasons:

to create them in the case explorer (an on-event job is launched for that)
to check that all the images in a dataset have a dicom tag with an existent subproject here (matching with an externalId) and that subproject is related to the project where the dataset is created.

Authorizations:

bearerAuth

path Parameters

code

required

string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$

Example: MY-PROJECT

The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)

Responses

Response samples

200

Content type

application/json

{"list": [{"code": "LUNG-CANCER",
"name": "Lung cancer",
"description": "The lung cancer cases of the project.",
"externalId": "6436b3f00011ce501f0aa4fc"
}
],
"allowedActionsForTheUser": ["create",
"edit"
]
}

Create or update subproject

Creates a new subproject or updates if already exists.

Authorizations:

bearerAuth

path Parameters

code required	string (ProjectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$ Example: MY-PROJECT The unique code of the project: short name without spaces (usually capital letters and numbers and hyphens instead of spaces)
subcode required	string (SubprojectCode) [ 2 .. 16 ] characters ^[a-zA-Z-]+$ Example: MY-PROJECT The code which identifies the subproject.

Request Body schema: application/json

name	string <= 50 characters The name of the subproject.
description	string <= 80 characters Short description of the subproject.
externalId	string <= 40 characters The id of the subproject in the case explorer, i.e. the value of the dicom tag (70D1, 2000) in the images.

Responses

Request samples

Payload

Content type

application/json

{"name": "Lung cancer",
"description": "The lung cancer cases of the project.",
"externalId": "6436b3f00011ce501f0aa4fc"
}

licenses

Operations with licenses

List licenses

List all available licenses for datasets.

Authorizations:

bearerAuth

Responses

Response samples

200

Content type

application/json

[{"title": "CC BY 4.0",
"url": "https://creativecommons.org/licenses/by/4.0/"
}
]

datasetAccesses

Operations with datasetAccesses

Check the access to datasets

It is called when a user wants to access to one or more datasets. The access will be granted or denied according to the groups of the user."

Authorizations:

bearerAuth

Request Body schema: application/json

userName	string the unique userName of the user
datasets	Array of strings <uuid> [ items <uuid > ] the ids of datasets to access

Responses

Request samples

Payload

Content type

application/json

{"userName": "user1",
"datasets": ["00e821c4-e92b-48f7-a034-ba2df547e2bf"
]
}

Response samples

403

Content type

application/json

["00e821c4-e92b-48f7-a034-ba2df547e2bf"
]

Create new access to datasets

It is called when a user access to one or more datasets. The access will be granted (and annotated in tracer) or denied according to the groups of the user."

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the datasetAccess, it can be the uid of the kubernetes object (deployment, job, pod...). When the user finishes the access, the DELETE operation should be called with that same id.

Request Body schema: application/json

userName	string the unique userName of the user
datasets	Array of strings <uuid> [ items <uuid > ] the ids of datasets to access
instanceName	string the name of the k8s object where the access is requested (deployment or job)
toolName	string the name of application, framework, docker image or helm chart used to analyze or process the datasets
toolVersion	string the version of application, framework, docker image or helm chart used to analyze or process the datasets

Responses

Request samples

Payload

Content type

application/json

{"userName": "user1",
"datasets": ["00e821c4-e92b-48f7-a034-ba2df547e2bf"
],
"instanceName": "my-job",
"toolName": "tensorflow-workstation",
"toolVersion": "1.1"
}

Close a previous access to datasets

It is called when a user ends the work over one o more datasets whose access have been required previously.

Authorizations:

bearerAuth

path Parameters

id

required

string

The id of the datasetAccess created previously, usually the uid of the kubernetes object (deployment, job, pod...).

Request Body schema: application/json

status	string
startTime	string the time when the access started in iso format
endTime	string the time when the access ended in iso format

Responses

Request samples

Payload

Content type

application/json

{"status": "succeded",
"startTime": "2024-01-12T13:11:48Z",
"endTime": "2024-01-12T15:31:14Z"
}

DatasetDetails

id	string <uuid>
name	string [ 3 .. 128 ] characters Short descriptive name.
version	string <= 16 characters
project	string [ 2 .. 16 ] characters ^[a-zA-Z-]+$ The code of the project which the dataset is assigned to.
previousId	string <uuid> Specified when this is a new version of a previous dataset. Otherwise null.
nextId	string <uuid> Specified when there is another dataset which is a new version of this one. Otherwise null. It is automatically set when the new version is released (i.e. "draft" flag changed to false). The new version is a dataset that have previousId referencing this one. Actually there can be more than one new versions in draft state, but only one can be released.
authorId	string The unique id of the user wich created the dataset, it is not for showing but it can be useful to a possible new feature to show all datasets from an author.
authorName	string [ 3 .. 128 ] characters The name of the user wich created the dataset.
authorEmail	string <email> [ 3 .. 128 ] characters The email of the user wich created the dataset. It will not be included if unregistered user for privacy reasons.
creationDate	string <date-time>
description	string Long explanation of dataset details, statistics, links, references, improvements/modifications if it's a new version, etc.
tags	Array of strings[^[a-zA-Z-]+$] Free text tags assignable to datasets for better organization and filtering. Tags can contain alphanumeric characters or '-', and with max length of 20.
provenance	string Short text with the provenance of the data.
purpose	string Short text with the intended purpose of the dataset.
type	Array of strings Items Enum: "original" "annotated" "processed" "personal-data"
collectionMethod	Array of strings Items Enum: "patient-based" "cohort" "only-image" "longitudinal" "case-control" "disease-specific"
	object (License) Name and link to the license document.
	object Permanent IDs as links, to show the users how the dataset should be cited.
contactInfo	string <= 256 characters (Optional) Contact information of the responsible of the dataset, usually a name and email.
draft	boolean If true then the dataset is only visible and usable by the author, and some properties can be modified.
creating	boolean (This flag only appears when "draft" is true) If true then the dataset is still creating, and "draft" can not be changed to false until the end (it does not appear in "dataset.editablePropertiesByTheUser"). The progress can be retrieved with GET /datasets/{id}/creationStatus. If false then the creation process has finished and "draft" already can be changed to false when the user want to release de dataset.
public	boolean If false then the dataset is only visible and usable to users in the project where the dataset is assigned. If true then the dataset is accessible and usable to all users and visible for non-registered users.
invalidated	boolean If true the dataset can not be used and only visible for the author. The UI should show a visible warning including the reason, which is in the property "invalidationReason".
invalidationReason	string <= 128 characters (This property only appears and can be set when "invalidated" is true) When the user invalidates a dataset the UI should show a field to let the user specify the reason. It is recommended to show some predefined reasons like: "It is a discarded/useless draft", "It contains some rejected subject or study", "It contains corrupted data", "Other". And if "Other" is selected, let the user write the reason.
corrupted	boolean It is true when the integrity check has failed (files modified or even deleted). In that case, the dataset should be invalidated. You can see the date of the last check in "lastIntegrityCheck". The admin can see the details in the log of that date.
lastIntegrityCheck	string <date-time> The last time the integrity of dataset has been checked calculating the hash and comparing with the original saved in Tracer. It is null in newly created datasets and while not checked. If the check fails, the flag "corrupted" will be true.
editablePropertiesByTheUser	Array of strings The properties that can be modified with PATCH operation by the current authenticated user in the current state of this dataset. The array will be empty if no properties can be modified. The UI can use this property to show which properties are editable.
allowedActionsForTheUser	Array of strings Items Enum: "use" "delete" "checkIntegrity" "restartCreation" "readjustFilePermissions" "recollectMetadata" "viewAccessHistory" "manageACL" The actions that the current authenticated user can do in the current state of this dataset. The actions are some operations apart from edit fields/flags which are already specified in the previous property "editablePropertiesByTheUser". So this property is intended for let the UI know which other actions to show apart from those related with flags, like "Release" (change the flag "public") or "Invalidate" (change the flag "invalidated"). Possible values are: `use` - To launch a workstation in the platform with access to this dataset, The k8s operator will be able to do a POST /datasetAccess/{id} for this user on this dataset. `delete` - To do a DELETE /datasets/{id}. `checkIntegrity` - To do a POST /datasets/{id}/checkIntegrity. `restartCreation` - To do a POST /datasets/{id}/restartCreation. `readjustFilePermissions` - To do a POST /datasets/{id}/readjustFilePermissions. `recollectMetadata` - To do a POST /datasets/{id}/recollectMetadata. `viewAccessHistory` - To do a GET /datasets/{id}/accessHistory `manageACL` - To do any operation in /datasets/{id}/acl and /datasets/{id}/acl/{username}
studiesCount	integer The number of studies contained in the dataset
subjectsCount	integer The number of different subjects which are related with the studies contained in the dataset
ageLow	integer [Miabis] Age of the youngest subject. Null if metadata still not collected or age data empty for all studies. Collected from Dicom tag (0010,1010) or from the subject clinical data (inclusion_criteria.age_at_diagnosis or .age_at_baseline).
ageHigh	integer [Miabis] Age of the oldest subject. Null if metadata still not collected or age data empty for all studies. Collected from Dicom tag (0010,1010) or from the subject clinical data (inclusion_criteria.age_at_diagnosis).
ageUnit	Array of strings [Miabis] Array of two items: unit for ageLow, unit for ageHigh. Empty if metadata still not collected or age data empty for all studies.
ageNullCount	integer The number of studies with unknown age data. Null if metadata still not collected.
sex	Array of strings [Miabis] Array of different sex values in the dataset. Empty if metadata still not collected. Collected from Dicom tag (0010,0040) or from the subject clinical data (patient_data.gender). Miabis standard defines the possible values: "Male", "Female", "Undifferentiated", "Unknown".
sexCount	Array of integers Array with the number of studies for each item of the previous 'sex' array. Empty if metadata still not collected.
diagnosis	Array of strings Array of different diagnosis values in the dataset. Empty if metadata still not collected. Collected from Dicom tag (70D1,2000), private tag (project name) defined in CHAIMELEON. Possible values: "Prostate cancer", "Breast cancer", "Lung cancer", "Colon cancer", "Rectum cancer", "Unknown".
diagnosisCount	Array of integers Array with the number of studies for each item of the previous 'diagnosis' array. Empty if metadata still not collected.
bodyPart	Array of strings Array of different body parts in the dataset. Empty if metadata still not collected. Collected from Dicom tag (0018,0015). Dicom standard defines the posible values. Additionally the value 'Unknown' will be added at the end if there is any case with empty body part data.
bodyPartCount	Array of integers Array with the number of studies for each item of the previous 'bodyPart' array. Empty if metadata still not collected.
modality	Array of strings Array of different image modalities in the dataset. Empty if metadata still not collected. Collected from Dicom tag (0008,0060). Dicom standard defines the posible values. Additionally the value 'Unknown' will be added at the end if there is any case with empty modality data.
modalityCount	Array of integers Array with the number of studies for each item of the previous 'modality' array. Empty if metadata still not collected.
manufacturer	Array of strings Array of different image equipment manufacturers in the dataset. Empty if metadata still not collected. Collected from Dicom tag (0008,0070) and harmonized. Additionally the value 'Unknown' will be added at the end if there is any case with empty manufacturer data.
manufacturerCount	Array of integers Array with the number of studies for each item of the previous 'manufacturer' array. Empty if metadata still not collected.
diagnosisYearLow	integer The year of the oldest diagnosis. Null if metadata still not collected or year of diagnosis data empty for all studies. Collected from the subject clinical data (inclusion_criteria.baseline_date or .date_baseline_ct).
diagnosisYearHigh	integer The year of the most recent diagnosis. Null if metadata still not collected or year of diagnosis data empty for all studies. Collected from the subject clinical data (inclusion_criteria.baseline_date).
diagnosisYearNullCount	integer The number of studies with unknown year of diagnosis. Null if metadata still not collected.
seriesTags	Array of strings Array of different tags in series of the studies of the dataset.
sizeInBytes	integer The total size of files in all the studies and series selected for this dataset plus the eforms file. Null if metadata still not collected.
timesUsed	integer The number of times that the dataset has been used by any user (mounted in any application or job).

{"id": "00e821c4-e92b-48f7-a034-ba2df547e2bf",
"name": "Lung cancer",
"version": "20240415",
"project": "MY-PROJECT",
"previousId": "efa2cba6-4a17-4612-8074-7e9eb9c9d7ca",
"nextId": "4bda04db-8b73-4a65-b1bb-d2011769a91a",
"authorId": "d290f1ee-6c54-4b01-90e6-d701748f0851",
"authorName": "James Gordon",
"authorEmail": "james@email.com",
"creationDate": "2016-08-29T09:12:33.001Z",
"description": "string",
"tags": ["train-partition",
"annotated"
],
"provenance": "string",
"purpose": "string",
"type": ["annotated"
],
"collectionMethod": ["only-image",
"disease-specific"
],
"license": {"title": "CC BY 4.0",
"url": "https://creativecommons.org/licenses/by/4.0/"
},
"pids": {"preferred": "zenodoDoi",
"urls": {"zenodoDoi": "https://doi.org/10.5072/zenodo.1081030",
"custom": "https://myDatasetsDB.com/ds327"
}
},
"contactInfo": "James Gordon (james@email.com)",
"draft": true,
"creating": true,
"public": true,
"invalidated": true,
"invalidationReason": "It is a discarded/useless draft",
"corrupted": true,
"lastIntegrityCheck": "2016-08-29T09:12:33.001Z",
"editablePropertiesByTheUser": ["public",
"invalidated",
"name",
"description"
],
"allowedActionsForTheUser": ["use",
"checkIntegrity"
],
"studiesCount": 0,
"subjectsCount": 0,
"ageLow": 54,
"ageHigh": 82,
"ageUnit": ["years",
"years"
],
"ageNullCount": 2,
"sex": ["Male",
"Female",
"Unknown"
],
"sexCount": [348,
170,
2
],
"diagnosis": ["Colon cancer",
"Rectum cancer",
"Unknown"
],
"diagnosisCount": [348,
170,
2
],
"bodyPart": ["LUNG",
"BRAIN",
"Unknown"
],
"bodyPartCount": [348,
170,
2
],
"modality": ["CT",
"MR",
"Unknown"
],
"modalityCount": [348,
170,
2
],
"manufacturer": ["Philips",
"Siemens",
"Unknown"
],
"manufacturerCount": [348,
170,
2
],
"diagnosisYearLow": 0,
"diagnosisYearHigh": 0,
"diagnosisYearNullCount": 2,
"seriesTags": ["Axial",
"T2W"
],
"sizeInBytes": 2834502270,
"timesUsed": 0
}