Context Navigation

← Previous Ticket
Next Ticket →

#8499 closed defect (fixed)

Collate details about DICOM collections from series

Reported by:	Zach Pearson	Owned by:	Zach Pearson
Priority:	moderate	Milestone:	1.6
Component:	DICOM	Version:
Keywords:		Cc:	Elaine Meng
Blocked By:		Blocking:
Notify when closed:		Platform:	all
Project:	ChimeraX

Description

The nbia api is much more detailed at the study level than the collections level, or the data can be scraped at the same time as links are scraped.

Attachments (1)

tcia-columns.png (623.8 KB ) - added by Elaine Meng 3 years ago.: Added by email2trac

Download all attachments as: .zip

Change History (9)

comment:1 by Zach Pearson, 3 years ago

What do you think of: Collection Name, Patients, Locations, Modalities as the columns for the first table in the tool?

in reply to: 2 ; follow-up: 2 comment:2 by Elaine Meng, 3 years ago

Hi Zach,
If it's not too many things, I would vote for 

Collection (sans "Name")
Cancer Type (sometimes redundant with Location but often has more info)
Species (I saw a lot of dogs and rodents when I was browsing around)
Subjects (since dogs and rodents aren't really patients?  But I'm OK with "Patients" too if you preferred it since that is DICOM terminology)
Modality (or "Data Type" or "Data Types" ... whichever you like, really, and Modality is reasonable since it's DICOM terminology)

The image is a screenshot of the Browse Collections page at the TCIA website:

by Elaine Meng, 3 years ago

Attachment:	tcia-columns.png added

Added by email2trac

comment:3 by Zach Pearson, 3 years ago

Collection, Species, Subjects, and Modality are possible today, but I think the closest I can get to cancer type without asking them to modify their API (which I can do!) is location (body part).

Is that acceptable?

in reply to: 5 comment:4 by Elaine Meng, 3 years ago

Hi Zach,
Oh, I thought all of the columns shown in their browse-database page were equally available.  Strange, seems like "cancer type" is of primary importance (and they also must think so since it is the first column they show after Collection)! However, I understand if you only have access to Location, we can use that instead unless they improve their API.
Elaine

follow-up: 4 comment:5 by Zach Pearson, 3 years ago

I opened a pull request on the library we're using to access TCIA data. Their API client was missing an endpoint that would reduce the number of API calls we need to make per-collection from 4 down to 2. I wanted to get this done because the individual requests can take a long time. Once it's merged, I'll finish this ticket.

comment:6 by Zach Pearson, 3 years ago

Status:	assigned → closed

I have this working, but it's slow to download. Each collection (there are 114 of them) requires a separate request to download information, then we have to parse it to get it into place. I added a progress message because it takes about two minutes and I started to worry it wasn't doing anything. If this feels agonizingly slow reopen this ticket and I'll work on caching the results of one download in CxServices and set up a job to periodically retrieve new data. I'll push the changes for tomorrow's daily build.

comment:7 by Zach Pearson, 3 years ago

Status:	closed → reopened

comment:8 by Zach Pearson, 3 years ago

Resolution:	→ fixed
Status:	reopened → closed

Note: See TracTickets for help on using tickets.

Download in other formats: