Yetisgen is the project’s lead data miner. She will design software that recognizes and gathers meaningful information from the free text of radiologists’ clinical reports and notes.
“We sampled 500 CT notes to create a schema that defines what information we’re trying to extract, say, tumor findings,” Yetisgen said. “We’ll manually label descriptors, like “lesion” and “malignant” and references to sizes like millimeters and centimeters in these subsets of information. Then we will build a software language model that will automatically learn the lexicon associated with those labels.”
Ad Statistics
Times Displayed: 172685
Times Visited: 3120 For those who need to move fast and expand clinical capabilities -- and would love new equipment -- the uCT 550 Advance offers a new fully configured 80-slice CT in up to 2 weeks with routine maintenance and parts and Software Upgrades for Life™ included.
The model’s accuracy is crucial, so it will need to be trained and tested multiple times until it can be validated as representative of the full 4 million records.
The task may sound daunting, but Yetisgen pointed out one upside: “Radiology reports follow a standard and usually are quite nicely structured, so they are much simpler than patient admission notes or discharge notes, which can differ widely even at the same institution,” she said.
The four-year research project is being funded with an award (R01CA248422) of more than $2 million from the National Cancer Institutes, part of the National Institutes of Health.
Back to HCB News