Common data determinants of recurrent cancer are broken, mislead researchers

Current algorithms used to pull the needle of patients with recurrent cancer from the haystack of patient databases are broken. Image: Flickr/jromero

Current algorithms used to pull the needle of patients with recurrent cancer from the haystack of patient databases are broken. Image: Flickr/jromero

In order to study the effectiveness or cost effectiveness of treatments for recurrent cancer, you first have to discover the patients in medical databases who have recurrent cancer. Generally studies do this with billing or treatment codes – certain codes should identify who does and does not have recurrent cancer. A recent study published in the journal Medical Care shows that the commonly used data determinants of recurrent cancer may be misidentifying patients and potentially leading researchers astray.

“For example, a study might look in a database for all patients who had chemotherapy and then another round of chemotherapy more than six months after the first, imagining that a second round defines recurrent disease. Or a study might look in a database for all patients with a newly discovered secondary tumor, imagining that all patients with a secondary tumor have recurrent disease. Our study shows that both methods are leave substantial room for improvement,” says Debra Ritzwoller, PhD, health economist at the Kaiser Permanente Colorado Institute for Health Research and investigator at the University of Colorado Cancer Center.

The study used two unique datasets derived from HMO/Cancer Research Network  and CanCORS/Medicare to check if the widely used algorithms in fact discovered the patients with recurrent disease that the algorithms were designed to detect. They did not. For example, a newly diagnosed secondary cancer may not mark a recurrence but may instead be a new cancer entirely; a second, later round of chemotherapy may be needed for continuing control of the de novo cancer, and not to treat recurrence.

“Basically, these algorithms don’t work for all cancer sites in many datasets commonly used for cancer research,” says Ritzwoller.

For example, to discover recurrent prostate cancer, no combination of billing codes used in this large data set pointed with sensitivity and specificity to patients whom notes in the data showed had recurrent disease. The highest success of the widely used algorithms was predicting patients with recurrent lung, colorectal and breast cancer, with success rates only between 75 and 85 percent.

“We need to know who in these data sets has recurrent disease. Then we can do things like look at which treatments lead to which outcomes,” Ritzwoller says. Matching patients to outcomes can help to decide who gets what treatment, and can help optimize costs in health care systems.

In a forthcoming paper, Ritzwoller and colleagues will suggest algorithms to replace these that have now proved inadequate.

FacebookTwitterGoogle+Share

About the author: Garth Sundem

In addition to writing for the University of Colorado Cancer Center, Garth is the author of the books The Geeks' Guide to World Domination, Brain Candy, and Geek Logik. Contact him at garth.sundem [at] ucdenver.edu.

1 comment

  1. I have CT scans every six months to see if my Stage IV kidney cancer is back on the move. My doctors routinely note that my diagnosis is “renal cell carcinoma” on the prescriptions, though they consider that I am currently without disease–and have been for eight years.

    If the algorithms assume that I have a recurrence because of this, it wildly overstates the situation. Billing codes and medical conditions are not the same, quite obviously. Also as more and more adjuvant drugs are being prescribed, the information will continue to be misleading.