Self-supervised learning for chest x-rays and reports
- Rationale:
- Medical imaging, such as chest x-rays, are rich with information. However, they are not always labeled, and even when they are, these labels may only capture a fraction of the information a radiologist considers when evaluating the image.
- While labels are not always present or ideal, radiology reports that discuss clinically relevant information often accompany medical imaging.
- Self-supervised image-text architectures are an opportunity to leverage these radiology reports to learn from rich medical imaging and paired text data.
- We believe this type of architecture not only can yield better image representation, but also can allow us to:
- accurately classify relevant findings in a zero-shot manner based on text descriptions
- identify for which inputs our model can be reliably used
- reduce reliance on spurious correlations that traditional CNNs are subject to.
- Project:
- We trained a modified version of CLIP on MIMIC-CXR data and used CheXpert and Padchest datasets for evaluation.
- We generated synthetic watermark shortcuts to evaluate shortcut learning.
- We introduced new terms to the loss to impose sparsity of image-patch and text-token embeddings.
- Outcomes:
- We demonstrated through synthetically-generated watermark shortcuts on chest x-rays that supervised CNNs are heavily reliant and unable to unlearn shortcuts, and self-supervision with text helps to reduce this reliance.
- We developed a regularized version of the model that achieves SOTA zero-shot classification AUCs, better than a comparable fully-supervised CNN on external test data.
- Using descriptive text prompting, we were able to classify novel diseases not present in training data, such as covid19. Our performance is competitive, and in some cases, exceeds that of radiologists.
Quantifying the trans-arterial embolization endpoint.
- Rationale:
- Transarterial embolization (TAE) and its variants are effective minimally invasive procedures, often used to treat liver and kidney cancers, uterine fibroids, and other conditions.
- The procedure works by inserting a catheter into an artery supplying the tumor, and injecting beads to block its blood supply and essentially starve the tumor. Oftentimes, chemotherapy or radiological particles are injected alongside these embolic beads.
- While highly effective, it is important that the correct amount of beads are injected: inject too few, and the cancer won't be adequately treated; inject too many, and the beads (and chemotherapy!) can reach off-target tissue causing additional harm to an already sick patient.
- The current standard used by many clinicians to determine when to stop injecting is "beats of stasis". Using an angiogram, they inject contrast into the vasculature and count how long it takes for the contrast to wash away.
- However, this counting process is highly variable between clinicians, some opting not to use the method at all, and five seconds is a somewhat arbitrary cutoff.
- Project:
- We aimed to quantify an embolization endpoint that could determine the optimal point to stop injecting, when blood flow was sufficiently blocked but risk of off-target embolization was acceptable.
- We designed a catheter device that occluded a vessel before embolization, so that we could pre-emptively measure the final target pressure clinicians should aim for when injecting beads downstream.
- We conducted in-vitro research to characterize the relationship between injection pressure, vessel pressure, and off-target embolization.
- Outcomes:
- We were able to demonstrate that the pre-embolization predicted endpoint we provided aligned well with the final vessel pressure post-embolization.
- We used computer vision to identify instances of off-target embolization, and used this to create quantitative bounds on safe injection pressures that could be used at a given vessel pressure without causing off-target embolization.
Standard embolization can result in reflux.
Our system optimizes embolization level.
Tracking Surgical Instruments in the Operating Room
- Rationale:
- Surgeries are expensive for hospitals, and one of the primary drivers for this is costly operational inefficiencies due to poor instrument management.
- Each instrument that is taken out for a given procedure needs to be re-sterilized. While this is only around $0.50 per instrument, the cost adds up quickly when we consider potentially a hundred instruments per surgery and hundreds of surgeries per day in a hospital.
- One study demonstrated that around 70% of instruments could be safely removed without any added risk to the patient.
- Additionally, many instruments end up lost, or even left inside the patient after surgery, leading to high cost of replacement or litigation costs.
- Perioperative administrators would love to be able to reduce these risks and inefficiencies, but they currently lack instrument-level insight into what's going on in the operating room.
- Computer vision could potentially be used to track instruments and determine which are actually needed in a given procedure.
- With a conservative 40% reduction in instruments used, this would amount to an estimated $1.2 million dollar saving per year at Johns Hopkins Hospital alone.
- Project:
- We built a CNN-based computer vision system to track surgical instruments in the operating room.
- We collaborated with and interviewed many perioperative administrators, clinicians, nurses, and scrub techs to identify practical and regulatory requirements for our system.
- Outcomes:
- We trained a CNN that could accurately identify the surgical instrument being held in a video frame.
- We used optical flow to reduce data requirements by over 90%, and implemented temporal post-processing to fix frame-level mistakes, as well as logic/rule-based event processing to produce instrument usage requirements to administrators.
- We created a small annotated video dataset of instruments being moved over a fake operating room instrument table.
Our system tracks the instrument being held.
Epilepsy localization
- Rationale:
- Over sixty million people worldwide suffer from epilepsy. For about one third of these people, their seizures are focal-onset, meaning they begin in particular regions of the brain, and for about half of those people, their seizures are medically-refractory, meaning they don't adequately respond to medical treatments.
- For these about 10 mil. medically refractory focal-onset patients, the standard treatment is to surgically resect (or potentially neurostimulate) the culprit "epileptogenic zone".
- However, this resection has an abysmal success rate of about 50%, and one potential theory for why is that surgeons are not targeting the correct region for resection.
- Thus, better localization of the epileptogenic zone could help improve surgical outcomes.
- Normally, this localization is done by implanting invasive electrodes directly on the brain (ECoG or iEEG), or through the brain (sEEG). The electrical activity is monitored, sometimes for days, to identify problematic regions.
- Interictal spikes, or spikes occuring in between seizures are one promising feature that can be seen on these EEG recordings to identify the epileptogenic zone.
- Better placement of the invasive electrodes themselves could also improve a clinician's ability localize the epileptogenic zone.
- Project:
- We built a simple signal-processing + threshold based algorithm to identify interictal spikes in iEEG.
- We conducted research to try to see if the recordings from non-invasive EEG could be used to identify optimal invasive electrode placement.
- Outcomes:
- We were able to quickly and accurately process a lot of iEEG data to identify presence and location of interictal spikes.
- We determined that we could isolate the power of the signal in the gamma band for non-invasive scalp EEG. Furthermore, we showed that when the distribution of this power over the scalp regions was more similar to the distribution of physical invasive electrodes, resection was more likely to be successful.