Blog <-

Case Study: Evaluating System for Accurate and Robust Literature Review and Evidence Synthesis

Julian Baldwin (Atropos Health), Rebecca Hyde (Atropos Health), Christina Dinh (Atropos Health), Patrick Wedlock (System)

2.3.2025

A healthtech start-up recently engaged with System to test its ability to integrate into their AI co-pilot solution for physicians and medical researchers. The healthtech start-up company explored multiple vendors to identify the solution best suited to their requirements for high-quality literature reviews.

Methods

Over a six-month period, the healthtech start-up conducted an assessment of potential vendors including System, to determine their capability to deliver high-quality, comprehensive literature reviews based on submitted clinical questions. This evaluation process involved:

Vendor Selection: A list of vendors offering APIs capable of generating literature reviews in response to custom clinical queries was culled down to three.

Citation Generation: For each vendor, the APIs were used to generate a curated list of citations tailored to specific clinical questions.

Internal Review Process: A detailed assessment was conducted by a clinical team and ranked to evaluate the quality, relevance, and utility of the citations. This evaluation ensured that each citation met the stringent criteria for quality and accuracy.

Quality Standards: A team of clinicians reviewed the results throughout the evaluation to determine which vendor provided  the most relevant and reliable citations that would maintain credibility and customer trust.

Through this systematic approach, each vendor's capability to provide robust and evidence-based literature review was evaluated.

"A detailed assessment was conducted by a clinical team and ranked to evaluate the quality, relevance, and utility of the citations."

Results

In the evaluation of three citation-generating platforms, significant variability in the quality and relevance of the citations produced was discovered.

System's flexibility and robust tooling enabled nuanced customization that ensured the citations were highly relevant to the clinical queries. While initial scores for relevance (un-tuned) were between 33-42% across all of the platforms, System's fine-tuning capabilities ensured that ultimately 81% of the citations provided were deemed relevant. The analysis uncovered a key advantage of System which lies in its flexibility and robust tooling. It enables nuanced customization, allowing for fine-tuning of outputs based on specific needs. Features such as the inclusion of detailed summaries, relevance scoring, and statistical insights provide greater control over the citation generation process. This level of adaptability ensures better alignment of the outputs with high standards, improving the overall reliability and relevance of the literature reviews and evidence synthesis included in customer-facing products. 

"A key advantage of System lies in its flexibility and robust tooling which enables nuanced customization based on specific needs."

About this Study

This study was conducted by Atropos Health. Atropos Health specializes in generating personalized, real-world evidence (RWE) by leveraging automation to conduct observational research studies from electronic health records (EHR) and claims data reducing time from months to minutes. Our company mission is to accelerate the generation of actionable evidence to improve healthcare outcomes for everyone. We do this by enabling healthcare providers, researchers, and organizations to address specific, novel clinical questions effectively. By providing timely and actionable insights, we empower health systems and life sciences companies to advance patient care and medical research.

Case Study: Evaluating System for Accurate and Robust Literature Review and Evidence Synthesis

Julian Baldwin (Atropos Health), Rebecca Hyde (Atropos Health), Christina Dinh (Atropos Health), Patrick Wedlock (System)

February 3, 2025

A healthtech start-up recently engaged with System to test its ability to integrate into their AI co-pilot solution for physicians and medical researchers. The healthtech start-up company explored multiple vendors to identify the solution best suited to their requirements for high-quality literature reviews.

Methods

Over a six-month period, the healthtech start-up conducted an assessment of potential vendors including System, to determine their capability to deliver high-quality, comprehensive literature reviews based on submitted clinical questions. This evaluation process involved:

Vendor Selection: A list of vendors offering APIs capable of generating literature reviews in response to custom clinical queries was culled down to three.

Citation Generation: For each vendor, the APIs were used to generate a curated list of citations tailored to specific clinical questions.

Internal Review Process: A detailed assessment was conducted by a clinical team and ranked to evaluate the quality, relevance, and utility of the citations. This evaluation ensured that each citation met the stringent criteria for quality and accuracy.

Quality Standards: A team of clinicians reviewed the results throughout the evaluation to determine which vendor provided  the most relevant and reliable citations that would maintain credibility and customer trust.

Through this systematic approach, each vendor's capability to provide robust and evidence-based literature review was evaluated.

"A detailed assessment was conducted by a clinical team and ranked to evaluate the quality, relevance, and utility of the citations."

Results

In the evaluation of three citation-generating platforms, significant variability in the quality and relevance of the citations produced was discovered.

System's flexibility and robust tooling enabled nuanced customization that ensured the citations were highly relevant to the clinical queries. While initial scores for relevance (un-tuned) were between 33-42% across all of the platforms, System's fine-tuning capabilities ensured that ultimately 81% of the citations provided were deemed relevant. The analysis uncovered a key advantage of System which lies in its flexibility and robust tooling. It enables nuanced customization, allowing for fine-tuning of outputs based on specific needs. Features such as the inclusion of detailed summaries, relevance scoring, and statistical insights provide greater control over the citation generation process. This level of adaptability ensures better alignment of the outputs with high standards, improving the overall reliability and relevance of the literature reviews and evidence synthesis included in customer-facing products. 

"A key advantage of System lies in its flexibility and robust tooling which enables nuanced customization based on specific needs."

About this Study

This study was conducted by Atropos Health. Atropos Health specializes in generating personalized, real-world evidence (RWE) by leveraging automation to conduct observational research studies from electronic health records (EHR) and claims data reducing time from months to minutes. Our company mission is to accelerate the generation of actionable evidence to improve healthcare outcomes for everyone. We do this by enabling healthcare providers, researchers, and organizations to address specific, novel clinical questions effectively. By providing timely and actionable insights, we empower health systems and life sciences companies to advance patient care and medical research.

Filed Under:

Use Cases

Use Cases