Case study: medical research

From mixed research inputs to hypothesis-ready graph context.

Ingest paper metadata, cohort exports, and trial summaries once, then give research agents a connected substrate for literature scouting and follow-up hypotheses.

Research agents lose time before the first useful question.

Papers, trial summaries, cohort exports, and biomarker notes arrive in different shapes. A research loop stays fragile when every iteration rebuilds a temporary retrieval layer from those sources.

Before

  • PDF chunks, CSV rows, and summaries stay isolated
  • Every agent reconstructs source context independently
  • Literature evidence and cohort evidence are hard to connect
  • Follow-up hypotheses disappear between runs

With RushDB

  • Mixed inputs become queryable records and relationships
  • Ontology gives agents a live map of available evidence
  • Semantic retrieval narrows relevant research text
  • Follow-up hypotheses persist for the next loop

Graph intelligence on ingest

Incoming data becomes queryable graph context.

RushDB turns structured data into graph-ready context without a separate modeling pipeline. Structure already encoded in a nested payload is linked immediately. For flat records imported from scattered sources, relationship analysis can propose stable cross-source patterns.

01

Normalize as data arrives

Import JSON or CSV. RushDB infers property types and adds new fields to the live, queryable ontology without a schema migration.

02

Auto-link nested structure

Nested objects become connected records automatically, preserving the parent-child graph structure already encoded in your payload.

03

Enrich scattered sources

After flat imports or schema changes, analyze the project ontology. RushDB can suggest join patterns and semantic relationship types for your review.

Suggested relationship analysis requires an LLM configured for the project. Suggestions stay in draft form until you approve them, so inferred domain meaning never mutates the graph silently. You can also add explicit relationships through the SDK or API.

Review suggested relationship patterns

Data model

One flexible graph for the workflow.

Start with the payload shape your product already produces. RushDB stores it as Records, infers typed properties, and keeps nested or approved domain relationships queryable.

Schema sketch
Research workspace payload

Paper chunks, trial rows, biomarkers, cohorts, and follow-up hypotheses stay connected for iterative research support.

{
  "workspaceId": "oncology-01",
  "PAPER": [{
    "paperId": "paper-egfr-2026",
    "title": "Resistance patterns in EGFR cohorts",
    "AUTHOR": [{ "name": "Dr. Lee" }],
    "PDF_CHUNK": [{ "chunkId": "p1-c4", "text": "Response varied after acquired resistance..." }]
  }],
  "TRIAL": [{
    "trialId": "trial-9",
    "COHORT": [{ "cohortId": "cohort-egfr", "sampleSize": 84, "BIOMARKER": [{ "name": "EGFR" }] }]
  }],
  "HYPOTHESIS": [{ "title": "Compare outcomes by resistance marker", "status": "draft" }]
}

Working example

Persist the evidence trail and the next hypothesis.

A literature scout searches paper chunks for EGFR resistance evidence, then the workflow keeps the connected paper, cohort, trial, and follow-up hypothesis available for the next pass.

Input
PAPER paper-17
  TOPIC EGFR resistance
  PDF_CHUNK "Response varied after acquired resistance..."

COHORT cohort-4
TRIAL trial-9
HYPOTHESIS "Compare outcome differences by resistance marker"
Query
{
  "labels": ["PDF_CHUNK"],
  "propertyName": "text",
  "query": "EGFR resistance evidence with outcome differences",
  "where": { "workspace_id": "oncology-01" }
}
Result
{
  "paper": "paper-17",
  "topic": "EGFR resistance",
  "cohort": "cohort-4",
  "trial": "trial-9",
  "next_hypothesis": "Compare outcome differences by resistance marker"
}

Python SDK

Ingest once. Ground each research loop in live structure.

Keep document parsing and domain review in your application. RushDB stores the resulting records, exposes ontology, and retrieves relevant evidence for the next step.

Implementation blueprint

Build the medical research-loop path.

Use this sequence to connect paper chunks, trial metadata, cohort rows, and follow-up hypotheses in a research workspace.

  1. 01Import PAPER records with PDF_CHUNK, AUTHOR, INSTITUTION, and TOPIC records
  2. 02Import TRIAL and COHORT records with biomarker and outcome context
  3. 03Create a managed index for PDF_CHUNK.text
  4. 04Retrieve evidence by workspace, topic, and semantic query
  5. 05Persist each follow-up HYPOTHESIS with SUPPORTING_EVIDENCE links

Build path

  • Keep PDF extraction and chunking outside RushDB.
  • Connect evidence across PAPER, TRIAL, COHORT, BIOMARKER, and OUTCOME records.
  • Persist hypotheses as records linked to the evidence used to create them.
  • Present this as research support, not clinical decision automation.

Relevant docs

Read the exact primitives behind this pattern.

These links point to the RushDB docs pages that map directly to this case study: ingestion, labels, properties, values, SearchQuery, relationships, semantic search, MCP, or deployment.

How it works

Build the smallest useful workflow first.

01

Normalize useful source records

Import paper metadata, chunked text, cohort rows, and trial summaries in shapes the research workflow can inspect.

02

Discover and retrieve

Load ontology, search indexed text, and enrich matching evidence with related research records.

03

Persist the next question

Write the follow-up hypothesis and its cited evidence back to the workspace so the next loop starts with durable context.

Know where it fits.

Research support, not medical advice

This pattern organizes and retrieves evidence for research workflows. It does not replace clinical validation, domain review, or medical judgment.

Keep source provenance visible

Return cited paper, cohort, trial, and topic records with the answer instead of flattening evidence into an uncited summary.

Questions developers ask.