Does RushDB parse PDFs automatically?

Treat PDF extraction and chunking as application concerns. Import the structured metadata, chunks, and source references your retrieval strategy needs.

Can the workflow ingest CSV data?

Yes. RushDB supports CSV and JSON ingestion. Model the imported rows with the workspace and source identifiers required by your research flow.

Is this a clinical decision system?

No. This page describes a research-workspace pattern. Any medical or clinical use requires appropriate domain review, validation, governance, and compliance work.

How do later research loops retain context?

Persist hypotheses, supporting references, and workflow state as records. Later loops can retrieve the relevant evidence rather than starting from an empty context window.

How is this different from legal-contract-review?

Both persist cited context across iterations, but this blueprint connects heterogeneous research sources (papers, cohorts, trials) around evolving hypotheses, while legal-contract-review tracks facts and clause revisions within one matter over time.

rushdb

Blueprint: medical research

From mixed research inputs to hypothesis-ready graph context.

Ingest PAPER metadata with nested PDF_CHUNK text, COHORT exports, and TRIAL summaries once, then connect them under shared TOPIC and workspace identifiers. A literature-scouting agent searches indexed PDF_CHUNK text for evidence like EGFR resistance findings, and the resulting HYPOTHESIS record persists linked to that evidence so the next research loop starts from a connected substrate instead of an empty context window.

Medical Research Loops is a RushDB pattern that turns PAPER, PDF_CHUNK, COHORT, TRIAL, and HYPOTHESIS inputs into connected, queryable graph records so research agents retrieve cited evidence and persist follow-up hypotheses between iterations.

Start free View docs

Research agents lose time before the first useful question.

Papers, trial summaries, cohort exports, and biomarker notes arrive in different shapes from different sources. Without shared structure, PDF chunks, CSV rows, and summaries stay isolated, every agent reconstructs source context from scratch, literature evidence cannot be connected to cohort evidence, and each follow-up hypothesis vanishes at the end of the run instead of seeding the next one.

Before

PDF chunks, CSV rows, and summaries stay isolated
Every agent reconstructs source context independently
Literature evidence and cohort evidence are hard to connect
Follow-up hypotheses disappear between runs

With RushDB

Mixed inputs become queryable records and relationships
Schema gives agents a live map of available evidence
Semantic retrieval narrows relevant research text
Follow-up hypotheses persist for the next loop

Graph intelligence on ingest

Incoming data becomes queryable graph context.

Research inputs land as PAPER records with nested PDF_CHUNK text, TOPIC tags, COHORT rows, and TRIAL summaries, all scoped to a workspace_id like oncology-01. RushDB types each field on write, auto-links PDF_CHUNK and TOPIC records to their parent PAPER, and lets suggested-relationship analysis surface connections between a paper chunk and a related COHORT or TRIAL record imported separately.

Normalize as evidence arrives

PAPER, PDF_CHUNK, COHORT, and TRIAL payloads are typed on write, so topic tags and chunk text are indexable the moment they land.

Auto-link nested structure

PDF_CHUNK and TOPIC records nested under a PAPER are automatically related to it, keeping EGFR resistance evidence attached to its source paper.

Enrich scattered sources

Suggested-relationship analysis surfaces links between a PAPER, its COHORT, and a related TRIAL even when each was imported in a separate batch.

Suggested relationship analysis requires an LLM configured for the project. Suggestions stay in draft form until you approve them, so inferred domain meaning never mutates the graph silently. You can also add explicit relationships through the SDK or API.

Review suggested relationship patterns

Data model

One flexible graph for the workflow.

Start with the payload shape your product already produces. RushDB stores it as Records, infers typed properties, and keeps nested or approved domain relationships queryable.

Schema sketch

Research workspace payload

Paper chunks, trial rows, biomarkers, cohorts, and follow-up hypotheses stay connected for iterative research support.

{
  "workspaceId": "oncology-01",
  "PAPER": [{
    "paperId": "paper-egfr-2026",
    "title": "Resistance patterns in EGFR cohorts",
    "AUTHOR": [{ "name": "Dr. Lee" }],
    "PDF_CHUNK": [{ "chunkId": "p1-c4", "text": "Response varied after acquired resistance..." }]
  }],
  "TRIAL": [{
    "trialId": "trial-9",
    "COHORT": [{ "cohortId": "cohort-egfr", "sampleSize": 84, "BIOMARKER": [{ "name": "EGFR" }] }]
  }],
  "HYPOTHESIS": [{ "title": "Compare outcomes by resistance marker", "status": "draft" }]
}

Working example

Persist the evidence trail and the next hypothesis.

A literature scout searches paper chunks for EGFR resistance evidence, then the workflow keeps the connected paper, cohort, trial, and follow-up hypothesis available for the next pass.

Input

PAPER paper-17
  TOPIC EGFR resistance
  PDF_CHUNK "Response varied after acquired resistance..."

COHORT cohort-4
TRIAL trial-9
HYPOTHESIS "Compare outcome differences by resistance marker"

Query

{
  "labels": ["PDF_CHUNK"],
  "propertyName": "text",
  "query": "EGFR resistance evidence with outcome differences",
  "where": { "workspace_id": "oncology-01" }
}

Result

{
  "paper": "paper-17",
  "topic": "EGFR resistance",
  "cohort": "cohort-4",
  "trial": "trial-9",
  "next_hypothesis": "Compare outcome differences by resistance marker"
}

Python SDK

Ingest once. Ground each research loop in live structure.

Keep document parsing and domain review in your application. RushDB stores the resulting records, exposes schema, and retrieves relevant evidence for the next step.

from rushdb import RushDB

db = RushDB('RUSHDB_API_KEY')

schema = db.ai.get_schema_markdown({'labels': ['PAPER', 'PDF_CHUNK', 'TRIAL', 'COHORT']}).data
evidence = db.ai.search({
    'labels': ['PDF_CHUNK'],
    'propertyName': 'text',
    'query': 'EGFR resistance evidence with outcome differences',
    'where': {'workspace_id': 'oncology-01'},
    'limit': 5,
})
cohorts = db.records.find({'labels': ['COHORT'], 'where': {'BIOMARKER': {'name': {'$contains': 'EGFR'}}}, 'limit': 10})

Implementation blueprint

Build the medical research-loop path.

Use this sequence to connect paper chunks, trial metadata, cohort rows, and follow-up hypotheses in a research workspace.

01Import PAPER records with PDF_CHUNK, AUTHOR, INSTITUTION, and TOPIC records
02Import TRIAL and COHORT records with biomarker and outcome context
03Create a managed index for PDF_CHUNK.text
04Retrieve evidence by workspace, topic, and semantic query
05Persist each follow-up HYPOTHESIS with SUPPORTING_EVIDENCE links

Build path

Keep PDF extraction and chunking outside RushDB.
Connect evidence across PAPER, TRIAL, COHORT, BIOMARKER, and OUTCOME records.
Persist hypotheses as records linked to the evidence used to create them.
Present this as research support, not clinical decision automation.

Relevant docs

Read the exact primitives behind this pattern.

These links point to the RushDB docs pages that map directly to this blueprint: ingestion, labels, properties, values, SearchQuery, relationships, semantic search, MCP, or deployment.

Import JSON data

Import papers, chunks, cohorts, biomarkers, trial rows, and hypotheses as connected records.

Open docs

Semantic search

Retrieve relevant research evidence by meaning while preserving workspace and topic filters.

Open docs

Relationships API

Connect hypotheses to supporting evidence, cohorts, biomarkers, papers, and trial context.

Open docs

How it works

Build the smallest useful workflow first.

Normalize useful source records

Import paper metadata, chunked text, cohort rows, and trial summaries in shapes the research workflow can inspect.

Discover and retrieve

Load schema, search indexed text, and enrich matching evidence with related research records.

Persist the next question

Write the follow-up hypothesis and its cited evidence back to the workspace so the next loop starts with durable context.

Know where it fits.

Research support, not medical advice

This pattern organizes and retrieves evidence for research workflows. It does not replace clinical validation, domain review, or medical judgment.

Keep source provenance visible

Return cited paper, cohort, trial, and topic records with the answer instead of flattening evidence into an uncited summary.

Questions developers ask.

Next step

Start with one focused workflow.

Start free

Related use cases

Explore all use cases

RAG and knowledge bases

Retrieve related context, not only the nearest similar chunks.

Explore graph-aware RAG

Legal contract-review memory

Persist contract facts and retrieve changed clauses with citation-linked context.

Explore contract-review memory