Context
Patients often get medical reports that are hard to read. The numbers are scattered, the words are technical, and even doctors explain them differently.
So I wondered — what if an AI could summarize, explain, and answer questions directly from a patient’s own report?
That was the starting idea for Medical Reports AI, a small experiment that turned into a preregistered pilot study published on OSF.
The goal was simple:
👉 Make report reading faster and clearer for patients.
👉 Keep everything auditable, transparent, and ethical.
Hypothesis
If AI can read and summarize medical text, then:
- Patients will find information faster.
- They’ll feel more confident before doctor visits.
- They’ll trust the system if it’s transparent about what it’s doing.
This became the foundation of my preregistered study — testing if an AI assistant could reduce time-to-find and increase user confidence when reading reports.
Development
I built everything myself over a few weekends.
🔹 Backend (Core Engine)
- Python + LangChain + OpenAI API → handled text parsing, summarization, and chat-based answers.
- LangSmith → used for auditing and prompt tracing.
- OpenAI OCR → extracted text from scanned reports.
- Backblaze B2 → cloud storage for uploaded files.
- FastAPI → exposed all endpoints as a clean REST API.
- RepoCloud → hosted the backend, keeping logs visible for debugging.
🔹 Frontend
- Generated first version with v0.dev, then manually connected to my API.
- Designed a simple upload → chat → summary flow.
- Each answer cited text from the uploaded report to improve trust.
It wasn’t fancy, but it worked. The user could upload a report, wait a few seconds, then chat with an AI about their findings — “What’s my hemoglobin level?”, “Is this normal?”, or “What’s abnormal here?”
Testing & Results
I preregistered the study before collecting data (DOI: 10.17605/OSF.IO/SPJNY).
Then I ran a within-subject pilot with 5 participants (non-medical users).
They did two sessions:
- Without AI – manually finding values in sample reports.
- With AI – using the assistant to find and interpret them.
Key outcomes:
Metric | Before | After | Change |
---|---|---|---|
Time-to-find (sec) | 260.0 | 98.4 | ↓ 62.2% |
Ease of finding (1–5) | 3.2 | 4.4 | ↑ significant |
Confidence (1–5) | 3.0 | 4.0 | ↑ significant |
Post-surveys showed:
- Trust: ~4.0
- Usability: ~3.8
- Reuse Intention: ~4.4
Users liked “how fast it finds things” and “seeing everything in one place.”
The main complaints were upload latency and “how do I know it’s accurate?”
Reflection
What worked:
✅ Fast search and summarization drastically improved the experience.
✅ The audit trail from LangSmith made debugging and verification easier.
✅ Transparency (showing report text snippets) built trust.
What didn’t:
⚠️ Upload latency — heavy PDF OCR slows the first impression.
⚠️ No explicit confidence score — users wanted reassurance about accuracy.
I understood that building a usable AI tool is less about the model and more about trust design — making sure users feel the system isn’t hiding anything.
Takeaway
This project started as a weekend idea but ended up as a preregistered open pilot with real data.
It proves that small, transparent systems can genuinely improve understanding — even in sensitive fields like healthcare.
The next step is obvious: scale it carefully, test with larger and more diverse users, and explore how explainable AI can move from lab demos to patient tools