Summary

An authenticated user can upload a malicious PDF file containing an HTML payload that is rendered unsafely in the Information Panel. This results in Stored Cross-Site Scripting (XSS).

Due to Kotaemon storing plaintext username and password in localStorage, an attacker can steal user credentials via this vector, escalating the impact to session hijacking and persistent compromise.

Details

Kotaemon (versions <= 0.11.0 and including commit 37cdc28 does not sanitize or escape HTML when rendering extracted content from PDFs into the DOM. This affects the Information Panel and possibly other frontend components that render retrieved_content.


1. Insecure HTML Injection:

Several components render extracted content (text, tables, image URLs) from user-uploaded documents directly into the DOM without escaping:

This allows HTML/JS to be embedded via PDF content.


2. Unsafe Client-Side Credential Storage:

Kotaemon saves username and password in plaintext in localStorage, making them accessible to any injected JavaScript payload.

Relevant code: