An authenticated user can upload a malicious PDF file containing an HTML payload that is rendered unsafely in the Information Panel. This results in Stored Cross-Site Scripting (XSS).
Due to Kotaemon storing plaintext username and password in localStorage, an attacker can steal user credentials via this vector, escalating the impact to session hijacking and persistent compromise.
Kotaemon (versions <= 0.11.0 and including commit 37cdc28 does not sanitize or escape HTML when rendering extracted content from PDFs into the DOM. This affects the Information Panel and possibly other frontend components that render retrieved_content.
Several components render extracted content (text, tables, image URLs) from user-uploaded documents directly into the DOM without escaping:
libs/ktem/ktem/utils/render.py
libs/ktem/ktem/index/file/ui.py
table/image → L414–427libs/ktem/ktem/reasoning/simple.py
react.py
libs/kotaemon/kotaemon/indices/qa/format_context.py
This allows HTML/JS to be embedded via PDF content.
Kotaemon saves username and password in plaintext in localStorage, making them accessible to any injected JavaScript payload.
Relevant code:
libs/ktem/ktem/pages/login.pylibs/ktem/ktem/assets/js/main.js