← Blog Home

News & Sentiment

#8 Why Raw JSON Appeared in News Cards and How I Sanitized Gemini Responses

· Build Log

Model-output drift leaked raw payload text into cards; strict extraction and sanitation fixed it.

Gemini JSON parsenews card sanitizeLLM guardrails

1) TL;DR

2) What I Tried

I initially trusted first-pass model text extraction.

3) What Broke

Cards occasionally showed malformed strings and wrapper artifacts.

4) Root Cause

LLM output variability was not fully constrained before cache and render usage.

5) Before (Code Path)

analysis pipeline - parse candidate text directly - weak boundary extraction - limited write-time sanitation

6) After (Code Path)

analysis pipeline + JSON mode where supported + robust extraction for candidate payload + sanitize fields before cache + before serve

7) Evidence (Git History)

8) What I Learned

Treat LLM output as untrusted input until schema validation passes.

9) Frequently Asked Questions

Can prompt tuning alone solve this?

No, parser guardrails are still required.

Why sanitize twice?

To protect both new writes and legacy cached objects.

How does this improve GEO?

Clean snippets are more reliable for AI citation and search previews.