Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime 🫆 Scene

**We already have thousands of pages of documentation in PDF, so the model can just read those and answer questions from them. Voila!**

Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime 🫆 SceneDiscover why allowing PDF tech docs to power AI-answer engines might not be your best idea — 💡
͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     
Forwarded this email? Subscribe here for more
Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime 🫆 Scene
Discover why allowing PDF tech docs to power AI-answer engines might not be your best idea — 💡
Scott Abel
May 6 ∙ Preview 

READ IN APP

There’s a quaint kind of optimism that appears when some of us start talking about AI and tech docs. It usually starts like this: “We already have thousands of pages of documentation in PDF, so the model can just read those and answer questions from them. Voila!”
I get the impulse. Really, I do.  I ❤️ me a good PDF; but only when I need one. 

PDFs are familiar. They look finished, feel official, and are what many tech docs teams ship, archive, email, and point to when someone asks where the user assistance lives. 
To a human reader, a well-made PDF can seem perfectly clear. 
The heading introduces the procedure
Steps appear in order
Warning box is seriously hard to miss
The diagram sits exactly where it belongs, quietly earning its keep
But when a large language model encounters that same file, it often isn’t getting the polished experience you imagine. It’s getting a reconstruction of the content through whatever extraction method sits between the PDF and the model, and that reconstruction can be a hot 🔥 mess.
The Content Wrangler is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber...
Upgrade to paid
Continue reading this post for free in the Substack app
Claim my free post
Or upgrade your subscription. Upgrade to paid