Your PDF Docs Look Beautiful To Your Boss, But Your AI Thinks They're A Crime SceneDiscover why allowing PDF tech docs to power AI-answer engines might not be your best idea — 💡There’s a quaint kind of optimism that appears when some of us start talking about AI and tech docs. It usually starts like this: “We already have thousands of pages of documentation in PDF, so the model can just read those and answer questions from them. Voila!” I get the impulse. Really, I do. I ❤️ me a good PDF; but only when I need one. To a human reader, a well-made PDF can seem perfectly clear.
But when a large language model encounters that same file, it often isn’t getting the polished experience you imagine. It’s getting a reconstruction of the content through whatever extraction method sits between the PDF and the model, and that reconstruction can be a hot 🔥 mess. Continue reading this post for free in the Substack app |