What the … is PDF (and why do we hate it so much)?

Blog

And why do we hate them so much?

We recently posted about file formats in general, and why they are so important in the translation industry. In that post, we mentioned that we can efficiently handle almost all document file formats in our translation processes. There is, however, always at least one exception, and we thought it would be very worthwhile to give a brief explanation as to why PDF files induce such an intensely negative reaction in translators.

Open vs. read-only

Let’s quickly recap the most important point of our post on file formats. Namely, files that originate from programs like Word or InDesign or TextEdit are easily processed by our CAT software. This is because they are open. Open file formats mean that everyone with the rights can access and edit the relevant file on any device that supports the format. It’s the reason why you’re able to send your colleague an Excel sheet, and why they can change things directly, without any additional steps. This is obviously an extremely useful feature, but it’s also a bit risky – aside from mistakes, these kinds of files can also be damaged, especially as they are being sent back and forth via email or messaging services. This kind of damage is often called corruption.

There is more than one way to corrupt a file…

All editable text files are subject to corruption. To get around this, the Portable Document Format, or PDF, was developed. It is a read-only file type, meaning that the content cannot be changed after the file has been created. It only allows for superficial changes, like highlighting or comments. Now, don’t get us wrong, we love this functionality. We recognize that it’s very important that documents can be transmitted without losing any of the content. But, and that’s a big but, PDFs are extremely difficult for the translation industry to work with. CAT software, including the one we use at Wordcraft, cannot handle read-only files, because it all it sees is a single picture and no text.

What to do?

Anyone who has ever sent us a PDF file has heard us ask if they can maybe, pretty please source an open file format instead.