Getting rid of Type 3 fonts from gnuplot outputs
I recently had a small odyssey with PDF and fonts when preparing camera-ready versions of my two most recent papers. It started with the following error message from HotCRP format checker: “Bad font: unnamed Type3 fonts first referenced on page X”.
I cannot even comprehend the error message on first sight – what is a Type 3 font? After some googling, I learned that the definition of “types” comes from PostScript, and a Type 3 font is a font stored as bitmaps instead of vector glyphs that you and I would expect. Such fonts are designed for specific devices (say, a 300-DPI printer) and would naturally be ugly when displayed on monitors, even more so when zoomed in/out. So, HotCRP and most publishers (rightfully) decide to ban such fonts from camera-ready papers.
I’m still slightly confused though. I have always thought that PDFs are basically vector pictures. Why do they need to care about fonts? It turns out that the text is actually not lost when typeset into a PDF file. The text is encoded (with a standard or nonstandard encoding) and stored in the PDF, accompanied by a hint on which font to use to display it. It is the PDF viewer’s job to draw the glyphs of the text using a proper font on the screen. Note that the font itself may or may not be embedded in the PDF file. If so, the embedded font will always be used. If not, the viewer will choose a locally-available font as a substitute. This font-embedding business is another source of trouble by itself. Publishers (again, rightfully) want final papers to look the same on all devices regardless of what fonts are locally available, so they generally require all fonts to be embedded, which may or may not be the default for LaTeX and gnuplot. But let’s get back to Type 3 fonts for now.
So, the error message basically says “I found ugly Type 3 fonts used in the PDF”. The first question is how it got in there. The page number in the error message and some tweaking led me to believe the problem comes from figures generated by gnuplot. Another round of googling led me to a bunch of other users sharing the same problem.
Solving the problem
Usually, the diagnosis from stackoverflow would be: gnuplot on macOS does strange stuff and uses Type 3 fonts for spaces in captions and labels. It does not show this behavior on Linux. Someone did file a ticket for this bug, but unfortunately, the developer pointed out that gnuplot does not handle fonts by itself, and suggested looking at its dependencies. This looks like a rabbit hole to me. The suggested solution is usually re-generating the plots on Linux.
As a side note, if you would like to see for yourself that there are indeed Type 3
fonts in the document, you may install the pdffonts
tool from the poppler
package (available in Homebrew). Then, run pdffonts your-figure.pdf
and you should
see something like this:
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
[none] Type 3 Custom yes no yes 6 0
FDCBUM+Helvetica TrueType WinAnsi yes yes yes 7 0
CQDHZB+Helvetica TrueType WinAnsi yes yes yes 8 0
[none] Type 3 Custom yes no yes 9 0
If you embed this figure into a paper, the PDF output of the paper will contain Type 3 fonts.
As mentioned before, the optimal solution is to re-generate the figures and try to not let Type 3 fonts sneak in. However, sometimes it is difficult or impossible. For example, I do not even know why gnuplot on macOS would use Type 3 fonts. Or, I may have figures which I have lost the original script to generate. Or, I may not want to take the risk that the re-generated plot may look different from the original, peer-reviewed version. It would be great if there is a way to fix the PDF plots I’ve already got.
The solution is actually simple. I just remove all fonts from the figure. Recall that fonts are there to help show text, and there is little text in a figure. So, we can simply draw the plot as a pure vector image, with text rendered into vector shapes. This process is called “outlining”. Some quick googling led me to the one-liner:
gs -o output.pdf -dNoOutputFonts -sDEVICE=pdfwrite input.pdf
If you run pdffonts
on the output, you will notice that all fonts are gone.
HotCRP should stop complaining if you embed the output into the paper.
Note that you should definitely not run this on the full paper. It will convert your paper into a huge vector picture. Although the text will look the same, the PDF viewer will just treat them as vector shapes, and functions like search or highlight will break.
Updated June 12, 2024.
Okay, so I needed to work on another camera ready. This time, I did not wish to
outline all glyphs in the figures, but just the ones using Type 3 fonts. (In
comparison, the -dNoOutputFonts
option will outline all glyphs in a file.)
There are many reasons. For example, outlined glyphs are just vector shapes and
applications handle them differently from actual text, which in my opinion make
them appear uglier. After some digging, I finally figured out a solution using
only ghostscript. I believe this method has not been documented elsewhere.
The key is to use the AlwaysOutline
option of ghostscript documented
here. It
tells ghostscript to only outline a specified list of fonts. A small issue is
that the Type 3 fonts in Gnuplot outputs do not have a name, so I was a bit
unsure about how to refer to them. But after some trial and error I found the
following command worked.
gs -sDEVICE=pdfwrite -o out.pdf -c "<< /AlwaysOutline [/] >> setdistillerparams" -f input.pdf
To make sure it worked, we use pdffonts
again.
% pdffonts input.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
[none] Type 3 Custom yes no yes 7 0
WMFXZX+Helvetica TrueType WinAnsi yes yes yes 8 0
ZNDOST+Helvetica TrueType WinAnsi yes yes yes 9 0
[none] Type 3 Custom yes no yes 10 0
% pdffonts out.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
JGONUT+Helvetica TrueType WinAnsi yes yes yes 10 0
LLMJYR+Helvetica TrueType WinAnsi yes yes yes 14 0
Notice that the Type 3 fonts with no name are gone from the output!