Saturday, January 26, 2008

On importing MatLAB figures into papers and presentations

[2008.02.01: I've had to substantially re-write this post, after I realized that my first attempt at it was horrendously convoluted and confusing. Which, in all fairness, is exactly what my own state of mind was at the time, after the hours – days! – of wrestling with the subject matter of this post.]

As a [graduate] student in engineering, I have, of course, had to write innumerable papers and reports, and deliver just as many presentations. Since most --- almost all, in fact --- of the work described in those papers and presentations was performed using MATLAB, the issue of how best to import figures created in that software package into a different software suite, meant for creating said papers and presentations, is a very important one. One that, thanks to the different ways in which MATLAB, Microsoft Office (the most-prevalent choice for creating documents today) and Adobe's PDF standard handle image data (not to mention all the various `standalone' image formats --- [Windows] bitmap, [Windows] Enhanced Metafile (EMF), Compuserve GIF, JPEG, TIFF, PNG... the list could go on forever), usually results in hours upon hours of exasperating, infuriating, hair-rending, cerebro-circuit-overloading trial and error. (MS Word's own quirks when handling non-text objects and/or relatively complex documents only add to the pain, which is why I, like many others, have switched to using LaTeX --- but that's another story.) It sure did for me, and I know it has for many others too. (See, for example, "Matlab copy-and-paste: Still broken after all these years" by Frinkytown, "How do I import MATLAB graphics into Microsoft Office 97?" by the MathWorks, this Matlab script to create a Word document from a Matlab figure, by James Lynch, and yet more search results returned from googling "matlab figures Word".) So, to save any others the ordeal of re-inventing my wheel, here is what I have learned:

(Note: The following discussion applies to figures with `line' plots. For figures with shaded plot elements, like those created with surf, mesh or pcolor, see the last paragraph of this post. Also, my system is WinXP Home / SP2, MS Office 2000 and MATLAB v6.5 (R13).)

First off, MATLAB’s own display format: figures on-screen are created at a resolution of 96 dpi, at a default size of 560x420 pixels (4.375”x5.833”). I’m not quite sure how this happens, since my laptop screen itself seems to have a resolution of 117 dpi (1400x1050 pixels, 12”x9”). Maximized, the figure becomes 1400x944 pixels, but still at 96 dpi.

If the figure is exported as a png file – or any other bitmapped (raster) format, I’m assuming – the resolution gets increased to 150 dpi. The size of the image in inches is set by the Page Setup settings of the figure window (default: 8”x6”). The size in pixels adjusts accordingly (default: 1200x900). Similarly, other Page Setup settings also apply, such as Recomputing vs. Keeping screen limits and ticks. Linewidths remain about the same, but the sizes of the text fonts and the line markers become smaller in proportion to the rest of the image. The dotted gridlines (which are actually very tiny dashes in the original MATLAB figure, with a default linewidth of 0.5) become true dotted lines – which is nice.

The results are probably the same with other bitmap-format filetypes, I’d imagine. With a tif it’s the same, except that the filesizes (in bytes) are an order of magnitude larger (two orders of magnitude larger for an uncompressed tif), with no improvement in quality (that I can observe). Jpgs are about five times larger than the pngs, again with no improvement in quality (possibly even a slight deterioration, depending on the degree of `lossiness’ chosen). Gifs are the same as pngs, and we’re not supposed to use those any more. So, in fact, with the exception of eps files (and I’ll come back to those later), pngs really are the smallest filesize format out there, smaller even than the emfs.

If the figure is exported as an emf file (which is a vector format, I think), the [relative] sizes of all the figure objects remain the same, so the image looks *exactly* like the original figure in MATLAB. The size of the image in pixels is the same as the size of the original figure on the screen at the time it was exported. However, the resolution increases to 120 dpi (so the quality improves), and the size in inches therefore gets proportionately reduced. Strangely enough, if this emf file is imported/inserted into MS Word, the resolution of the inserted image falls back to 96 dpi (and the size in inches therefore increases back to that of the original MATLAB figure). Similarly, and similarly strangely, if the emf image is converted into the png format (which is easily done using a program like Irfanview), the resolution falls to 94 dpi. And, for some still further strangeness: if this emf2png file is inserted/imported into MS Word, the resolution once again returns to the original 96 dpi!

Coming now to the quality of the various image formats when imported into the various document formats:

In a MS Word .doc file:
png: image quality is good (still at 150 dpi?), but fontsizes and markersizes need to be increased to look the same as the original MATLAB figure. The true-dotted gridlines are almost invisible – which is nice.
emf2png: quality is not as good (96 dpi) as the imported pngs, although the sizes of the objects are `authentic’. Pretty sorry-looking, in fact – everything in the image is pixelated and `choppy’ – but just about passable in a pinch.
emf: quality is the best, even though the resolution is only at 96 dpi, and the image looks exactly like the original figure. Much better than the emf2png, and also slightly better than a png even when it’s had the font/markersizes increased. _This_ is the best of the three choices, when working with Word documents.
Coloured lines in imported pngs are slightly faded (when printed out on a B/W printer), but those in imported emfs and emf2pngs are still quite visible.
In addition to importing an external image file, when working with Word you also have the option of copying-and-pasting the figure image directly from MATLAB, via the [Windows] Clipboard. Using this method, a figure that is copied in metafile-format (“Preserve information; metafile if possible”), and with the “match screen-size” option, results in an image that has an increased resolution of 120 dpi (like an emf file before it is imported), and looks exactly like the imported emf-file image (96 dpi). If the image is copied in bitmap-format, it comes in at 96 dpi and looks just like the imported emf2png image.
One final word (no, pun not intended): “direct-copying” in metafile format is sometimes preferable to importing an emf file, even though they look the same, because in some versions of Word (Word 2000, for example, but not Word 2003), the latter method can result in the individual letters of the y-axis-label --- but not the entire label as a single unit --- getting rotated 90deg clockwise, resulting in a vertical stack of horizontally-aligned letters.

.doc --> .pdf, using the Adobe pdfmaker macro:
pngs: look terrible, whether font/markersizes have been increased or not. The quality deteriorates very badly in the conversion from doc to pdf.
emf2png: quality deteriorates slightly, not as much as the `true’ pngs; so they look a little better, but not by very much. They’re passable, but just barely so.
emfs (and metafile-format direct-copies): look the best, although the little dashes of the gridlines get converted into big dashes, which is not very pretty at all. (Increasing the linewidth of the gridlines from in the original figure from 0.5 to 1.0 may help – I’m not sure; I haven’t tried it yet.)
For all three formats, coloured lines are almost completely invisible when printed out on a B/W printer.
(All this is with the default pdfmaker settings. Maybe tweaking those will help; I've never tried it, so I don't know. Not sure if using Distiller instead of pdfmaker makes any difference, either.)

.tex --> .pdf:
pdflatex only recognizes images in jpg, gif, png and pdf formats, so now it’s not possible to import those good-quality emfs or even directly copy from MATLAB via the Clipboard. So we’re stuck with pngs. On the other hand, we can now use figures that have been exported as EPS – Encapsulated PostScript – files, and which can be converted to pdf, either on the fly during `texification’ by Heiko Oberdiek’s epstopdf package, or previously by some other method.
png: looks decent, like it looks in a doc before it’s converted to pdf.
emf2png: also looks like it does in a doc before conversion to pdf.
Therefore, of the two, png is better. Font- and markersizes need to be increased, though, of course.
eps: the best of the lot. Quality is better than that of png. (I don’t know what the resolution of these images is – it seems to be between 110 and 120 dpi.) Here too, though, font/markersizes need to be increased prior to exporting from MATLAB, by possibly the same amount as for the pngs. The gridlines are true dotted lines. Also, the postscript `bounding box' is drawn tighter around the axes/plot, so that, for the same overall figure width, the graph/image is larger --- unless you specify the -loose option while using the MATLAB print command.
Coloured lines in all three formats are faded as much as, or even more than, in doc2pdfs.

The -loose option tells the postscript driver to use the figure's PaperPosition property value as the Bounding Box. This is important because, when multiple figures are to be aligned side-by-side in the final document, it is the bounding boxes that get aligned, and, if the different figures have axis labels with different sizes or positions, then not using the -loose option results in differently-sized bounding boxes, with the result that the axes of the different graphs don't line up with each other.

Thus, in shorthand, for each of the three document formats, the order of preference for the image formats would be:

doc: emf /direct-copy > png (sizes increased) > emf2png
doc2pdf: emf/direct-copy > emf2png > png (whatever the size)
tex2pdf: eps2pdf > png (sizes increased) > emf2png

Finally, coming to the scaling of the various image objects (and the images themselves), these are the settings that I have found to work for me:

For a presentation (Powerpoint), using emfs:
Resize figure window to 600 x 450 for 2 graphs per slide
(For 1 graph per slide, magnify --- in Ppt --- to 125%)
Line thickness - 3
Arrow thickness - 2
Arrow text - 16, bold
Axes labels - 14 or 16, bold
Tick labels - 14, bold
Marker size - 10 (15 for x)

For a paper (doc, doc2pdf), using emfs:
Resize (in Matlab) to 600 x 450
Resize (in Word) to 65% (width = 3.17") for double-column size
Inline with text (not floating over)
Line thickness - 2
Legend font - 10 point, normal (default)
Axes labels - 14, bold
Tick labels - 12, normal
Marker size - 6 (10 for asterisks, pentagrams and hexagrams)

For a paper (tex2pdf), using pngs:
(To fit two images side-by-side on a single-column page)
Either don't resize in Matlab (default = 560 x 420), and then scale the width in TeX to 0.5\linewidth, or resize in Matlab to 600x450 and then scale in TeX to 0.45\linewidth.
Line thickness - 1
Legend font - 14 point, normal
Axes labels - 18, normal
Tick labels - 14, normal
Marker size - 8 (12 for asterisks, pentagrams and hexagrams)

For a paper (tex2pdf), using eps2pdfs:
Same figure scaling as above, and:
Line thickness - 1
Legend font - 10 point, normal
Axes labels - 14, normal
Tick labels - 11, normal
Marker size - 6 (10 for asterisks, pentagrams and hexagrams)

For figures involving shading (pcolor, surf, mesh plots, etc.), there are still more things to consider, like the way in which MATLAB renders the image (Painters vs. Zbuffer vs. OpenGL), the colour of the background (transparent -- which doesn't always work out that way -- vs. white vs. `figure-color'), and, again, the method for copying/exporting the image (bitmap vs. metafile)... Oh, and the version of Word or Powerpoint that you're importing into. And each choice interacts with each other choice in completely unpredictable ways. (Of course.) I'll get to these another day.

------------------------------------------------------------------

2008.08.15 (Another Day):

For figures without shading, MATLAB's default renderer is Painters, which is a vector format. For figures with shading, the default renderer seems to be either ZBuffer or OpenGL, which are bitmap/raster formats. (On my computer, it's ZBuffer.)

According to MATLAB, when exporting figures to image files (this includes using the "print" command), the default output resolution of the image is:
- 150 dpi for (figures in image formats and when using the ZBuffer or OpenGL renderers)
- screen resolution for metafiles
- 864 dpi otherwise (eg: eps figures and using the Painters renderer)

Since figures, whether shaded or not, come out at 150 dpi when exported as a png/bitmap-file, this would imply that MATLAB always uses the ZBuffer/OpenGL renderer instead of Painters for this export option.

Now, the method I've been using for my tex-->pdf documents, for images with shading, is to export the images as png files, with all the default options/settings. This means, in the Page Setup options, to:
- Use manual size and position (8"x6" => 1200x900 pixels @ 150dpi)
- Force white background
- Use the default figure rendering method (which is ZBuffer on my computer)

But:
- Keep screen limits and ticks (instead of the default "Re-compute")

The same method can also be used for importing the figures into MS Word (and on to pdf), but now there's also the option of directly copying and pasting from MATLAB into Word.

If the figure is copied in metafile format, or "Preserve information (metafile if possible)" (found under Copy Options), the text comes out looking nice, but the shading becomes blocky --- i.e., the "facecolor" property of the shaded surface gets set to the default "flat" (faceted) before the copy occurs. (This can also be set by Matlab's "shading" command. I usually use the "interp" --- for interpolated --- option, because it gives a smoother, nicer look.) This can be remedied, to an extent, by decreasing the step size of the surface matrix.

If the image is copied in bitmap format, then the interpolation of the shading, if set, is retained, but now the text gets pixelated and choppy. This can be fixed, to an extent, by increasing the text fontsize and changing the fontweight to "bold".

The third option --- importing a 150-dpi png image into Word --- seems to be the best compromise solution for this trade-off: interpolated shading is retained, and the text, while not as flawless as in the pure-vectorized metafile (or emf), is still better than in the bitmap-format copy.

When the .doc file is converted to .pdf, all three formats suffer a slight, and roughly equal, deterioration. The imported-png image remains the best option, in my opinion.

A fourth option (for tex-->pdf users), is exporting the shaded figure as an eps file, but this results in ugliness, and is not an option I would consider.

Wednesday, January 16, 2008

Haute couture, graduate-student style

East met West in my attire today. Spurning my usual engineering-graduate-student garb of grubby jeans and a white T-shirt (typically obtained for free at some point during my collegiate career) for something more formal, and more appropriate for the American Helicopter Society dinner meeting that I'm attending tonight, I stepped out of the house nattily dressed in a Nehru jacket, trench coat and cowboy hat.

Yes, you read that right. Nehru jacket. Cowboy hat. Oh, and a tie elegantly patterned with a flotilla of little B-52s.

The Nehru jacket because I wanted to try something different from my usual double-breasted blazer --- and lo and behold, it actually did look pretty damn good! The trench coat because my winter jacket is a bit too technical-looking for a formal dinner event (and also because it's fun to wear something I've never worn before). And, topping it all (ha ha!), the Western broad-brim because it's winter, and it's cold, and I need to keep my mostly keratin-free noggin warm somehow, and a woolly touk (`beanie', or `winter hat' for anyone not Canadian) would have been just waaay too incongruous with the rest of my attire. So, in the absence of any other formal-looking hat (maybe I should get myself one. And some leather gloves too, while I'm about it.), my very warm --- 100% wool --- and very stylish cowboy hat it had to be.

I'm quite stoked about it --- am grinning away idiotically just thinking about it, and the reactions of people who encountered me on my way to work. Dressing up is fun! And, if you're going to try out something bizarre, a college campus is the best place to do it. I'm never going to be able to walk around with one side of my face clean-shaven and other side with a week's growth of facial hair again, once I get a real job. *sigh*

;)