MetaPost: Direct to PDF via MPlib

Introduction

Using the Cairo graphics library (under Windows/Visual Studio) I have, with some caveats, been able to create a direct-to-PDF backend for MetaPost via the brilliant MPlib C library. Of course, Cairo does not support the CMYK colour space which is a real shame, despite there being a lot of discussion on the need for that. I might look at using LibHaru or possibly PoDoFo, both of which I’ve managed to build on Windows – although I found PoDoFo somewhat difficult to build as a native Windows library. In addition, I have not yet added support for including text in the MetaPost graphics which is, of course, a pretty big omission! That’s on the “TODO” list. An example PDF is included in this post, based on the MetaPost code available on this site. If you look at the example PDF you will see it is created with Cairo 1.12.16, the latest release available at the time I wrote this post (25 October 2014).

Download PDF

Quick overview of the process

At the moment, the PDF backend seems to work well, at least with the MetaPost code I’ve tried it with (minus text, of course!). The lack of CMYK support in Cairo is a nuisance and at the moment I do a very simple, and wholly inadequate, “conversion” of CMYK to RGB, which really makes me cringe. Perhaps I might put in a “callback” feature to use other PDF libraries at the appropriate points in my C code. MPlib itself is a superb C library and the API documentation (version 1.800) that’s currently available was a helpful start but as very non-expert MetaPost user I did need to resort to John Hobby’s original work in order to understand just a little more about some MetaPost internals. In writing the PDF backend I pretty much had to go through the PostScript backend and replace PostScript output with the appropriate Cairo API calls. The trickiest part, at least for me, was implementing management of the graphics state (as MetaPost sees it). In the end, I chose to use MPlib’s ability to register a userdata pointer (void*) with the MetaPost interpreter. In the PostScript backend the graphics state is managed internally by the MetaPost interpreter (MPlib). Can’t quite recall why I chose to externalise the graphics state code but I think it was to give me a bit more flexibility; either way, so far it basically works well. I chose to build MPlib as a static Windows .lib file – no particular reason, just that’s what I prefer to do – although building a DLL is no more difficult. Much of MPlib is released as a set of CWEB files so you will need to extract the C code via CTANGLE.EXE. I use Windows and Visual Studio so, not surprisingly, I found that the MPlib C code would not compile immediately “out of the box” but a few minor (pretty trivial) adjustments to the header files (and some manual #defines) soon resolved the problems and it compiled fine after that.

A little deeper

Assuming you have a working compilation of MPlib, how do you actually use it? I won’t repeat the information available in the the MPlib API documentation but will give a brief summary of additional considerations that might be helpful to others. Firstly, in my implementation I instantiate an instance of the MP interpreter like this:

	MP mp = init_metapost((void*)create_mp_graphics_state());
	if ( ! mp ) exit ( EXIT_FAILURE ) ;

where (void*)create_mp_graphics_state() is a function to create a new graphics state and register this as the userdata item stored in the MPlib instance – see the code for init_metapost(void* userdata) below (Note: this is a work-in-progress and the error checking is very minimal!!! :-)). Providing the initialization succeeds you will get a new MetaPost interpreter instance returned to you. As part of the initialization you have to provide a callback that tells MetaPost how to find input files – my callback is called file_finder which uses recursive directory searching: no kpathsea involved at all. One very important setting in MP-options is math_mode which affects how MetaPost performs its internal calculations: later versions of MPlib (after 1.800) support all 4 of the possible options. As part of the initialization I also preload the plain.mp macro collection.

MP init_metapost(void* userdata)
{

	MP mp;
	MP_options * opt = mp_options () ;
	opt -> command_line = NULL;
	opt -> noninteractive = 1 ;
	opt->find_file = file_finder;
	opt->print_found_names = 1;
	opt->userdata = userdata;

	/*
	typedef enum{
	mp_math_scaled_mode= 0,
	mp_math_double_mode= 1,
	mp_math_binary_mode= 2,
	mp_math_decimal_mode= 3
	}mp_math_mode;
	*/

	opt->math_mode =mp_math_scaled_mode;
        opt->ini_version = 1;
	mp = mp_initialize ( opt ) ;
	if ( ! mp ) 
		//exit ( EXIT_FAILURE )
		return NULL;
	else
	{
		char * input= "let dump = endinput ; input plain; ";
		mp_execute(mp, input, strlen(input));
		mp_run_data * res = mp_rundata(mp);
		
		if(mp->history > 0)
		{
			printf("Error text (%s\n)", res->term_out.data);
			return NULL;
		}
		else{
		
			return mp;
		}
	}

}

Got a working instance, now what?

In you get a working MP instance the next task is, of course, to feed it with some MetaPost code (using mp_execute(mp, your_code, strlen(your_code))😉 and checking to see if MetaPost successfully interpreted your_code. Now I’m not going to give full details of the checks you need to perform as this is pretty routine and the API documentation contains enough help already. In essence, if MPlib was able to run your MetaPost code successfully, it stores the individual graphics (produced from your_code) as a linked list of so-called edge structures (mp_edge_objects). Each edge structure (mp_edge_object) is a graphic that you want to output and, in essence, each edge structure results from the successful execution of the code contained in each beginfig(x) ... endfig; pair. In turn, each edge structure (individual graphic to output) is itself made up from smaller building blocks of 8 types of fundamental graphics object (mp_graphic_object). Each mp_graphic_object has a type to tell you what sort of graphic object it is so you can call the appropriate function to render it – as the equivalent PostScript, PDF, PNG, SVG etc.

In summary

If your MetaPost interpreter instance is called, say, mp, then to gain access to the linked list of edge structures you do something like this:

 
mp_run_data * res = mp_rundata(mp);
mp_edge_object* graphics = res->edges;

Note that the edge structures form a simple linked list but the list of components within each individual edge structure (the mp_graphic_object objects) form a circularly-linked list, so you have to be careful to check when you get to the end of the circular list of the mp_graphic_object objects: see the API docs for an example. In closing, here’s the loop from my code to process an individual edge structure into PDF – not including all the additional functions to process the various types of the mp_graphic_object objects.

int draw_mp_graphic_on_pdf(mp_edge_object* single_graphic, cairo_t *cr)
{

 		mp_graphic_object*p;
 		MP mp = single_graphic->parent;

		 p=single_graphic->body;
		
		// Inherited this weirdness from core MP engine...
		init_graphics_state(mp, 0);
		// Here we are looping over all the objects in a single graphics
		// resulting from a beginfig(x) ... endfig pair
		 while (p != NULL) 
		 {
			mp_gr_fix_graphics_state(mp,p,cr);
 			switch (gr_type(p)) 
			 {
				 case mp_fill_code:

				 {
				
				 if(gr_pen_p((mp_fill_object*)p)==NULL)
					 {
						//mp_dump_solved_path(gr_path_p((mp_fill_object*)p));
						cairo_gr_pdf_fill_out(mp,gr_path_p((mp_fill_object*)p),cr); 
					}
					 else if(pen_is_elliptical(gr_pen_p((mp_fill_object*)p)))
					{
						//mp_dump_solved_path(gr_path_p((mp_fill_object*)p));
						cairo_gr_stroke_ellipse(mp,p,true,cr);
					 }else{
						//mp_dump_solved_path(gr_path_p((mp_stroked_object*)p));
						cairo_gr_pdf_fill_out(mp,gr_path_p((mp_fill_object*)p),cr);
						cairo_gr_pdf_fill_out(mp,gr_htap(p),cr); 
					 }
	
					 if(   ((mp_fill_object*)p)->post_script != NULL)
					 {
					        // just something I'm experimenting with
						//ondraw(cr, ((mp_fill_object*)p)->post_script);
					}
				}
				break;

				 case mp_stroked_code:
				 {
					
					 mp_dump_solved_path(gr_path_p((mp_stroked_object*)p));	
					if(pen_is_elliptical(gr_pen_p((mp_stroked_object*)p)))
						cairo_gr_stroke_ellipse(mp, p, false, cr);
 					else
 					{
						//mp_dump_solved_path(gr_path_p((mp_stroked_object*)p));
						cairo_gr_pdf_fill_out(mp,gr_path_p((mp_stroked_object*)p),cr);
 					}

					 if(((mp_stroked_object*)p)->post_script != NULL)
					{
						ondraw(cr, ((mp_stroked_object*)p)->post_script);
 					}
 				}
 				break;

  			         case mp_text_code: // not yet implemented
				 {
					 mp_text_object* to;
					to = (mp_text_object*)p;
					char * po = to->post_script;
					char * ps = to->pre_script;
				}
				break;

				case mp_start_clip_code:
					cairo_save(cr);
					cairo_gr_pdf_path_out(mp,gr_path_p((mp_clip_object*)p),cr);
					cairo_clip(cr);
				break;
				
				case mp_stop_clip_code:
					cairo_restore(cr);
				break;		
				
				case mp_start_bounds_code: // ignored
					//mp_bounds_object *sbo;
					//sbo = (mp_bounds_object *)gr;
				break;

				case mp_stop_bounds_code: //ignored
					//mp_special_object
				break;
				case mp_special_code: //just more experimenting, ignore
					
					mp_special_object *speco;
					speco = (mp_special_object *)p;
					printf("%s", speco->pre_script);
					ondraw(cr, speco->pre_script);
				break;
			}
				p= gr_link(p);
		}
		return 0;
	}

Conclusion

I wish I could switch on the commenting feature but, sadly, spammers make this impossible. So, I just hope the above is a useful starting point for anyone wanting to explore the marvellous MPlib C library.

PDF file of John Hobby’s original MetaPost code (version 0.64)

MetaPost MPlib

I’m currently implementing a project built around the MetaPost library MPlib. I managed to build MPlib as a Windows .lib (library) file without “too much” difficulty… In order to understand the workings of the powerful, but complex, MPlib library I found it was very helpful to read parts of Hobby’s original code – mainly in relation to generating output from the low-level MPlib/MetaPost edge structures. I also benefitted enormously from reading the C code of the Lua binding so a huge thank you to Taco Hoekwater for his utterly brilliant work on the MPlib/lmplib source code.

I tracked down the MetaPost 0.64 source code (the .web code) and ran TIE and WEAVE to generate the TeX documentation. After a few tiny fixes (for fonts I don’t have) I produced a PDF file which I thought others might find useful. You can download it here. The MPlib API documentation (again by Taco) was also very helpful – documentation for version 1.800 of the MPlib API is available here.