STM publishing: tools, technologies and change A WordPress site for STM Publishing

30Aug/14Off

RegexBuddy and RegexMagic: Truly superb regular expression tools

Regular expressions are part of many programmer's toolkit but they can be quite fiddly to get right. At the moment, I'm trying to "sanitize" the C code generated for TeX (via Web2C) by post-processing the TeX.c file to make the C source code far more readable. To do that I'm using the original definitions in TeX.WEB to generate C #define statements that I can use in TeX.c. For example, in TeX.WEB you see the following "WEB macros" related to entries in TeX's "equivalence table":

@d eq_level_field(#)==#.hh.b1
@d eq_type_field(#)==#.hh.b0
@d equiv_field(#)==#.hh.rh
@d eq_level(#)==eq_level_field(eqtb[#]) {level of definition}
@d eq_type(#)==eq_type_field(eqtb[#]) {command code for equivalent}
@d equiv(#)==equiv_field(eqtb[#]) {equivalent value}

When WEB expressions using the above macros are processed by TANGLE and Web2C the resulting C code contains many statements that look like the following:

eqtb [curval ].hh.b1 = 1 ; 
eqtb [curval ].hh.b0 = c ; 
eqtb [curval ].hh .v.RH = o ; 

Not very readable but, of course, it is machine-generated C code so what would you expect. Through regular expressions I'm (slowly/carefully) replacing many raw C statements using #defines, such as the following:

#define equivalence_level(a) eqtb[a].hh.b1
#define command_code_equivalence(a) eqtb[a].hh.b0
#define set_value_of_equivalent(a) eqtb[a].hh.v.RH

As part of this work, I use two very useful tools for building and testing regular expressions: RegexBuddy and RegexMagic (the tools are compared/explained here). They help you build, test/develop regular expressions and support the syntax and options of many regular expression engines. Once you have a working regex, RegexBuddy and RegexMagic will generate code that allows you to use the regex in a language of your choice (many languages are supported), including C code to use the regex with PCRE – which is my favourite regex library. Again, this is not an advert for these tools, just some notes from someone who has found them to be extremely useful – and have saved me considerable amounts of time in building, testing/using powerful regular expressions with PCRE.

Screenshot: RegexBuddy

Processing INITEX's primitive(...) function code with RegexBuddy to extract data for preparing C #defines.

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.