{"id":3557,"date":"2014-08-30T14:27:18","date_gmt":"2014-08-30T14:27:18","guid":{"rendered":"http:\/\/www.readytext.co.uk\/?p=3557"},"modified":"2014-09-11T09:23:01","modified_gmt":"2014-09-11T09:23:01","slug":"regexbuddy-and-regexmagic-truly-superb-regular-expression-tools","status":"publish","type":"post","link":"https:\/\/www.readytext.co.uk\/?p=3557","title":{"rendered":"RegexBuddy and RegexMagic: Truly superb regular expression tools"},"content":{"rendered":"<p>Regular expressions are part of many programmer&#8217;s toolkit but they can be quite fiddly to get right. At the moment, I&#8217;m trying to &#8220;sanitize&#8221; the C code generated for TeX (via Web2C) by post-processing the TeX.c file to make the C source code far more readable. To do that I&#8217;m using the original definitions in TeX.WEB to generate C <code>#define<\/code> statements that I can use in TeX.c. For example, in TeX.WEB you see the following &#8220;WEB macros&#8221; related to entries in TeX&#8217;s &#8220;equivalence table&#8221;:<\/p>\n<pre class=\"brush: plain; light: false; title: ; toolbar: true; notranslate\" title=\"\">\r\n@d eq_level_field(#)==#.hh.b1\r\n@d eq_type_field(#)==#.hh.b0\r\n@d equiv_field(#)==#.hh.rh\r\n@d eq_level(#)==eq_level_field(eqtb&#x5B;#]) {level of definition}\r\n@d eq_type(#)==eq_type_field(eqtb&#x5B;#]) {command code for equivalent}\r\n@d equiv(#)==equiv_field(eqtb&#x5B;#]) {equivalent value}\r\n<\/pre>\n<p>When WEB expressions using the above macros are processed by TANGLE and Web2C the resulting C code contains many statements that look like the following:<\/p>\n<pre class=\"brush: plain; light: false; title: ; toolbar: true; notranslate\" title=\"\">\r\neqtb &#x5B;curval ].hh.b1 = 1 ; \r\neqtb &#x5B;curval ].hh.b0 = c ; \r\neqtb &#x5B;curval ].hh .v.RH = o ; \r\n<\/pre>\n<p>Not very readable but, of course, it is machine-generated C code so what would you expect. Through regular expressions I&#8217;m (slowly\/carefully) replacing many raw C statements using <code>#define<\/code>s, such as the following:<\/p>\n<pre class=\"brush: plain; light: false; title: ; toolbar: true; notranslate\" title=\"\">\r\n#define equivalence_level(a) eqtb&#x5B;a].hh.b1\r\n#define command_code_equivalence(a) eqtb&#x5B;a].hh.b0\r\n#define set_value_of_equivalent(a) eqtb&#x5B;a].hh.v.RH\r\n<\/pre>\n<p>As part of this work, I use two very useful tools for building and testing regular expressions: <a href=\"http:\/\/www.regexbuddy.com\/\">RegexBuddy<\/a> and <a href=\"http:\/\/www.regexmagic.com\/\">RegexMagic<\/a> (the tools are compared\/explained <a href=\"http:\/\/www.regexbuddy.com\/regexmagic.html\">here<\/a>). They help you build, test\/develop regular expressions and support the syntax and options of many regular expression engines. Once you have a working regex, RegexBuddy and RegexMagic will generate code that allows you to use the regex in a language of your choice (many languages are supported), including C code to use the regex with <a href=\"http:\/\/www.pcre.org\/\">PCRE<\/a> &ndash; which is my favourite regex library. Again, this is not an advert for these tools, just some notes from someone who has found them to be extremely useful &ndash; and have saved me <em>considerable<\/em> amounts of time in building, testing\/using powerful regular expressions with PCRE.<\/p>\n<p><H1>Screenshot: RegexBuddy<\/H1><\/p>\n<p>Processing INITEX&#8217;s <code>primitive(...)<\/code> function code with RegexBuddy to extract data for preparing C <code>#define<\/code>s.<\/p>\n<p><a href=\"https:\/\/www.readytext.co.uk\/files\/regexbuddy.png\"><img decoding=\"async\" src=\"https:\/\/www.readytext.co.uk\/files\/regexbuddy.png\" width=\"500\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Regular expressions are part of many programmer&#8217;s toolkit but they can be quite fiddly to get right. At the moment, I&#8217;m trying to &#8220;sanitize&#8221; the C code generated for TeX (via Web2C) by post-processing the TeX.c file to make the C source code far more readable. To do that I&#8217;m using the original definitions in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28,1],"tags":[],"class_list":["post-3557","post","type-post","status-publish","format-standard","hentry","category-c-programming-miscellaneous","category-uncategorized"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/3557","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3557"}],"version-history":[{"count":13,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/3557\/revisions"}],"predecessor-version":[{"id":3587,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=\/wp\/v2\/posts\/3557\/revisions\/3587"}],"wp:attachment":[{"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3557"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3557"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.readytext.co.uk\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3557"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}