Genii Weblog
Redacting content with Midas regular expressions
Thu 18 Nov 2010, 11:20 AM
Tweetby Ben Langhinrichs
Sorry about the typo in the title earlier.
A customer asked whether it was possible to use our Midas Rich Text technology to search through rich text, inside tables, in different fonts and with different attributes, and replace a specified pattern with a string of X's.
It couldn't be much easier. The Midas Rich Text LSX and Midas Rich Text C++ API both support regular expressions (mostly consistent with Perl expressions), and contain a number of methods to allow their use. In this case, let's say the pattern was ORDnnnnnnnnnnnn where the n's may be any twelve digit number. The code would simply be:
Call rtitem.ConnectBackend(doc.Handle, "Body", True)
Call rtitem.Everything.RegexReplace("ORD([0-9]{12})", "XXXXXXXXXXXXXXX")
but, let's say that isn't specific enough, as it could turn CHORD1234567890123456 into CHXXXXXXXXXXXXXXX3456, so let's say that the string must either be separated by whitespace or by the beginning/end of the text, so that "CHORD1234567890123456" would not match but "The order# is ORD123456789012" would. Simple enough, just change the regular expression as below:
Call rtitem.ConnectBackend(doc.Handle, "Body", True)
Call rtitem.Everything.RegexReplace("(^|[\s*])ORD([0-9]{12})([\s*]|$)", "$1XXXXXXXXXXXXXXX$3")
and that's all it takes. We could make it easier, but we'd probably have to read your mind.
Copyright © 2010 Genii Software Ltd.
What has been said: