Friday, February 28, 2003

regular expressions and irregular forms of life

"Irregular forms of life" means "programmers". Just in case you didn't make the connection (but then, how could you?). Every programmer with substantial experience beyond kiddy projects has encountered regular expressions: the best example of a necessary evil. Why evil? Well, for the inherent complexity (yes, yes, once you figure them out, they're "cool". But do you always have a lot of time to figure them out, without your boss, instructor, teaching assistant or time consciousness breathing down your neck?). Why necessary? Because the recommended approaches for dealing with the kind of documents that you need to process using regular expressions often possess quirks that render the clean elegant solutions useless (eg. "XML" documents that contain dozens of illegal characters essentially make an elegant XML parsing methodology look stupid). Despite their efficiency, regular expresions do not conform to a standard. There are accepted conventions (mostly derived from Perl) that create the notion of expected behaviour. Clearly, if there is no standard, you aren't being fair if you berate and curse the regular expression support of another language (like Java, with its recent inclusion of an unsatisfying regular expression package). If you have a Perl background, your frustration is understandable, but by means is it a justifiable reason to spend a lot of time tearing your hair, cursing Java and getting nothing done. Python supports Perl-style regular expressions and I like using them (especially some new nuggets in the 2.2 series), but I am sure hardcore Perl-ites will find cause for dissent. Give Python or even Java some credit. Although regular expressions are a strange addition to the OO world of Java.

No comments:

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.