diff --git a/cpan/perlfaq/lib/perlfaq6.pod b/cpan/perlfaq/lib/perlfaq6.pod index 346a37ead763..fb6f72818cb8 100644 --- a/cpan/perlfaq/lib/perlfaq6.pod +++ b/cpan/perlfaq/lib/perlfaq6.pod @@ -165,12 +165,19 @@ Here's another example of using C<..>: X X X X X X X -Do not use regexes. Use a module and forget about the -regular expressions. The L, L and -L modules are good starts, although each namespace -has other parsing modules specialized for certain tasks and different -ways of doing it. Start at CPAN Search ( L ) -and wonder at all the work people have done for you already! :) +Regular expressions in Perl versions prior to 5.10.0 could not handle +recursion. Therefore the recommendation for recursive languages +like C or C was to avoid using regexes, and use specific +parsing modules instead, like L, L or +L. If you need a full parse tree, that recommendation +is still the best advice. + +Since version 5.10.0, Perl regexes I parse recursive constructs, +either through regexes crafted by hand, or through helper modules +like L or L. +Therefore, if you need to extract some specific subset of information from +an C or C document, you I construct a regexp to do so - +but this is not an easy task and it may require a fair amount of testing. =head2 I put a regular expression into $/ but it didn't work. What's wrong? X<$/, regexes in> X<$INPUT_RECORD_SEPARATOR, regexes in> @@ -807,31 +814,18 @@ These strings do not match /\Bam\B/ "I am Sam" # "am" surrounded by non-word chars -=head2 Why does using $&, $`, or $' slow my program down? +=head2 Why did using $&, $`, or $' slow my program down in older Perls? X<$MATCH> X<$&> X<$POSTMATCH> X<$'> X<$PREMATCH> X<$`> -(contributed by Anno Siegel) - -Once Perl sees that you need one of these variables anywhere in the -program, it provides them on each and every pattern match. That means -that on every pattern match the entire string will be copied, part of it -to $`, part to $&, and part to $'. Thus the penalty is most severe with -long strings and patterns that match often. Avoid $&, $', and $` if you -can, but if you can't, once you've used them at all, use them at will -because you've already paid the price. Remember that some algorithms -really appreciate them. As of the 5.005 release, the $& variable is no -longer "expensive" the way the other two are. - -Since Perl 5.6.1 the special variables @- and @+ can functionally replace -$`, $& and $'. These arrays contain pointers to the beginning and end -of each match (see perlvar for the full story), so they give you -essentially the same information, but without the risk of excessive -string copying. - -Perl 5.10 added three specials, C<${^MATCH}>, C<${^PREMATCH}>, and -C<${^POSTMATCH}> to do the same job but without the global performance -penalty. Perl 5.10 only sets these variables if you compile or execute the -regular expression with the C

modifier. +(contributed by Anno Siegel, revised by Laurent Dami) + +In versions prior to 5.20.0, +once Perl saw that you needed one of these variables anywhere in the +program, it provided them on each and every pattern match. That meant +that on every pattern match the entire string was copied, part of it +to $`, part to $&, and part to $'. Thus the penalty was most severe with +long strings and patterns that match often. Since version 5.20.0 that +problem has been solved and is no longer a concern. =head2 What good is C<\G> in a regular expression? X<\G> diff --git a/pod/perlvar.pod b/pod/perlvar.pod index 542925378b69..e75d63d470ea 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -1027,12 +1027,14 @@ considered to be one of many good reasons to avoid C. =head3 Performance issues -Traditionally in Perl, any use of any of the three variables C<$`>, C<$&> +In Perl prior to 5.20.0, any use of any of the three variables C<$`>, C<$&> or C<$'> (or their C equivalents) anywhere in the code, caused all subsequent successful pattern matches to make a copy of the matched string, in case the code might subsequently access one of those variables. This imposed a considerable performance penalty across the whole program, -so generally the use of these variables has been discouraged. +so generally the use of these variables was discouraged. Most Perl textbooks +and tutorials still reflect these ancient recommendations; but under recent +versions of Perl, they are no longer necessary, as explained below. In Perl 5.6.0 the C<@-> and C<@+> dynamic arrays were introduced that supply the indices of successful matches. So you could for example do @@ -1068,8 +1070,9 @@ In Perl 5.20.0 a new copy-on-write system was enabled by default, which finally fixes most of the performance issues with these three variables, and makes them safe to use anywhere. -The C and C modules can help you -find uses of these problematic match variables in your code. +If you work with older Perl versions, when these match variables were still +problematic, then the C and C modules +can help you find uses of these variables in your code. =over 8 @@ -1140,9 +1143,6 @@ X<$&> X<$MATCH> The string matched by the last successful pattern match. (See L.) -See L above for the serious performance implications -of using this variable (even once) in your code. - This variable is read-only, and its value is dynamically scoped. Mnemonic: like C<&> in some editors. @@ -1154,7 +1154,8 @@ It is only guaranteed to return a defined value when the pattern was compiled or executed with the C

modifier. This is similar to C<$&> (C<$MATCH>) except that to use it you must -use the C

modifier when executing the pattern, and it does not incur +use the C

modifier when executing the pattern, and in versions +prior to 5.20.0 it does not incur the performance penalty associated with that variable. See L above. @@ -1171,9 +1172,6 @@ X<$`> X<$PREMATCH> The string preceding whatever was matched by the last successful pattern match. (See L). -See L above for the serious performance implications -of using this variable (even once) in your code. - This variable is read-only, and its value is dynamically scoped. Mnemonic: C<`> often precedes a quoted string. @@ -1185,7 +1183,8 @@ It is only guaranteed to return a defined value when the pattern was executed with the C

modifier. This is similar to C<$`> ($PREMATCH) except that to use it you must -use the C

modifier when executing the pattern, and it does not incur +use the C

modifier when executing the pattern, and in versions +prior to 5.20.0 it does not incur the performance penalty associated with that variable. See L above. @@ -1207,9 +1206,6 @@ pattern match. (See L). Example: /def/; print "$`:$&:$'\n"; # prints abc:def:ghi -See L above for the serious performance implications -of using this variable (even once) in your code. - This variable is read-only, and its value is dynamically scoped. Mnemonic: C<'> often follows a quoted string. @@ -1221,7 +1217,8 @@ It is only guaranteed to return a defined value when the pattern was compiled or executed with the C

modifier. This is similar to C<$'> (C<$POSTMATCH>) except that to use it you must -use the C

modifier when executing the pattern, and it does not incur +use the C

modifier when executing the pattern, and in versions +prior to 5.20.0 it does not incur the performance penalty associated with that variable. See L above.