JavaScript Alternative To Possessive Quantifiers In Dynamic Regex

I'm using JavaScript to extract a subset of "siblings" from a comma-delimited string of members I call a "generation" string.

Metaphorically speaking, the members are all from the same generation, but they are not all siblings (from the same parents). Here's an example:

// This is the generation string to search
var generation  = 'ABAA,ABAB,ABAC,ABAD,ABBA,ACAA,ACAB,ACAD,AEAB,AEAD,AFAA';

// This is the member for whom to extract siblings (member included)
var member      = 'ACAA';

The generation string and its members have the following characteristics:

  • Each member has the same number of characters as the others
  • All members of the string are alpha sorted
  • Each set of siblings will always be adjacent to one another
  • Siblings are those members who share the exact same combination of letters except the last letter
  • Continuing the example...

    // This is how I go about extracting the desired result: ACAA,ACAB,ACAD
    var mParent     = member.substr(0, member.length - 1) ;
    var mPattern    = mParent + '[A-Z]';
    var mPattern    = '(.*)((' + mPattern + ')(,$1)*)(.*)'; // Trouble is here
    var mRegex      = new RegExp(mPattern);
    var mSiblings   = generation.replace(mRegex, '$2');
    

    The trouble spot identified above concerns regex quantifiers in the constructed pattern. As it is above, everything is set to greedy, so the value of mSiblings is:

    ACAD
    

    That's only the last member. Changing mPattern to be less greedy in hopes of extracting the other members yields the following

    // Reluctant first expression yields ACAA
    var mPattern = '(.*?)((' + mPattern + ')(,$1)*)(.*)'; 
    
    // Reluctant last expression yields ACAD,AEAB,AEAD,AFAA
    var mPattern = '(.*)((' + mPattern + ')(,$1)*)(.*?)'; 
    
    // Reluctant first and last yields ACAA,ACAB,ACAD,AEAB,AEAD,AFAA
    var mPattern = '(.*?)((' + mPattern + ')(,$1)*)(.*?)';
    

    If I could make the middle expression possessive, this would be problem solved. Something like this:

    // Make as many "middle" matches as possible by changing (,$1)* to (,$1)*+
    var mPattern = '(.*?)((' + mPattern + ')(,$1)*+)(.*?)';
    

    But as I have read (and have the syntax errors to prove it), JavaScript doesn't support possessive regular expression quantifiers. Can someone suggest a solution? Thank you.


    The most obvious problem is the $1 . Within a regex, you would refer to capturing group #1 using 1 , not $1 . The (,$1)* in your regex is never going to match anything. But a group reference isn't going to do any good anyway.

    When you use a group reference in a regex, you aren't applying that part of the regex again, you're simply matching the same thing that it matched the first time. That is, (ACA[AZ])(,1)* will match ACAA,ACAA , but not ACAA,ACAB or ACAA,ACAC . If you want to do that, you need to repeat the actual regex: (ACA[AZ])(,ACA[AZ])* . Since you're generating the regex dynamically, that shouldn't be a problem.

    Note that that's the whole regex: ACA[AZ](,ACA[AZ])* . There's no need to match the stuff preceding or following the part that interests you; that's just making the job more complicated (and the results more confusing). You can access the match result directly, instead of using that "replace" gimmick:

    var match = mRegex.exec(generation);
    if (match != null) {
        mSiblings = match[0];
    }
    
    链接地址: http://www.djcxy.com/p/76980.html

    上一篇: 匹配和具有奇怪行为的懒惰量词

    下一篇: 在动态正则表达式中使用JavaScript替代拥有量词