Happy birthday to me :-) John Backus, father of FORTRAN, dies at 82
2007 03 18

Regular expressions are extremely powerful. This is something I read at least once or twice every day while reading articles and blogs on the Web.

 
 
 

 
 
 

While browsing today, I found this page which thoroughly describes the use of the regular expression /^1?$|^(11+?)\1+$/ in Perl to check if a number is prime or not!!!

To be frank, I was skeptical. The regular expression looks like magic! And I wanted to understand it better. I rewrote it in Ruby using irb and I tested it:

Avinash@noulakaz:~$ irb
irb(main):001:0> def is_prime(n)
irb(main):002:1> ("1" * n) !~ /^1?$|^(11+?)\1+$/
irb(main):003:1> end
=> nil
irb(main):004:0> is_prime(10)
=> false
irb(main):005:0> is_prime(11)
=> true
irb(main):006:0> is_prime(12)
=> false
irb(main):007:0> is_prime(13)
=> true
irb(main):008:0> is_prime(99)
=> false
irb(main):009:0> is_prime(100)
=> false
irb(main):010:0> is_prime(101)
=> true

Great! It also works in Ruby! This means that there is no (Perl) magic going on. The regular expression really works. But how? Let’s try to follow the logic behind it.

Is 7 prime?

To know this, the function first generates “1111111″ (from “1″ * 7) and tries to see if that string does not match /^1?$|^(11+?)\1+$/. If there is no match, then the number is prime.

Notice that the regular expression has two parts (separated with the vertical bar |).

The first part is /^1?$/ is trivial and matches with beginning of line (^), an optional 1 (1?)
and end of line ($) which implies that it matches either the empty string or “1″. This simply indicates that calling that function with n==0 or n==1 will correctly return false (as the “1″ * n will match with the first part of the regular expression)

The second part is where the magic occurs…

/^(11+?)\1+$/ matches with beginning of line (^) then by (11+?) then by \1+ and finally by end of line ($). I guess that you know that \1 is a variable which is bound to whatever was matched previously (in our case by (11+?)).

Let’s proceed slowly…

(11+?) does two things

  1. It matches with “1″ followed by one or more ones minimally. This means that it matches with “11″ initially (notice that if there was no ? (i.e. (11+) was used instead, the whole string would have matched)
  2. The string obtained (”11″ initially) is bound to the variable \1.

\1+ then matches with whatever has been matched above (”11″ initially) repeated one or more times. If this match succeeds then the number is not prime.

If you are following, you’ll realise that this eliminates all even numbers except 2 (for example, 8 is “11111111″ and therefore (11+?) will match with “11″ and \1+ will match with “111111″).

As for odd numbers (in our case 7), the (11+?) matches with “11″ initially but \1+$ cannot be true (notice the $) as there are 5 remaining ones. The regular expression engine will backtrack and will make (11+?) match “111″ and here also \1+$ won’t be true because there will be 4 remaining ones (and \1+$ will only match with a number of ones which is a multiple of 3 followed by end of line) etc. hence “1111111″ will not match the regular expression which implies that 7 will be considered as being prime :-)

When I showed this to Christina this morning (true), she told me that this only checked for a number being odd or not. This is also what I felt at the beginning. But it really works. For instance, let’s try to apply it to 9 (which is obviously not even), “1″ * 9 should match the regular expression…

“1″ * 9 = “111111111″. (11+?) matches “11″ initially. \1+$ cannot match because there are 7 remaining ones. Backtracking occurs. (11+?) now matches “111″. And here \1+$ matches the remaining 6 remaining ones! Hence, 9 is not prime.

Easy… and beautiful at the same time ;-)

Popularity: 20% [?]

written by avinash

26 Responses to “A regular expression to check for prime numbers”

  1. Ketwaroo D. Yaasir Says:

    unfortunately, I am quite allergic to regular expressions… too many bloody flavors to make any sense in the end.

  2. avinash Says:

    I fully understand. I was quite allergic to regexpr too. But little by little I have come to like and use them a lot.

    Let’s see, I use regexpr when working with vi, grep, sed, awk and, now, ruby…

  3. Ketwaroo D. Yaasir Says:

    I’ve been trying to write a bbcode parser for wordpress µ to cure it of kses.php while keeping the entries of the bloggers “safe”. So I have to figure out how to use regex.

    So i guees i too will have to learn to like it.

    It’s like voodoo.

  4. avinash Says:

    And I suppose you have realised that it is a small programming language… so it’s bound to be cool :-)

  5. curiousEngine Says:

    Programming Quiz No. 2

    A serial file named contains a series of values, each of which is how many prime numbers ther are below a particular value. Write a program to read the number of primes between and values where and are requested interactively and compare it to the value obtained from FUNCT(limit) which is defined as limit / LOG(limit).

  6. avinash Says:

    Hi,

    Seems that some of the text went missing… or you were seriously drunk :-)

  7. VagabondoDigitale Says:

    Cool.. thx for your post. I like matematics quiz :P

  8. doh Says:

    nice trick!

    I find that the “backtracking” part was equivalence to finding out if the number is prime by brute-forcing

    e.g. going from 3 to half the number and if any of that is divisible, its not a prime.

    thus the ^(11+?)\1+& trick (anyone got it?)

    i believe there are a lot of fast algorithms out there to use instead of generating 1s and parsing them with a regular expression which is kinda slow.

    however this is truly magic ^^

  9. ksivasenthil Says:

    hey avinash, what is the exact input that goes to the regex constructor. I am not a ruby programmer but my attempt to use this regex in javascript failed. The test always failed for both prime and composite numbers. If you know javascript please help show me a sample to use this in javascript pls

  10. someone Says:

    the trick is to apply the regex to a string with n 1’s, for example, for 5, apply the regex to 11111.

  11. Jay Says:

    A PHP Version:
    !preg_match(’/^1?$|^(11+?)\1+$/x’, str_repeat(’1′, $n))

    Example:

    <?php

    $n = 1;
    while ($n < 1000000) {
    if (!preg_match(’/^1?$|^(11+?)\1+$/x’, str_repeat(’1′, $n))) {
    echo “$n: Prime\n”;
    }
    $n++;
    }
    ?>

  12. Skymt Says:

    C# version:

    static void Main(string[] args)
    {
    int c;
    if (args.Length == 1 && int.TryParse(args[0], out c))
    Console.WriteLine(!Regex.IsMatch(new String(’1′, c), @”^1?$|^(11+?)\1+$”));
    }

  13. Binks Says:

    Java Version, nice tip btw, guess I should learn more about regular expressions. Java requires that the \ be \’d out, otherwise it sees it as a smiley face character and problems happen :P.

    public static void main(string[] args){
    if (args.length >= 1)
    {
    int c = Integer.parseInt(args[0]);
    String s = “”;
    while (c > 0) {c–; s+=”1″;}
    System.out.println(!Pattern.matches(”^1?$|^(11+?)\\1+$”, s));
    }}

  14. Venkatesh Says:

    This is just awesome…

  15. Steven Says:

    Hello,

  16. adamsky Says:

    To be honest - regular expressions are “overused”… And are NOT that good. For example in PHP using just common str_replace(); would be in many cases much faster then preg_replace();

    Problem is that people tend to use a lot of “sophisticated” code where it’s not necessary, only because it looks more “professional”. My thought on this is “I don’t care what others think - I just need my code to be fast, clean and as simple as it is possible”.

  17. avinash Says:

    True.

    Sometimes regular expressions are extremely cryptic (like the one above in fact ;-) ) but they are also interesting to understand because they are essentially small programs written in a domain specific language.

  18. Nick Says:

    The magic is in the backtrack…. if it can find a number a=(11+?) that multiplied N times gives our number, it’s obvious that our number can’t be prime (our number = a*N, i.e. not prime)

  19. This must be shared: /^1?$|^(11+?)1+$/ to check for prime numbers at Philipp Meier’s weblog Says:

    [...] More details on Avinash Meetoo’s Blog. [...]

  20. roscoe Says:

    How many numbers does this count up to? whats the largest prime that can be found in *real* time?

    I’m interested so mail me please.

  21. avinash Says:

    Hi Roscoe,

    I’ve not done any performance test but I guess that if you use a compiled language like C with a good Regex implementation, then, maybe, you can generate prime numbers real fast.

    On the other hand, if you really want to generate primes then have a look at The Sieve of Eratosthene.

  22. Paul Says:

    Avinash. Super, I’ve been looking around for a Regex similar to this for a while. I’ve modified it to suit.

  23. avinash Says:

    Hi Paul,

    Great to know that you find this regular expression useful. I’d like to point out that I’m not the original author though ;-)

  24. rasmus Says:

    if primenumber mod 2 is 0 then
    isprime = true

    wouldnt that be easier and faster? ^^

  25. rasmus Says:

    sorry, its getting late here heh

    if primenumber mod 2 != 0 then
    isprime = true

  26. avinash Says:

    Hi Rasmus,

    A prime number is not simply odd. Check this.

Leave a Reply