Regular expressions are extremely powerful. This is something I read at least once or twice every day while reading articles and blogs on the Web.
![]()
While browsing today, I found this page which thoroughly describes the use of the regular expression /^1?$|^(11+?)\1+$/ in Perl to check if a number is prime or not!!!
To be frank, I was skeptical. The regular expression looks like magic! And I wanted to understand it better. I rewrote it in Ruby using irb and I tested it:
Avinash@noulakaz:~$ irb
irb(main):001:0> def is_prime(n)
irb(main):002:1> ("1" * n) !~ /^1?$|^(11+?)\1+$/
irb(main):003:1> end
=> nil
irb(main):004:0> is_prime(10)
=> false
irb(main):005:0> is_prime(11)
=> true
irb(main):006:0> is_prime(12)
=> false
irb(main):007:0> is_prime(13)
=> true
irb(main):008:0> is_prime(99)
=> false
irb(main):009:0> is_prime(100)
=> false
irb(main):010:0> is_prime(101)
=> true
Great! It also works in Ruby! This means that there is no (Perl) magic going on. The regular expression really works. But how? Let’s try to follow the logic behind it.
Is 7 prime?
To know this, the function first generates “1111111″ (from “1″ * 7) and tries to see if that string does not match /^1?$|^(11+?)\1+$/. If there is no match, then the number is prime.
Notice that the regular expression has two parts (separated with the vertical bar |).
The first part is /^1?$/ is trivial and matches with beginning of line (^), an optional 1 (1?)
and end of line ($) which implies that it matches either the empty string or “1″. This simply indicates that calling that function with n==0 or n==1 will correctly return false (as the “1″ * n will match with the first part of the regular expression)
The second part is where the magic occurs…
/^(11+?)\1+$/ matches with beginning of line (^) then by (11+?) then by \1+ and finally by end of line ($). I guess that you know that \1 is a variable which is bound to whatever was matched previously (in our case by (11+?)).
Let’s proceed slowly…
(11+?) does two things
- It matches with “1″ followed by one or more ones minimally. This means that it matches with “11″ initially (notice that if there was no ? (i.e. (11+) was used instead, the whole string would have matched)
- The string obtained (”11″ initially) is bound to the variable \1.
\1+ then matches with whatever has been matched above (”11″ initially) repeated one or more times. If this match succeeds then the number is not prime.
If you are following, you’ll realise that this eliminates all even numbers except 2 (for example, 8 is “11111111″ and therefore (11+?) will match with “11″ and \1+ will match with “111111″).
As for odd numbers (in our case 7), the (11+?) matches with “11″ initially but \1+$ cannot be true (notice the $) as there are 5 remaining ones. The regular expression engine will backtrack and will make (11+?) match “111″ and here also \1+$ won’t be true because there will be 4 remaining ones (and \1+$ will only match with a number of ones which is a multiple of 3 followed by end of line) etc. hence “1111111″ will not match the regular expression which implies that 7 will be considered as being prime
When I showed this to Christina this morning (true), she told me that this only checked for a number being odd or not. This is also what I felt at the beginning. But it really works. For instance, let’s try to apply it to 9 (which is obviously not even), “1″ * 9 should match the regular expression…
“1″ * 9 = “111111111″. (11+?) matches “11″ initially. \1+$ cannot match because there are 7 remaining ones. Backtracking occurs. (11+?) now matches “111″. And here \1+$ matches the remaining 6 remaining ones! Hence, 9 is not prime.
Easy… and beautiful at the same time ![]()
Popularity: 20% [?]


March 19th, 2007 at 18:23
unfortunately, I am quite allergic to regular expressions… too many bloody flavors to make any sense in the end.
March 19th, 2007 at 22:48
I fully understand. I was quite allergic to regexpr too. But little by little I have come to like and use them a lot.
Let’s see, I use regexpr when working with vi, grep, sed, awk and, now, ruby…
March 19th, 2007 at 23:51
I’ve been trying to write a bbcode parser for wordpress µ to cure it of kses.php while keeping the entries of the bloggers “safe”. So I have to figure out how to use regex.
So i guees i too will have to learn to like it.
It’s like voodoo.
March 20th, 2007 at 09:42
And I suppose you have realised that it is a small programming language… so it’s bound to be cool
March 21st, 2007 at 09:25
Programming Quiz No. 2
A serial file named contains a series of values, each of which is how many prime numbers ther are below a particular value. Write a program to read the number of primes between and values where and are requested interactively and compare it to the value obtained from FUNCT(limit) which is defined as limit / LOG(limit).
March 21st, 2007 at 13:34
Hi,
Seems that some of the text went missing… or you were seriously drunk
September 24th, 2007 at 23:19
Cool.. thx for your post. I like matematics quiz
September 25th, 2007 at 14:53
nice trick!
I find that the “backtracking” part was equivalence to finding out if the number is prime by brute-forcing
e.g. going from 3 to half the number and if any of that is divisible, its not a prime.
thus the ^(11+?)\1+& trick (anyone got it?)
i believe there are a lot of fast algorithms out there to use instead of generating 1s and parsing them with a regular expression which is kinda slow.
however this is truly magic ^^
September 29th, 2007 at 07:50
hey avinash, what is the exact input that goes to the regex constructor. I am not a ruby programmer but my attempt to use this regex in javascript failed. The test always failed for both prime and composite numbers. If you know javascript please help show me a sample to use this in javascript pls
October 3rd, 2007 at 17:53
the trick is to apply the regex to a string with n 1’s, for example, for 5, apply the regex to 11111.
October 7th, 2007 at 10:55
A PHP Version:
!preg_match(’/^1?$|^(11+?)\1+$/x’, str_repeat(’1′, $n))
Example:
<?php
$n = 1;
while ($n < 1000000) {
if (!preg_match(’/^1?$|^(11+?)\1+$/x’, str_repeat(’1′, $n))) {
echo “$n: Prime\n”;
}
$n++;
}
?>
October 16th, 2007 at 17:19
C# version:
static void Main(string[] args)
{
int c;
if (args.Length == 1 && int.TryParse(args[0], out c))
Console.WriteLine(!Regex.IsMatch(new String(’1′, c), @”^1?$|^(11+?)\1+$”));
}
October 19th, 2007 at 04:17
Java Version, nice tip btw, guess I should learn more about regular expressions. Java requires that the \ be \’d out, otherwise it sees it as a smiley face character and problems happen :P.
public static void main(string[] args){
if (args.length >= 1)
{
int c = Integer.parseInt(args[0]);
String s = “”;
while (c > 0) {c–; s+=”1″;}
System.out.println(!Pattern.matches(”^1?$|^(11+?)\\1+$”, s));
}}
October 19th, 2007 at 09:11
This is just awesome…
January 30th, 2008 at 02:35
Hello,
February 10th, 2008 at 16:19
To be honest - regular expressions are “overused”… And are NOT that good. For example in PHP using just common str_replace(); would be in many cases much faster then preg_replace();
Problem is that people tend to use a lot of “sophisticated” code where it’s not necessary, only because it looks more “professional”. My thought on this is “I don’t care what others think - I just need my code to be fast, clean and as simple as it is possible”.
February 11th, 2008 at 01:32
True.
Sometimes regular expressions are extremely cryptic (like the one above in fact
) but they are also interesting to understand because they are essentially small programs written in a domain specific language.
February 19th, 2008 at 04:33
The magic is in the backtrack…. if it can find a number a=(11+?) that multiplied N times gives our number, it’s obvious that our number can’t be prime (our number = a*N, i.e. not prime)
February 27th, 2008 at 13:55
[...] More details on Avinash Meetoo’s Blog. [...]
March 26th, 2008 at 02:19
How many numbers does this count up to? whats the largest prime that can be found in *real* time?
I’m interested so mail me please.
March 26th, 2008 at 23:24
Hi Roscoe,
I’ve not done any performance test but I guess that if you use a compiled language like C with a good Regex implementation, then, maybe, you can generate prime numbers real fast.
On the other hand, if you really want to generate primes then have a look at The Sieve of Eratosthene.
April 2nd, 2008 at 23:38
Avinash. Super, I’ve been looking around for a Regex similar to this for a while. I’ve modified it to suit.
April 3rd, 2008 at 19:32
Hi Paul,
Great to know that you find this regular expression useful. I’d like to point out that I’m not the original author though
April 27th, 2008 at 03:22
if primenumber mod 2 is 0 then
isprime = true
wouldnt that be easier and faster? ^^
April 27th, 2008 at 03:30
sorry, its getting late here heh
if primenumber mod 2 != 0 then
isprime = true
April 27th, 2008 at 07:20
Hi Rasmus,
A prime number is not simply odd. Check this.