Tuesday, August 26, 2008

Using Back References with String replaceAll method

This is small problem i was facing the other day and couldn't find much information about it on the web so thought i would blog about it.

The problem is how to use back references in Java regular expressions.

The problem is this. Say i have a String like so

"orderM8orderA3orderX2NoReturn"

and i want to turn it into a String like so:

"order M#8 order A#3 orderX#2 NoReturn"

i can do this:

String test = "orderM8orderA3orderX2NoReturn";

String replaced = test.replaceAll("([A-Z])([0-9])", " $1#$2 ");

What happens here is:

  1. first create two regular expression for matching all capital letters [A-Z] and all single digits [0-9]
  2. Next i then put each of these in a group. using ( ) brackets. The grouping means that the match is remembered and can be referenced by the replace string.
  3. In the replace string i can then reference the matches via the $n notation where n = the number of the group.

So what happens is: The regular expression processor moves along the string looking for cases of a capital letter next to a digit. When it find them it stores the capital letter in a group 1 and the digit in group 2.

So i want to replace the original match with another string i can.

Also note. The whole expression is automatically added to an implicit group zero 0 that is a group of the whole expression.

String replaced = test.replaceAll("([A-Z])([0-9])", " '$0' ");

will give

order 'M8' order 'A3' order 'X2' NoReturn
IMPORTANT NOTE:

The javadoc says that you reference back references with '\n' (were n = number) but that is not true. That does not work you need to use '$n'. The javadoc is wrong and needs to be updated.

Pattern javadoc

Hope this helps :)

7 comments:

Xhiris said...

As of Groovy 1.6.3, this doesn't seem to work.

You get the error "illegal string body character after dollar sign [...]

Use single quotes (') or slash (/) to delimit the replacement string and it works fine.

lopik said...

Thanks for your post, i was just struggling with \1 backreferences not working.. $1 works perfect for me. (using String.replaceAll(String,String) )

Anonymous said...

Helped a lot. Thanks

dracularKing said...

thanks, it's pretty useful to me

drac

Anonymous said...

\1 backreferences work in the regex argument, not in the replacement argument.

Chris Treber said...

Excellent; had the same problem (\n does NOT work)

ben ingledew said...

Thanks, used your blog to refresh my memory on using backreferences. Replaced about 20 lines of code with about 2.