Tuesday, August 26, 2008

Using Back References with String replaceAll method

This is small problem i was facing the other day and couldn't find much information about it on the web so thought i would blog about it.

The problem is how to use back references in Java regular expressions.

The problem is this. Say i have a String like so

"orderM8orderA3orderX2NoReturn"

and i want to turn it into a String like so:

"order M#8 order A#3 orderX#2 NoReturn"

i can do this:

String test = "orderM8orderA3orderX2NoReturn";

String replaced = test.replaceAll("([A-Z])([0-9])", " $1#$2 ");

What happens here is:

  1. first create two regular expression for matching all capital letters [A-Z] and all single digits [0-9]
  2. Next i then put each of these in a group. using ( ) brackets. The grouping means that the match is remembered and can be referenced by the replace string.
  3. In the replace string i can then reference the matches via the $n notation where n = the number of the group.

So what happens is: The regular expression processor moves along the string looking for cases of a capital letter next to a digit. When it find them it stores the capital letter in a group 1 and the digit in group 2.

So i want to replace the original match with another string i can.

Also note. The whole expression is automatically added to an implicit group zero 0 that is a group of the whole expression.

String replaced = test.replaceAll("([A-Z])([0-9])", " '$0' ");

will give

order 'M8' order 'A3' order 'X2' NoReturn
IMPORTANT NOTE:

The javadoc says that you reference back references with '\n' (were n = number) but that is not true. That does not work you need to use '$n'. The javadoc is wrong and needs to be updated.

Pattern javadoc

Hope this helps :)

5 comments:

Xhiris said...

As of Groovy 1.6.3, this doesn't seem to work.

You get the error "illegal string body character after dollar sign [...]

Use single quotes (') or slash (/) to delimit the replacement string and it works fine.

lopik said...

Thanks for your post, i was just struggling with \1 backreferences not working.. $1 works perfect for me. (using String.replaceAll(String,String) )

Anonymous said...

Helped a lot. Thanks

dracularKing said...

thanks, it's pretty useful to me

drac

Anonymous said...

\1 backreferences work in the regex argument, not in the replacement argument.