Tim Maxey .NET Technology Blog & Resources

RegEx RegEx.Replace Regular Expressions Explained

Ok, gonna try to explain this, for me and for you:

I needed to strip apart a strap number like U-22-29-5J5-0000.20 and pass it to the property appraiser's site like: 29225J5000020U

Now how do you do that? You have to use Regular Expression Replace, so pattern the above like this:

([0-9a-zA-Z]{1})([\w-]{1})([0-9a-zA-Z]{2})([\w-]{1})([0-9a-zA-Z]{2})([\w-]{1})([0-9a-zA-Z]{3})([\w-]{1})([0-9a-zA-Z]{4})([\w.]{1})([0-9a-zA-Z]{2})

Say what? huh? Let's break it apart. The above regex is the "pattern" to look for, so we are matching in the first () i.e. ([0-9a-zA-Z]{1}) all numbers and letters. Note: each character in the orginal strap is represented by "something" in parentheses ()

This first set ([0-9a-zA-Z]{1}) is the "U" notice the [ ] then the { } we have a {1} to represent only one character and what's inside the [ ] is normal regex matching for numbers and letters. Then the next character ( a dash - ) is represented like ([\w-]{1}) (The \w is match word characters  like dashes or periods, in our case toward the end I use ([\w.]{1}) to represent the period toward the end of the strap string.

So if you follow the above "pattern" it will match the orginal strap. Now we do this so we can replace, move stuff around, each section or what is in the parentheses is a unit, or in our instance, there are 11 section patterns, does this make sense? ([0-9a-zA-Z]{1}) is 1 and ([\w-]{1}) is another (2), ([0-9a-zA-Z]{2}) is 3 etc, etc,

So our "replace" filter is represented by a $ then the "section" number, so in our case since we need U-22-29-5J5-0000.20  to look like 29225J5000020U, we need the 5th section first: $5, then we need the "22" which is the 3rd $3 and so forth until we get this: $5$3$7$9$11$1

Broken down:
U = 1
- = 2
22 = 3
- = 4
29 = 5
- = 6
5J5 = 7
- = 8
0000 = 9
. = 10
20 = 11

Now we can use this is a RegEx.Replace

This example is VB.NET but you could do it for whatever supports RegEx.Replace...

Dim strP as String = "([0-9a-zA-Z]{1})([\w-]{1})([0-9a-zA-Z]{2})([\w-]{1})([0-9a-zA-Z]{2})([\w-]{1})([0-9a-zA-Z]{3})([\w-]{1})([0-9a-zA-Z]{4})([\w.]{1})([0-9a-zA-Z]{2})"
Dim strR as string = "$5$3$7$9$11$1"
Dim mynewstrap = RegEx.Replace("U-22-29-5J5-0000.20",strP,strR)

The variable mynewstrap now = 29225J5000020U

Alternatively if you needed the "new" strap to be 29/225J50000/20U (notice the  forward slashes) how would you do that?

Dim strR as string = "$5/$3$7$9/$11$1". Hope this helps, you can really do a lot with this stuff, mix and match etc, like the 5J5 for instance, that "section" ([0-9a-zA-Z]{3}) could be "broken" up if you needed the J in the front, so that section would then become ([0-9a-zA-Z]{1})([0-9a-zA-Z]{1})([0-9a-zA-Z]{1}) and the $numbers would be:

5 = $7
J = $8
5 = $9

and the rest of the sections would now have different numbers

- = $10
0000 = $11
. = $12
20 = $13

Then if you needed this U-22-29-5J5-0000.20 to look like this J292255000020U, then the strR would look like this:

$8$5$3$7$9$11$13$1

Hopefully this all makes sense to me the next time I need it!!!





Feedback

No comments posted yet.


Post a comment





 

Please add 4 and 3 and type the answer here: