I am horrible at writing regular expressions but, like any developer, when I need them I want a good one. Today I needed one to validate North American phone numbers, that would allow for all different kinds of formatting by the users and also allow for extensions.
On the following blog post, by Chris Nanney, I finally found the perfect one:
I also found this site that I used to validate that it worked as I expected: http://regex101.com/
I’ve been in the position of having to take an unnormalized database that had virtually no data validation or standardization in place, and migrating it to a normalized schema. I used regex to help me through the process.
This post will deal specifically with phone numbers. The data I was importing had many problems: First, there was no standard formatting—some numbers were stored (xxx) xxx-xxxx, some xxx-xxx-xxxx, some xxx.xxx.xxxx, etc. Second, there wasn’t a separate field for extensions—they were just tacked on the end by either ext., EXT, x, Ex, or some variation. If there were only 20 numbers or so you could just fix them by hand, but you need an automated process to deal with say, 15,000.
via Cleaning Phone Numbers with Regular Expressions : Code : Chris Nanney.