Huzaifa Rasheed

Huzaifa Rasheed

Software Engineer

Blogs

Regex? The Minimum You Need To Know.

Posted on April 7, 2021

Repost of https://dev.to/rhuzaifa/regex-the-minimum-you-need-to-know-496l

You might have noticed when filling out an online form for email or password, you sometimes get validation errors like email must be valid or password must be 8 digits long. Something like this 👇

Regex Use Example

These are places where REGEX is used.

What Is Regex?

Regex is short for Regular Expressions.

A sequence of characters that specifies a search pattern. These patterns are mostly used by string-searching algorithms to find or find and replace character/s. Thus they can be used for validations and mostly you will be using Regex for them.

How To Use Regex?

You can use different functions for matching regex with your data. In PHP there are functions starting with preg_ that mostly match regex (see Regex for Php) while In JavaScript, regex are also objects. You can see some of the regex functions for JavaScript on Javascript Regex Guide.

Basic Validations

There are a number of online regex engines that you can use to quickly test out your regex expression. I mostly use Regex101 because I like it. The following examples are tested on Regex101.

Email

/^(([^<>()[\]\\.,;:\s@"]+(\.[^<>()[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

Email Validation

Password

Minimum eight characters, at least one uppercase letter, one lowercase letter, and one number

/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d]{8,}$/

Password Validation

For more password validations, Check out the answer on stackoverflow

Numbers Only

To match 0 or many numbers /^[0-9]*$/

Numbers 0 or many
To match 1 or many numbers /^[0-9]+$/

Numbers 1 or many
To match exactly 1 number /^[0-9]$/

Numbers exactly 1
If you add one more number in the test string, the validation will fail.

Phone Numbers Only

This is a little difficult to explain as different countries have different phone numbers, codes, etc., you know what I am talking about.

For my number, and I have an 11 digit number(Pakistani Phone Number), I use the following regex /^((\+92)|(0092))-{0,1}\d{3}-{0,1}\d{7}$|^\d{11}$|^\d{4}-\d{7}$/

Phone Number validation country code
It validates the phone number with and without the country code like the following do pass

  • 00923000000000
  • +923000000000
  • 03000000000

I would suggest that for a good reference, check out Google’s Library for validating international phone numbers. This is not regex but it gets the job done if are on the run.

Characters Only

To match for 1 or many characters /^[A-Za-z]+$/

Characters only 1 or many

  • To match 0 or many characters you can use /^[A-Za-z]*$/
  • To match exactly 1 character you can use /^[A-Za-z]$/

Of course, there are other combinations you can explore.

URL Matching

The following will mostly match the URLs you want to
^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$

URL Matching exmaple

This however does not match port numbers, Like it would not match https://regex101:9000.com/?page=1. So you should have a general idea of what type of URL you want to match.

I suggest seeing Regex For URL that has different implementations of Php and Js that you can play with.

Which Language Supports Regex?

Almost every major language has support for regex.

To be more clear, implementations of regex functionality are called a regex engine, and a number of libraries are available for reuse which different languages do use.

Regex syntax may vary slightly between languages but for the most part, it is the same.

Now Some Theory.

Fun is over guys, now we are moving towards the theory. Jokes aside, Most of you don’t need to know everything about regex in detail, the same with a programming language, we can’t learn everything about them.

But you should know that 👇

A Regex can have

  • Tokens
  • Anchors
  • Meta Sequences
  • Quantifiers
  • Group Constructs
  • Character Classes
  • Flag/Modifiers
  • Substitution

If you want to learn and practice them at the same time then I would again suggest, goto Regex101. It has a good reference for the regex operators. 👇

Regex101 regex reference and I think I don’t need to explain more 😉

Fun Part While Writing The Article

I found a regex from StackOverflow that was matching non-ASCII characters and it matched every word I tried of any language. Even in my native language Urdu it matched اِسلامی جمہوریہ پاكِستان

Regex that matches every word of every language

I don’t know if it’s useful or not, but I will add it in the comments.

Tip

More like a best practice. You should do regex validations on both the frontend and backend of your code. Just in case someone tries to manipulate the frontend, your backend will not validate the invalid data.

Note: You can also use HTML5 Input Types for validating some form fields on the frontend, but for the backend you need regex.


Conclusion

So did you use regex for a complex match sometime or you just got to know about it? Also, give a 💖 or a 🦄 if you like the article.