Regex in JavaScript
In JavaScript, regular expressions are implemented as their own type of object (such as the RegExp
object). These objects store patterns and options and can then be used to test and manipulate strings.
To start playing with regular expressions, the easiest thing to do is to enable a JavaScript console and play around with the values. The easiest way to get a console is to open up a browser, such as Chrome, and then open the JavaScript console on any page (press the command + option + J on a Mac or Ctrl + Shift + J).
Let's start by creating a simple regular expression; we haven't yet gotten into the specifics of the different special characters involved, so for now, we will just create a regular expression that matches a word. For example, we will create a regular expression that matches hello
.
The RegExp constructor
Regular expressions can be created in two different ways in JavaScript, similar to the ones used in strings. There is a more explicit definition, where you call the constructor function and pass it the pattern of your choice (and optionally any settings as well), and then, there is the literal definition, which is a shorthand for the same process. Here is an example of both (you can type this straight into the JavaScript console):
var rgx1 = new RegExp("hello"); var rgx2 = /hello/;
Both these variables are essentially the same, it's pretty much a personal preference as to which you would use. The only real difference is that with the constructor method you use a string to create an expression: therefore, you have to make sure to escape any special characters beforehand, so it gets through to the regular expression.
Besides a pattern, both forms of Regex constructors accept a second parameter, which is a string of flags. Flags are like settings or properties, which are applied on the entire expression and can therefore change the behavior of both the pattern and its methods.
Using pattern flags
The first flag I would like to cover is the ignore case or i flag. Standard patterns are case sensitive, but if you have a pattern that can be in either case, this is a good option to set, allowing you to specify only one case and have the modifier adjust this for you, keeping the pattern short and flexible.
The next flag is the multiline or m flag, and this makes JavaScript treat each line in the string as essentially the start of a new string. So, for example, you could say that a string must start with the letter a. Usually, JavaScript would test to see if the entire string starts with the letter a, but with the m flag, it will test this constraint against each line individually, so any of the lines can pass this test by starting with a.
The last flag is the
global or g flag. Without this flag, the RegExp
object only checks whether there is a match in the string, returning on the first one that's found; however, in some situations, you don't just want to know if the string matches, you may want to know about all the matches specifically. This is where the global flag comes in, and when it's used, it will modify the behavior of the different RegExp
methods to allow you to get to all the matches, as opposed to only the first.
So, continuing from the preceding example, if we wanted to create the same pattern, but this time, with the case set as insensitive and using global flags, we would write something similar to this:
var rgx1 = new RegExp("hello", "gi"); var rgx2 = /hello/gi;
Using the rgx.test method
Now that we have created our regular expression objects, let's use its simplest function, the test
function. The test
method only returns true
or false
, based on whether a string matches a pattern or not. Here is an example of it in action:
> var rgx = /hello/; undefined > rgx.test("hello"); true > rgx.test("world"); false > rgx.test("hello world"); true
As you can see, the first string matches and returns true, and the second string does not contain hello
, so it returns false
, and finally the last string matches the pattern. In the pattern, we did not specify that the string had to only contain hello
, so it matches the last string and returns true
.
Using the rgx.exec method
The next method on the RegExp
object, is the exec
function, which, instead of just checking whether the pattern matches the text or not, exec
also returns some information about the match. For this example, let's create another regular expression, and get index
for the start of the pattern;
> var rgx = /world/; undefined > rgx.exec("world !!"); [ 'world' ] > rgx.exec("hello world"); [ 'world' ] > rgx.exec("hello"); null
As you can see here, the result from the function contains the actual match as the first element (rgx.exec("world !!")[0];
) and if you console.dir
the results, you will see it also contains two properties: index
and input
, which store the starting index
property and complete the input
text, respectively. If there are no matches, the function will return null
:
The string object and regular expressions
Besides these two methods on the RegExp
object itself, there are a few methods on the string object that accept the RegExp
object as a parameter.
Using the String.replace method
The most commonly used method is the replace
method. As an example, let's say we have the foo foo
string and we want to change it to qux qux
. Using replace
with a string would only switch the first occurrence, as shown here:
In order to replace all the occurrences, we need to supply a RegExp
object that has the g
flag, as shown here:
Using the String.search method
Next, if you just want to find the (zero-based) index of the first match in a string, you can use the search
method:
> str = "hello world"; "hello world" > str.search(/world/); 6
Using the String.match method
The last method I want to talk about right now is the match
function. This function returns the same output as the exec
function we saw earlier when there was no g
flag (it includes the index
and input
properties), but returned a regular Array
of all the matches when the g
flag was set. Here is an example of this:
We have taken a quick pass through the most common uses of regular expressions in JavaScript (code-wise), so we are now ready to build our RegExp
testing page, which will help us explore the actual syntax of Regex without combining it with JavaScript code.