Regular Expression Match using C++

This is my first blog post. I have been thinking about writing for a long time but could not get an idea about what to write. Thus today after a lot a thought, I am writing about Regular Expression matcher using C++, one of the very famous questions on leetcode. This solution uses the Dynamic Programming approach.

The problem statement:

“Given an input string () and a pattern (), implement regular expression matching with support for and where:

Matches any single character.​​​​

Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).”

Example Testcases:

Example Testcases from leetcode

The Approach:

To approach this problem, think as if how would you solve it if you were to do it the way you would have done in your Theory Of Computation classes. You would look at the pattern and the string, scan it and then apply all the rules to check whether the string satisfies the regex or not. If encountered a ‘*’, you would probably check the preceding character from the ‘*’ and compare it with the string. If it matches, then string satisfies. Similarly, if encountered a ‘.’, it means that any single character can satisfy it. The key point to use Dynamic Programming is to break down the problem into smaller sub problems and use the solutions to these sub problems to get the optimal solution to the main problem.

Here, we will make a 2D vector A with size (p_len+1 * s_len+1), p_len representing the length of the pattern and s_len representing the length of the string.

The rules for regular expression matching are:

case:
p[i] == s[j] : check diagonally up because if the p[i-1] == s[j-1], string is being satisfied by the regex.
p[i] != s[j] and p[i] != ‘*’ and p[i] != ‘.’ : False
p[i] == ‘.’ : check diagonally up. This field is generally true because ‘.’ can satisfy any one character.
p[i] == ‘*’: check 2 rows up being in the same column as ‘*’ matches zero or more of the preceding element ;
if(p[i-1] = s[j] || p[i-1] = ‘.’): check 2 rows up as well as one column left(being in the same row) because if there is a match, then ‘*’ also causes a match.
For the 0th column, if(p[i-1] = ‘*’): check 2 rows above. false otherwise.

a 2D vector A with size(p_len+1 * s_len+1)
a 2D vector A with size(p_len+1 * s_len+1)
a 2D vector A with size(p_len+1 * s_len+1)

Note that the first column and the first row are NOT the part of the string, and this is the reason that vector A is of size(p_len+1 * s_len+1). Since blank string satisfy a blank regex, this is why A[0][0] is True.

The whole vector A is filled referring to the above mentioned rules and the last index of this 2 D vector determines whether there is a match or not.

The code:

The Match function

I would also like to thank the pepcoding youtube channel.

I hope this article would have helped you.

Enjoy and stay tuned! Let the code begin!