Regex to get Everything Until a Specific Character is Found

Stripping attribute from HTML can be time consuming if you do it manually. Since the value can change on each of the attribute, you cannot do a simple search and replace. What you need is a regular expression that will search for the attribute you want to remove and with anything between quotes. That mean you need to have a Regex that search a string until a specific character is reach. In the case of having an Html attribute with a value between double quote that change you need to search what does not change and have the Regex catching all until it found the second quote.

data-info="this is value 1"
data-info="thisIsValue2"

The Regex to parse all string and is the following one. It takes the part that does not change and search for everything not a double quote. The Regex part that does this is the square bracket following by the ^ symbol. It informs the Regex to search everything until it finds the double quote. The “everything” is specified by the star character that is following the ending square bracket.

data-info=\"[^"]*\"

How to Access Group by Name Instead of Index with Regex in C#

When you are searching with a Regex Expression you may have multiple groups and you may do not one to rely on the position to have access to the information. The good news is that Regex allows to name every group directly into the Regex Expression. This is done by adding inside the group parentheses a question mark followed by the less greater sign, the name and closing the name with the greater sign.

const string PATTERN_WITHOUT_NAME = @"(parser)\((.*?)\)"; //This has 3 groups
const string PATTERN_WITH_NAME= @"(parser)\((?<parameterName>.*?)\)"; //This also has 3 groups

This code is searching for every thing inside the parse method. For example: “This is parser(abc) and it is awesome”. This would result to get “abc”. This is also working for multiple instances of parser method.

var regex = new Regex(PATTERN_WITH_NAME, RegexOptions.IgnoreCase);

var matches = regex.Matches(stringToSearch);
foreach (Match match in matches)
{
	var url = match.Groups["parameterName"].Value;
}

Something that can be also interesting is to replace the named group from your Regex. This can be done with the MatchEvaluator delegate function that the Regex’ Replace method use. This deletage is called every time a match is found. From here, you must return what you want to do.

transformedString = regex.Replace(originalContent, match =>
{
	var group = match.Groups["parameterName"];
	if (group != null)
	{
		var parameter = match.Groups["parameterName"].Value;
		//Do your transformation logic here
		return string.Format("parser({0})", transformedParameter);
	}
	return match.ToString();
});

In this example, every time the Regex pattern match, the delegate is called. We get the parameter and could do something with like and return the whole match string. This is important to understand that you cannot return only a single group. For example, if you desired to just modify the parameterName and not the whole parser method this cannot be done.

This article shows you how to use Regex to match and to replace a part of a string even if the logic of replacement is complex. The MatchEvaluator allows you to have a lot of flexibility about how to figure out how to handle group within the match.

How to convert Javascript parenthese to access array to square bracket?

It can happen in old project that array objet are accessed with parentheses instead of square bracket.

For example, MyArray[0] is in fact the first element of an array in Javascript. But, IE let you use MyArray(0). This is not a good practice and other browsers doesn’t accept this syntax.

To convert easily, you can use a Regex expression. In my case, the array name was InTran.

InTran\({(.+)}\) //Find

InTran\[\1\] //Replace

The curly bracket is required by Visual Studio to have a backreference but is not required by all Regex tool.

How to remove document.all from your projects?

Recently, I had to work on pages which contained a lot of code that were using the famous Internet Explorer 4 document.all javascript method. It’s not supported by all browser and should not be used. You should use unique identifier but I couldn’t because time was limited for the change.

We already user JQuery so I knew that I can search by attribute.

The plan was to replace all document.all[“XYZ”] to $(‘input[name=”XYZ”]’). As you see, the XYZ change between each files. The solution is with Visual Studio (or other software that do replace with Regex) to use the Replace tool with a Regex expression.

//this
document.all\[\"{(.+)}\"\]
//to
$(\'input\[name="\1"\]\')

Regex to replace all document.all with JQuery name selector

What it does is that it search for the string document.all[“???”] and replace with input[name=”???”] and the ??? is replaced with what is found in the search and used in the replacement. This way, the name change every time it founds a new string with document.all.

This can will be good for some situation but not for code like this:

document.all[“???”].value because in JQuery the value is get by val() and set by val(‘new value’);

To be able to do this correctly, 2 Replaces is required.

The first one for the setter of the value:

document.all\[\"{(.+)}\"\].value(:b)@={(.+)}; //Search
$(\'input\[name="\1"\]\').val(\2); //Replace

The second one for the getter of the value

[^\.]document.all\[\"{(.+)}\"\].value //search
$(\'input\[name="\1"\]\').val() //Replace

This is ain’t perfect for all situations. Multiples concatenations of document.all may not be replaced correctly. But, I think it does the job for most of the situation.

Regex Tool For .Net Developers

I am far from being an expert in Regex but with good tools writing a Regex become easier.

First, I suggest you to download is RegexBuilder from Renschler. This Regex tool is ideal once the Regex is created to check few sentence and to quick see if something is wrong or not.

RegexBuilder

The second tool is something for helping you to create the Regex. This Regex tool is also free by RadSoftware. You can download it here.

Regex Designer

This tool is great to see what will be in your group and also contain a library of all Regex syntax.

With these two tools you should enjoy a little bit more the creation of Regex.