I am a programmer since 2000, never cared to learn Regular expressions till now. I watched this video completely today and started to get interested in RegEx! Thankyou Tim :)
I've always tried to escape using RegEx due to it's complexity. After watching this vid, it's actually doesn't look scary anymore and makes much more sense. Special thanks for the cheat sheet, will pin it somewhere for future reference.
Another helpful video for people that have not used regex. I really love how you always point out to use the right tool for the job. Code comments are there to explain WHY you are doing something even more than WHAT you are doing. Maintaining comments is part of the change. I have personally only used regex in these basic types of data matching. I had a developer that used it to extract data elements and do code editing using across generated code to upgrade it instead of regenerating the object classes/methods.
Hey Tim, i started programming 7 month before and iam a german potato with decent english skills and iam only watching german tutorials. But i dont found a good german video for regex. So i start watching your one. After 1 minute of watching i forgot it is in english and understand every single word and feels for me i watching it in german. Your a talking in a perfect speed and very clearly for me to understand it. Great video. Thx and greetings from germany.
Oh man, you're amazing. I always learn something new from you. thank you a lot to spend time making these videos for the community. Regards from Argentina
Good way of documenting regexes - break it onto multiple lines e.g. as an array of strings. You can then add a comment to each line explaining what each regex fragment does. Then just concatenate the array when you need to use it.
I’ve recently had something come up where I was going to have to start using RegEx. Perfect timing! This was very helpful, thanks for all your hard work and always delivering great content!
I don't use them often but when I do it saves time, though I sometimes need a refresher on positive/negative look ahead/behind, named captures, or back references. Sometimes I use them to find code that needs to be updated or reviewed and I like to use "TextCrawler" and enable "Preview all matches together" which allows me to copy/paste results to Word to document code that I need to look at or change.
Great tutorial. I've used it before but never bothered to read or watch a tutorial. I've always googled when I needed to use regex. This tutorial is very informative. Thanks Tim.
Thanks Tim, this is a very helpful intro to regex. I think the comments thing is really important because I'm almost certain I would write some regex pattern and then a week later ask myself "what in the world did I write this pattern for and what does it do." I'm all down for maintaining those comments!
Always thought those guys on SO writing and debugging regex were from Mars. Thanks Tim for making this quite palatable. I can now confidently write my own regex instead of copy pasting😄
Thanks Tim for this helpful course. there are some benefits to start your Regex by declaring Regex regex = new Regex() rather than using the static version. 1. you get the compiler to tell you if it's a valid expression. 2. the color coding helps with visualization 3. it has some IntelliSense thanks again!
@@IAmTimCorey By the way, thank you Tim for providing us with free quality content here on UA-cam! Because of your videos I got back into programming 5 years ago. Since then, I am developing different tools to help my employer and co-workers and looking forward to more videos. Keep up your good spirit!
here U go, 55 minutes of my life and I learned something usefull for me, how was I going to get this knowladge if I would be spending not just 55 min, but years somewhere else
that was very helpful mr tim! thank you so much. Im still a beginner so i was wondering of how to compare 2 txt files perhaps? for example, file 1 is analysed and compared with file 2 and similar strings are retrieved?
You would probably load both text files into memory as List each. Then compare each entry (which would be a line) between one list and the other. That can still be rather complicated, since one file might have an extra line at the top that throws off the entire comparison. You would need to figure out what your comparison criteria is and then build your app.
What? Why was there a higher time with the precompiled version compared to the time without the precompiled version? I thought if we compiled it over again it spend much time.
Again awasome video. You are a String Bender. I have something in mind but can't find it anywhere. Can we read from a Word file line by line? Not by paragraph or word.
Word files are actually zip files. Try it. Change the docx extension to zip and then open it up. You will see that there are multiple files inside. Usually, we use a third-party tool to read Word documents.
Hello Tim. Excellent video. Trying to download source code, with no success. I have provided same e-mail before. Is there a limit on the code downloads in your channel? Anyways. I am very thankful, because thanks to this video I finally got to understand how Regular expressions work. I can't thank you enough for this help.
There is, although I've been holding off for a bit because of some of the issues it has been having. It isn't quite stable yet. Blazor Hybrid is actually further along from what I've been hearing.
It depends on what you mean. Trim will trim off the whitespace on the very ends of a string. If you mean using Trim on the file, it won't do much good. If you mean using trim on the items you find, that would work if you were capturing spaces. For instance, in the phone number search, we weren't capturing spaces so it was not needed. If we decided to make sure it was the whole word or number then yes, we could trim the spaces after the capture.
I have followed few years back more helpful video ,thank you so much.... how to fetch integer value(20) for exm: pay 20 or 20 pay both cases in single regex pattern
I'm curious about that as well. I don't have an answer for you. It is something I'll probably be asking about when I next talk to one of the developers of C#.
@@IAmTimCorey I read this: "Regex has an interpreted mode and a compiled mode. The compiled mode takes longer to start, but is generally faster." So Maybe after a while when things get going and you have more data to process it will eventually get faster? I dunno
@@IAmTimCorey Nothing mysterious, the overhead for the compile is the extra time, the trade off of setup would be for performance later, for a hamburger today, i will... :) So if you only use a regex once or twice would argue generally you are introducing a perf hit for no reason. While you example of running millions, exactly where you'd want the compiled... Check out the first two paragraphs here (MS Docs) learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options#compiled-regular-expressions Also with c# 6 there is a global time out for regex, // learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-6.0#remarks AppDomain.CurrentDomain.SetData("REGEX_DEFAULT_MATCH_TIMEOUT", TimeSpan.FromSeconds(2)); Just in case people start using regex and worried about some sort of denial of service attacks that have happened, and why the timeout was introduced... Generally would assert the big problem with Regex is when using forward/backward captures, effectively one will create cycles or recursive looks in the matching engine, i've used Regex for at least 40 years never trouble/issue if one avoids the fore/back captures...
@Dan D - I've read those articles. The mysterious part is that the uncompiled version is the fastest one, yet according to those documents, instantiated Regex objects that are not pre-compiled do not get cached (so the less efficient version is used each time), yet that was the fastest in our small test. According to the docs, it should have been the slowest. We sacrificed a bit of startup time with the compiled in order to gain runtime performance. But we were testing both for both options. So either the compilation of the pattern is not done at all for the instantiated version and that compilation is expensive enough to not be made up in 100,000 iterations or there is something else going on.
@@IAmTimCorey Might assert the margin of difference is well within a range that the times are effectively immaterial for a few reasons, not to mention could the code execution have been effected by say your OBS recording and influencing your tests... So honestly would assert your test environment is rather dirty and not a valid test environment. Good reason perf testing is a joy. Should really get cold start, warm starts, and long haul variations, after warmup/startup, before setup... So needless to say starting a console app get seeing 5ms deference... Meh? Then again the stopwatch, what is the resolution for the timers, use to be the multi-media timers were the highest resolution, and .Net did introduce high performance timers, not sure about the Stopwatch, which timer is it using internal. This would be the first obvious reason for the differences, or why i would assert the results are invalid... Also some of the timers may report any number say 1, 2, 3, they only have a resolution of 5ms if memory servers me so you might see 1, 6, 11, or 3,8,13 all % 5... Next problem the code is too simplistic, meaning the compiler behind the scene might be doing something we may not be seeing or expecting, variable/expression hoisting. We would need to ensure and force the compile to not do some of the things it might do... I.E. variable or expression hoisting, or simply the compiler can see your code effectively did nothing, with no side effect compiler may effectively just edit out and remove the code? Meaning to be a fair performance tests, change the regex and/or the string being evaluated to be different each loop and defend against the compiler being helpful 🙂 While the sentence from the docs: "Compiled regular expressions maximize run-time performance at the expense of initialization time." I swear the docs use to call out explicitly and a lot clearer that the RegexOptions.Compiled may introduce a large performance hit, could be they go the init time faster. Problem is where is the break even point? And or the complexity of the regex, there were no alternating groups i.e. (this|or|that)[ \t ]*(and|more). I wouldn't be surprised if the compiler behind the scenes altered the regex to a simple if( x == "Tim") Just saying depending what is going on, there are a lot of optimizations that can occur and we would be none the wiser unless you got into the IL code... This would be my over inflated $0.02 cents ;)
It's black magic. I don't f* with wizards who know regex. It's an interesting tool, though. You specify a pattern for acceptable forms of strings and some black box algorithm performs validation for you -- which saves a bunch of work you'd otherwise have to do writing up validation logic. Every language I've seen have regex libraries: C#, Python, Matlab, C++. This is something I'd definitely write a unit test for though lol
Might disagree, Regex is generally pretty easy, but there are some very very subtle gotchas that will get most everyone at some point, most often they can give you a false sense of simplicity but the subtle nuances will get most everyone at some point, most often from people not learning them properly. Tried looking for a series that were on MSDN but can't find it now, might have been the best i ever saw for regex.
Hi, your explanation of compiled regular expression and it’s performance measurement is incorrect. The proper usage is compile once outside of stopwatch, and reuse over and over again.
I am a programmer since 2000, never cared to learn Regular expressions till now. I watched this video completely today and started to get interested in RegEx! Thankyou Tim :)
Fantastic!
I've been programming since the 80's, my friends at work use to ask if could write anything without them ;)
I've always tried to escape using RegEx due to it's complexity. After watching this vid, it's actually doesn't look scary anymore and makes much more sense. Special thanks for the cheat sheet, will pin it somewhere for future reference.
Jeez these comments😂 why this golden channel have these fake comments
Another helpful video for people that have not used regex. I really love how you always point out to use the right tool for the job.
Code comments are there to explain WHY you are doing something even more than WHAT you are doing. Maintaining comments is part of the change.
I have personally only used regex in these basic types of data matching. I had a developer that used it to extract data elements and do code editing using across generated code to upgrade it instead of regenerating the object classes/methods.
Thanks for sharing. I'm glad you enjoyed it.
Hey Tim, i started programming 7 month before and iam a german potato with decent english skills and iam only watching german tutorials. But i dont found a good german video for regex. So i start watching your one. After 1 minute of watching i forgot it is in english and understand every single word and feels for me i watching it in german. Your a talking in a perfect speed and very clearly for me to understand it. Great video. Thx and greetings from germany.
Awesome! I am glad my content is helpful!
watching this and i'm already applying regex to my own code. understandable explanations, jargon-free, this is why i love your videos!
I'm glad they are so helpful.
Oh man, you're amazing. I always learn something new from you. thank you a lot to spend time making these videos for the community. Regards from Argentina
You are welcome.
Good way of documenting regexes - break it onto multiple lines e.g. as an array of strings.
You can then add a comment to each line explaining what each regex fragment does. Then just concatenate the array when you need to use it.
Thanks for sharing.
I’ve recently had something come up where I was going to have to start using RegEx. Perfect timing! This was very helpful, thanks for all your hard work and always delivering great content!
Excellent! I am glad it will be helpful.
I don't use them often but when I do it saves time, though I sometimes need a refresher on positive/negative look ahead/behind, named captures, or back references. Sometimes I use them to find code that needs to be updated or reviewed and I like to use "TextCrawler" and enable "Preview all matches together" which allows me to copy/paste results to Word to document code that I need to look at or change.
Yeah, not using them much does lead to needing to refer back to a resource.
What a real one. This helped immensely
Great!
As always, clear to understand and on point. My 55 minutes well spent. Thanks Tim
Glad it was helpful!
Great tutorial. I've used it before but never bothered to read or watch a tutorial. I've always googled when I needed to use regex. This tutorial is very informative. Thanks Tim.
Glad it was helpful!
Thanks Tim, this is a very helpful intro to regex. I think the comments thing is really important because I'm almost certain I would write some regex pattern and then a week later ask myself "what in the world did I write this pattern for and what does it do." I'm all down for maintaining those comments!
You are welcome.
You're super amazing!
I begin to love C# more and more!
Thank you!
how is your Visual Studio showing what to put in a method? like "path" in here 32:31
That was awesome. Perfect video to get started with Regex. Thank you!
You are welcome.
Great explanation! Thank you very much!!
You are welcome.
You nailed it Tim!!! I was always scared of these.
I am glad it was so helpful.
One of the best ways to practice regular expressions is the Find and Replace tool in Visual Studio
Thanks for sharing.
But there is some difference between them. Not all my patterns worked for 'Find' tool.
Always thought those guys on SO writing and debugging regex were from Mars. Thanks Tim for making this quite palatable. I can now confidently write my own regex instead of copy pasting😄
Awesome!
Thanks Tim for this helpful course.
there are some benefits to start your Regex by declaring Regex regex = new Regex() rather than using the static version.
1. you get the compiler to tell you if it's a valid expression.
2. the color coding helps with visualization
3. it has some IntelliSense
thanks again!
Thanks for sharing!
Best c# tutorials on the internet 😀
Thank you!
Thank you for great content. I learned a lot and very clear for Beginning RegEx
You are welcome.
This was a great tutorial. Thank you, Tim
You are welcome.
Actually it is "\." instead of "."
"." will also find 440x555x1212 as it represents any character (except
)
Yep, caught that too late. Good catch.
@@IAmTimCorey By the way, thank you Tim for providing us with free quality content here on UA-cam! Because of your videos I got back into programming 5 years ago. Since then, I am developing different tools to help my employer and co-workers and looking forward to more videos. Keep up your good spirit!
This is amazing Tim!
Thank you for this tutorial.
You are welcome.
Thank you for this nice video, Tim. Would be happy for a 2nd parts including groups 🙂
Thanks for the suggestion. Please add it to the list on the suggestion site so others can vote on it as well: suggestions.iamtimcorey.com/
Very informative Tim , thanks for making this
You are welcome.
this video is so helpful. thanks Tim
You're very welcome!
I Am Tim .. is in print statement not in search string time -10:30,
Yep, I figured that out later on.
Great to watch! Awesome !
Glad you enjoyed it!
I work as a junior C# developer and I swear no body knows RegEx in my company )
Im glad just got out of that club
Great!
here U go, 55 minutes of my life and I learned something usefull for me, how was I going to get this knowladge if I would be spending not just 55 min, but years somewhere else
Glad it was helpful.
that was very helpful mr tim! thank you so much. Im still a beginner so i was wondering of how to compare 2 txt files perhaps?
for example, file 1 is analysed and compared with file 2 and similar strings are retrieved?
You would probably load both text files into memory as List each. Then compare each entry (which would be a line) between one list and the other. That can still be rather complicated, since one file might have an extra line at the top that throws off the entire comparison. You would need to figure out what your comparison criteria is and then build your app.
@@IAmTimCorey thank you so much mr tim i would look into it!!
What? Why was there a higher time with the precompiled version compared to the time without the precompiled version? I thought if we compiled it over again it spend much time.
Great video. Just what I was looking for!
Awesome!
Again awasome video. You are a String Bender. I have something in mind but can't find it anywhere. Can we read from a Word file line by line? Not by paragraph or word.
Word files are actually zip files. Try it. Change the docx extension to zip and then open it up. You will see that there are multiple files inside. Usually, we use a third-party tool to read Word documents.
@@IAmTimCorey Thx a lot for the answer.
Hello Tim.
Excellent video.
Trying to download source code, with no success. I have provided same e-mail before.
Is there a limit on the code downloads in your channel?
Anyways. I am very thankful, because thanks to this video I finally got to understand how Regular expressions work.
I can't thank you enough for this help.
Sometimes email providers block emails that contain source code. You can email help@iamtimcorey.com and Tom will help you out.
@@IAmTimCorey
Hello Tim.
Got it.
Thank you very much for your prompt answer.
Great tutorial
Thanks!
I did get a boost by using Regex compiled from the demo in this example. It clocked in at 10 ms cirka and not using compiled regex gave about 30 ms.
You didn't escape the dot, so it's matching any single character at those places. Ie, it would match: 440a555b1234
AARG! Good catch.
Great video as always. I wanted to ask, if there is any chance of making .NET MAUI tutorial or Course ?
There is, although I've been holding off for a bit because of some of the issues it has been having. It isn't quite stable yet. Blazor Hybrid is actually further along from what I've been hearing.
Thanks Tim.
You are welcome.
Thanks for the video
You are welcome.
We can use Trim() function to eliminate spaces in between right?
It depends on what you mean. Trim will trim off the whitespace on the very ends of a string. If you mean using Trim on the file, it won't do much good. If you mean using trim on the items you find, that would work if you were capturing spaces. For instance, in the phone number search, we weren't capturing spaces so it was not needed. If we decided to make sure it was the whole word or number then yes, we could trim the spaces after the capture.
@@IAmTimCorey Tq
I have followed few years back more helpful video ,thank you so much....
how to fetch integer value(20) for exm: pay 20 or 20 pay both cases in single regex pattern
Just wondering why was the uncompiled version going faster when the compiled and cache should of been faster there?
I'm curious about that as well. I don't have an answer for you. It is something I'll probably be asking about when I next talk to one of the developers of C#.
@@IAmTimCorey I read this: "Regex has an interpreted mode and a compiled mode. The compiled mode takes longer to start, but is generally faster." So Maybe after a while when things get going and you have more data to process it will eventually get faster? I dunno
@@IAmTimCorey Nothing mysterious, the overhead for the compile is the extra time, the trade off of setup would be for performance later, for a hamburger today, i will... :) So if you only use a regex once or twice would argue generally you are introducing a perf hit for no reason. While you example of running millions, exactly where you'd want the compiled...
Check out the first two paragraphs here (MS Docs)
learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options#compiled-regular-expressions
Also with c# 6 there is a global time out for regex,
// learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-6.0#remarks
AppDomain.CurrentDomain.SetData("REGEX_DEFAULT_MATCH_TIMEOUT", TimeSpan.FromSeconds(2));
Just in case people start using regex and worried about some sort of denial of service attacks that have happened, and why the timeout was introduced... Generally would assert the big problem with Regex is when using forward/backward captures, effectively one will create cycles or recursive looks in the matching engine, i've used Regex for at least 40 years never trouble/issue if one avoids the fore/back captures...
@Dan D - I've read those articles. The mysterious part is that the uncompiled version is the fastest one, yet according to those documents, instantiated Regex objects that are not pre-compiled do not get cached (so the less efficient version is used each time), yet that was the fastest in our small test. According to the docs, it should have been the slowest. We sacrificed a bit of startup time with the compiled in order to gain runtime performance. But we were testing both for both options. So either the compilation of the pattern is not done at all for the instantiated version and that compilation is expensive enough to not be made up in 100,000 iterations or there is something else going on.
@@IAmTimCorey Might assert the margin of difference is well within a range that the times are effectively immaterial for a few reasons, not to mention could the code execution have been effected by say your OBS recording and influencing your tests... So honestly would assert your test environment is rather dirty and not a valid test environment. Good reason perf testing is a joy. Should really get cold start, warm starts, and long haul variations, after warmup/startup, before setup... So needless to say starting a console app get seeing 5ms deference... Meh? Then again the stopwatch, what is the resolution for the timers, use to be the multi-media timers were the highest resolution, and .Net did introduce high performance timers, not sure about the Stopwatch, which timer is it using internal. This would be the first obvious reason for the differences, or why i would assert the results are invalid... Also some of the timers may report any number say 1, 2, 3, they only have a resolution of 5ms if memory servers me so you might see 1, 6, 11, or 3,8,13 all % 5...
Next problem the code is too simplistic, meaning the compiler behind the scene might be doing something we may not be seeing or expecting, variable/expression hoisting. We would need to ensure and force the compile to not do some of the things it might do... I.E. variable or expression hoisting, or simply the compiler can see your code effectively did nothing, with no side effect compiler may effectively just edit out and remove the code? Meaning to be a fair performance tests, change the regex and/or the string being evaluated to be different each loop and defend against the compiler being helpful 🙂
While the sentence from the docs:
"Compiled regular expressions maximize run-time performance at the expense of initialization time."
I swear the docs use to call out explicitly and a lot clearer that the RegexOptions.Compiled may introduce a large performance hit, could be they go the init time faster.
Problem is where is the break even point? And or the complexity of the regex, there were no alternating groups i.e.
(this|or|that)[
\t
]*(and|more). I wouldn't be surprised if the compiler behind the scenes altered the regex to a simple if( x == "Tim") Just saying depending what is going on, there are a lot of optimizations that can occur and we would be none the wiser unless you got into the IL code...
This would be my over inflated $0.02 cents ;)
that's great , thank you
You are welcome!
Great stuff.
Thanks!
Got this theme in university (HSE Russia), thanks for video.
You are welcome.
The perfect subject for Halloween 😂
lol
what is the different btw \w and \b ,thank you, :[)
Here you go: stackoverflow.com/a/11874899/733798
Try to make video on Microsoft teams bot app
Thanks for the suggestion. Please add it to the list on the suggestion site so others can vote on it as well: suggestions.iamtimcorey.com/
is there anyway to get the actual content rather than returning only true and false
One of those things that everybody knows but nobody really does
It's black magic. I don't f* with wizards who know regex.
It's an interesting tool, though. You specify a pattern for acceptable forms of strings and some black box algorithm performs validation for you -- which saves a bunch of work you'd otherwise have to do writing up validation logic. Every language I've seen have regex libraries: C#, Python, Matlab, C++. This is something I'd definitely write a unit test for though lol
Might disagree, Regex is generally pretty easy, but there are some very very subtle gotchas that will get most everyone at some point, most often they can give you a false sense of simplicity but the subtle nuances will get most everyone at some point, most often from people not learning them properly. Tried looking for a series that were on MSDN but can't find it now, might have been the best i ever saw for regex.
Good morning
Hi, your explanation of compiled regular expression and it’s performance measurement is incorrect. The proper usage is compile once outside of stopwatch, and reuse over and over again.
great
Thanks!
Once you duplicated the hardcoded strings I knew you set up a trap for yourself 😅
Yep, that's always risky.
16:21
Thank you very much! This comment is for UA-cam algorithm.
Thank you!
(?i)My Name(?-i) matches 'My Name' case insensitive
Regex can help you solve problems but it can't help you build solutions :)
You can say that about a lot of things. Regex is a tool that is helpful in certain circumstances.