Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex improvements #1348

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

EngRajabi
Copy link
Contributor

@EngRajabi EngRajabi commented Oct 6, 2020

Proposed changes

Regex improvements:

  • set option compile
  • set timeout
  • set static object (Regex class is in mutable)

Links

Microsoft Docs: Best Practices for Regular Expressions in .NET | Microsoft Learn

Hossein-Edraki
Hossein-Edraki previously approved these changes Oct 7, 2020
Copy link

@Hossein-Edraki Hossein-Edraki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's Ok

@EngRajabi
Copy link
Contributor Author

@TomPallister

VahidTajari
VahidTajari previously approved these changes Nov 15, 2020
@WeihanLi
Copy link
Contributor

What if timeout happens? Timeout maybe not good practice here.

@EngRajabi
Copy link
Contributor Author

Setting timeout is one of the necessary tasks of using regex. When a timeout occurs, it is best to run the entire project involved and not respond to others.

@WeihanLi
Copy link
Contributor

Is there some reason for considering 100ms as the regex timeout, and how could the client update the timeout config if the client wanna use a longer timeout?

@EngRajabi
Copy link
Contributor Author

timeout is not customizable by the customer. Like many other settings. It was 100 ms when I adjusted it according to the tests I had. But we can decide here and change this time. We can set this time to 500 ms. If we do not receive an answer at this time, it is better not to involve the whole project in its implementation and let it respond to other requests.

@raman-m raman-m changed the base branch from master to develop July 17, 2023 12:23
@raman-m raman-m changed the title Improvement regex Regex improvements Jul 17, 2023
@raman-m raman-m self-requested a review July 17, 2023 12:58
Copy link
Contributor Author

@EngRajabi EngRajabi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review done

@raman-m
Copy link
Member

raman-m commented Jul 20, 2023

Dear Mohsen,

Why did you merge develop branch from your fork into the feature branch?
I meant these commits: ff1dbb6 5b336cd
It was unnecessary because I did the merge of ThreeMammals:develop in 329d2c3.
After that I trigger pipeline to make a valid green build for commit 03a70b3.

Now there is no build at all and pipeline cannot run new build.
The pipeline cannot recognize develop branch because you have juggled all commits in a wrong way.

Could you fix/update your fork develop branch please first?

After that we will try to fix pipeline triggering by rebasing current feature branch onto develop one.

Copy link
Contributor Author

@EngRajabi EngRajabi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did Branch Develop on Branch as Rebase. Does this solve this problem?

@raman-m
Copy link
Member

raman-m commented Jul 22, 2023

image

Now I see develop and it has zero diff, but the branch is not default. ☝️
You have to go to repo settings and make develop branch as default!
I hope that should help to fix the problem of builds, because all builds are built for develop branch (auto) and master (manually).
Currently I don't see that pipeline was triggered for top commit.
Maybe an empty or tiny commit will be required to trigger the pipeline after making develop branch default.

Copy link
Contributor Author

@EngRajabi EngRajabi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change something for build

@raman-m
Copy link
Member

raman-m commented Jul 22, 2023

Feature branch has been rebased onto develop!
The build has been passed for top commit!

Going to start code review...

I guess, this PR is not related to an issue, right?

Copy link
Member

@raman-m raman-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EngRajabi Mohsen,

I see three major issues with current code:

  • TimeSpan.FromMilliseconds(100). The hardcode of 100 is detected. Milliseconds constant should be a constant, and public static. Moving to some common options static class is highly desired.
  • No catching and processing of the RegexMatchTimeoutException object being generated by all Regex objects. A try-catch constructions should be provided.
  • No unit tests which cover new behavior of Regex. So, RegexMatchTimeoutException exception case should be tested.

Hope this doc will be helpful: Defining a Time-Out Value, to understand current issues.

@@ -107,8 +104,7 @@ private static bool AllRoutesForAggregateExist(FileAggregateRoute fileAggregateR

private static bool IsPlaceholderNotDuplicatedIn(string upstreamPathTemplate)
{
var regExPlaceholder = new Regex("{[^}]+}");
var matches = regExPlaceholder.Matches(upstreamPathTemplate);
var matches = _regExPlaceholder.Matches(upstreamPathTemplate);
Copy link
Member

@raman-m raman-m Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No catching and processing of the RegexMatchTimeoutException exception.

Will it result to unhandled exception by app domain?
Will it be caught by upper context?

@@ -38,7 +37,7 @@ List<PlaceholderNameAndValue> urlPathPlaceholderNameAndValues
if (values.Data != null)
{
// dynamic claim
var match = Regex.Match(required.Value, @"^{(?<variable>.+)}$");
var match = _regexAuthorize.Match(required.Value);
Copy link
Member

@raman-m raman-m Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No catching and processing of the RegexMatchTimeoutException exception.

Will it result to unhandled exception by app domain?
Will it be caught by upper context?

Comment on lines +15 to +18
private static readonly Regex _secondsRegEx = new Regex("^[0-9]+s", RegexOptions.Compiled, TimeSpan.FromMilliseconds(100));
private static readonly Regex _minutesRegEx = new Regex("^[0-9]+m", RegexOptions.Compiled, TimeSpan.FromMilliseconds(100));
private static readonly Regex _hoursRegEx = new Regex("^[0-9]+h", RegexOptions.Compiled, TimeSpan.FromMilliseconds(100));
private static readonly Regex _daysRegEx = new Regex("^[0-9]+d", RegexOptions.Compiled, TimeSpan.FromMilliseconds(100));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcode has detected!
The 100 milliseconds constant should be public static.

Comment on lines +121 to +124
return _secondsRegEx.Match(period).Success
|| _minutesRegEx.Match(period).Success
|| _hoursRegEx.Match(period).Success
|| _daysRegEx.Match(period).Success;
Copy link
Member

@raman-m raman-m Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No catching and processing of the RegexMatchTimeoutException exception.

Will it result to unhandled exception by app domain?
Will it be caught by upper context?

[Obsolete("Please set BaseUrl in ocelot.json GlobalConfiguration.BaseUrl")]
public static IConfigurationBuilder AddOcelotBaseUrl(this IConfigurationBuilder builder, string baseUrl)
{
Regex.CacheSize += 100;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default, current value of the cache in runtime?
So, why did you increase this property value?

Comment on lines +63 to +68
var pathAndQuery = CreateServiceFabricUri(downstreamRequest, downstreamRoute,
templatePlaceholderNameAndValues, response);

//todo check this works again hope there is a test..
downstreamRequest.AbsolutePath = pathAndQuery.Path;
downstreamRequest.Query = pathAndQuery.Query;
downstreamRequest.AbsolutePath = pathAndQuery.Path;
downstreamRequest.Query = pathAndQuery.Query;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero changes!
Revert to previous code!

@@ -51,7 +49,7 @@ public Response Remove(string key)
=> _placeholders.Remove(key);

private static string CleanKey(string key)
=> Regex.Replace(key, @"[{}]", string.Empty, RegexOptions.None);
=> _regex.Replace(key, string.Empty);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No catching and processing of the RegexMatchTimeoutException exception.

new Regex("$^", RegexOptions.Compiled | RegexOptions.Singleline) :
new Regex(template, RegexOptions.Compiled | RegexOptions.Singleline);
_reg :
_regex.AddOrUpdate(template, new Regex(template, RegexOptions.Compiled | RegexOptions.Singleline, TimeSpan.FromMilliseconds(100)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No catching and processing of the RegexMatchTimeoutException exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no unit tests for new approach of using Regex.
Time out should be tested!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regex Time out event should be tested!

@EngRajabi
Copy link
Contributor Author

@raman-m
Hello, I read some of your reviews, but since there were many, I will answer you in one post.
1- In the case of TimeSpan.FromMilliseconds(100), if the work is not done in 100 milliseconds, the work will not continue. Regarding the RegexMatchTimeoutException, I ask you what happens in the current situation where we have not set the timeout??? When we don't have a timeout, the requests wait so much that they receive errors, and the CPU and computing resources are busy. But with my changes, a timeout error occurs, which probably returns the error code 500 in the output. This behavior is better than the current default behavior.
2- In the case of Regex.CacheSize += 100. Its default value is 15. According to the following document
https://learn.microsoft.com/en-us/dotnet/standard/base-types/best-practices
This amount is very, very low for large projects and causes the cache to be cleared very often and this causes our patterns to come out of the cache.
3- Regarding format code changes. These changes are not my work. If you have looked, people changed my code.

@raman-m
Copy link
Member

raman-m commented Aug 9, 2023

@EngRajabi commented on Aug 4, 2023

It is better to reply in own thread of each code review issue...


When we don't have a timeout, the requests wait so much that they receive errors, and the CPU and computing resources are busy. But with my changes, a timeout error occurs, which probably returns the error code 500 in the output. This behavior is better than the current default behavior.

Thanks for improvements! Current behavior has been improved. We've done with that, but...
I'm pretty sure that each Match method is not wrapped by any try-catch construction in upper contexts. That means, Yes, we have 99% UnhandledException which will be handled globally by current OcelotPipelineExtensions.
So, in this case we have error being handled, request has been failed, and records will be added by Ocelot loggers.
But... Instead of having error logged why not to improve user experience, and let the user know that current timeout value is low, and it is better to increase time out value. For example we could log a warning with all required metrics.
Finally, we can introduce some Regex helper for Match methods to process timeouts.

Links to read


And, Could you start fixing code review issues please?
Let me know if you need a help, and we will have some pair programming sessions...

@raman-m
Copy link
Member

raman-m commented Aug 9, 2023

@WeihanLi commented on Nov 20, 2020:

Is there some reason for considering 100ms as the regex timeout, and how could the client update the timeout config if the client wanna use a longer timeout?

I wanna develop your idea further! 😉
The 100ms value will be OK if it covers all cases of Regex-matching inside of Ocelot core. But I am sure it doesn't. And, 👉

The problem with the RegexMatchTimeoutException objects being thrown by Match methods. 😩

Regarding configuring of timeout value. Well... We could introduce one common configurable value for this option in Best Practices for Regular Expressions in .NET:

  • Set a process-wide or AppDomain-wide value with code such as AppDomain.CurrentDomain.SetData("REGEX_DEFAULT_MATCH_TIMEOUT", TimeSpan.FromMilliseconds(100));.

What about such kind of Regex configuration?

@raman-m raman-m added proposal Proposal for a new functionality in Ocelot needs feedback Issue is waiting on feedback before acceptance labels Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs feedback Issue is waiting on feedback before acceptance proposal Proposal for a new functionality in Ocelot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants