Skip to content

Replace oro#700

Closed
FSchumacher wants to merge 69 commits intoapache:masterfrom
FSchumacher:replace-oro
Closed

Replace oro#700
FSchumacher wants to merge 69 commits intoapache:masterfrom
FSchumacher:replace-oro

Conversation

@FSchumacher
Copy link
Contributor

Description

Try to replace usage of Oro Regex implementation by Java Regex implementation. This is done by adding a property to let the user switch between the two. Default is to use the old Oro based one.

This is not finished. There are not all occurrences enhanced with the switch. A lot of it feels like a hack, as much code has been duplicated.

I thought about adding some interfaces to hide the Oro and Java Regex implementations behind, but currently I believe it would be overengineered. But maybe someone has a good idea to make this code less hacky.

Motivation and Context

The Oro library has been unmaintained for years and is feature-wise behind the built-in Java Regex implementation. To make a smooth transition from Oro to the Java built-in implementation the plan is to

  • make switching possible while defaulting to the old one
  • deprecate the usage of the old one and switching the default to the new one
  • removing the old implementation

Every step should be made with at least one release in between.

Discussed in Bug 57672 and Bug 65883

How Has This Been Tested?

Tests were run with the default setting (Oro Regex) and some manual tests with the setting jmeter.use_java_regex to true.

Screenshots (if appropriate):

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the code style of this project.
  • I have updated the documentation accordingly.

Copy link
Contributor

@pmouawad pmouawad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @FSchumacher ,

Thanks for this PR which will be a great addition to JMeter.

It looks clean to me, but I think we need to have a PatternCache otherwise we will degrade JMeter performance.
Also if we could centralize the Regex implementation selection through a factory it would avoid having the property spread in multiple classes.

Thanks again for you work

@vlsi
Copy link
Collaborator

vlsi commented Feb 22, 2022

Just wondering: does it replace all oro usages? Frankly speaking, I was surprised there are many usages of oro.


jmeter.use_java_regex

What do you think if the property is something like jmeter.regex.engine=oro|java ?
Then it would be more-or-less compatible with other alternatives (if needed :) ).

@FSchumacher
Copy link
Contributor Author

Just wondering: does it replace all oro usages? Frankly speaking, I was surprised there are many usages of oro.

$ find src -name "*.java" -exec grep -rl \.oro\. {} \;
src/components/src/main/java/org/apache/jmeter/visualizers/RenderAsRegexp.java
src/components/src/main/java/org/apache/jmeter/assertions/ResponseAssertion.java
src/components/src/main/java/org/apache/jmeter/assertions/JSONPathAssertion.java
src/components/src/main/java/org/apache/jmeter/assertions/jmespath/JMESPathAssertion.java
src/components/src/main/java/org/apache/jmeter/assertions/CompareAssertion.java
src/components/src/main/java/org/apache/jmeter/extractor/RegexExtractor.java
src/core/src/test/java/org/apache/jmeter/report/dashboard/ApdexPerTransactionTest.java
src/core/src/main/java/org/apache/jmeter/engine/util/ReplaceFunctionsWithStrings.java
src/core/src/main/java/org/apache/jmeter/util/JMeterUtils.java
src/core/src/main/java/org/apache/jmeter/save/CSVSaveService.java
src/core/src/main/java/org/apache/jmeter/report/dashboard/ReportGenerator.java
src/protocol/http/src/test/java/org/apache/jmeter/protocol/http/sampler/TestHTTPSamplersAgainstHttpMirrorServer.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/util/accesslog/LogFilter.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/util/accesslog/SessionFilter.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/util/EncoderCache.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/parser/HtmlParsingUtils.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/parser/RegexpHTMLParser.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/modifier/URLRewritingModifier.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/config/MultipartUrlConfig.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/control/HttpMirrorThread.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/sampler/HTTPSamplerBase.java
src/protocol/http/src/main/java/org/apache/jmeter/protocol/http/proxy/ProxyControl.java
src/functions/src/main/java/org/apache/jmeter/functions/RegexFunction.java
src/functions/src/main/java/org/apache/jmeter/functions/EscapeOroRegexpChars.java

I have started with the functions and assertions. The other stuff is less user visible, but would need to replaced, too.

jmeter.use_java_regex

What do you think if the property is something like jmeter.regex.engine=oro|java ? Then it would be more-or-less compatible with other alternatives (if needed :) ).

For the bugfix, I concentrated on a feature flag to ease the transition from Oro to Java based. If we would like to be more flexible, the simplistic approach -- I took -- might not be the best and we should take more care to introduce a real abstraction instead of the if clauses.

@vlsi
Copy link
Collaborator

vlsi commented Feb 23, 2022

If we would like to be more flexible, the simplistic approach

jmeter.regex.engine=oro|java does not require changing the implementation.

What I mean is primitive values, especially boolean are quite often a poor choice for an API.
In many cases, it is hard to understand what true and false means.

See https://twitter.com/lukaseder/status/1443527683720286209

On the other hand, if the only values are oro and java, then it is way easier to understand what does the switch flip.

@FSchumacher
Copy link
Contributor Author

If we would like to be more flexible, the simplistic approach

jmeter.regex.engine=oro|java does not require changing the implementation.

What I mean is primitive values, especially boolean are quite often a poor choice for an API. In many cases, it is hard to understand what true and false means.

See https://twitter.com/lukaseder/status/1443527683720286209

On the other hand, if the only values are oro and java, then it is way easier to understand what does the switch flip.

I am not really against using a more flexible approach, but I will first try to finish the classes left :)

Giving the users more choice, will let more room for misspellings, though :)

@vlsi
Copy link
Collaborator

vlsi commented Feb 23, 2022

Giving the users more choice, will let more room for misspellings, though :)

The choice is the same: two possible values.
However, I claim oro vs java is more user-friendly than true vs false

@FSchumacher
Copy link
Contributor Author

The EncoderCache from o.a.j.p.http.util uses the Cache from Oro and could probably changed to use Caffeine instead. But I will not do that in this PR.

@codecov-commenter
Copy link

codecov-commenter commented Feb 23, 2022

Codecov Report

Merging #700 (2886281) into master (c2f29cc) will decrease coverage by 0.28%.
The diff coverage is 23.91%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #700      +/-   ##
============================================
- Coverage     55.54%   55.25%   -0.29%     
- Complexity    10350    10374      +24     
============================================
  Files          1061     1061              
  Lines         65220    65666     +446     
  Branches       7433     7511      +78     
============================================
+ Hits          36229    36287      +58     
- Misses        26432    26798     +366     
- Partials       2559     2581      +22     
Impacted Files Coverage Δ
.../org/apache/jmeter/visualizers/RenderAsRegexp.java 2.83% <4.34%> (+0.47%) ⬆️
...in/java/org/apache/jmeter/save/CSVSaveService.java 43.55% <5.00%> (-0.98%) ⬇️
...va/org/apache/jmeter/extractor/RegexExtractor.java 57.86% <5.35%> (-28.75%) ⬇️
...org/apache/jmeter/assertions/CompareAssertion.java 79.80% <6.66%> (-4.73%) ⬇️
.../jmeter/protocol/http/parser/RegexpHTMLParser.java 42.85% <6.89%> (-30.68%) ⬇️
...pache/jmeter/report/dashboard/ReportGenerator.java 68.21% <10.00%> (-1.51%) ⬇️
...meter/engine/util/ReplaceFunctionsWithStrings.java 53.19% <10.52%> (-28.96%) ⬇️
.../jmeter/protocol/http/parser/HtmlParsingUtils.java 48.92% <12.76%> (-11.92%) ⬇️
...er/protocol/http/util/accesslog/SessionFilter.java 64.06% <22.22%> (-6.85%) ⬇️
...jmeter/protocol/http/control/HttpMirrorThread.java 75.15% <26.08%> (-7.73%) ⬇️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2f29cc...2886281. Read the comment docs.

Use a different approach to split the template and correct the
difference between Java Regex groupCount() and Oro groups()
Oro seems to get along with the escaped braces, so use them for testing
both implementations
Guard it for usage within a Regex
The Java Regex implementation sees an empty group at the end of
a string, when matching against .*. So let us rephrase our wish
to at least one character.
If a system property prefixed with jmeter.properties. is found, Gradle will
now pass it as a system property without that prefix to the unit tests.
That allows to set system properties (which might be used as jmeter properties)
to the unit tests, without setting them in the JVM, that is used by Gradle.
Copy link
Contributor

@pmouawad pmouawad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @FSchumacher ,
Thanks for this great PR.
I made another review.

Regards

@FSchumacher
Copy link
Contributor Author

I think I am ready with this PR. The three things left above, are done, which were

  • Measure the impact of caching the patterns vs. compiling on the fly
  • Adding documentation
  • Changing the property name to make Vladimir happy

Now, the question remains. Should we merge it before 5.5?

@asfgit asfgit closed this in 37a7639 Mar 5, 2022
@vlsi
Copy link
Collaborator

vlsi commented Mar 14, 2022

I just noticed the PR was merged as 69 commits. It would be better to squash all of them in a single commit so the git history (and git annotate is easier to reason)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants