[TMVA][Preprocessing] - Additional normalisation method #4141

josephmckenna · 2019-08-01T10:54:20Z

Add scaling VarTransform functionality to TMVA preproccessing (like normalisation it linearly scales the data but the sign of the input and output data is retained).

I have added to the functionality of the VariableNormalizeTransform class in the style of the VariableGaussTransform class to transform data such that it remains in the range of [-1,1], there is no offset, so the sign of the input data is unchanged by the transformation.

This is proving essential for my neural network analyses that treat a detector hit data like an image classification problem and use ReLU activation functions at the beginning of my network.

I have also added a description to the TMVA documentation

phsft-bot · 2019-08-01T10:54:23Z

Can one of the admins verify this patch?

lmoneta · 2019-09-02T13:43:14Z

Very nice PR. Thank you very much for your contribution!
Could you also please provide a simple test, to be sure the transformation is doing the right thing ?
Thank you

josephmckenna · 2019-09-03T09:29:30Z

Hi Imoneta,

Modifying tutorials/tmva/keras/ClassificationKeras.py to add an 'S' transformation
Line 28-29 becomes:

factory = TMVA.Factory('TMVAClassification', output,
                       '!V:!Silent:Color:DrawProgressBar:Transformations=D,G,S:AnalysisType=Classification')

Line 63-66 becomes:

factory.BookMethod(dataloader, TMVA.Types.kFisher, 'Fisher',
                   '!H:!V:Fisher:VarTransform=D,G,S')
factory.BookMethod(dataloader, TMVA.Types.kPyKeras, 'PyKeras',
                   'H:!V:VarTransform=D,G,S:FilenameModel=model.h5:NumEpochs=20:BatchSize=32')

Updated script attached:
ClassificationKerasScale.zip

Before the changes the output from running
cd $ROOTSYS/tutorials/tmva/keras
python ClassificationKeras.py &> DG.log
DG.log
and after:
python ClassificationKerasScale.py &> DGS.log
DGS.log

We can see that the training sample transformation is limited to be between -1 and 1:

TFHandler_PyKeras : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0015578 0.17520 [ -0.54435 1.0000 ]
: var2: 0.0013889 0.17448 [ -0.54435 1.0000 ]
: var3: 0.0013901 0.17452 [ -0.54435 1.0000 ]
: var4: 0.0012939 0.17410 [ -0.54435 1.0000 ]
: -----------------------------------------------------------

Scaling is working... is it being saved and loaded again ok? We can check the 'Test' phase in the same script since TMVA saves transformations to file, then loads them to re-apply to testing data

TFHandler_PyKeras : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0041504 0.17586 [ -0.52983 1.0000 ]
: var2: 0.0048056 0.17568 [ -0.52290 1.0000 ]
: var3: 0.0039114 0.17501 [ -1.0000 0.70855 ]
: var4:-0.00083735 0.17310 [ -1.0000 1.0000 ]
: -----------------------------------------------------------

The limits are no longer exactly -0.54435 to 1.0, but they linearly match the D,G transformation (see DG.log file). If we had a larger data sample the training and test transformations would have more similar ranges.

We can also see at the end of the training that the training and test data classification accuracy match each other, also showing the transformation is working:

: Testing efficiency compared to training efficiency (overtraining check)
: -------------------------------------------------------------------------------------------------------------------
: DataSet MVA Signal efficiency: from test sample (from training sample)
: Name: Method: @b=0.01 @b=0.10 @b=0.30
: -------------------------------------------------------------------------------------------------------------------
: dataset PyKeras : 0.263 (0.228) 0.680 (0.673) 0.904 (0.908)
: dataset Fisher : 0.229 (0.192) 0.645 (0.640) 0.893 (0.896)
: -------------------------------------------------------------------------------------------------------------------

Any questions please ask

Thank you

lmoneta · 2021-02-12T15:23:33Z

@phsft-bot build

phsft-bot · 2021-02-12T15:23:41Z

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

phsft-bot · 2021-02-12T20:19:32Z

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

[2021-02-12T20:19:31.987Z] CMake Error at C:/build/workspace/root-pullrequests-build/rootspi/jenkins/root-build.cmake:1049 (ctest_start):

josephmckenna · 2021-02-13T17:15:36Z

@phsft-bot build

josephmckenna · 2021-02-13T17:17:21Z

@phsft-bot build

Ahha I dont have that power. I saw the build failed and that this branch was waay behind the master branch so simply merged in the current version of the protject-root/master

sitongan

LGTM!

josephmckenna · 2021-06-23T08:14:24Z

Is there anything I can do to help expediate this pull request?

ferdymercury · 2025-02-17T18:13:26Z

so simply merged in the current version of the protject-root/master

Please do instead a rebase with current master (e.g. with git rebase --interactive)

Is there anything I can do to help expediate this pull request?

Based on your tutorial above, create a test either in the roottest or tmva/test folders.

…y scales the data but the sign of the input and output data is retained).

guitargeek · 2025-10-20T22:08:00Z

Thank you very much! Given that this PR got a positive review by a TMVA expert, I don't think it should be held back from merging.

tmadlener · 2025-12-16T13:34:43Z

tmva/tmva/src/VariableNormalizeTransform.cxx

 {
+   TString UseOffsetOrNot;
+
+   gTools().ReadAttr(trfnode, "UseOffsetOrNot", UseOffsetOrNot );


I am fairly certain this breaks the reading of existing TMVA files that have not been written with the UseOffsetOrNot tag.

I currently get messages like:

<FATAL> : Trying to read non-existing attribute 'UseOffsetOrNot' from xml node 'Transform'

I am currently trying a simple fix locally and will open a PR once I have validated that works.

Thanks a lot! That is very kind.

josephmckenna requested review from ashlaban, couet, lmoneta and stwunsch as code owners August 1, 2019 10:54

josephmckenna changed the title ~~[TMVA][Preprocessing]~~ [TMVA][Preprocessing] - Additional normalisation method Aug 2, 2019

bellenot assigned lmoneta Aug 12, 2019

couet removed their request for review February 12, 2021 07:29

josephmckenna requested a review from sitongan as a code owner February 13, 2021 09:42

sitongan approved these changes Feb 15, 2021

View reviewed changes

guitargeek added the in:TMVA label Sep 5, 2023

Add scaling VarTransform functionality (like normalisation it linearl…

51a4a80

…y scales the data but the sign of the input and output data is retained).

guitargeek force-pushed the master branch from 5910301 to 51a4a80 Compare October 20, 2025 21:11

guitargeek approved these changes Oct 20, 2025

View reviewed changes

guitargeek merged commit ffdab66 into root-project:master Oct 20, 2025
21 of 26 checks passed

tmadlener reviewed Dec 16, 2025

View reviewed changes

tmadlener mentioned this pull request Dec 16, 2025

[tmva] Check for existence of xml node before reading it #20735

Merged

2 tasks

[TMVA][Preprocessing] - Additional normalisation method #4141

[TMVA][Preprocessing] - Additional normalisation method #4141

Uh oh!

Conversation

josephmckenna commented Aug 1, 2019

Uh oh!

phsft-bot commented Aug 1, 2019

Uh oh!

lmoneta commented Sep 2, 2019

Uh oh!

josephmckenna commented Sep 3, 2019

Uh oh!

lmoneta commented Feb 12, 2021

Uh oh!

phsft-bot commented Feb 12, 2021

Uh oh!

phsft-bot commented Feb 12, 2021

Errors:

Uh oh!

josephmckenna commented Feb 13, 2021

Uh oh!

josephmckenna commented Feb 13, 2021

Uh oh!

sitongan left a comment

Choose a reason for hiding this comment

Uh oh!

josephmckenna commented Jun 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ferdymercury commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guitargeek commented Oct 20, 2025

Uh oh!

Uh oh!

tmadlener Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

guitargeek Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

josephmckenna commented Jun 23, 2021 •

edited

Loading

ferdymercury commented Feb 17, 2025 •

edited

Loading