Skip to content

[RDataFrame] RDataFrame.Redefine does not work with Snapshot #10233

@AlkaidCheng

Description

@AlkaidCheng

After calling the RDataFrame.Redefine method, saving the Snapshot containing the redefined column will raise an error.

Reproducer:

import ROOT
import numpy as np
rdf = ROOT.RDF.MakeNumpyDataFrame({"bar": np.arange(0, 10, 1)})
rdf = rdf.Redefine("bar", "bar + 1")
rdf.Snapshot("output", "output.root")

This gives:

TypeError: Template method resolution failed:
  none of the 3 overloaded methods succeeded. Full details:
  ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, initializer_list<string> columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
    TypeError: takes at least 3 arguments (2 given)
  ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, const vector<string>& columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
    TypeError: takes at least 3 arguments (2 given)
  ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, basic_string_view<char,char_traits<char> > columnNameRegexp = "", const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
    logic_error: Error: column "bar" was passed to Snapshot twice. This is not supported: only one of the columns would be readable with RDataFrame.
  ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, basic_string_view<char,char_traits<char> > columnNameRegexp = "", const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
    logic_error: Error: column "bar" was passed to Snapshot twice. This is not supported: only one of the columns would be readable with RDataFrame.

However, this works:

import ROOT
import numpy as np
rdf = ROOT.RDF.MakeNumpyDataFrame({"bar": np.arange(0, 10, 1)})
rdf = rdf.Redefine("bar", "bar + 1")
rdf.Snapshot("output", "output.root", ["bar"])

ROOT Version: 6.26/00 (conda install), 6.27/01 (swan bleeding edge)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions