Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RDataFrame] Unable to cacheread remote file #15028

Open
1 task done
AlkaidCheng opened this issue Mar 21, 2024 · 1 comment
Open
1 task done

[RDataFrame] Unable to cacheread remote file #15028

AlkaidCheng opened this issue Mar 21, 2024 · 1 comment
Assignees

Comments

@AlkaidCheng
Copy link

Check duplicate issues.

  • Checked for duplicates

Description

When input files to RDataFrame are remote files, force caching of remote files does not work and the remote files will be downloaded every time.

Reproducer

import os
import ROOT

user = os.environ['USER']
outdir = f"/eos/user/{user[0]}/{user}"
filename = os.path.join(outdir, "test.root")
# create dummy root file
ROOT.RDataFrame(100).Define("x", "1").Snapshot("test", filename)

ROOT.TFile.SetCacheFileDir("/tmp", True, True)
# this does not trigger loading of cached root file
ROOT.RDataFrame("test", f"root://eosuser.cern.ch/{filename}").Sum("x").GetValue()

This is because internally RDataFrame will create a TChain using ROOT.Internal.TreeUtils.MakeChainForMT(treename), which creates a TChain object with the mode ROOT.TChain.kWithoutGlobalRegistration. This in turn forces the TFile open option to be "READ_WITHOUT_GLOBALREGISTRATION". This causes the TFile to be opened without caching since it only checks the fgCacheFileForce flag when option is "READ"

ROOT version

6.30/04 (LCG105a)

Installation method

LCG (Swan)

Operating system

Linux

Additional context

No response

@AlkaidCheng
Copy link
Author

I think one possible solution will be to manually edit the options (like here) inside TFile::Open (i.e. somewhere here) so that the _WITHOUT_GLOBALREGISTRATION suffix is not interfering with the remote caching decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants