-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'read beyond' in AwkwardForth error when reading vector<vector<object>> branch #1221
Comments
One extra piece of information: object1 has a lot of members and I've found that the vector<vector< object1 >> can be read if we remove a lot of object1's members. Is there some limit on the size of an object we can read? |
No, there's no size limit. There's evidently an error in the AwkwardForth code that was generated for one of the fields of this object, and when you removed fields, you removed the one that we're not handling correctly. First thing is that you can turn off the attempt to generate AwkwardForth like this: >>> import uproot
>>> file = uproot.open("nts.owner.trkana-reco.version.sequencer.root")
>>> trkana = file["TrkAna"]["trkana"]
>>>
>>> trkana["demtsh"].interpretation._forth = False # secret back-door
>>>
>>> array = trkana["demtsh"].array()
>>> array
<Array [[[{plane: 4, panel: 1, ...}, ...]]] type='1 * var * var * struct[{p...'>
>>>
>>> array.type.show()
1 * var * var * struct[{
plane: int32,
panel: int32,
layer: int32,
straw: int32,
state: int32,
algo: int32,
frozen: bool,
usetot: bool,
usedriftdt: bool,
useabsdt: bool,
usendvar: bool,
bkgqual: float32,
signqual: float32,
driftqual: float32,
chi2qual: float32,
earlyend: int32,
edep: float32,
wdist: float32,
werr: float32,
tottdrift: float32,
etime: 2 * float32,
tot: 2 * float32,
ptoca: float32,
stoca: float32,
rdoca: float32,
rdocavar: float32,
rdt: float32,
rtocavar: float32,
udoca: float32,
udocavar: float32,
udt: float32,
utocavar: float32,
rupos: float32,
uupos: float32,
rdrift: float32,
cdrift: float32,
sderr: float32,
uderr: float32,
dvel: float32,
lang: float32,
utresid: float32,
utresidmvar: float32,
utresidpvar: float32,
udresid: float32,
udresidmvar: float32,
udresidpvar: float32,
rtresid: float32,
rtresidmvar: float32,
rtresidpvar: float32,
rdresid: float32,
rdresidmvar: float32,
rdresidpvar: float32,
wdot: float32,
poca: struct[{
fCoordinates: struct[{
fX: float32,
fY: float32,
fZ: float32
}, parameters={"__record__": "ROOT::Math::Cartesian3D<float>"}]
}, parameters={"__record__": "ROOT::Math::DisplacementVector3D<ROOT::Math::Cartesian3D<float>,ROOT::Math::DefaultCoordinateSystemTag>"}],
dhit: bool,
dactive: bool
}, parameters={"__record__": "mu2e::TrkStrawHitInfo"}] That can help debugging, but we should fix this bug in AwkwardForth generation. If you have a large file, you can look at the time needed to load the We can also see the AwkwardForth code that it generates, through another back-door attribute, which is only populated after it attempts to generate code and fails. So, in a new session, >>> import uproot
>>> file = uproot.open("nts.owner.trkana-reco.version.sequencer.root")
>>> trkana = file["TrkAna"]["trkana"]
>>>
>>> trkana["demtsh"].array()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/jpivarski/irishep/uproot5/src/uproot/behaviors/TBranch.py", line 1819, in array
_ranges_or_baskets_to_arrays(
File "/Users/jpivarski/irishep/uproot5/src/uproot/behaviors/TBranch.py", line 3105, in _ranges_or_baskets_to_arrays
uproot.source.futures.delayed_raise(*obj)
File "/Users/jpivarski/irishep/uproot5/src/uproot/source/futures.py", line 38, in delayed_raise
raise exception_value.with_traceback(traceback)
File "/Users/jpivarski/irishep/uproot5/src/uproot/behaviors/TBranch.py", line 3054, in basket_to_array
basket_arrays[basket.basket_num] = interpretation.basket_array(
File "/Users/jpivarski/irishep/uproot5/src/uproot/interpretation/objects.py", line 162, in basket_array
output = self.basket_array_forth(
File "/Users/jpivarski/irishep/uproot5/src/uproot/interpretation/objects.py", line 308, in basket_array_forth
context["forth"].vm.resume()
ValueError: 'read beyond' in AwkwardForth runtime: tried to read beyond the end of an input
>>>
>>> print(trkana["demtsh"].interpretation._complete_forth_code)
input stream
input byteoffsets
input bytestops
output node1308208521282257931-offsets int64
output node1334627086166945291-offsets int64
output node8524218671830622932-data int32
output node8524218671830622933-data int32
output node8524218671830622934-data int32
output node8524218671830622935-data int32
output node8524218671830622936-data int32
output node8524218671830622937-data int32
output node8524218671830622938-data bool
output node8524218671830622939-data bool
output node8524218671830622940-data bool
output node8524218671830622941-data bool
output node8524218671830622942-data bool
output node8524218671830622943-data float32
output node8524218671830622944-data float32
output node8524218671830622945-data float32
output node8524218671830622946-data float32
output node8524218671830622947-data int32
output node8524218671830622948-data float32
output node8524218671830622949-data float32
output node8524218671830622950-data float32
output node8524218671830622951-data float32
output node8524218671830622952-data float32
output node8524218671830622953-offsets int64
output node8524218671830622954-data float32
output node8524218671830622955-offsets int64
output node8524218671830622956-data int32
output node8524218671830622957-data int32
output node8524218671830622958-data int32
output node8524218671830622959-data int32
output node8524218671830622960-data int32
output node8524218671830622961-data int32
output node8524218671830622962-data bool
output node8524218671830622963-data bool
output node8524218671830622964-data bool
output node8524218671830622965-data bool
output node8524218671830622966-data bool
output node8524218671830622967-data float32
output node8524218671830622968-data float32
output node8524218671830622969-data float32
output node8524218671830622970-data float32
output node8524218671830622971-data int32
output node8524218671830622972-data float32
output node8524218671830622973-data float32
output node8524218671830622974-data float32
output node8524218671830622975-data float32
output node1365998019369373946-data float32
output node1365998019369373947-data float32
output node1365998019369373948-data float32
output node8524218671830622976-data int32
output node8524218671830622977-data int32
output node8524218671830622978-data int32
output node8524218671830622979-data int32
output node8524218671830622980-data int32
output node8524218671830622981-data int32
output node8524218671830622982-data bool
output node8524218671830622983-data bool
output node8524218671830622984-data bool
output node8524218671830622985-data bool
output node8524218671830622986-data bool
output node8524218671830622987-data float32
output node8524218671830622988-data float32
output node8524218671830622989-data float32
output node8524218671830622990-data float32
output node8524218671830622991-data int32
output node8524218671830622992-data float32
output node8524218671830622993-data float32
output node8524218671830622994-data float32
output node8524218671830622995-data float32
0 node1308208521282257931-offsets <- stack
0 node1334627086166945291-offsets <- stack
0 node8524218671830622953-offsets <- stack
0 node8524218671830622955-offsets <- stack
0 do
byteoffsets I-> stack
stream seek
6 stream skip
stream !I-> stack
dup node1308208521282257931-offsets +<- stack
0 do
stream !I-> stack
dup node1334627086166945291-offsets +<- stack
0 do
0 stream skip
6 stream skip
4 stream skip
stream !i-> node8524218671830622932-data
stream !i-> node8524218671830622933-data
stream !i-> node8524218671830622934-data
stream !i-> node8524218671830622935-data
stream !i-> node8524218671830622936-data
stream !i-> node8524218671830622937-data
stream !?-> node8524218671830622938-data
stream !?-> node8524218671830622939-data
stream !?-> node8524218671830622940-data
stream !?-> node8524218671830622941-data
stream !?-> node8524218671830622942-data
stream !f-> node8524218671830622943-data
stream !f-> node8524218671830622944-data
stream !f-> node8524218671830622945-data
stream !f-> node8524218671830622946-data
stream !i-> node8524218671830622947-data
stream !f-> node8524218671830622948-data
stream !f-> node8524218671830622949-data
stream !f-> node8524218671830622950-data
stream !f-> node8524218671830622951-data
2 dup node8524218671830622953-offsets +<- stack
stream #!f-> node8524218671830622952-data
2 dup node8524218671830622955-offsets +<- stack
stream #!f-> node8524218671830622954-data
stream !i-> node8524218671830622956-data
stream !i-> node8524218671830622957-data
stream !i-> node8524218671830622958-data
stream !i-> node8524218671830622959-data
stream !i-> node8524218671830622960-data
stream !i-> node8524218671830622961-data
stream !?-> node8524218671830622962-data
stream !?-> node8524218671830622963-data
stream !?-> node8524218671830622964-data
stream !?-> node8524218671830622965-data
stream !?-> node8524218671830622966-data
stream !f-> node8524218671830622967-data
stream !f-> node8524218671830622968-data
stream !f-> node8524218671830622969-data
stream !f-> node8524218671830622970-data
stream !i-> node8524218671830622971-data
stream !f-> node8524218671830622972-data
stream !f-> node8524218671830622973-data
stream !f-> node8524218671830622974-data
stream !f-> node8524218671830622975-data
0 stream skip
6 stream skip
4 stream skip
0 stream skip
6 stream skip
4 stream skip
stream !f-> node1365998019369373946-data
stream !f-> node1365998019369373947-data
stream !f-> node1365998019369373948-data
stream !i-> node8524218671830622976-data
stream !i-> node8524218671830622977-data
stream !i-> node8524218671830622978-data
stream !i-> node8524218671830622979-data
stream !i-> node8524218671830622980-data
stream !i-> node8524218671830622981-data
stream !?-> node8524218671830622982-data
stream !?-> node8524218671830622983-data
stream !?-> node8524218671830622984-data
stream !?-> node8524218671830622985-data
stream !?-> node8524218671830622986-data
stream !f-> node8524218671830622987-data
stream !f-> node8524218671830622988-data
stream !f-> node8524218671830622989-data
stream !f-> node8524218671830622990-data
stream !i-> node8524218671830622991-data
stream !f-> node8524218671830622992-data
stream !f-> node8524218671830622993-data
stream !f-> node8524218671830622994-data
stream !f-> node8524218671830622995-data
loop
loop
loop
which is auto-generated and not very readable, even with reference to the AwkwardForth documentation. But somewhere, on one of those lines, is a bug. You can remove class members from object2 and toggle whether it's successful or not? Can you find the minimum set of fields that causes it to fail? That would help in debugging, to see the error with less other stuff around. Incidentally, the >>> import uproot
>>> file = uproot.open("nts.owner.trkana-reco.version.sequencer.root")
>>> trkana = file["TrkAna"]["trkana"]
>>>
>>> array = trkana["demtshmc"].array()
>>> array
<Array [[[{pdg: 11, gen: 38, ...}, ...]]] type='1 * var * var * struct[{pdg...'>
>>> array.type.show()
1 * var * var * struct[{
pdg: int32,
gen: int32,
startCode: int32,
ambig: int32,
earlyend: int32,
rel: struct[{
_rel: int8,
_rem: int8
}, parameters={"__record__": "mu2e::MCRelationship"}],
pdg: int32,
gen: int32,
startCode: int32,
ambig: int32,
earlyend: int32,
cpos: struct[{
fCoordinates: struct[{
fX: float32,
fY: float32,
fZ: float32
}, parameters={"__record__": "ROOT::Math::Cartesian3D<float>"}]
}, parameters={"__record__": "ROOT::Math::DisplacementVector3D<ROOT::Math::Cartesian3D<float>,ROOT::Math::DefaultCoordinateSystemTag>"}]
}, parameters={"__record__": "mu2e::TrkStrawHitInfoMC"}]
>>>
>>> print(trkana["demtshmc"].interpretation._complete_forth_code)
input stream
input byteoffsets
input bytestops
output node8610624215831872535-offsets int64
output node1465212339864199009-offsets int64
output node2518444527281408638-data int32
output node2518444527281408639-data int32
output node2518444527281408640-data int32
output node2518444527281408641-data int32
output node2518444527281408642-data int32
output node3012499898703986055-data int8
output node3012499898703986056-data int8
output node2518444527281408643-data int32
output node2518444527281408644-data int32
output node2518444527281408645-data int32
output node2518444527281408646-data int32
output node2518444527281408647-data int32
output node3722848300881395695-data float32
output node3722848300881395696-data float32
output node3722848300881395697-data float32
0 node8610624215831872535-offsets <- stack
0 node1465212339864199009-offsets <- stack
0 do
byteoffsets I-> stack
stream seek
6 stream skip
stream !I-> stack
dup node8610624215831872535-offsets +<- stack
0 do
stream !I-> stack
dup node1465212339864199009-offsets +<- stack
0 do
0 stream skip
6 stream skip
4 stream skip
stream !i-> node2518444527281408638-data
stream !i-> node2518444527281408639-data
stream !i-> node2518444527281408640-data
stream !i-> node2518444527281408641-data
stream !i-> node2518444527281408642-data
0 stream skip
6 stream skip
4 stream skip
stream !b-> node3012499898703986055-data
stream !b-> node3012499898703986056-data
stream !i-> node2518444527281408643-data
stream !i-> node2518444527281408644-data
stream !i-> node2518444527281408645-data
stream !i-> node2518444527281408646-data
stream !i-> node2518444527281408647-data
0 stream skip
6 stream skip
4 stream skip
0 stream skip
6 stream skip
4 stream skip
stream !f-> node3722848300881395695-data
stream !f-> node3722848300881395696-data
stream !f-> node3722848300881395697-data
loop
loop
loop
It seems to have a lot less class members. |
Thanks, Jim! I've managed to reduce the problem down but it is very strange and seems to depend on the location of the
with the following forth code
However, ordered like this, the branch can't be read:
with the following forth code:
Here's the
I also notice there's an extra node7 when it works vs when it doesn't... Does any of that help? |
You've definitely narrowed in on it: when the class members are ordered differently, some of them are getting wrong data-reading instructions, and at the end, it runs off the end of the dataset. The The Let me try to focus on what's going wrong here. The class members that work are float wdot = 0; // one float
bool dhit = false; // one bool
XYZVectorF poca; // some superclass stuff and then three floats
bool dactive = false; // one bool The generated output node2-data float32 ( wdot )
output node3-data bool ( dhit )
output node4-data float32 ( poca.x )
output node5-data float32 ( poca.y )
output node6-data float32 ( poca.z )
output node7-data bool ( dactive ) and the generated stream !f-> node2-data ( wdot )
stream !?-> node3-data ( dhit )
0 stream skip ( superclasses of XYZVectorF, probably TObject )
6 stream skip
4 stream skip
0 stream skip
6 stream skip
4 stream skip
stream !f-> node4-data ( poca.x )
stream !f-> node5-data ( poca.y )
stream !f-> node6-data ( poca.z )
stream !?-> node7-data ( dactive ) Good. The class members that don't work are float wdot = 0; // one float
XYZVectorF poca; // some superclass stuff and then three floats
bool dhit = false; // one bool
bool dactive = false; // one bool The generated output node2-data float32 ( wdot )
output node3-data float32 ( poca.x )
output node4-data float32 ( poca.y )
output node5-data float32 ( poca.z )
output node6-data float32 ( ??? ) and the generated stream !f-> node2-data ( wdot )
0 stream skip ( superclasses of XYZVectorF, probably TObject )
6 stream skip
4 stream skip
0 stream skip
6 stream skip
4 stream skip
stream !f-> node3-data ( poca.x )
stream !f-> node4-data ( poca.y )
stream !f-> node5-data ( poca.z )
stream !f-> node6-data ( ??? ) Okay, so the wrong one seems to consistently be turning the two booleans into a single float. That's bizarre. I'm not sure whether that float is supposed to be |
The issue should be somewhere here: uproot5/src/uproot/streamers.py Lines 1000 to 1012 in d38f72a
and it's possibly going here by mistake: uproot5/src/uproot/streamers.py Lines 973 to 983 in d38f72a
I don't see anything here that seems to depend on boolean versus float, but there is a different Actually, it's very suspicious that the multiple basic types block uses The first thing I can test (when I get back to my computer) is to see if all of our tests pass with |
I'm now 90% sure that PR #1224 is a correct fix. If using that PR branch eliminates your problem, then I'll be 99.9% sure. We've never had a case that tests this combination before (described in detail in the PR). Could you make small files (<< 1 MB) for us to add to our test suite? That would ensure that it doesn't get reverted. |
Thanks, Jim! Looks like it works!
and the already-working one still works too
This is the uproot version number Thanks again! |
Thanks for the files! I've added a test and will merge the PR as soon as the tests pass. |
Hi,
We are seeing the following error when we try to read a vector<vector< object1 >> branch of a TTree:
One odd thing is that we have another vector<vector< object2 >> branch which works fine. The two objects are defined as structs and there doesn't look to be any difference in how they are implemented and added to the TTree. Any advice as to what the problem might be?
I've attached a small example file and a python script to recreate the error and show that the object2 branch works. Our uproot version is 5.3.7.
Let me know if there's any more information you need.
Thanks,
Andy
nts.owner.trkana-reco.version.sequencer.root.txt
python_script.py.txt
The text was updated successfully, but these errors were encountered: