Replies: 3 comments 2 replies
-
It may be worthwhile to map out what concerns Tonic is trying to optimize towards; memory, processing, usability, ...? I don't have hard data on it, but my gut feeling agrees with you @biphasic in that the memory efficiency is negligent when moving towards byte or uint arrays. However, if there is a need to optimize towards memory, I can't see a reason why the raw event-based data couldn't be encoded in byte tensors. During processing, the situation will naturally change. One simple heuristic to apply would be to use the least common denominator, in that any filter can choose to "augment" the datatype. If a filter receives a float tensor, it shouldn't truncate it (unless it's a part of its purpose), but it's always easy to increase it. I realize this won't provide any solid answer, but I have a hard time seeing how else one would go about this - without overengineering some fancy OOP hierarchy of course :-) |
Beta Was this translation helpful? Give feedback.
-
Unsurprising, I still think structured arrays would be the nicest solution - no idea about the issues of converting to tensors though 🤷♂️ i feel having integer coordinates would be nicer for building frames too |
Beta Was this translation helpful? Give feedback.
-
I have a different view about this.
I think Dataset should output the raw unsigned addresses and relevant transforms are applied to convert them to the right data type. This way there is an unambiguous mapping between the address of the event and its representation in PyTorch. |
Beta Was this translation helpful? Give feedback.
-
Currently, arrays of events are in NxD format and all floats. We have a separate variable ordering that decides which column encodes x,y,t,p. This provides maximum flexibility, but is not super user-friendly. @neworderofjamie made me rethink again the options to encode event data. Here is what I have until now:
I would greatly appreciate other opinions @aMarcireau @Jegp @neworderofjamie @eneftci.
Some considerations: bear in mind that while some people use the raw event ndarrays, others use event tensors and yet others create frame representations. Also event dataset encoding is very heterogenous in terms of ordering and timestamp resolution especially.
Beta Was this translation helpful? Give feedback.
All reactions