-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple and quick duplicate check #908
base: master
Are you sure you want to change the base?
Conversation
@zuckschwerdt the code seems to work but it is not tested in the field. Do you know any protocols where this would break functionality? |
This is a much requested feature and code looks great. I can't think up where this might break a decoder. I'll run this for a day here. |
@merbanan it looks like you accidentally put a rtl_433_tests commit on top of this? |
The registered flex decoders have protocol_num==0, would it perhaps be better to use the |
f1dd693
to
21065fc
Compare
Gotta add that to git ignore. |
I have checked out rtl_433_tests below rtl_433 also. Git seems to ignore directories which are a repo of their own. |
It did not work for my git version, and I updated the code according to you suggestions. |
Perhaps the (int)(uintptr_t) cast should be commented with something like, "intentionally drop the high order bits on LP64". Playing with this I noticed that this silences all messages from one decoder for a time. We should perhaps also hash some data portion, like "id" -- if we can get at that? Maybe add a helper in data.c to to iterate and hash well known field values? |
It silences it for the argument value of seconds. We could add a much more advanced dupcheck but that will require much more parsing. Which equals more cpu time. I have been pondering to add a dupcheck flag to specific decoders whose transmitting counter parts emit long signals. Then we could add this filtering by default for those signals. And then we could add use -Y 0 to disable the duplicate filer och -Y >0 to globally filer duplicates. It is like 10 more lines of code and the out of the box behavior would be 1 signal gives 1 message with overrides when needed. So what do you think is best? Do we want that flag to be available for decoders? It might be miss used. But I think that the user experience will be better if we implement it. The special cast is there to silence a warning, the reduction of bits is a side effect but I'll add a comment. |
People might have multiple sensors that only differ in the ID, and there are sensors that send multiple packets of different type (wind, rain, ...) and not even differ in the id but other payload. We could implement a data_hash function to traverse the data, pick up all values and hash that. Or we could retain the last send message per decoder and do a tree compare. Another idea (like your flag I guess), have the decoders which are known for long repeats add a "hash" themselves and store that. It will likely just be the whole 24 bits of message :) With no hash always pass through, otherwise if the hash matches throttle. |
The thing is I don't want to do a data_make() to create a structure that we later need to parse just so we can be able to hash it properly. We need to add the thing that is to be hashed in the proper place so that it is fast to get later. So no tree compare just 32bits with as high entropy as possible. If we add a member to data_t we can populate it after data_make() in the relevant decoders. Then later it is just a de-reference and possible hash to know if it is a duplicate. I think that addresses all your points. If so I will implement that. |
af2fd65
to
44a5c13
Compare
Sorry to ask, what is the status of this pull request? |
It's still recommended to do duplicate (and plausibility) filtering in your consumer. The idea here needs more thought and possibly a change to all existing decoders. rtl_433's tasks is to decode every possible radio packet and give it to you. Coalescing and aggregation is an interesting topic but likely dependant on the specific use-case. If you have some ideas and don't want to touch the C code, perhaps prototype a Python filter script (examples) for discussion. |
I understand your point and agree that "multi-firing" is often part of the protocol. I saw a few different cases in my items.
It's often just a repetition of the same message. |
If the data packet is repeated in the same transmission we already only output a single event. When there is too much time between repeats we can't catch that in a single decoder call though. We'd need to suppress subsequent events. But that asks for which decoders and what message contents do we consider something a "repeat". That's not easy to capture and decide. There are ideas like maybe some opt-in from decoders: a decoder could add a "hash" key if dups are expected and the hash should then associate those. But that needs some prototyping and testing. |
I totally agree - I think we cannot avoid some manual configuration at some point :( This project is trying to build a "babel tower" (and is quite successful) - but I already see that some of my sensors not well decoded and I have to blacklist some decoders :( When I see the signal of some of my devices, it's just an id basically. Everything could be decoded as an id. |
We already filter out the worst offenders (prone to false positives). That's why we really don't recommend to use
I don't follow. Do you mean beacons, motion, doorbell-style senders? |
Yes, I have a doorbell and a few motion sensors :)
But I think the discussion is out of the scope of this ticket now. Sorry for that! |
The idea is to add dupcheck for specific decoders that has known repeats larger then the signal length. And then maybe also globally for all signals. One idea I had is that maybe the dupcheck only add a duplicate tag. Then it is non distructive and we help the consumer in its filtering process. Anyway dupcheck would be a nice option for people who would like to use it. |
there is already an option to translate value to SI, this is helpful if you have some sensors output in F and some in C. I would be more than happy to have this filter added as an option, you can enable if you want the same way you can do for converting output to SI. I have few sensors with this issue :
|
9f71ca8
to
3d36258
Compare
5f8552a
to
6eb4c3e
Compare
8e31ece
to
e42ad33
Compare
93031a1
to
ed794ec
Compare
Just chiming in that I'd love to have a flag that filters out duplicates. I guess I could do it on the client side but I'd rather not have to do that. |
+1 😉 👍 |
Rudimentary tested with:
rtl_433 -q -Y 1 -r rtl_433_tests/tests/honeywell_activlink/01/secret-press_868.428M_250k.cu8
It happily eats the duplicates and I am quite sure that this will filter out 99% of other signal duplicates. If you have several sensors of the same protocol that cycles and produces duplicates then eventually you will get some signals filtered that shouldn't.
There is an option to add selective filtering per protocol also. That way every protocol decoder should be able to output one one signal per sample. Doing it globally might filter some protocols that output different messages one after the other.