Simple and quick duplicate check #908

merbanan · 2018-12-10T23:09:10Z

Rudimentary tested with:

rtl_433 -q -Y 1 -r rtl_433_tests/tests/honeywell_activlink/01/secret-press_868.428M_250k.cu8

It happily eats the duplicates and I am quite sure that this will filter out 99% of other signal duplicates. If you have several sensors of the same protocol that cycles and produces duplicates then eventually you will get some signals filtered that shouldn't.

There is an option to add selective filtering per protocol also. That way every protocol decoder should be able to output one one signal per sample. Doing it globally might filter some protocols that output different messages one after the other.

merbanan · 2018-12-10T23:42:29Z

@zuckschwerdt the code seems to work but it is not tested in the field. Do you know any protocols where this would break functionality?

zuckschwerdt · 2018-12-11T07:40:43Z

This is a much requested feature and code looks great. I can't think up where this might break a decoder. I'll run this for a day here.

zuckschwerdt · 2018-12-15T16:23:59Z

@merbanan it looks like you accidentally put a rtl_433_tests commit on top of this?

zuckschwerdt · 2018-12-15T16:36:31Z

The registered flex decoders have protocol_num==0, would it perhaps be better to use the r_device pointer as hash key? The hash function will continue to work, 101 is choosen well, e.g. (uintptr_t)&cfg.devices[i] % 101 shows a uniform distribution.

merbanan · 2018-12-15T17:14:27Z

Gotta add that to git ignore.

zuckschwerdt · 2018-12-15T17:37:52Z

I have checked out rtl_433_tests below rtl_433 also. Git seems to ignore directories which are a repo of their own.

merbanan · 2018-12-15T18:02:28Z

It did not work for my git version, and I updated the code according to you suggestions.

zuckschwerdt · 2018-12-15T18:24:26Z

Perhaps the (int)(uintptr_t) cast should be commented with something like, "intentionally drop the high order bits on LP64".

Playing with this I noticed that this silences all messages from one decoder for a time. We should perhaps also hash some data portion, like "id" -- if we can get at that? Maybe add a helper in data.c to to iterate and hash well known field values?

merbanan · 2018-12-15T18:37:22Z

It silences it for the argument value of seconds. We could add a much more advanced dupcheck but that will require much more parsing. Which equals more cpu time. I have been pondering to add a dupcheck flag to specific decoders whose transmitting counter parts emit long signals. Then we could add this filtering by default for those signals. And then we could add use -Y 0 to disable the duplicate filer och -Y >0 to globally filer duplicates. It is like 10 more lines of code and the out of the box behavior would be 1 signal gives 1 message with overrides when needed.

So what do you think is best? Do we want that flag to be available for decoders? It might be miss used. But I think that the user experience will be better if we implement it.

The special cast is there to silence a warning, the reduction of bits is a side effect but I'll add a comment.

zuckschwerdt · 2018-12-16T10:14:53Z

People might have multiple sensors that only differ in the ID, and there are sensors that send multiple packets of different type (wind, rain, ...) and not even differ in the id but other payload.

We could implement a data_hash function to traverse the data, pick up all values and hash that.
It could be built similar to a data output.

Or we could retain the last send message per decoder and do a tree compare.

Another idea (like your flag I guess), have the decoders which are known for long repeats add a "hash" themselves and store that. It will likely just be the whole 24 bits of message :) With no hash always pass through, otherwise if the hash matches throttle.
That would require adding that to all decoders but also put the "hash" where it is most efficient and perhaps even allow that value to be passed downstream for collation there.

merbanan · 2018-12-16T11:38:20Z

The thing is I don't want to do a data_make() to create a structure that we later need to parse just so we can be able to hash it properly. We need to add the thing that is to be hashed in the proper place so that it is fast to get later. So no tree compare just 32bits with as high entropy as possible.

If we add a member to data_t we can populate it after data_make() in the relevant decoders. Then later it is just a de-reference and possible hash to know if it is a duplicate. I think that addresses all your points. If so I will implement that.

gabrielklein · 2019-08-26T21:29:19Z

Sorry to ask, what is the status of this pull request?

zuckschwerdt · 2019-08-27T06:49:12Z

It's still recommended to do duplicate (and plausibility) filtering in your consumer. The idea here needs more thought and possibly a change to all existing decoders.

rtl_433's tasks is to decode every possible radio packet and give it to you. Coalescing and aggregation is an interesting topic but likely dependant on the specific use-case.

If you have some ideas and don't want to touch the C code, perhaps prototype a Python filter script (examples) for discussion.

gabrielklein · 2019-08-27T07:46:18Z

I understand your point and agree that "multi-firing" is often part of the protocol.

I saw a few different cases in my items.

Multi-firing is used as a way to do some redundancy. As an example I have a temperature module that send a temperature 5-6 times every minute. => We should send it only once.
Multi-firing is used as a way to have a "still pressing" approach. => We should send the "start" and the "end" in some case, just the start in other cases.

It's often just a repetition of the same message.

zuckschwerdt · 2019-08-27T08:21:52Z

If the data packet is repeated in the same transmission we already only output a single event. When there is too much time between repeats we can't catch that in a single decoder call though. We'd need to suppress subsequent events. But that asks for which decoders and what message contents do we consider something a "repeat". That's not easy to capture and decide.

There are ideas like maybe some opt-in from decoders: a decoder could add a "hash" key if dups are expected and the hash should then associate those. But that needs some prototyping and testing.

gabrielklein · 2019-08-27T09:32:17Z

I totally agree - I think we cannot avoid some manual configuration at some point :( This project is trying to build a "babel tower" (and is quite successful) - but I already see that some of my sensors not well decoded and I have to blacklist some decoders :(

When I see the signal of some of my devices, it's just an id basically. Everything could be decoded as an id.

zuckschwerdt · 2019-08-27T09:53:19Z

some of my sensors not well decoded and I have to blacklist some decoders :(

We already filter out the worst offenders (prone to false positives). That's why we really don't recommend to use -G. If you see systematic false positives open an issue with samples.

When I see the signal of some of my devices, it's just an id basically. Everything could be decoded as an id.

I don't follow. Do you mean beacons, motion, doorbell-style senders?

gabrielklein · 2019-08-27T10:08:30Z

Yes, I have a doorbell and a few motion sensors :)

Lot of problems with : [86] Wireless Smoke and Heat Detector GS 558 and some of these just have an id.
I can get an id with urh (that is similar to the id detected by rtl_433 in a -vvvv mode).

But I think the discussion is out of the scope of this ticket now. Sorry for that!

merbanan · 2019-08-28T14:14:04Z

The idea is to add dupcheck for specific decoders that has known repeats larger then the signal length. And then maybe also globally for all signals. One idea I had is that maybe the dupcheck only add a duplicate tag. Then it is non distructive and we help the consumer in its filtering process.

Anyway dupcheck would be a nice option for people who would like to use it.

unnilennium · 2019-09-04T02:35:03Z

there is already an option to translate value to SI, this is helpful if you have some sensors output in F and some in C.

I would be more than happy to have this filter added as an option, you can enable if you want the same way you can do for converting output to SI.

I have few sensors with this issue :

acurite_txr_decode: row 0 bits 57, bytes 8 
acurite_txr_decode: Parity: 1000000
{"time" : "2019-09-03 22:29:20", "model" : "Acurite tower sensor", "id" : 11303, "sensor_id" : 11303, "channel" : "C", "temperature_C" : 17.000, "humidity" : 55, "battery_low" : 0}
acurite_txr_decode: row 1 bits 57, bytes 8 
acurite_txr_decode: Parity: 1000000
{"time" : "2019-09-03 22:29:20", "model" : "Acurite tower sensor", "id" : 11303, "sensor_id" : 11303, "channel" : "C", "temperature_C" : 17.000, "humidity" : 55, "battery_low" : 0}
acurite_txr_decode: row 2 bits 57, bytes 8 
acurite_txr_decode: Parity: 1000000
{"time" : "2019-09-03 22:29:20", "model" : "Acurite tower sensor", "id" : 11303, "sensor_id" : 11303, "channel" : "C", "temperature_C" : 17.000, "humidity" : 55, "battery_low" : 0}

acurite_986_decode: row 0 bits 2, bytes 5 
acurite_986_decode: row 1 bits 0, bytes 5 
acurite_986_decode: row 2 bits 0, bytes 5 
acurite_986_decode: row 3 bits 0, bytes 5 
acurite_986_decode: row 4 bits 42, bytes 5 
acurite_986_decode: reversed: {40} 06 60 f7 00 15 : 00000110 01100000 11110111 00000000 00010101 
acurite_986_decode: sensor 0x60f7 - 1R: 6 F
{"time" : "2019-09-03 22:30:38", "model" : "Acurite 986 Sensor", "id" : 24823, "channel" : "1R", "temperature_C" : -14.444, "battery" : "OK", "status" : 0}
acurite_986_decode: row 5 bits 0, bytes 5 
acurite_986_decode: row 6 bits 0, bytes 5 
acurite_986_decode: row 7 bits 0, bytes 5 
acurite_986_decode: row 8 bits 39, bytes 5 
acurite_986_decode: reversed: {40} 06 60 f7 00 15 : 00000110 01100000 11110111 00000000 00010101 
acurite_986_decode: sensor 0x60f7 - 1R: 6 F
{"time" : "2019-09-03 22:30:38", "model" : "Acurite 986 Sensor", "id" : 24823, "channel" : "1R", "temperature_C" : -14.444, "battery" : "OK", "status" : 0}

bendur · 2023-11-30T21:21:53Z

Just chiming in that I'd love to have a flag that filters out duplicates. I guess I could do it on the client side but I'd rather not have to do that.

Mart124 · 2024-01-15T13:10:22Z

Just chiming in that I'd love to have a flag that filters out duplicates. I guess I could do it on the client side but I'd rather not have to do that.

+1 😉 👍

Simple and quick duplicate check

21065fc

This was referenced Dec 10, 2018

Acurite 5N1 triple messages #587

Closed

[FEATURE] Option to prevent a device from "firing" multiple times #664

Closed

merbanan force-pushed the feat-dup_check branch from f1dd693 to 21065fc Compare December 15, 2018 17:13

merbanan added 2 commits December 15, 2018 19:00

Change to r_dev for key

a885462

Add rtl_433_tests to .gitignore

63e2294

Add debug output

d2f5d11

zuckschwerdt force-pushed the master branch from e7f8b93 to ccbc568 Compare December 31, 2018 11:25

zuckschwerdt force-pushed the master branch 2 times, most recently from af2fd65 to 44a5c13 Compare February 16, 2019 16:44

zuckschwerdt force-pushed the master branch from 206c063 to 6b6b893 Compare April 25, 2019 10:14

zuckschwerdt added the wip Work In Progress (for PRs only) label May 6, 2019

merbanan mentioned this pull request May 24, 2019

Add DirecTV RC66RX Remote Control #1063

Closed

zuckschwerdt force-pushed the master branch from e7983ef to dbe6e58 Compare August 20, 2019 17:00

zuckschwerdt force-pushed the master branch 2 times, most recently from 9f71ca8 to 3d36258 Compare September 9, 2019 10:15

zuckschwerdt force-pushed the master branch 3 times, most recently from 5f8552a to 6eb4c3e Compare September 21, 2019 19:54

zuckschwerdt force-pushed the master branch from 1198aa2 to 429f115 Compare December 2, 2019 17:34

zuckschwerdt force-pushed the master branch 2 times, most recently from 8e31ece to e42ad33 Compare December 26, 2021 11:10

zuckschwerdt force-pushed the master branch from d8293a7 to 63514de Compare March 3, 2022 13:20

zuckschwerdt force-pushed the master branch from 9fafe69 to 15637a2 Compare September 7, 2022 13:14

zuckschwerdt force-pushed the master branch from 5b2a938 to 3603446 Compare January 22, 2023 14:25

zuckschwerdt force-pushed the master branch from 976371b to fe3df68 Compare March 30, 2023 09:19

zuckschwerdt force-pushed the master branch 2 times, most recently from 93031a1 to ed794ec Compare August 14, 2023 11:47

This was referenced Sep 24, 2023

Add stateful processing to clean up device semantics #2635

Open

Add default-on feature to suppress duplicate reports #2640

Open

zuckschwerdt force-pushed the master branch from 7b88ef5 to cf4c74d Compare December 24, 2023 15:31

zuckschwerdt force-pushed the master branch from 1613179 to 25a0a97 Compare March 12, 2024 13:43

zuckschwerdt mentioned this pull request Jul 31, 2024

repo has random branches, for which it is not clear there is a reason to remain #3015

Open

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple and quick duplicate check #908

Simple and quick duplicate check #908

merbanan commented Dec 10, 2018

merbanan commented Dec 10, 2018

zuckschwerdt commented Dec 11, 2018

zuckschwerdt commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 16, 2018

merbanan commented Dec 16, 2018

gabrielklein commented Aug 26, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

merbanan commented Aug 28, 2019

unnilennium commented Sep 4, 2019

bendur commented Nov 30, 2023 •

edited

Loading

Mart124 commented Jan 15, 2024

Simple and quick duplicate check #908

Are you sure you want to change the base?

Simple and quick duplicate check #908

Conversation

merbanan commented Dec 10, 2018

merbanan commented Dec 10, 2018

zuckschwerdt commented Dec 11, 2018

zuckschwerdt commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 15, 2018

merbanan commented Dec 15, 2018

zuckschwerdt commented Dec 16, 2018

merbanan commented Dec 16, 2018

gabrielklein commented Aug 26, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

zuckschwerdt commented Aug 27, 2019

gabrielklein commented Aug 27, 2019

merbanan commented Aug 28, 2019

unnilennium commented Sep 4, 2019

bendur commented Nov 30, 2023 • edited Loading

Mart124 commented Jan 15, 2024

bendur commented Nov 30, 2023 •

edited

Loading