Gsoc phase2 #51

akshitpatel01 · 2021-07-05T11:34:52Z

Hi @DvdMgr, @mattia-lecci,
I have added a new example logging-example2.py and the phase 2 functions process_logs, insert_logs, parse_logs, and filter_logs. It would be great if you could review this code.

mattia-lecci

It looks already in very good shape!

sem/utils.py

DvdMgr

Hi @akshitpatel01, nice job!

A general comment: now that we have some basic functionality that uses TinyDB, I think we should test where the limit is. Can you generate a very large log file (in the order of 100s of MBs, up to 1 GB) and measure how long it takes to read, parse, filter, and so on? I'd like to know whether we should jump ship and move to something else while there's still time.

Next step: tests :) while playing around with the code, I tried adding the following components to logging_example2.py: 'WifiPhy': 'level_all', 'FrameExchangeManager': 'level_all'. A couple things seemed to go wrong - I think that's a good starting point to understand how to improve the code!

examples/logging_example2.py

sem/utils.py

akshitpatel01 · 2021-07-12T11:40:43Z

I found this while working on tests. Consider these two logs:

+0.045510017s 1 [mac=00:00:00:00:00:01] FrameExchangeManager:RxStartIndication(0x5576595683e0, "PSDU reception started for ", +76us, " (txVector: ", txpwrlvl: 17 preamble: LONG channel width: 20 GI: 800 NTx: 97 Ness: 0 MPDU aggregation: 0 STBC: 0 FEC coding: BCC mode: OfdmRate6Mbps Nss: 1, ")")
+0.000000000s -1 WifiPhy:SetChannelNumber(): [DEBUG] (Message)

If we greedily match the arguments (i.e. like this: .*?), the first log will not be matched correctly and if we do not match greedily (i.e. like this .*), the second log will not be matched correctly. Matched correctly here means that these logs do get matched by the regex, but the groups are incorrectly formed.

Note: The second log is made manually for testing purposes and the first log is an actual log from FrameExchangeManager.

Can you guys think of any workarounds?

DvdMgr · 2021-07-12T11:58:41Z

Can you guys think of any workarounds?

I get the following output when applying parse_logs to a file containing the logs you described above:

[{'Time': 0.045510017,
  'Context': '1',
  'Extended_context': '[mac=00:00:00:00:00:01] ',
  'Component': 'FrameExchangeManager',
  'Function': 'RxStartIndication',
  'Arguments': '0x5576595683e0, "PSDU reception started for ", +76us, " (txVector: ", txpwrlvl: 17 preamble: LONG channel width: 20 GI: 800 NTx: 97 Ness: 0 MPDU aggregation: 0 STBC: 0 FEC coding: BCC mode: OfdmRate6Mbps Nss: 1, "',
  'Severity_class': 'FUNCTION',
  'Message': ''},
 {'Time': 0.0,
  'Context': '-1',
  'Extended_context': None,
  'Component': 'WifiPhy',
  'Function': 'SetChannelNumber',
  'Arguments': '',
  'Severity_class': 'DEBUG',
  'Message': '(Message)'}]

I'm not sure I get what is incorrect in this parsing - can you explain in further detail? Is it the arguments in the first entry?

mattia-lecci

just a couple of minor comments

mattia-lecci · 2021-07-12T09:50:25Z

sem/utils.py

        'Time': timestamp,  # float
        'Context': context/nodeId,  # str
+        'Extended_Context': ,   #str
        'Component': log component,  # str
        'Function': function name,  # str
        'Arguments': function arguments,  # str
-        'Level': log level,  # str
+        'Severity_class': log severity class,  # str
        'Message': log message  # str


i'm not sure if this is being too picky, but dictionary keys are usually lower case (lower snake case to be precise). any opinion @DvdMgr ?

sem/utils.py

akshitpatel01 · 2021-07-12T12:25:54Z

I'm not sure I get what is incorrect in this parsing - can you explain in further detail? Is it the arguments in the first entry?

Yes. The arguments should be:
'0x5576595683e0, "PSDU reception started for ", +76us, " (txVector: ", txpwrlvl: 17 preamble: LONG channel width: 20 GI: 800 NTx: 97 Ness: 0 MPDU aggregation: 0 STBC: 0 FEC coding: BCC mode: OfdmRate6Mbps Nss: 1, ")" '

But I think I solved it by updating the regex.
newRegex = ^OldRegex$

mattia-lecci · 2021-07-12T13:46:39Z

But I think I solved it by updating the regex.
newRegex = ^OldRegex$

That's great (i was also missing the last parenthesis, thank you Davide for asking).
Although, I am not sure what you mean with this last sentence. Could you explain the new regex?

akshitpatel01 · 2021-07-12T14:33:44Z

The '^$' ensures that the regex begins and ends with the regex. So in other words, if after matching a log with this regex, there are some trailing characters still present, then previously the regex will match (some part of the log will match) but now it will not.

akshitpatel01 · 2021-07-15T08:30:16Z

Can you generate a very large log file (in the order of 100s of MBs, up to 1 GB) and measure how long it takes to read, parse, filter, and so on

The following are the statistics for reading a file, parsing it, and executing two filters:

101MB: 7.711092710494995s
205MB: 16.106702089309692s
308MB: 24.7750563621521s
411MB: 33.843762159347534s (8s for two filters)
513MB: 47.96710252761841s
617MB: 55.62735986709595s
719MB: 64.37102270126343s
815MB: 73.82458209991455s (17s for two filters)
895MB: 77.21614909172058s (18s for two filters)
964MB: 86.9548716545105s (19s for two filters)

I think these numbers will be highly system-specific. Particularly on my system, the main bottleneck was RAM(I have 16G) for bigger files.

mattia-lecci

just a few minor comments

sem/utils.py

tests/test_utils.py

mattia-lecci · 2021-07-16T08:36:44Z

sem/utils.py

-        if isinstance(sevirity_class, str):
-            sevirity_class = [sevirity_class]
-
+    if severity_class is not None or components is not None:


components was check but never used. is there a reason for this?

Refer 65afd95.

tests/test_utils.py

sem/utils.py

akshitpatel01 · 2021-07-17T08:03:03Z

The following are the statistics for reading a file, parsing it, and executing two filters:

This execution time almost reduces by half if I change this line in insert_logs:
db.table('logs').insert_multiple(deepcopy(logs))
to
db.table('logs').insert_multiple(logs).

Does this improvement in performance justify removing this deepcopy()? Also, all these functions will be internally called by the backend so the user will never call these functions directly.

akshitpatel01 · 2021-07-20T09:47:07Z

I have also added profiling plots here.

sem/utils.py

tests/test_utils.py

mattia-lecci · 2021-07-22T08:18:48Z

This execution time almost reduces by half if I change this line in insert_logs:
db.table('logs').insert_multiple(deepcopy(logs))
to
db.table('logs').insert_multiple(logs).

Deepcopy is quite aggressive and expensive, it makes sense that runtime increase if you use it.
Since we never modify the original log file nor the log dictionaries, it might make sense to avoid deepcopy the logs within the program

DvdMgr

Akshit, good job! I'm good to merge after these final comments are implemented.

sem/utils.py

DvdMgr

Almost there! I found a couple of minor issues to fix before merging. Another thing that was mentioned during the call was the plan to move these functions into a logging.py file - do you think you can apply that change to this pull request too?

sem/utils.py

examples/logging_example2.py

akshitpatel01 · 2021-07-28T06:16:13Z

Another thing that was mentioned during the call was the plan to move these functions into a logging.py

I created a dedicated logging file in the last commit. Do we want to create a new test_logging.py or keep the tests in test_utils?

DvdMgr · 2021-07-28T06:41:55Z

I created a dedicated logging file in the last commit. Do we want to create a new test_logging.py or keep the tests in test_utils?

Best to move all those tests to test_logging.py, you are right!

DvdMgr · 2021-07-28T07:52:44Z

Squashed and merged in 6ba05d1: thanks for the effort :)

akshitpatel01 · 2021-07-28T10:46:00Z

Hi,
There is a minor issue in sem.logging::filter_logs(). The following line:
if query_final is not None:
needs to be changed to
if query_final != []:.
Should I push a commit on my local repository for this?

DvdMgr · 2021-07-28T10:47:20Z

Hi,
There is a minor issue in sem.logging::filter_logs(). The following line:
if query_final is not None:
needs to be changed to
if query_final != []:.
Should I push a commit on my local repository for this?

No problem, I'll make the change myself and force-push it to the gsoc branch!

DvdMgr · 2021-07-28T10:48:51Z

Done!

akshitpatel01 added 4 commits July 2, 2021 15:23

Add process_logs, parse_logs and insert_logs

54c20c5

Add filter_logs and documentation

f12f04e

Add example for phase 2 functions

7d2e671

Small fix

7c3c58a

mattia-lecci reviewed Jul 7, 2021

View reviewed changes

DvdMgr reviewed Jul 7, 2021

View reviewed changes

akshitpatel01 added 5 commits July 9, 2021 11:26

Small Fixes

848bf13

Updated parse_log regex, add wipe results and small fixes

c4b6fc1

Updated regex and add extended context

2cc7a6e

Add return info to all remaining functions in docstring description

faed867

changed level to sevirity_class

236ab77

mattia-lecci reviewed Jul 12, 2021

View reviewed changes

akshitpatel01 added 5 commits July 12, 2021 20:51

Updated regex and small fixes

950124b

Add tests

8e5171f

Update tests

b06c7b6

Small fixes

d39ea56

Update more tests for parse logs

8d35dae

akshitpatel01 requested review from mattia-lecci and DvdMgr July 14, 2021 10:57

mattia-lecci reviewed Jul 15, 2021

View reviewed changes

akshitpatel01 added 3 commits July 16, 2021 11:41

Updated regex readability

8832ada

Add check for types in filter_logs and small fixes

404fd9a

Samll fixes

473227d

mattia-lecci reviewed Jul 16, 2021

View reviewed changes

akshitpatel01 added 6 commits July 17, 2021 11:47

Update filter_logs structure and small fixes

65afd95

Small Fix

8d0afbb

Small fixes in test_utils

c8e8db3

Updated log list in test_utils

3dc3eba

Convert dictionary keys to lowercase

df1ece8

Added documentation and small fixes

7f32644

mattia-lecci reviewed Jul 22, 2021

View reviewed changes

sem/utils.py Outdated Show resolved Hide resolved

tests/test_utils.py Outdated Show resolved Hide resolved

tests/test_utils.py Outdated Show resolved Hide resolved

tests/test_utils.py Outdated Show resolved Hide resolved

DvdMgr requested changes Jul 22, 2021

View reviewed changes

akshitpatel01 added 2 commits July 24, 2021 13:58

Update test_filters and small fixes

1d87256

Small Fixes

343258d

DvdMgr reviewed Jul 27, 2021

View reviewed changes

sem/utils.py Outdated Show resolved Hide resolved

examples/logging_example2.py Outdated Show resolved Hide resolved

akshitpatel01 added 3 commits July 28, 2021 11:10

Updated logging_example2.py

676724d

Small fix

c06ee22

Create logging.py

d5796a6

Add test_logging.py

eb310f4

DvdMgr closed this Jul 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gsoc phase2 #51

Gsoc phase2 #51

akshitpatel01 commented Jul 5, 2021

mattia-lecci left a comment

DvdMgr left a comment

akshitpatel01 commented Jul 12, 2021

DvdMgr commented Jul 12, 2021

mattia-lecci left a comment

mattia-lecci Jul 12, 2021

akshitpatel01 commented Jul 12, 2021 •

edited

Loading

mattia-lecci commented Jul 12, 2021

akshitpatel01 commented Jul 12, 2021

akshitpatel01 commented Jul 15, 2021

mattia-lecci left a comment

mattia-lecci Jul 16, 2021

akshitpatel01 Jul 17, 2021

akshitpatel01 commented Jul 17, 2021

akshitpatel01 commented Jul 20, 2021

mattia-lecci commented Jul 22, 2021

DvdMgr left a comment

DvdMgr left a comment

akshitpatel01 commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

DvdMgr commented Jul 28, 2021 •

edited

Loading

akshitpatel01 commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

Gsoc phase2 #51

Gsoc phase2 #51

Conversation

akshitpatel01 commented Jul 5, 2021

mattia-lecci left a comment

Choose a reason for hiding this comment

DvdMgr left a comment

Choose a reason for hiding this comment

akshitpatel01 commented Jul 12, 2021

DvdMgr commented Jul 12, 2021

mattia-lecci left a comment

Choose a reason for hiding this comment

mattia-lecci Jul 12, 2021

Choose a reason for hiding this comment

akshitpatel01 commented Jul 12, 2021 • edited Loading

mattia-lecci commented Jul 12, 2021

akshitpatel01 commented Jul 12, 2021

akshitpatel01 commented Jul 15, 2021

mattia-lecci left a comment

Choose a reason for hiding this comment

mattia-lecci Jul 16, 2021

Choose a reason for hiding this comment

akshitpatel01 Jul 17, 2021

Choose a reason for hiding this comment

akshitpatel01 commented Jul 17, 2021

akshitpatel01 commented Jul 20, 2021

mattia-lecci commented Jul 22, 2021

DvdMgr left a comment

Choose a reason for hiding this comment

DvdMgr left a comment

Choose a reason for hiding this comment

akshitpatel01 commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

DvdMgr commented Jul 28, 2021 • edited Loading

akshitpatel01 commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

DvdMgr commented Jul 28, 2021

akshitpatel01 commented Jul 12, 2021 •

edited

Loading

DvdMgr commented Jul 28, 2021 •

edited

Loading