Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic profiler and report generation module integration #824

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

pankajskku
Copy link
Member

Why are these changes needed?

I added the modules for generating the report based on the syntactic and semantic features present in the code. It is an extension of the existing code profiler.

Related issue number (if any).

Signed-off-by: aishwariyachakraborty <aishwariya.chakraborty@gmail.com>
	Added UAST_Package_List column

Signed-off-by: Pankaj Thorat <thorat.pankaj9@gmail.com>
Signed-off-by: aishwariyachakraborty <aishwariya.chakraborty@gmail.com>
Signed-off-by: Pankaj Thorat <thorat.pankaj9@gmail.com>
@touma-I
Copy link
Collaborator

touma-I commented Nov 26, 2024

Why are these changes needed?

I added the modules for generating the report based on the syntactic and semantic features present in the code. It is an extension of the existing code profiler.

Related issue number (if any).

@pankajskku I noticed that the notebook was not updated. Do any of the new changes impact how an end users leverage this transform ?

@pankajskku
Copy link
Member Author

Why are these changes needed?

I added the modules for generating the report based on the syntactic and semantic features present in the code. It is an extension of the existing code profiler.

Related issue number (if any).

@pankajskku I noticed that the notebook was not updated. Do any of the new changes impact how an end users leverage this transform ?

Thank you for pointing out.
I have updated the notebook in the latest amends to the PR.

Copy link
Collaborator

@touma-I touma-I left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one of the files is missing the copyright block

@pankajskku
Copy link
Member Author

one of the files is missing the copyright block

Change is updated in the PR.

Copy link
Collaborator

@touma-I touma-I left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When trying to run the test example on my MacOS, I discovered a dependency on c_sharp-bindings.so. Do we want to support MacOS users running the code and if so, can you please direct me to the instruction to properly run the code on my MacOS? Thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajskku I tried to run make run-local-python-sample on my macOS and got the following error below. I could not find anything in the readme.md to guide me. Is it possible to update the README.md to provide additional configuration needed to run the sample code.

Bindings bindings_dir: /Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c
Bindings path: /Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64
14:41:42 ERROR - Exception creating transform  dlopen(/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so, 0x0006): tried: '/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (no such file), '/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (not a mach-o file)
Traceback (most recent call last):
  File "/Users/touma/data-prep-kit-code-profiler/data-processing-lib/python/src/data_processing/runtime/pure_python/transform_file_processor.py", line 51, in __init__
    self.transform = transform_class(self.transform_params)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/code_profiler_transform.py", line 92, in __init__
    CSHARP_LANGUAGE = Language(os.path.join(bindings_path, 'c_sharp-bindings.so'), 'c_sharp')
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/venv/lib/python3.11/site-packages/tree_sitter/__init__.py", line 132, in __init__
    self.lib = cdll.LoadLibrary(fspath(path_or_ptr))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ctypes/__init__.py", line 454, in LoadLibrary
    return self._dlltype(name)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.10/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: dlopen(/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so, 0x0006): tried: '/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (not a mach-o file), '/System/Volumes/Preboot/Cryptexes/OS/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (no such file), '/Users/touma/data-prep-kit-code-profiler/transforms/code/code_profiler/python/src/tree-sitter-bindings-a2ed8cfe-8fa8-49fd-9ffa-f78a8b10c08c/x86_64/c_sharp-bindings.so' (not a mach-o file)

Copy link
Member Author

@pankajskku pankajskku Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Maroun,

I have updated the README to guide the user on how to run the transform on their host.
The code profiler can be run on mach-arm64 and x86_64 host architectures. Please change the RUNTIME_HOST_ARCH in the Makefile depending on your host architecture.

#values possible mach-arm64, x86_64
export RUNTIME_HOST_ARCH=x86_64

As these are .so bindings, you may need to permit your Mac to load them from the security settings. Generally, you get the pop-up here under the tab security. If not, I would recommend you use x86_64 arch.

image.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajskku is there a reason every time we run the code, we create a new folder src/tree-sitter-bindings-* each having 162MB ? What is the reason for having a uuid in the folder name ? Also, in addition to src/tree-sitter-bindings-, I also have copies of the same files in python python3.11/site-packages/tree-sitter-bindings- . Maybe a call to discuss how these files are used/delivered may be needed. Please give it some thoughts on how we can simplify things maybe be even deliver the files as part of the pip install of the transform. I will schedule a call for Monday morning if that is ok with you. Thanks

Copy link
Member Author

@pankajskku pankajskku Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Maroun,

The tree-sitter bindings convert the source code to the abstract syntax tree. Each language has its bindings. The bindings are cloned from a public repo (https://github.com/pankajskku/tree-sitter-bindings/tree/main) and deleted after the program exits. But, the failure case wasn't handled properly therefore, the cloned folder wasn't deleted. I have added a check to handle the exception and clean the bindings .so in the updated PR. I couldn't find python3.11/site-packages/tree-sitter-bindings in my venv. We can also discuss this on Monday. Thanks.

Added the modules for generating the report based on the syntactic and semantic feature present in the code

Signed-off-by: Pankaj Thorat <thorat.pankaj9@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants