Skip to content

script is designed to analyze Python libraries and extract detailed information about their elements, such as classes, methods, functions, properties, and more. The analysis results can be saved to a JSON file for further inspection.

Notifications You must be signed in to change notification settings

JacquesGariepy/library-analyzer

Repository files navigation

Overview

The script library_analyzer.py in the JacquesGariepy/library-analyzer repository is designed to analyze Python libraries and extract detailed information about their elements, such as classes, methods, functions, properties, and more. The analysis results can be saved to a JSON file for further inspection.

Capabilities of the Code:

  • Analyze Python Libraries: The script can analyze Python libraries and extract detailed information about various elements within the library.
  • Element Types Identified: It identifies and categorizes elements such as classes, methods, functions, properties, modules, variables, enums, constants, dataclasses, coroutines, generators, descriptors, exceptions, and protocols.
  • Extract Type Information: The script can safely evaluate and extract type information for various elements.
  • Extract Signatures: It can extract function/method signatures and other relevant details such as docstrings, parameter types, and return types.
  • Class Analysis: The script provides detailed information about classes, including base classes, methods, properties, and type hints.
  • Dataclass and Enum Analysis: It can analyze dataclasses and enums, extracting field types and enum values.
  • Save Analysis Results: The analysis results can be saved to a JSON file for further inspection and documentation.
  • Error Handling: The script includes error handling to capture and log errors encountered during the analysis process.
  • Semantic Search: The script includes semantic search functionality using Whoosh and BERT to index and search extracted text data from the analysis results.

This script is part of a larger ongoing project aimed at creating a comprehensive tool for analyzing and documenting Python libraries. The project aims to provide insights into the structure and content of libraries, aiding developers in understanding and utilizing various libraries efficiently.

Usage

To use the script, run it from the command line with the name of the library to analyze:

python -m library_analyzer <library_name> [search_query]

Using Docker

You can also use Docker to run the script. First, build the Docker image:

docker build -t library-analyzer .

Then, run the container with the necessary arguments:

docker run --rm -v $(pwd):/app library-analyzer <library_name> [search_query]

Output

The analysis results include metadata about the library, such as its name, version, file location, and documentation, as well as detailed information about each element in the library. The results can be saved to a JSON file for further inspection. The filename includes the library version and increments if the file already exists. Look for files like openai_analysis_v1.0.json or openai_analysis_v1.0_1.json for the analysis results of the openai library.

(json output) 1732065708571

Semantic Search

The script now includes semantic search functionality using Whoosh and BERT. This allows you to perform searches on the extracted text data from the analysis results, such as docstrings, function signatures, and class descriptions.

How to Use Semantic Search (BERT and whoosh)

  1. Extract Text Data for Indexing: The script extracts relevant text data from the analysis results, including docstrings, function signatures, and class descriptions.
  2. Index the Extracted Data: The extracted text data is indexed using Whoosh and BERT.
  3. Perform Searches: You can perform searches on the indexed data using the search function in the LibraryAnalyzer class.

Example

To perform a search, you can use the following code snippet:

analyzer = LibraryAnalyzer()
analysis = analyzer.analyze_library("your_library_name")
text_data = analyzer.extract_text_data(analysis)
analyzer.index_data(text_data)
search_results = analyzer.search("your_search_query")
for result in search_results:
    print(f"Path: {result['path']}, Text: {result['text']}")

This will output the search results, showing the paths and text snippets that match the search query.

Configuration

The script uses a configuration file config.yaml to define preferences for the semantic search functionality. The configuration options include enabling or disabling BERT and Whoosh, and setting the top_k value for the number of search results to return.

Example Configuration

Here is an example config.yaml file:

use_bert: true
use_whoosh: true
top_k: 5

Conclusion

The library_analyzer.py script is a powerful tool for analyzing Python libraries and extracting detailed information about their elements. It can be used to gain insights into the structure and contents of a library, making it easier to understand and work with.

Reddit Badge

Let’s stay in touch here or on LinkedIn. LinkedIn Badge

About

script is designed to analyze Python libraries and extract detailed information about their elements, such as classes, methods, functions, properties, and more. The analysis results can be saved to a JSON file for further inspection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published