Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mypy warnings #221

Closed
ckunki opened this issue Nov 20, 2024 · 12 comments · Fixed by #225
Closed

Fix mypy warnings #221

ckunki opened this issue Nov 20, 2024 · 12 comments · Fixed by #225
Assignees
Labels
refactoring Code improvement without behavior change

Comments

@ckunki
Copy link
Contributor

ckunki commented Nov 20, 2024

@ckunki ckunki added the refactoring Code improvement without behavior change label Nov 20, 2024
@ckunki
Copy link
Contributor Author

ckunki commented Nov 20, 2024

I found some classes initializing some of their attributes after __init__()
Other methods access the attributes that might be uninitialized.

Class BackgroundListenerThread

  • self._register_peer_connection, RegisterPeerConnection()
  • self._out_control_socket
  • self._my_connection_info

Class QueryHandlerRunnerUDF

  • QueryHandlerRunnerState.input_query_query_handler_context might be None while self._wrap_return_query() expects the argument not to be None
  • UDFParameter.temporary_schema_name is Optional while TopLevelQueryHandlerContext expects the argument not to be None

SQLStageGraphExecutionQueryHandlerState

  • self._current_query_handler_context
  • self._sql_stage_graph

Class Communicator has some attributes that potentially could be None:

  • self._value
  • self._multi_node_communicator, _create_multi_node_communicator() -> Optional[]

SQLStageGraphExecutionInput still imports and uses as attribute type-hint AbstractBucketFSLocation

  • How can this be replaced by bfs.path.PathLike?

call_udf.py contains udf = QueryHandlerRunnerUDF(exa).

  • Can we add # type: ignore here?

Class TopLevelQueryHandlerContext, attribute temporary_schema_name is set from parameters to UDF.
I propose to change the type of this attribute from str to Optional[str]

@ckunki ckunki self-assigned this Nov 20, 2024
@tkilias
Copy link
Collaborator

tkilias commented Nov 20, 2024

Regarding BackgroundListenerThread

They can only be initialized with a proper value in the run method, because the object gets transferred from one process to another process after the construction and only in the process then run is called.

Furthermore, I think there are more attributes for which the same is true , e g. self._in_control_socket, have a look at the create* methods that get called in run.

After the calls to the create methods in run they shouldn't be None anymore, before they could be None, I think. However, you might need to assert that they are not anymore before they get used otherwise, mypy might complain that they could be None. @tomuben had to do this in the itde.

@tkilias
Copy link
Collaborator

tkilias commented Nov 20, 2024

I think UDFParameter.temporary_schema_name can be None, because we don't validate the json in Lua regarding this and we probably don't need to. However, should be maybe checked , what we get if the json has no schema specified. However, the more I think, about it , also if it could be None , we need a value for it and that means either a default or an error. And, I think, tend to error here, that also should solve the complaint

@tkilias
Copy link
Collaborator

tkilias commented Nov 20, 2024

Regarding QueryHandlerRunnerState.input_query_query_handler_context check with exception in self._wrap_return_query() that it is not None, that solve the complaint

@tkilias
Copy link
Collaborator

tkilias commented Nov 21, 2024

@tkilias
Copy link
Collaborator

tkilias commented Nov 21, 2024

Regarding UDFParameter

temporary_schema_name, python_class_name, python_class_module are not optional, we need to check if the values we get from the ctx are None and if yes throw an error message.

@tkilias
Copy link
Collaborator

tkilias commented Nov 21, 2024

SQLStageGraphExecutionQueryHandlerState

  • self._current_query_handler_context -> assert

@tkilias
Copy link
Collaborator

tkilias commented Nov 21, 2024

Same for

@ckunki
Copy link
Contributor Author

ckunki commented Nov 21, 2024

Only two type check findings remaining:

Class BackgroundListenerInterface

exasol/analytics/udf/communication/peer_communicator/background_listener_interface.py: note: 
In member "my_connection_info" of class "BackgroundListenerInterface":

exasol/analytics/udf/communication/peer_communicator/background_listener_interface.py:107:16: error:
Incompatible return value type (got "ConnectionInfo | None", expected
"ConnectionInfo")  [return-value]
            return self._my_connection_info
                   ^~~~~~~~~~~~~~~~~~~~~~~~

The return value of this method potentially is used in many places.
Implications are unclear to me.

  • Simply setting the return type to Optional only propagates the typing issue.
  • Raising an exception seems inappropriate since method Class _set_my_connection_info() logs an error but enables to continue with self._my_connection_info still being set to None.

A potential minimal-invasive patch could be to return a constant INVALID_CONNECTION_INFO.

exasol/analytics/query_handler/udf/runner/udf.py: note: 
In member "_create_state" of class "QueryHandlerRunnerUDF":

exasol/analytics/query_handler/udf/runner/udf.py:271:13: error: Argument 3 to
"TopLevelQueryHandlerContext" has incompatible type "str | None"; expected "str"
 [arg-type]
                self.parameter.temporary_schema_name,
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I found two alternative initializations of self._parameter
If we stick to making temporary_schema mandatory, then these two alternatives don't make sense anymory in my eyes.

@ckunki
Copy link
Contributor Author

ckunki commented Nov 21, 2024

The current unit tests for the classes in exasol.analytics.schema verify TypeCheckErrors raised by typeguard.
This conflicts with the custom errors I introduced to be raised when asking an incomplete builde to build a type to meet the type hints.

I see the following options

  • O1) modify tests to verify the custom errors
  • O2) modify the builders to raise TypeCheckError

@ckunki
Copy link
Contributor Author

ckunki commented Nov 25, 2024

I fixed most of the findings in initial review batch
More investigation was requested in

@ckunki
Copy link
Contributor Author

ckunki commented Nov 29, 2024

Waiting until reviewer @tkilias is available.

ckunki added a commit that referenced this issue Dec 3, 2024
* Fixed mypy warnings
* Formatted code
* Made register_peer_connection optional in RegisterPeerForwarder, RegisterPeerForwarderBuilderParameter, RegisterPeerForwarderFactory
* replaced comment to ignore import-untyped by entry in pyproject.toml
* Prefixed properties with "_checked_"
* Updated pandas to version 2.2.3

Co-authored-by: Thomas Ubensee <34603111+tomuben@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Code improvement without behavior change
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants