Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service Time measurement Trail_Approach_1 #679

Closed
wants to merge 14 commits into from

Conversation

saimedhi
Copy link
Collaborator

@saimedhi saimedhi commented Feb 21, 2024

Description

  • Service time is calculated within the perform_request function of the Connection class ( This draft uses RequestsHttpConnection) at the point where the server call is made.

  • The calculated service time is added to the response or data in the connection.

  • I understand that this method is not ideal for returning service time and am exploring alternative options for improvement.

Issues Resolved

Related to #678

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@saimedhi
Copy link
Collaborator Author

saimedhi commented Feb 21, 2024

Sample usage

host = 'localhost'
port = 9200
auth = ('admin', 'admin') # For testing only. Don't store credentials in code.

client = OpenSearch(
    hosts = [{'host': host, 'port': port}],
    http_auth = auth,
    use_ssl = True,
    verify_certs = False,
    connection_class = RequestsHttpConnection,
    calculate_service_time = True,
)

client.info()

{'name': 'c3b344c0372b',
'cluster_name': 'docker-cluster',
'cluster_uuid': 'f6p2z4YRQvGW8nknOPgPeg',
'version': {'distribution': 'opensearch',
'number': '2.11.1',
'build_type': 'tar',
'build_hash': '6b1986e964d440be9137eba1413015c31c5a7752',
'build_date': '2023-11-29T21:45:35.524809067Z',
'build_snapshot': False,
'lucene_version': '9.7.0',
'minimum_wire_compatibility_version': '7.10.0',
'minimum_index_compatibility_version': '7.0.0'},
'tagline': 'The OpenSearch Project: https://opensearch.org/',
'__client': {'Service_Time': '0.43622279167175293'}}

@saimedhi saimedhi marked this pull request as draft February 22, 2024 00:05
Copy link

codecov bot commented Feb 22, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (4b69c09) 72.14% compared to head (921cd55) 72.13%.

Files Patch % Lines
opensearchpy/connection/http_requests.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #679      +/-   ##
==========================================
- Coverage   72.14%   72.13%   -0.02%     
==========================================
  Files          89       89              
  Lines        7945     7952       +7     
==========================================
+ Hits         5732     5736       +4     
- Misses       2213     2216       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


# Add the service time to the raw_data
if calculate_service_time:
raw_data = raw_data.rstrip()[:-1] + f', "__client": {{"Service_Time": "{duration}"}}' + '}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick look, I think rather than attempting to modify the JSON string like this, it'd make more sense to just return the value from this function. ie. changing return response.status_code, response.headers, raw_data to return response.status_code, response.headers, raw_data, duration or creating some 'RequestMetricsclass that contains the duration (and potential additional future metrics) which is returned. Then in the transport class inserting that into the returned/deserialized object asmetrics` or something like that.

Adding it to the return signature of the connection class gives a clear and uniform way of returning this information across all connection implementations.

Some one with more Python prowess will hopefully be able to give a better idea of what's actually idiomatic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this.


# Add the service time to the raw_data
if calculate_service_time:
raw_data = raw_data.rstrip()[:-1] + f', "__client": {{"Service_Time": "{duration}"}}' + '}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this.

@@ -187,13 +191,15 @@ def perform_request( # type: ignore
"allow_redirects": allow_redirects,
}
send_kwargs.update(settings)
start = time.time()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the start at line 179 is fine. Any reason otherwise?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @VachaShah, I am adding start timer just before sending request. I think we don't need timer to start before prepared_request right?

try:
response = self.session.send(prepared_request, **send_kwargs)
duration = time.time() - start
raw_data = response.content.decode("utf-8", "surrogatepass")
except reraise_exceptions:
raise
except Exception as e:
duration = time.time() - start
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this variable being used anywhere since ConnectionError is raised after.

@dblock
Copy link
Member

dblock commented Feb 24, 2024

This is a good start and shows what we're trying to achieve.

What do you think about an idea where we first expose "events", then add "metrics"? Googling I found https://stackoverflow.com/questions/443885/python-callbacks-delegates-what-is-common and https://pypi.org/project/Events/ that could be a start. The client could expose events such as "request started", "finished", "errored", etc., and then another class called Metrics or Measurements could collect time information using this event system.

@saimedhi saimedhi closed this Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants