-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow retrieval of binary data from WSClient output stream #1471
Comments
It's a reasonable request. We'd like to review it if you could send a PR /assign @FabianNiehaus |
cc @yliaog |
After some trial and error, I realized that I will try to add an instance variable |
Turns out that during Code: elif op_code == ABNF.OPCODE_BINARY or op_code == ABNF.OPCODE_TEXT:
data = frame.data
if len(data) > 1:
channel = data[0]
if six.PY3 and channel not in self.raw_channels:
data = data.decode("utf-8", "replace")
data = data[1:]
if data:
if channel in [STDOUT_CHANNEL, STDERR_CHANNEL]:
# keeping all messages in the order they received
# for non-blocking call.
self._all.write(data) # !!! Data passed to StringIO object !!!
if channel not in self._channels:
self._channels[channel] = data
else:
self._channels[channel] += data Error:
I will try to avoid this by converting the byte sequence into a string without decoding, to it can be cast back later. |
+1 for the issue |
Alright, I finally got around to working on this again. As stated before, the use of StringIO in the client bars me from storing the raw bytes. There were two issues to solve: Which data type to use for conversion and under which conditions to convert. Data type selection:
Conditions for conversion:
For now, I have a working solution. I am however not quite satisfied with the results, as it is not really intuitive to use and requires some knowledge about the inner workings of the API client. Click to view code for WSClient def __init__(self, configuration, url, headers, capture_all):
"""A websocket client with support for channels.
Exec command uses different channels for different streams. for
example, 0 is stdin, 1 is stdout and 2 is stderr. Some other API calls
like port forwarding can forward different pods' streams to different
channels.
"""
self._connected = False
self._channels = {}
if capture_all:
self._all = StringIO()
else:
self._all = _IgnoredIO()
self.sock = create_websocket(configuration, url, headers)
self._connected = True
# channels to be dumped to hex rather than utf-8 decoded
# to be set during runtime / after client creation
self.hexdump_channels = []
def update(self, timeout=0):
"""Update channel buffers with at most one complete frame of input."""
if not self.is_open():
return
if not self.sock.connected:
self._connected = False
return
r, _, _ = select.select(
(self.sock.sock, ), (), (), timeout)
if r:
op_code, frame = self.sock.recv_data_frame(True)
if op_code == ABNF.OPCODE_CLOSE:
self._connected = False
return
elif op_code == ABNF.OPCODE_BINARY or op_code == ABNF.OPCODE_TEXT:
data = frame.data
if six.PY3:
data = data.decode("utf-8", "ignore")
if len(data) > 1:
channel = ord(data[0])
data = data[1:]
if data:
if channel in self.hexdump_channels:
# retrieve raw data from stream as hex
data = '0x' + frame.data[1:].hex()
if channel in [STDOUT_CHANNEL, STDERR_CHANNEL]:
# keeping all messages in the order they received
# for non-blocking call.
self._all.write(data)
if channel not in self._channels:
self._channels[channel] = data
else:
self._channels[channel] += data An alternative approach: |
I adjusted my code for copying to be more generalistic. However, I don't quite know where to add it to the project. Click to view codeimport os
import tarfile
import tempfile
import threading
from pathlib import Path
import kubernetes
from kubernetes.client import CoreV1Api
class MyClass:
def copy_from_pod(self, name, namespace, source, destination, **kwargs):
"""copy_from_pod # noqa: E501
copy file from a pod # noqa: E501
This method makes a synchronous HTTP request by default. To make an
asynchronous HTTP request, please pass async_req=True
>>> thread = api.copy_from_node(name, namespace, source, destination, async_req=True)
>>> result = thread.get()
:param async_req bool: execute request asynchronously
:param str name: name of the PodExecOptions (required)
:param str namespace: object name and auth scope, such as for teams and projects (required)
:param str source: File path to retrieve from the pod.
:param str destination: Destination file name on the local system.
:return: str
If the method is called asynchronously,
returns the request thread.
"""
def cp():
# -c create a new archive
# -m don't extract file modified time
# -f use archive file or device ARCHIVE
exec_command = ["tar", "cmf", "-", source]
with tempfile.TemporaryFile(mode="w+b") as tar_buffer:
wsclient = kubernetes.stream.stream(
CoreV1Api().connect_get_namespaced_pod_exec,
name=name,
namespace=namespace,
command=exec_command,
stdout=True,
stderr=True,
_preload_content=False,
)
wsclient.hexdump_channels = [1]
out = ''
while wsclient.is_open():
wsclient.update(timeout=1)
if wsclient.peek_stdout():
out += wsclient.read_stdout()
if wsclient.peek_stderr():
err: str = wsclient.read_stderr()
raise RuntimeError(err)
wsclient.close()
tar_buffer.write(bytes.fromhex(out[2:])) # strip 0x
tar_buffer.seek(0)
destination_dir = Path(destination).parent
if not destination_dir.exists():
os.makedirs(destination_dir)
with tarfile.open(fileobj=tar_buffer, mode="r:") as tar:
for member in tar.getmembers():
if member.isdir():
continue
tar.makefile(member, Path(destination))
request_thread = threading.Thread(target=cp)
request_thread.start()
if kwargs.get("async_req") is True:
return request_thread
else:
request_thread.join()
if Path(destination).is_absolute():
return destination
else:
return str(Path(os.getcwd(), destination).absolute()) @roycaihw Do you have a suggestion? Also, am I correct that this code would need to be made compatible with Python 2.7? Even though that is EOL by now? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
@roycaihw Any inputs on this? /remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
@roycaihw can you give some hints here #1471 (comment) There is a solution but it just lacks at knowledge on how to integrate it. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
These bots suck |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
It looks like any attempt at this was abandoned. I have eadfcaa which takes the approach of adding a backward-compatible option to specify to use binary data for all channels. It is passing existing tests, as well as new ones which work with binary data. I am going to try it "in the wild", and open a PR if I don't find any egregious issues. |
What is the feature and why do you need it:
The API client currently does not offer any capabilities for copying files from Pods / containers (similiar to
kubectl cp
).We have created a workaround by opening a stream to the Pod, creating a tar archive of the files to copy, and outputting the data to stdout. Once retrieved, the data is then extracted from the archive.
However, this does not work for binary files due to how the WSClient handles incoming data.
kubernetes\stream\ws_client.py:161-178
(kubernetes v17.17.0)The last line always tries to decode to UTF-8 while replacing all characters that cannot be properly decoded.
In the case of binary PCAP files, this results in corrupted data.
Describe the solution you'd like to see:
I think that changing the signature of
update
to allow the user the following options might resolve the issue:replace
. According to the docs,strict
andignore
are options as well. I tested outignore
and it results in the desired output when decoding and then encoding again, which both byte object being the same.If needed, I can implement the agreed on solution myself and open a pull request.
The text was updated successfully, but these errors were encountered: