Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QSYMBOL as unicode string in Python 3 #35

Closed
audetto opened this issue Dec 30, 2015 · 7 comments
Closed

QSYMBOL as unicode string in Python 3 #35

audetto opened this issue Dec 30, 2015 · 7 comments
Assignees
Milestone

Comments

@audetto
Copy link

audetto commented Dec 30, 2015

Hi,

I am using python 3 and when I query my employer kdb server I get back a lot of QSYMBOL and QSYMBOL_LIST which are converted to numpy.string_ which I seem to understand is just bytes.

This is real annoying as the rest of my code uses plain python 3 strings.

Would it be possible for the user to specify an encoding and convert them to string?
Maybe using the QReader mapping mechanism? Is it private or can be overwritten?

@maciejlach
Copy link
Collaborator

Unfortunately, at the moment we don't support to replace QReader/QWriter with custom implementations. This might be a subject to change in future releases.

At the moment the most straight forward solution would be to introduce a new option parameter, similar to numpy_temporals. This would allow to override default behavior and return QSYMBOLs as Python strings.

@maciejlach
Copy link
Collaborator

The 1.2.0b1 version provides basic support for extending the QReader and QWriter classes.

Here is a code snippet with subclassed QReader:

class MyQReader(QReader):
    # QReader and QWriter use decorators to map data types and corresponding function handlers 
    parse = Mapper(QReader._reader_map)

    def _read_list(self, qtype):
        if qtype == QSYMBOL_LIST:
            self._buffer.skip()
            length = self._buffer.get_int()
            symbols = self._buffer.get_symbols(length)
            return [s.decode(self._encoding) for s in symbols]
        else:
            return QReader._read_list(self, qtype = qtype)

    @parse(QSYMBOL)
    def _read_symbol(self, qtype = QSYMBOL):
        return numpy.string_(self._buffer.get_symbol()).decode(self._encoding)


with qconnection.QConnection(host='localhost', port=5000, reader_class = MyQReader) as q:
    symbols = q.sync('`foo`bar')
    print(symbols, type(symbols), type(symbols[0]))

    symbol = q.sync('`foo')
    print(symbol, type(symbol))

@audetto
Copy link
Author

audetto commented Jan 16, 2016

I will try it.

But I found an other issue: about QSTRING

The doc says they are converted to Python strings

here:

https://github.com/exxeleron/qPython/blob/master/doc/source/type-conversion.rst#string-and-symbols

This is probably true in Python 2, but in Python 3, they are byte arrays. Probably the doc should be clarified.

I wonder if in Python 3 people would find it more natural if all these byte arrays where turned into strings using a customisable encoding (I see that sometimes "latin-1" is used).
Byte arrays (as a replacement for strings) are really awkward to use in python 3.

@maciejlach
Copy link
Collaborator

Thanks for feedback. I'll update documentation.

You can use the same approach to override default handling of QSTRING type. QReader now supports custom encoding - it's passed as a constructor parameter.

@audetto
Copy link
Author

audetto commented Jan 19, 2016

I see.
One thing needs to be taken into account.

There are 2 ways to override methods

  • standard: via the "vtable"
  • using decorators in a global map

One needs to remember that just by writing the code above, the behaviour of the exisiting QReader has been changed, as the QSYMBOL parser is global. On the other hand _read_list is in the vtable.

So it is somehow hard to swap the 2 in a single running instance of python.

@maciejlach
Copy link
Collaborator

Yes, that's correct. This is a design flaw which I will aim to fix in future release.

@maciejlach
Copy link
Collaborator

I've adjusted the QReader and QWriter to use mapping dictionary from the sub-class. You can use parse time decorators to extend/modify default mapping. You have to remember to create copy of mapping from the parent class.

The standard way of overriding by providing implementation of protected methods (e.g. _read_list) is still allowed.

I've updated the documentation and provided updated example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants