Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

table.scan() with filter set will always fail in happybase >= 0.7 #54

Closed
ruhe opened this issue Jan 22, 2014 · 12 comments
Closed

table.scan() with filter set will always fail in happybase >= 0.7 #54

ruhe opened this issue Jan 22, 2014 · 12 comments
Labels

Comments

@ruhe
Copy link

ruhe commented Jan 22, 2014

Description:
batchSize should not be set on scans with filter.

happybase v0.7 introduced new argument batchSize for TScan in method happybase.table.scan(). When used with filter this parameter will cause all scan operations to fail.

happybase always passes batch_size to TScan, even if there is filter_string present.
there is no way to set batch_size to None since method scan() validates batch_size value:
https://github.com/wbolster/happybase/blob/0.7/happybase/table.py#L259

See corresponding HBase code:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.94.9/org/apache/hadoop/hbase/client/Scan.java?av=f#311

Steps to reproduce:

import happybase
conn = happybase.Connection(host='localhost', port=9090)
conn.create_table('project', {'f': dict()})
table = conn.table('project')

table.put('row1', {'f:qual1': 'val1'})
table.put('row2', {'f:qual1': 'val2'})
table.put('row3', {'f:qual1': 'val1'})

# this operation always fails
for k, v in table.scan(filter="SingleColumnValueFilter ('f', 'qual1', =, 'binary:val1')"): 
    print v
@wbolster
Copy link
Member

Would a simple removal of the batch size argument in case a filter argument was supplied introduce other issues?

wbolster added a commit that referenced this issue Jan 25, 2014
Allow None as a valid value for the batch_size argument to Table.scan(),
since HBase does not support specifying a batch size when some scanner
filters are used.

Fixes issue #54.
@wbolster
Copy link
Member

@ruhe Please test these changes and let me know whether this is a good enough fix? Thanks.

@wbolster
Copy link
Member

@ruhe I'm not completely sure how the batchSize argument is actually used when using Thrift. If you have suggestions that improve the current implementation, please share.

@ruhe
Copy link
Author

ruhe commented Jan 28, 2014

@wbolster thanks a lot for fixing it so fast!
your fix is what i had in mind. the only concern i have is for default value of 'how_many'. but i need to digg deeper in hbase code to provide more detailed comments.

i'll definitely test this fix, stay tuned :)

@wbolster
Copy link
Member

wbolster commented Feb 1, 2014

It's a bit more complicated; see issue #56.

openstack-gerrit pushed a commit to openstack/openstack that referenced this issue Feb 12, 2014
Project: openstack/requirements  051bd0cca12c57f2fd016f56db2be724c30499f9
null
Fix happybase version

Since version 0.7 happybase contains a bug python-happybase/happybase#54
It makes impossible HBase table scanning with filters. Version 0.6 works ok in this scenario.

Change-Id: I33bad6447f6bc1241f3168a3df14e6f5bf028f5b
openstack-gerrit pushed a commit to openstack/openstack that referenced this issue Feb 12, 2014
Project: openstack/requirements  051bd0cca12c57f2fd016f56db2be724c30499f9
null
Fix happybase version

Since version 0.7 happybase contains a bug python-happybase/happybase#54
It makes impossible HBase table scanning with filters. Version 0.6 works ok in this scenario.

Change-Id: I33bad6447f6bc1241f3168a3df14e6f5bf028f5b
openstack-gerrit pushed a commit to openstack/requirements that referenced this issue Feb 12, 2014
Since version 0.7 happybase contains a bug python-happybase/happybase#54
It makes impossible HBase table scanning with filters. Version 0.6 works ok in this scenario.

Change-Id: I33bad6447f6bc1241f3168a3df14e6f5bf028f5b
wbolster added a commit that referenced this issue Feb 25, 2014
For details, see the comments added in this commit, and issues #54 and
issue #56.
@wbolster
Copy link
Member

I've reverted my previous fix (8481d31) in commit da109ab, and implemented (hopefully) the right fix in 106dcf0.

@wbolster
Copy link
Member

Should be fixed in 0.8 (just released).

@bachvtuan
Copy link

Hi. I appreciate your works but it won't fixed. I upgraded to v.8 and below is my code which working with Hbase0.96

connection = happybase.Connection(host='localhost', port=9090,autoconnect=False,compat='0.96',transport='buffered')
connection.open()

tables = connection.tables()
print "All available tables"
print tables

user_table = connection.table('users')

for key, data in user_table.scan():
    print key, data

And below is the result in terminal:

All available tables
['inboxes', 'invitations', 'links', 'logs', 'module_categories', 'modules', 'projects', 'todo_comments', 'todo_groups', 'todo_tasks', 'users', 'workspaces']
Traceback (most recent call last):
  File "test.py", line 19, in <module>
for key, data in user_table.scan():
  File "/usr/local/lib/python2.7/dist-packages/happybase/table.py", line 368, in scan
self.name, scan, {})
  File "/usr/local/lib/python2.7/dist-packages/happybase/hbase/Hbase.py", line 1889, in scannerOpenWithScan
return self.recv_scannerOpenWithScan()
  File "/usr/local/lib/python2.7/dist-packages/happybase/hbase/Hbase.py", line 1914, in recv_scannerOpenWithScan
raise result.io
happybase.hbase.ttypes.IOError: IOError(_message='users')

And thrit log:

2014-02-26 08:43:56,244 WARN  [thrift-worker-1] client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch hbase:meta table: 
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in hbase:meta for table: users, row=users,,99999999999999
  at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:146)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1102)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1162)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1054)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1011)
  at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:326)
  at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:192)
  at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:165)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.getTable(ThriftServerRunner.java:462)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.getTable(ThriftServerRunner.java:468)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1200)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
  at com.sun.proxy.$Proxy7.scannerOpenWithScan(Unknown Source)
  at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4433)
  at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4417)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
  at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)
2014-02-26 08:43:56,247 WARN  [thrift-worker-1] thrift.ThriftServerRunner$HBaseHandler: users
org.apache.hadoop.hbase.TableNotFoundException: users
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1181)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1054)
  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1011)
  at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:326)
  at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:192)
  at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:165)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.getTable(ThriftServerRunner.java:462)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.getTable(ThriftServerRunner.java:468)
  at org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1200)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
  at com.sun.proxy.$Proxy7.scannerOpenWithScan(Unknown Source)
  at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4433)
  at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4417)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
  at org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)

Thrift said 'users' table not found but it was there in the list of tables

@bachvtuan
Copy link

I downgraded to Hbase0.94. V8 worked well.

@wbolster
Copy link
Member

I doubt that that is the same issue as the incorrect scan arguments, since for that I have a test that fails before and passes after my most recent fixes. Are you sure your database is valid? Can you access it using the HBase shell, for instance?

@bachvtuan
Copy link

sorry, that's my fault. I created all tables by using combat 0.94 in connection method ( I forgot update to 0.96 ). Now it worked.

@wbolster
Copy link
Member

wbolster commented Mar 2, 2014

@bachvtuan ok, glad to hear that it works.

tanaypf9 pushed a commit to tanaypf9/pf9-requirements that referenced this issue May 20, 2024
Since version 0.7 happybase contains a bug python-happybase/happybase#54
It blocks efficient HBase table scanning. Version 0.6 works ok in this scenario.

Change-Id: I33bad6447f6bc1241f3168a3df14e6f5bf028f5b
tanaypf9 pushed a commit to tanaypf9/pf9-requirements that referenced this issue May 20, 2024
Since version 0.7 happybase contains a bug python-happybase/happybase#54
It makes impossible HBase table scanning with filters. Version 0.6 works ok in this scenario.

Change-Id: I33bad6447f6bc1241f3168a3df14e6f5bf028f5b
tanaypf9 pushed a commit to tanaypf9/pf9-requirements that referenced this issue May 20, 2024
Patch Set 1:

@sean, there is a bug in 0.7 python-happybase/happybase#54 . It make impossible to work with hbase

Patch-set: 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants