-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pandas import error #17
Comments
It is a bit weird that specific error message is not visible. Did you cut it out by any chance? It should be somewhere in this block:
Also, you do not really need pandas to import this data. Just try to call Maybe you do not even need http transport at all. It is possible to import data directly from HTTP address of file in bucket. |
No i didn't cut the error I am using pandas as i will handle some data in the file |
Hmm, ok. Let's try to find actual error in EXA_DBA_AUDIT_SQL. If |
If it does not help, could you also try to enable Let's see if this error is truly anonymous. Maybe you discovered completely new type of |
Do you have a firewall on your local laptop, which blocks connections on custom ports? Could you try to disable all firewalls? |
I spent yesterday working on it The problem was mismatching datatypes between the default Dataframe and the data itself Thank you for your help |
@Salehflaconi , could you attach a minimal reproducible code (1 row of data in pandas and table structure) or request log produced by connection option Lack of error message is not normal in such a trivial case. I would be happy to investigate it. Thank you. |
Of course i will be happy to do that with you How i solved part of the errors : #Changing the data types to be able to import And now df[16] = df[16].astype(str) Sample of my Data is I set the delimiter as but the delimiter now is not " And the Error is Traceback (most recent call last): |
Thank you. In order to enable debugging, just add an extra connection option Example:
|
I had the same error. For me, downgrading to pandas 0.23.4 helped. So maybe there is an issue with pyexasol+pandas 0.24.x? |
@cyroxx i will share with you my solution today as i created work around to complete my task as i just finished it ! |
The problem that i faced was converting the data types from pandas dataframe and loading it into exasol that's why i added a temp columns with random values at the beginning of my script and at the end i deleted them and it works !!! i don't know the reason but i will send the debug to you soon this is part my script
@wildraid i couldn't send the first debug as i solved it and i don't know how to regenerate the bug again |
@Salehflaconi , @cyroxx , maybe it is related to this bug in pandas: pandas-dev/pandas#25048 Could you try versions 0.24.0, 0.24.1 and 0.24.2? Maybe it was fixed in the latest version. I do not have an easy access to Windows machines at this moment. |
@cyroxx , please try updating pyexasol to latest version |
@Salehflaconi , I suspect you have genuine mismatch of number of columns between DataFrame and target table in Exasol. Try using |
@Salehflaconi , @cyroxx , hi guys. Is this issue still relevant? |
I am using the below script to move the data from local file into exasol
The error appears in the import part
import boto3
import botocore
import pyexasol
import pandas as pd
import exa_cred as exa
import Bucket_info as Bk_info
import datetime
import pandas as pd
import base64
#reading credential info and bucket info from local file
exuser = exa.cred['exuser']
expwd = exa.cred['expwd']
exdsn = exa.cred['exdsn']
schema=exa.cred['schema']
C = pyexasol.connect(dsn=exdsn, user=exuser, password=expwd,schema=schema)
#Bucket information
Bucket = Bucket_name
Key = "file.csv"
outPutName = "file.csv"
#Handling exception of not found and downloading the file locally
s3 = boto3.resource('s3')
try:
s3.Bucket(Bucket).download_file(Key, outPutName)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
#reading the data with pandas
print(str(datetime.datetime.now()) + " - Start script ...\n")
df=pd.DataFrame
df =pd.read_csv(outPutName,delimiter=';',skipinitialspace=True,engine='python', quotechar ='"')
print("Testing that pandas is working")
print(df.head(10))
#Import part of pandas DataFrame into Exasol table
print(df.iloc[[0,1], [0,1,2]])
C.import_from_pandas(df.iloc[[0], [0,1,2]], 'file')
Import Error which is
Traceback (most recent call last):
File "C:/...../script.py", line 55, in
C.import_from_pandas(df.iloc[[0], [0,1,2]], 'script')
File "C:....\lib\site-packages\pyexasol\connection.py", line 213, in import_from_pandas
return self.import_from_callback(cb.import_from_pandas, src, table, callback_params)
File "C:\lib\site-packages\pyexasol\connection.py", line 300, in import_from_callback
raise sql_thread.exc
File "C:...\site-packages\pyexasol\connection.py", line 290, in import_from_callback
sql_thread.join()
File "C:...\site-packages\pyexasol\http_transport.py", line 53, in join
raise self.exc
File "C:...\site-packages\pyexasol\http_transport.py", line 37, in run
self.run_sql()
File "C:...\site-packages\pyexasol\http_transport.py", line 165, in run_sql
self.connection.execute(query)
File "C:...\site-packages\pyexasol\connection.py", line 140, in execute
self.last_stmt = self.cls_statement(self, query, query_params)
File "C:...\site-packages\pyexasol\statement.py", line 47, in init
self._execute()
File "C:.....\statement.py", line 141, in _execute
'sqlText': self.query,
File "C:...\site-packages\pyexasol\connection.py", line 442, in req
raise cls_err(self, req['sqlText'], ret['exception']['sqlCode'], ret['exception']['text'])
pyexasol.exceptions.ExaQueryError:
(
'] (Session: xxxx)
dsn => xxxx
user => xxxx
schema => xxxx
code => xxxx
session_id => xxxx
query => xxxx
AT 'Server_IP:port' FILE '000.csv'
)
Process finished with exit code 1
Following versions are used:
Exasol = 6.0.8
Python = Python 3.7.2
rsa = 3.4.2
pandas = 0.24.1
websocket-client = 0.55.0
pyexasol = 0.5.2
OS = Windows
The text was updated successfully, but these errors were encountered: