Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable name not imported correctly #165

Open
Ov-ille opened this issue Feb 25, 2022 · 2 comments
Open

Variable name not imported correctly #165

Ov-ille opened this issue Feb 25, 2022 · 2 comments
Labels
bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat

Comments

@Ov-ille
Copy link

Ov-ille commented Feb 25, 2022

I have come across a very strange problem today.

When reading an SPSS File, (at least) one of the variables is assigned a different name, than it has in the SPSS file.
I've never noticed that behaviour before, only when struggeling with the known issue #119 with reading long string variables (they are being split into multiple variables there).
Weirdly the variable in this case is a numeric variable with just 0/1 values, so I doubt it is related to the same problem.

A solution I have found for this issue as well as #119 is to open the file in SPSS make any change to it and save it again. Then everything is imported correctly.

Reproducing the issue

import pyreadstat as sav
import re
# load data
data, meta = sav.read_sav(test.sav)
# access variable in question --> this will give an error!
print(data["BRANDAA_SUN_1"])
# instead the variable was renamed while importing into python:
print(data["BRANDAA"])
  • open SPSS file, make any change to it and save it again (or download the following file, where I have did a minor change (added a variable label to one variable: https://www.dropbox.com/s/32du1xh0523yzb2/test2.SAV?dl=0)
  • execute the same code and this time there is no error when accessing column "BRANDAA_SUN_1"

Setup Information:
pyreadstat 1.1.4 was installed with pip (I also tried it with 1.1.2)
a virtual environment created with venv
Python3.8 (plain)
Windows10, 64bit

@ofajardo
Copy link
Collaborator

ofajardo commented Mar 4, 2022

I think the file has been created using the IBM spss dll files instead of the full application, but it should be possible to read it correctly since pspp does it correctly. I have submitted a ticket to Readstat, we have to wait for them to fix it.

@ofajardo ofajardo added bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat labels Mar 4, 2022
@ofajardo
Copy link
Collaborator

ofajardo commented Jan 10, 2024

another user reports a similar issue with another file, this time apparently created with SPSS #250, file added to the bug report in Readstat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working requires changes in Readstat waiting for changes in the C library Readstat
Projects
None yet
Development

No branches or pull requests

2 participants