Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yfinance should not write to the filesystem unless explicitly instructed by the user #1708

Closed
rickturner2001 opened this issue Sep 30, 2023 · 4 comments

Comments

@rickturner2001
Copy link
Contributor

rickturner2001 commented Sep 30, 2023

Describe bug

I think yfinance should not write to the filesystem unless instructed by the user. When I use yfinance, I don't expect it to create databases for caching purposes without being told, it seems to me that that's what everyone using the library would expect.

Caching is a great feature but it should be turned off by default.

if the code is executed in a read-only filesystem then an error will be thrown by the python interpreter. Whether the error is handled internally (thus seamlessly preventing the creation of the caching dir) or by the user, this event should not happen to begin with

import yfiance as yf

symbol= "MSFT"

# tell yf to cache these (tz, ticker, etc...)
df = yf.download(symbol, cache=True)
ticker = yf.download(symbol, cache=True)

Related issues

#1700

Simple code that reproduces your problem

# On a readonly filesystem
import yfinance as yf

symbol = "MSFT"

df =yf.Download(symbol)
ticker = yf.Ticker(symbol)

Debug log

Traceback (most recent call last):
File "/app/main.py", line 1, in
import yfinance as yf
File "/usr/local/lib/python3.11/site-packages/yfinance/init.py", line 23, in
from .ticker import Ticker
File "/usr/local/lib/python3.11/site-packages/yfinance/ticker.py", line 29, in
from .base import TickerBase
File "/usr/local/lib/python3.11/site-packages/yfinance/base.py", line 38, in
from . import utils
File "/usr/local/lib/python3.11/site-packages/yfinance/utils.py", line 991, in
class _KV(_peewee.Model):
File "/usr/local/lib/python3.11/site-packages/yfinance/utils.py", line 995, in _KV
class Meta:
File "/usr/local/lib/python3.11/site-packages/yfinance/utils.py", line 996, in Meta
database = _DBManager.get_database()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/yfinance/utils.py", line 953, in get_database
cls._initialise()
File "/usr/local/lib/python3.11/site-packages/yfinance/utils.py", line 972, in _initialise
_os.mkdir(cls._cache_dir)
OSError: [Errno 30] Read-only file system: '/root/.cache/py-yfinance'

Bad data proof

No response

yfinance version

0.2.30

Python version

3.11

Operating system

Linux

@ValueRaider
Copy link
Collaborator

ValueRaider commented Sep 30, 2023

As a general principle I 100% agree. But here we have to consider load on Yahoo and risk of them blocking/gatekeeping access. If we assume less load = less risk, then yfinance should try to reduce total load. That timezone cache reduces #requests per history()/download() call by half.

@ValueRaider
Copy link
Collaborator

A solution is don't fetch timezone if user provides timestamps or timezone-aware datetimes.

@rickturner2001
Copy link
Contributor Author

we have to consider load on Yahoo and risk of them blocking/gatekeeping access

This is a great point that I failed to consider.

I suppose a solution for a related issue would be to help users deploying on lambda functions or read-only filesystems (as discussed on issue #1700 and #1702) would be to allow them to pass a dns string to connect to, say a Postgres database and then generate the shema.

since the OSError is thrown as we import yfinance, we should at least handle that differently (maybe generate the tables with the first db statement exectuion).

# db cache is not instantiated
import yfinance as yf

# db cache is instantiated
ticker =  yf.Ticker("MSFT")

#...

# db cache is already instntiated
ticker = yf.Ticker("GOOGL")

As per the DNS string, I could see something like this happening

import yfinance as yf

yf.CacheConfig(my_provider, my_dns_string)

@ValueRaider
Copy link
Collaborator

ValueRaider commented Oct 1, 2023

since the OSError is thrown as we import yfinance, we should at least handle that differently

Or fully implement lazy-loading database, then nothing to handle #1709

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants