Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎊 Feature: Get and Store Splits 🎊 #32

Merged
merged 76 commits into from
Aug 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
546fbc6
Feature: init function (#7)
suchak1 Jun 13, 2020
652613b
Add logo (#8)
suchak1 Jun 13, 2020
a7c2754
Create python-app.yml
suchak1 Jun 18, 2020
d0b29ba
organization and linting
suchak1 Jun 18, 2020
104abce
pytest-check
suchak1 Jun 18, 2020
38c1c18
flatten tests
suchak1 Jun 18, 2020
2bd7834
flatten tests
suchak1 Jun 18, 2020
d58eed1
new tests
suchak1 Jun 18, 2020
2091273
test command update
suchak1 Jun 18, 2020
4b649cf
just run pytest
suchak1 Jun 18, 2020
d8c2cfc
init and copy token bash script
suchak1 Jun 22, 2020
4909976
env vars
suchak1 Jun 22, 2020
8e9ad7c
workflow
suchak1 Jun 22, 2020
1a15801
secrets
suchak1 Jun 22, 2020
256463b
progress towards automation
suchak1 Jun 24, 2020
dc3e67d
handled non ci case
suchak1 Jun 24, 2020
779368f
update token script almost done
suchak1 Jun 24, 2020
8a66d88
chmod +x
suchak1 Jun 24, 2020
550eb86
make tokens dir
suchak1 Jun 24, 2020
184eb43
split prebuild and postbuild scripts
suchak1 Jun 25, 2020
5e585ba
fix workflow
suchak1 Jun 25, 2020
68f82bc
script file path in workflow
suchak1 Jun 25, 2020
0ce77a0
housecleaning
suchak1 Jun 25, 2020
f6b66e1
get windows username on wsl
suchak1 Jun 25, 2020
943f4de
new workflow
suchak1 Jun 26, 2020
fb4b0a3
rename workflows
suchak1 Jun 26, 2020
4153708
yml syntax
suchak1 Jun 26, 2020
fdb0a64
schedule
suchak1 Jun 26, 2020
92e53e9
try again
suchak1 Jun 26, 2020
0be4e57
try again
suchak1 Jun 26, 2020
43870c2
daily update
suchak1 Jun 26, 2020
c05d0e8
update todo
suchak1 Jun 26, 2020
4604436
code coverage
suchak1 Jun 26, 2020
459b164
import sys
suchak1 Jun 26, 2020
7f97643
lower coverage threshold
suchak1 Jun 26, 2020
ada1e50
test_load_portfolio
suchak1 Jun 26, 2020
89c7cae
auto versioning
suchak1 Jun 26, 2020
b884b8b
writing last tests
suchak1 Jun 26, 2020
2ef7365
tests done
suchak1 Jun 26, 2020
f5652d7
locked reqs
suchak1 Jun 26, 2020
f27a646
wrong dotenv
suchak1 Jun 26, 2020
f95a1e6
try again
suchak1 Jun 26, 2020
e88a815
update function name
suchak1 Jun 26, 2020
b7a43c7
fix interval def
suchak1 Jun 26, 2020
c8b442f
comment out csv creation
suchak1 Jun 28, 2020
607a602
yml descriptions
suchak1 Jun 28, 2020
9238253
⚙️ AUTOMATION: CI, tests, and auto token refresh ⚙️ (#10)
suchak1 Jun 28, 2020
77a8aba
🐞 Bug: Commit New OAuth 🐞 (#12)
suchak1 Jul 1, 2020
8c523af
[AUTO] OAuth token refresh #patch
suchak1 Jul 13, 2020
9fab02f
Update update.yml
suchak1 Jul 24, 2020
0690c00
Bug: Fix Auth Flow / Add 2FA (#16)
suchak1 Jul 25, 2020
2738398
Feature: Get and Store Dividends (#14)
suchak1 Aug 3, 2020
8b72440
[AUTO] Symbol update #patch
suchak1 Aug 3, 2020
01a771e
[AUTO] Symbol update #patch
suchak1 Aug 4, 2020
594ff3e
[AUTO] Symbol update #patch
suchak1 Aug 6, 2020
40b87ef
[AUTO] Symbol update #patch
suchak1 Aug 10, 2020
aaac177
Feature: S3 Integration (#19)
suchak1 Aug 12, 2020
1cd36b1
hotfix
suchak1 Aug 13, 2020
43faa43
💽 Feature: Cache pip dependencies 💽 (#20)
suchak1 Aug 13, 2020
c07ec05
still need to write get_splits fx
suchak1 Aug 14, 2020
b46e346
merge conflicts
suchak1 Aug 20, 2020
e82ac8a
merge conflicts
suchak1 Aug 22, 2020
fbedfdc
fix save_symbols test
suchak1 Aug 22, 2020
baa0d98
fixing tests
suchak1 Aug 23, 2020
43661f4
tests fixed?
suchak1 Aug 24, 2020
12271d9
merge
suchak1 Aug 24, 2020
f59264d
iex get splits done
suchak1 Aug 26, 2020
32a21f6
splits functions done
suchak1 Aug 26, 2020
10db573
adding splits update to the pipeline
suchak1 Aug 26, 2020
a3cd839
writing tests
suchak1 Aug 27, 2020
85bfaab
higher order fx refactor and almost done w tests
suchak1 Aug 27, 2020
3ceb498
tests done?
suchak1 Aug 27, 2020
36150eb
fix test_get_splits_path
suchak1 Aug 27, 2020
1ebe74e
fix columns naming
suchak1 Aug 27, 2020
0d0e1d5
refactor
suchak1 Aug 27, 2020
ea3f20b
try again
suchak1 Aug 27, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,9 @@ updates:
directory: "/" # Location of package manifests
schedule:
interval: "monthly"

- package-ecosystem: "github-actions"
directory: "/"
schedule:
# Check for updates to GitHub Actions monthly
interval: "monthly"
3 changes: 3 additions & 0 deletions .github/workflows/sandbox.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,5 +65,8 @@ jobs:
- name: Update dividends
run: python scripts/update_dividends.py

- name: Update splits
run: python scripts/update_splits.py

- name: Upload repo to S3
run: python3 scripts/update_repo.py
47 changes: 47 additions & 0 deletions .github/workflows/splits.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# This workflow will automatically update data files
# For more information see: https://help.github.com/en/actions/reference/events-that-trigger-workflows#scheduled-events-schedule

name: Splits

on:
schedule:
- cron: "30 12 1 * *"
# 8:30am EST

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout repo
uses: actions/checkout@v2
with:
ref: ${{ github.head_ref }}

- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8

- name: Cache pip dependencies
uses: actions/cache@v2
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi

- name: Update splits
env:
IEXCLOUD: ${{ secrets.IEXCLOUD }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: ${{ secrets.AWS_DEFAULT_REGION }}
S3_BUCKET: ${{ secrets.S3_BUCKET }}
APCA_API_KEY_ID: ${{ secrets.APCA_API_KEY_ID }}
run: python scripts/update_splits.py
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,9 @@ Using Robinhood 2FA, we can simply provide our MFA one-time password in the `.en
- [x] ![Symbols](https://github.com/suchak1/scarlett/workflows/Symbols/badge.svg)
- [ ] EOD OHLCV
- [ ] Intraday OHLCV 5 min ticks
- [ ] Actions
- [x] Actions
- [x] ![Dividends](https://github.com/suchak1/scarlett/workflows/Dividends/badge.svg)
- [ ] Splits
- [x] ![Splits](https://github.com/suchak1/scarlett/workflows/Splits/badge.svg)
- [ ] Sentiment
- [ ] News Sentiment
- [ ] Social Sentiment
Expand Down
24 changes: 24 additions & 0 deletions scripts/update_splits.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import sys
sys.path.append('src')
from DataSource import IEXCloud, Polygon # noqa autopep8

iex = IEXCloud()
poly = Polygon()
symbols = iex.get_symbols()

# Double redundancy

for symbol in symbols:
# 1st pass
try:
iex.save_splits(symbol=symbol, timeframe='3m')
except Exception as e:
print(f'IEX Cloud split update failed for {symbol}.')
print(e)

# 2nd pass
try:
poly.save_splits(symbol=symbol, timeframe='max')
except Exception as e:
print(f'Polygon.io split update failed for {symbol}.')
print(e)
6 changes: 5 additions & 1 deletion src/Constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@
DEC = 'Dec' # Declaration Date
PAY = 'Pay' # Payment Date

# Splits
RATIO = 'Ratio'


class PathFinder:
def make_path(self, path):
Expand All @@ -54,12 +57,13 @@ def get_dividends_path(self, symbol, provider='iexcloud'):
f'{symbol.upper()}.csv'
)

def get_splits_path(self, symbol):
def get_splits_path(self, symbol, provider='iexcloud'):
# given a symbol
# return the path to its stock splits
return os.path.join(
DATA_DIR,
SPLT_DIR,
folders[provider],
f'{symbol.upper()}.csv'
)

Expand Down
117 changes: 86 additions & 31 deletions src/DataSource.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,25 +30,36 @@ def get_dividends(self, symbol, timeframe='max'):
filtered = self.reader.data_in_timeframe(df, C.EX, timeframe)
return filtered

def standardize(self, symbol, df, full_mapping, fx, columns, default):
mapping = {k: v for k, v in full_mapping.items() if k in df}

df = df[list(mapping)].rename(columns=mapping)
filename = fx(symbol, self.provider)
time_col, val_col = columns

if time_col in df and val_col in df:
df = self.reader.update_df(
filename, df, time_col).sort_values(by=[time_col])
df[val_col] = df[val_col].apply(
lambda val: float(val) if val else default)

return df

def standardize_dividends(self, symbol, df):
full_mapping = dict(
zip(
['exDate', 'paymentDate', 'declaredDate', 'amount'],
[C.EX, C.PAY, C.DEC, C.DIV]
)
)
mapping = {k: v for k, v in full_mapping.items() if k in df}
columns = list(mapping)

df = df[columns].rename(columns=mapping)
filename = self.finder.get_dividends_path(symbol, self.provider)

if C.EX in df and C.DIV in df:
df = self.reader.update_df(
filename, df, C.EX).sort_values(by=[C.EX])
df[C.DIV] = df[C.DIV].apply(lambda amt: float(amt) if amt else 0)

return df
return self.standardize(
symbol,
df,
full_mapping,
self.finder.get_dividends_path,
[C.EX, C.DIV],
0
)

def save_dividends(self, **kwargs):
# given a symbol, save its dividend history
Expand All @@ -57,18 +68,35 @@ def save_dividends(self, **kwargs):
self.writer.update_csv(
self.finder.get_dividends_path(symbol, self.provider), df)

# def get_splits(self, symbol, timeframe='max'):
# # given a symbol, return a cached dataframe
# df = self.reader.load_csv(self.finder.get_splits_path(symbol))
# filtered = self.reader.data_in_timeframe(df, C.EX, timeframe)
# return filtered
def get_splits(self, symbol, timeframe='max'):
# given a symbol, return a cached dataframe
df = self.reader.load_csv(
self.finder.get_splits_path(symbol, self.provider))
filtered = self.reader.data_in_timeframe(df, C.EX, timeframe)
return filtered

# def save_splits(self, **kwargs):
# # given a symbol, save its splits history
# symbol = kwargs['symbol']
# df = self.get_splits(**kwargs)
# self.writer.update_csv(self.finder.get_splits_path(symbol), df)
def standardize_splits(self, symbol, df):
full_mapping = dict(
zip(
['exDate', 'paymentDate', 'declaredDate', 'ratio'],
[C.EX, C.PAY, C.DEC, C.RATIO]
)
)
return self.standardize(
symbol,
df,
full_mapping,
self.finder.get_splits_path,
[C.EX, C.RATIO],
1
)

def save_splits(self, **kwargs):
# given a symbol, save its splits history
symbol = kwargs['symbol']
df = self.get_splits(**kwargs)
self.writer.update_csv(
self.finder.get_splits_path(symbol, self.provider), df)

# make tiingo OR IEX CLOUD!! version of get dividends which
# fetches existing dividend csv and adds a row if dividend
Expand Down Expand Up @@ -121,15 +149,34 @@ def get_dividends(self, symbol, timeframe='3m'):

return self.standardize_dividends(symbol, df)

# def get_splits(self, symbol):
# # given a symbol, return the stock splits
# ticker = yf.Ticker(symbol.replace('.', '-'))
# df = ticker.actions.reset_index().drop(
# 'Dividends',
# axis=1
# )
# df = df[df['Stock Splits'] != 0]
# return df
def get_splits(self, symbol, timeframe='3m'):
# given a symbol, return the stock splits
category = 'stock'
dataset = 'splits'
parts = [
self.base,
self.version,
category,
symbol.lower(),
dataset,
timeframe
]
endpoint = self.get_endpoint(parts)
response = requests.get(endpoint)
empty = pd.DataFrame()

if response.ok:
data = response.json()
# self.writer.save_json(f'data/{symbol}.json', data)
else:
print(f'Invalid response from IEX for {symbol} splits.')

if not response.ok or data == []:
return empty

df = pd.DataFrame(data)

return self.standardize_splits(symbol, df)


class Polygon(MarketData):
Expand All @@ -144,3 +191,11 @@ def get_dividends(self, symbol, timeframe='max'):
raw = pd.DataFrame(response.results)
df = self.standardize_dividends(symbol, raw)
return self.reader.data_in_timeframe(df, C.EX, timeframe)

def get_splits(self, symbol, timeframe='max'):
response = self.client.reference_stock_splits(symbol)
raw = pd.DataFrame(response.results)
df = self.standardize_splits(symbol, raw)
return self.reader.data_in_timeframe(df, C.EX, timeframe)

# newShares = oldShares / ratio
6 changes: 4 additions & 2 deletions test/test_Constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,7 @@ def test_get_dividends_path(self):
'AMD') == 'data/dividends/iexcloud/AMD.csv'

def test_get_splits_path(self):
assert finder.get_splits_path('aapl') == 'data/splits/AAPL.csv'
assert finder.get_splits_path('AMD') == 'data/splits/AMD.csv'
assert finder.get_splits_path(
'aapl') == 'data/splits/iexcloud/AAPL.csv'
assert finder.get_splits_path(
'AMD') == 'data/splits/iexcloud/AMD.csv'
Loading