Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added treemap example #531

Merged
merged 3 commits into from
Oct 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
130 changes: 130 additions & 0 deletions examples/tools/treemap/polygon_sic_code_data_gatherer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
import json
import concurrent.futures
from polygon import RESTClient

# Initialize Polygon API client
client = RESTClient(
trace=True
) # Assuming you have POLYGON_API_KEY environment variable set up

# Initialize the data structure to hold SIC code groups
sic_code_groups = {}


# https://en.wikipedia.org/wiki/Standard_Industrial_Classification
# https://www.investopedia.com/terms/s/sic_code.asp
def sic_code_to_group(sic_code):
"""
Maps a given SIC code to the industry group.
"""
sic_code = int(sic_code)
if 100 <= sic_code <= 999:
return "Agriculture, Forestry and Fishing"
elif 1000 <= sic_code <= 1499:
return "Mining"
elif 1500 <= sic_code <= 1799:
return "Construction"
# Note: 1800-1999 not used
elif 2000 <= sic_code <= 3999:
return "Manufacturing"
elif 4000 <= sic_code <= 4999:
return "Transportation and Public Utilities"
elif 5000 <= sic_code <= 5199:
return "Wholesale Trade"
elif 5200 <= sic_code <= 5999:
return "Retail Trade"
elif 6000 <= sic_code <= 6799:
return "Finance, Insurance and Real Estate"
elif 7000 <= sic_code <= 8999:
return "Services"
elif 9100 <= sic_code <= 9729:
return "Public Administration"
elif 9900 <= sic_code <= 9999:
return "Nonclassifiable"
else:
return None


def process_ticker(ticker_snapshot):
ticker = ticker_snapshot.ticker

try:
details = client.get_ticker_details(ticker)

# Check if the type is 'CS' (common stock), if not, return early without processing this ticker
# if getattr(details, 'type', None) != 'CS' or getattr(details, 'market_cap', None) != None:
if (
getattr(details, "type", None) != "CS"
or getattr(details, "market_cap", None) is None
):
return

sic_code = details.sic_code
sic_description = getattr(
details, "sic_description", None
) # Use getattr to avoid AttributeError if sic_description is not present
market_cap = getattr(details, "market_cap", None)

# if sic_code:
# sic_code = str(sic_code)[:1] # Extract first 1 digits

if sic_code:
sic_group = sic_code_to_group(sic_code)
if sic_group is None:
return

# Check if the sic_code is already in the groups, if not create a new entry with sic_description and empty companies list
# if sic_code not in sic_code_groups:
# sic_code_groups[sic_code] = {"sic_description": sic_description, "companies": []}

if sic_group not in sic_code_groups:
sic_code_groups[sic_group] = {
"sic_description": sic_group,
"companies": [],
}

# Append the company details to the corresponding SIC code entry
# sic_code_groups[sic_code]["companies"].append({
# "ticker": ticker,
# "market_cap": market_cap
# })

sic_code_groups[sic_group]["companies"].append(
{"ticker": ticker, "market_cap": market_cap}
)

except Exception as e:
print(f"Error processing ticker {ticker}: {e}")


# Get snapshot data
snapshot = client.get_snapshot_all("stocks")

# Execute the data processing in parallel, limited to 100 workers
with concurrent.futures.ThreadPoolExecutor(max_workers=100) as executor:
executor.map(process_ticker, snapshot)

# Modify the SIC Code Groups Dictionary to include the weights
for sic_code, group_data in sic_code_groups.items():
companies = group_data["companies"]
total_market_cap = sum(
company["market_cap"] for company in companies if company["market_cap"]
)

# If total_market_cap is 0, we will skip weight calculation to avoid division by zero
if total_market_cap == 0:
continue

for company in companies:
if company[
"market_cap"
]: # Avoid dividing by zero if a company's market cap is None or 0
company["weight"] = company["market_cap"] / total_market_cap
else:
company["weight"] = 0 # You can also set to a default value if preferred

# Save the enhanced data structure to a JSON file
with open("sic_code_groups.json", "w") as f:
json.dump(sic_code_groups, f)

print("Data collection complete and saved to 'sic_code_groups.json'")
60 changes: 60 additions & 0 deletions examples/tools/treemap/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Mapping Market Movements with Polygon.io and D3.js Treemap

This repository offers a tutorial on how to create a Treemap visualization of the current stock market conditions. Using D3.js Treemap, Polygon.io's [Snapshot API](https://polygon.io/docs/stocks/get_v2_snapshot_locale_us_markets_stocks_tickers), and the [python-client library](https://github.com/polygon-io/client-python), we'll guide you through building an interactive visualization. The Snapshot API allows us to fetch the most recent market data for all US-traded stocks, transforming them into color-coded nested rectangles within the Treemap. This presents an insightful and interactive snapshot of the market's current status.

![Treemap Visualization](./market-wide-treemap.png)

Please see the [tutorial](https://polygon.io/blog/market-movements-with-treemap) for more details.

## Structure

The repo consists of:

- `polygon_sic_code_data_gatherer.py`: Builds ticker to SIC code mapping for treemap groups.
- `sic_code_groups.json`: Pre-built JSON file containing grouped ticker to SIC code data.
- `treemap_server.py`: Simple server to host the treemap visualization (requires sic_code_groups.json).

For those interested in the underlying mechanics, the `polygon_sic_code_data_gatherer.py` script retrieves a snapshot of all ticker symbols, processes each one to obtain its SIC code via the Ticker Details API, and then saves these classifications into the file named `sic_code_groups.json`.

The logic of this SIC code-to-group enables us to transform a large dataset into a neatly structured visualization. This structured approach facilitates easy identification of market conditions, providing a snapshot of the market's overall health. You don't need to do anything since it is pre-built but we added the script if you wanted to modify anything.

## Getting Started

Setting up and visualizing the stock market's current conditions is straightforward. All you'll need to do is clone the repository, secure an API key from Polygon.io, install the required Python library, launch the visualization server example, and then dive into the visualization through your web browser.

### Prerequisites

- Python 3.x
- Polygon.io account and API key

### Setup

1. Clone the repository:
```
git clone https://github.com/polygon-io/client-python.git
```

2. Install the necessary Python packages.
```
pip install -U polygon-api-client
```

3. Store your Polygon.io API key securely, or set it as an environment variable:
```
export POLYGON_API_KEY=YOUR_API_KEY_HERE
```

### Running the Treemap Server

Change into the treemap example directory and execute the `treemap_server.py` script:
```
cd examples/tools/treemap
python3 treemap_server.py
```

Upon successful execution, the server will start, and you can view the treemap visualization by navigating to:
```
http://localhost:8889
```

That’s it. You'll now see a Treemap that organizes over 4,000+ US-traded stocks into 10 distinct categories. Use the dropdown box in the top left corner to select different categories and delve deeper into the data.
1 change: 1 addition & 0 deletions examples/tools/treemap/sic_code_groups.json

Large diffs are not rendered by default.

Loading