Grosoiu_Andrei_Submission

This repo was created for the pre-interview challenge at LSEG.

Stock Exchange Data Processor

This project is designed to read and process stock exchange data from CSV files. It identifies outliers in the data and saves the results to new CSV files. The main requirements for the task are done in the main.py file, I took the liberty to create some plots presenting a function that fits the stock values in the plot_stocks.py file. The coefficients for the function have been generated using Matlab.

Features

Read Multiple CSV Files: Reads one or two CSV files for each stock exchange directory.
Data Sampling: Extracts 30 data points starting from a random timestamp from each CSV file.
Outlier Detection: Processes the sampled data to identify outliers based on statistical analysis.
Logging and Error Handling: Logs warnings and errors with timestamps for better traceability.

Directory Structure

project-root/
│
├── inputs/
│ ├── exchange1/
│ │ ├── file1.csv
│ │ └── file2.csv
│ ├── exchange2/
│ │ ├── file1.csv
│ │ └── file2.csv
│ └── ...
├── output/
│ └── (outlier files will be saved here)
├── main.py
└── README.md

Prerequisites

Python 3.x
Pandas library

Installation

Clone the repository:

git clone https://github.com/Grosoiu/Grosoiu_Andrei_Submission.git

Install the required dependencies:
```
pip install -r requirements.txt
```

Run the script:

python main.py 1 OR python main.py 2, where the number represents how many files should be processed per stock exchange.

Additionally, if you want to run the script that generates the function that fits the stock values, you can: Install the required dependencies:
```
pip install -r requirements_extra.txt
```

Run the script:

python plot_stocks.py

Detailed Description

Create a directory for each stock exchange in the inputs folder and add csv files for each stock, you will see results in the output folder.

`read_csv_files(num_files)`

Parameters:
- num_files (int): Number of files to read (1 or 2).
Returns:
- A dictionary with the stock exchange directory names as keys and lists of dataframes with 30 data points each as values for the stocks.

`parse_exchanges(processed_data)`

Parameters:
- processed_data (dict): A dictionary with stock exchange directory names as keys and lists of dataframes with 30 data points each as values.
Returns:
- Writes the outliers in the output folder with the following rule : {stock_exchange}_{stock}_outlier.csv

Logging

The script logs various levels of messages with timestamps to help in debugging and tracking the process flow. Log messages include warnings for missing files, critical errors for empty or insufficient data, and info messages for successfully saved outlier files. In my experience working in Monitoring I realized just how important logs are for monitoring the well being of software.

Output Images

The script generates fitting plots for each stock and saves them in the output director. The coefficients have been calculated using Matlab's function polyfit, examples:

ASH Fitting Plot:
FLTR Fitting Plot:
GSK Fitting Plot:
NMR Fitting Plot:
TSLA Fitting Plot:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grosoiu_Andrei_Submission

Stock Exchange Data Processor

Features

Directory Structure

Prerequisites

Installation

Detailed Description

`read_csv_files(num_files)`

`parse_exchanges(processed_data)`

Logging

Output Images

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
inputs		inputs
output		output
polynomialCoef		polynomialCoef
.gitignore		.gitignore
README.md		README.md
main.py		main.py
plot_stocks.py		plot_stocks.py
requirements.txt		requirements.txt
requirements_exta.txt		requirements_exta.txt

Grosoiu/Grosoiu_Andrei_Submission

Folders and files

Latest commit

History

Repository files navigation

Grosoiu_Andrei_Submission

Stock Exchange Data Processor

Features

Directory Structure

Prerequisites

Installation

Detailed Description

read_csv_files(num_files)

parse_exchanges(processed_data)

Logging

Output Images

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`read_csv_files(num_files)`

`parse_exchanges(processed_data)`

Packages