Skip to content

ianmihura/biwak-hack

Repository files navigation

TenX

alt text

This project was built as part of the Data-Driven VC Hackathon organized by Red River West & Bivwak! by BNP Paribas

Boot project

  1. python -m venv .
  2. pip install -r requirements.txt
  3. source ./bin/activate
  4. Add your Harmonic and OpenAI keys to the .env file
  5. Launch the project with streamlit run interface.py

Main part of the code to look at:

  • UI entry: Interface.py
  • (The innovative part of our solution): backend.py (entry point) + BossExecutor.py (main logic) The innovative part of the solution is Use LLM to generate steps and queries from the question, in chain of thoughts. This is not integrated with the UI yet, but you can run it and see the process with python backend.py

Class: BossExecutor

Concept description

In today's VC world, in order to extract insights from data, it takes time. People either manually pull data from different vendor platforms and work in Excel, or the more tech-driven VCs build their own database by integrating with data from the vendors, and translate the business needs to big data queries to extract insights. Both approaches cost time and money, probably technical knowledge too.

TenX facilitates any non-technical VC analyst to extract insights from data, using natural language.

TenX will interpret your question, break the problem down into intelligent APIs (in the future, potentially queries too, but it would be more complicated), run validations, to ensure it answers what users want.

The important requirement for TenX to work is: you will need well-documented API contracts, or a Swagger spec, like harmonic_api_doc.txt in our example.

Modular APIs

We only support Harmonic and OpenAI API for now.

To integrate your own data provider APIs, you must:

  1. Include the secrets to the .env file
  2. Add the name of the endpoint to api_config.py
  3. Provide a list of callable endpoints, or a Swagger spec

For SQL queries

We still don't support SQL queries

Demo explanation

We created a mock of how the backend will work, the file is main.py. We perform a query search for competitors of a specified domain using various client APIs.

The main function processes user input to find similar companies for a given comapny. It retrieves company information, finds similar sites, and evaluates them using a vector search engine.

  1. Domain Extraction: Uses a regular expression to extract the domain from the user input.
  2. Client Initialization: Initializes the HarmonicClient to fetch company information based on the domain.
  3. API Calls:
    • Retrieves company info using Harmonic API
    • Fetches similar sites using Harmonic API
    • Gathers detailed information for similar companies using Harmonic API
  4. Validation: Initializes OpenAIClient and validated the results from API calls with LLM.

Boss Executor

Overview

The BossExecutor class orchestrates the execution of tasks using various executors, such as API calls, generic tasks, and theoretically SQL queries too (but sql would be difficult). For now we are only implementing HarmonicClient and OpenAIClient to generate and execute steps based on user queries. This class is particularly useful for handling workflows that involve stepwise execution.

Improvements

  1. Accuracy of the the steps and queries generated by the LLM model: The current model is not perfect, and it can sometimes generate API calls that have unwanted tokens (i.e., placeholders rather than real values, bad json strings). In our design, we need a LLM validator (which we have not implemented), which should validate the generated APIs/queries before they are executed. In retrospect, we can also ask the orchestrator to be smarter: after each step is executed and results are returned, we can feed the results back to the orchestrator, and ask it to update the subsequent steps based on the results. This will ensure that the steps are more accurate as the results return per step.

-2. Scalability: As TenX support more and more data providers, we will have more API documentations that need to be fed into OpenAI (or any other LLM model) context to generate steps and queries. This can be costly over time. However, we have an alternative solution to retrieve the relevant APIs to each question: a combination of LLM and a vector database. Vector database will allow us to handle large amount of documents without the limitation of the context window.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages