Skip to content
View leloykun's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report leloykun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
leloykun/README.md

Hi! I'm Franz Louis Cesista フランズ

Personal Site GitHub LinkedIn Twitter Ponder Curriculum Vitae

Building something 👨‍🍳🚀 • Former Machine Learning (AI) Research Scientist, Full-Stack Software Engineer, & Data Engineer at Expedock Software Inc. • 2x IOI & 2x ICPC World Finalist • Mathematics at the Ateneo de Manila University

At Expedock, I was in charge of researching, building, training, and managing the deploy of hundreds of multi-modal machine learning models fine-tuned for information extraction on semi-structured documents in the logistics industry. More generally, I was also responsible for improving our entire ML system--from making our data collection jobs more robust, to managing our data warehouse and feature stores, to building the charts and dashboards we present to our customers.

Recently, I've also been exploring model inference optimizations on more lower-level abstractions. I know how to implement most machine learning building blocks in C++ (see my implementation of Meta's LLamaV2 in C++ and Flash Attention 1 & 2 in CUDA). At Expedock, I also worked on reducing the memory consumption of PyTorch (and its CUDA kernels) so we could run more inference jobs in parallel per GPU instance. Tl;dr: I'm very comfortable working on every level of abstraction in machine learning.

Before Expedock, I studied Mathematics at the Ateneo de Manila University. I also dabbled a lot in competitive programming. In fact, I managed to be a 2-time IOI and a 2-time ICPC World Finalist representing the Philippines.

Tech Stack

Layer Tools
Cloud AWS GCP
Infra Docker Terraform
DB Firebase PostgreSQL MySQL Snowflake BigQuery DBT
Backend C++ Python SQLAlchemy Alembic
API REST Flask GraphQL Strawberry
Frontend TypeScript React Vite Redux MUI
ML Platform AWS SageMaker Weights And Biases HuggingFace Modal
ML Inference Server Nvidia Triton Alembic
ML APIs OpenAI API AWS Textract GCP Vision AI
ML Frameworks Keras PyTorch Tensorflow Scikit-Learn AutoGluon
Data Viz Metabase Seaborn Streamlit VisX

Research Interests

  • Information Retrieval from Semi-Structured Documents. Research on information retrieval (colloquially, "Search") mostly focus on purely text-based documents and structured documents--both of which are now largely solved problems. For context, structured documents are PDFs, scanned documents, screenshots of excel sheets, etc. where (1) the borders of the tables (if present) and (2) the ordering of the word-blocks are very clear. But most real-world documents, especially in the logistics industry, are semi-structured. That is, documents where either (1) the tables don't have very clear borders (or may even be implicit tables) and/or (2) the word-blocks are scattered all over the place. This is surprisingly a very difficult problem and even the big cloud platforms (GCP, AWS, & Azure) are having difficulty handling such documents. But it can be very profitable if you can get it right--hence why Expedock is now a multi-million $$ startup.
  • ML on Non-Euclidean Geometry. More specifically, I'm interested in embedding high-dimensional data into lower-dimensional non-euclidean spaces. Although embedding into euclidean spaces, $\mathbb{R}^n$, is good enough for most cases, there are cases where non-euclidean spaces might be more appropriate. For example:
    • Embedding hierarchical data such as the phylogenetic tree-representation of single-cell specialization data. Real-world hierarchical data are usually tree-like with near-constant branching factors. Thus, they grow exponentially with respect to the depth (e.g. the $k^{th}$-level of a binary tree has $2^{k}$ nodes). However, euclidean spaces, $\mathbb{R}^n$, only grow polynomially with respect to $n$. On the other hand, negatively-curved spaces such as the poincare disc grow exponentially. Thus, it's better to embed hierarchical data into them - we just need to be careful with floating-point errors.
    • Embedding complex cyclical data. In my stint at ExoraPH, I used UMAP to uncover the lower-dimensional, torus-like structure of the Philippine's energy supply-and-demand curves.
  • Geometric Deep Learning. I'm interested in unifying various concepts in machine learning through the lens of the Erlangen Program. I'm especially fascinated with the following:
    • How we can derive linear regression, convolution, the attention mechanism, and message-passing from the geometric transformations we want our models to preserve. For example:
      • If we want translation-invariance, then we have to use convolutions as they're the only family of transformations that are translation-invariant.
      • If we want color- and shade-invariance, then we can use batch-normalization.
      • If we let the weights of the convolutions to be learnable (and depend on the neighbors' weights), then we'd end up with the attention mechanism. And
      • If we generalize the attention mechanism to all graph structures (not just regular graphs), then we'd end up with message-passing.
    • In almost all unsupervised learning models, we just fix two of (a) the manifold $X$, (b) the metric on the manifold $d_X$, and (c) the probability measure $\mu_X$ over the metric space $(X, d_X)$ and then try to estimate the remaining one of the three. For example:
      • In dimensional reduction, we usually fix $d_{X, p}(x, y) = \sqrt[p]{\sum_i (x_i - y_i)^p}$ and $\mu_X =$ the uniform distribution (such as in UMAP) then try to find a low-dimenional manifold $X$ that preserves the local distances in the original graph as much as possible.
      • In metric learning, we usually fix $X = \mathbb{R}^n$ and $\mu_X =$ the uniform distribution then try to find $d_X$ such that similar datapoints are close together and dissimilar datapoints are far way from each other. And, finnaly,
      • In density estimation, we usually fix $X = \mathbb{R}^n$ and $d_{X, p}(x, y) = \sqrt[p]{\sum_i (x_i - y_i)^p}$ then try to find the probability distribution $\mu_X$ of our dataset.

If you're interested in collaborating on a research project with me, just email me at franzlouiscesista@gmail.com

Porfolio [WIP]

Please visit my personal website at leloykun.github.io for a more detailed portfolio.

Personal Projects

Project Description
ProgVar Library ProgVar Library is a collection of algorithms, data structures, and other useful information for competitive programming. It also contains the team notebook our team used to reach the ICPC World Finals twice-in-a-row. I lead the team in maintaining the project.

Open-source Contributions

Project Description
Llama V2 A C++ implementation of Meta's Llama2 generative large-language model. I also optimized the original C implementation by adding parallelization on the multi-head attention component.
Llama V2 A minimal implementation of Flash Attention 1 & 2 in just ~350 lines of CUDA code. This is still a work-in-progress, but the ultimate goal is to implement the various variations of Hyperbolic Attention in CUDA.
AutoGluon An AutoML tool that supports multi-modal inputs. I helped trace and squash a bug which prevented interpretability metrics (such as permutation importances) on quantile regressors from being calculated.
BERTopic An automated topic modelling tool. I added customizability options to the visualizations.
Hurado NOI.PH's (the Philippine's National Olympiad of Informatics) online judge and problem manager. I added developer tools. I also help out younger developers in our private discord server.

Misc

Project Description
Expedock AutoML Library Expedock's AutoML Library. Train a model on data from Snowflake with just one line of code and run predictions with another line of code.

Pinned Loading

  1. mmsg mmsg Public

    Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.

    Python 25 3

  2. llama2.cpp llama2.cpp Public

    Forked from karpathy/llama2.c

    Inference Llama 2 in one file of pure C++

    Python 80 9

  3. flash-hyperbolic-attention-minimal flash-hyperbolic-attention-minimal Public

    Forked from tspeterkim/flash-attention-minimal

    Flash Hyperbolic Attention in ~[...] lines of CUDA

    Cuda 13 1

  4. admu-progvar/progvar-library admu-progvar/progvar-library Public

    Competitive programming library and team notebook maintained by AdMU Programming Varsity

    C++ 14 2

  5. shopee-codeleague-2020 shopee-codeleague-2020 Public

    Team Bruh's solutions to the Shopee CodeLeague 2020 - 11th Place in all of Southeast Asia

    Jupyter Notebook

  6. booking-demand-prediction booking-demand-prediction Public archive

    Geotemporal booking demand prediction for Grab's AI for SEA challenge 2019.

    Jupyter Notebook 5 1