Skip to content

balanz24/serverless_benchmarks

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Lithops Serverless Benchmarks

This repository contains a summary of serverless benchmarks and pipelines designed to measure the performance of serverless architectures like Lithops.

Benchmark Description Data set Data size
General
Lithops Benchmarks Analyze Lithops compute and storage performance. Autogenerated
Montecarlo Monte Carlo Methods to make computations with big amount of random data. Autogenerated
Hyperparameter tunning Hyperparameter tuning using grid search algorithm. Amazon customer reviews (link) 516.93 MB
Geospatial
NDVI Calculate NDVI from Object Storage images. Sentinel2 satellite image from the AWS Sentinel2 open data repository
Model creation from LiDAR pre-processing Create terrain models using LiDAR partitioner. laz files 431 MB
Metabolomics
METASPACE Run the METASPACE metabolite annotation pipeline on cloud resources. Examples of datasets and databases in the link below
Genomics
Variant Calling Alignment of sequencing reads, stored as FASTQ files, to a reference genome, stored as a FASTA file. Trypanosome, Human, Bos Taurus (see links below) 703 MB, 14.184 GB, 17.263 GB
Astronomics
Astronomica-interferometry Radio interferometric data processing. SB205.MS SB206.MS SB207.MS SB208.MS SB209.MS SB210.MS 5.5 GB each sample

In some cases there's a link to an external repository containing the code while others can be found here.

All workflows utilize Lithops to easily deploy and run code on any major Cloud serverless platform.

Benchmarks

This is a benchmark to estimate the floating-point performance of the system for matrix multiplication operations using NumPy. It measures how many floating-point operations per second the system can perform for this specific operation.

This contains two applications in which Monte Carlo Methods is used to make computations with big amount of random data using Cloud Functions with Lithops.

Pre-processing of Sentinel2 images to enable serverless massive parallel processing with many workers consuming data from Object Storage using the Cloud-Optimized GeoTIFF format.

Use case of serverless image processing consuming data from Object Storage, NDVI(Normalized Difference Vegetation Index) is calculated over many images to demonstrate high throughput and performance.

LIDAR is a novel tool to partition LiDAR files based on the denisty of points. The partitions are simmilar in size, which is convenient for serverless processing, as task granularity defines the execution time and cost. With this partitioned data we create several terrain models used in many geospatial workflows. We study the impact of load balancing by partitioning LiDAR data using the aforementioned density-based partitioner.

Demonstrate using Lithops to run the METASPACE (Spatial metabolomics cloud platform that conducts molecular annotation of imaging mass spectrometry data) metabolite annotation pipeline on cloud resources.

In genomics, variant calling entails the alignment process, which is essentially a search for string similarities. This process aligns sequencing reads, typically stored as FASTQ files, with a reference genome, which is stored as a FASTA file. The reference genome and reads are split into smaller chunks for alignment.

Processing radio interferometric data performing all the phases: rebinning, calibration and imaging using Lithops.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.2%
  • Dockerfile 5.8%