Skip to content

GSoC 2024 Project Ideas

henry senyondo edited this page Mar 20, 2024 · 8 revisions

Please ask questions through issues on the respective project's repo.

Tags available @henrykironde, @bw4sz, @ethanwhite,

  • Preferred names (Henry, Ben, Ethan)
  • Preferred_greeting (Hi|Hello|Dear|Thanks|Thank you [First_name])

The code of conduct should be your first read.

Proposal: Developing an Advanced Image Recognition Model for Bird Nest Detection in UAV Imagery

Rationale

Create a robust image recognition model for accurately detecting nests in UAV (Unmanned Aerial Vehicle) imagery within the Everglades ecosystem. Currently, we possess nest annotations but lack a dedicated nest model tailored for this region.

Approach

  1. Curate Test Dataset: Collect and curate a comprehensive test dataset to assess the accuracy of the proposed image recognition model.

  2. Between-Week Accuracy Measurement: Evaluate the model's performance by measuring its accuracy in detecting nests across different weeks. This will provide insights into the model's consistency over time.

  3. Raw Imagery Analysis: Explore raw UAV imagery for potential egg detections, a task not feasible in orthophotos due to blur. Utilize this information for future model refinement and annotation mining.

Expected Outcomes:

  • A state-of-the-art image recognition model specifically designed for nest detection in Everglades UAV imagery.
  • Improved accuracy and consistency, validated through between-week accuracy assessments.
  • Identification of egg detections in raw imagery, contributing to the enhancement of the model and expanding the dataset for future research.

This proposal aims to address the existing gap in nest detection capabilities within the Everglades region, providing valuable insights into the avian population and ecosystem dynamics through advanced image recognition technology.

Source Code: DeepForest Associated Code:

Degree of difficulty

  • Intermediate, long (350 hours)

Skills:

  • git/GitHub
  • Machine learning
  • Software testing
  • Python and Python package deployment

Expected outcomes

Mentors

  • @bw4sz
  • @henrysenyondo
  • @ethanwhite

Proposal: Advancing Bird Detection and Classification in Hand-Held Airborne Imagery

Rationale

Fine-tune an existing UAV-based model for bird classification, specifically targeting six bird species within the Everglades ecosystem. This project aims to enhance accuracy and performance in identifying avian species from images captured by an observer from a low-flying aircraft.

Approach:

Model Fine-Tuning:

  • Implement fine-tuning techniques on the existing UAV-based model to optimize it for bird classification within the Everglades ecosystem.

Data Annotation and Extraction:

Utilize unlabeled imagery captured from low flying piloted aircraft, leveraging existing Photoshop annotations. Develop and employ a script to extract these annotations, providing valuable labeled data for model training.

Test Dataset Creation:

Curate a comprehensive test dataset to evaluate the accuracy of the fine-tuned model. The focus will be on comparing bird count accuracy across the six specified species.

Integration with Annotation Platform:

Incorporate the deepforest model into the annotation platform set up on label-studio. This integration streamlines the annotation process and enhances collaboration for continuous model improvement.

Expected Outcomes:

  • A refined and optimized deepforest model for accurate bird detection and classification in hand-held plane imagery.
  • An annotated dataset for training and testing, contributing to the improvement of the model's performance.
  • Comparative analysis of bird count accuracy by species, providing valuable insights into the model's effectiveness.

Expected outcomes

This proposal seeks to advance bird detection capabilities, providing a valuable tool for monitoring and understanding bird populations within the Everglades ecosystem using hand-held plane imagery.

Source Code: deepForest Associated Code:

Degree of difficulty

  • Intermediate, long (350 hours)

Skills:

  • git/GitHub
  • Machine learning
  • Software testing
  • Python and Python package deployment

Mentors

  • @bw4sz
  • @henrysenyondo
  • @ethanwhite

Proposal: Modernizing Tree Detection with Advanced Object Detection Models

Rationale:

Evaluate and update the existing 1 stage object detection backbone in the DeepForest model for tree detection. The aim is to assess the viability of transitioning to vision transformers. Using existing train, test and semi-supervised annotations, the aim of this project to compare deep learning architectures that have been developed since our initial model.

Approach:

  1. Update 1 Stage Object Detection Backbone: Evaluate the outdated 1 stage object detection tree backbone in the DeepForest model, considering advancements in object detection methodologies since 2017.

  2. Explore Transformers for Image Representation: Investigate vision transformer and other modern architectures. Leverage a large amount of unsupervised weak labels for tree detection alongside traditional supervised classification.

  3. Train and Compare 2 Stage Detectors: Utilize the torchvision library to train and compare 2 stage detectors against the existing deepforest backbone. Evaluate their performance in tree detection tasks.

  4. Organize Training and Test Data: Systematically organize training and test datasets to facilitate comprehensive evaluation and comparison of the updated models.

Expected outcomes:

  • An updated and optimized DeepForest model with a modernized 1 stage object detection backbone.
  • Insights into the potential benefits of transformers for tree detection compared to traditional methods.
  • Comparative analysis of 2 stage detectors against the existing deepforest backbone, providing a basis for model selection.

This proposal aims to enhance tree detection capabilities by embracing contemporary object detection models and methodologies, ultimately improving accuracy and efficiency in the analysis of tree-related imagery

Source Code: deepForest Associated Code:

Degree of difficulty

  • Intermediate, long (350 hours)

Skills:

  • git/GitHub
  • Machine learning
  • Software testing
  • Python and Python package deployment

Mentors

  • @bw4sz
  • @henrysenyondo
  • @ethanwhite

Optimizing Forecasting: High-Performance Parallel Computing for Model Fitting and Prediction in Portalcasting R package

Rationale

Portalcasting, an open-source R package, aids in ecological forecasting of biodiversity within a long-term ecological research program focused on studying desert biodiversity over 45 years. The package facilitates automated data integration and modular models for generating forecasts across various ecological outcomes. Presently, the forecasting system executes numerous forecasts sequentially, and this project aims to parallelize the codebase, enabling concurrent execution on multiple cores, both on individual machines and HPCs.

Approach

Portalcasting relies on supporting packages like PortalData and Portalr. PortalData contains all Portal project data, while Portalr offers functions for summarizing this data. The portalcasting package integrates PortalData and Portalr into a streamlined pipeline, used by portal-forecasts. The forecast results are displayed on the interactive dashboard. Currently, the forecast takes about four hours, with 98% of the time consumed by the portalcast() function. Our aim is to reduce the time by enabling parallel execution of the function, considering the shared data used by all models.

Source Code: https://github.com/weecology/portalcasting

Degree of difficulty

  • Intermediate, long (350 hours)

Skills:

  • R programming
  • Knowledge in designing and implementing parallel algorithms
  • Parallelization frameworks in R, such as 'parallel' or 'future' packages
  • High-Performance Computing (HPC) Knowledge
  • git/GitHub
  • Software Development

Expected outcomes

  • An optimized parallel program designed to significantly decrease execution time.

Mentors

  • @henrysenyondo
  • @ethanwhite