Skip to content

Commit

Permalink
Merge pull request #105 from taorye/dev
Browse files Browse the repository at this point in the history
feat: app gesture classifier
  • Loading branch information
Neutree authored Jan 16, 2025
2 parents 9d6dfe9 + b495106 commit 2b6050e
Show file tree
Hide file tree
Showing 16 changed files with 489 additions and 2 deletions.
Binary file added docs/doc/assets/handposex_14class.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions docs/doc/en/sidebar.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ items:
label: Hand landmarks
- file: vision/body_pose_classification.md
label: Human pose classifier
- file: vision/hand_gesture_classification.md
label: Hand geature classifier
- file: vision/maixhub_train.md
label: MaixHub online AI training
- file: vision/customize_model_yolov5.md
Expand Down
2 changes: 1 addition & 1 deletion docs/doc/en/video/uvc_streaming.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ uvcs.show(img)
This approach offers high performance with a single-process implementation, but USB functionality will only be available when the process is running. Therefore, when stopping this process, it's important to note that the enabled `Rndis` and `NCM` functionalities will temporarily become inactive, causing a network disconnection.

**Reference example source code path:**
`MaixPy/examples/vision/streaming/uvc_stream.py`
`MaixPy/examples/vision/streaming/uvc_server.py`

**Also packaged as an app source code path:**
`MaixCDK/projects/app_uvc_camera/main/src/main.cpp`
Expand Down
46 changes: 46 additions & 0 deletions docs/doc/en/vision/hand_gesture_classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection
---

## Introduction

The `MaixCAM MaixPy Hand Gesture Classification Based on Hand Keypoint Detection` can classify various hand gestures.

The current dataset used is the `14-class static hand gesture dataset` with a total of 2850 samples divided into 14 categories.
[Dataset Download Link (Baidu Netdisk, Password: 6urr)](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g)

![](../../assets/handposex_14class.jpg)

This app is implemented in `MaixPy/projects/app_hand_gesture_classifier/main.py`, and the main logic is as follows:

1. Load the `14-class static hand gesture dataset` processed by the **Hand Keypoint Detection** model, extracting `20` relative wrist coordinate offsets.
2. Initially train on the first `4` classes to support basic gesture recognition.
3. Use the **Hand Keypoint Detection** model to process the camera input and visualize classification results on the screen.
4. Tap the top-right `class14` button to add more samples and retrain the model for full `14-class` gesture recognition.
5. Tap the bottom-right `class4` button to remove the added samples and retrain the model back to the `4-class` mode.
6. Tap the small area between the buttons to display the last training duration at the top of the screen.
7. Tap the remaining large area to show the currently supported gesture classes on the left side—**green** for supported, **yellow** for unsupported.

## Demo Video

<video playsinline controls autoplay loop muted preload src="/static/video/hand_gesture_demo.mp4" type="video/mp4">
Classifier Result Video
</video>

1. The video demonstrates the `14-class` mode after executing step `4`, recognizing gestures `1-10` (default mapped to other meanings), **OK**, **thumbs up**, **finger heart** (requires the back of the hand, hard to demonstrate in the video but can be verified), and **pinky stretch**—a total of `14` gestures.

2. Then, step `5` is executed, reverting to the `4-class` mode, where only gestures **1**, **5**, **10** (fist), and **OK** are recognizable. Other gestures fail to produce correct results. During this process, step `7` was also executed, showing the current `4-class` mode—only the first 4 gestures are green, and the remaining 10 are yellow.

3. Step `4` is executed again, restoring the `14-class` mode, and previously unrecognized gestures in the `4-class` mode are now correctly identified.

4. Finally, dual-hand recognition is demonstrated, and both hands' gestures are accurately recognized simultaneously.

## Others

The demo video captures the **maixvision** screen preview window in the top-right corner, matching the actual on-screen display.

For detailed implementation, please refer to the source code and the above analysis.

Further development or modification can be directly done based on the source code, which includes comments for guidance.

If you need additional assistance, feel free to leave a message on **MaixHub** or send an email to the official company address.
2 changes: 2 additions & 0 deletions docs/doc/zh/sidebar.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ items:
label: 手部关键点检测
- file: vision/body_pose_classification.md
label: 人体姿态分类器
- file: vision/hand_gesture_classification.md
label: 手势分类器
- file: vision/maixhub_train.md
label: MaixHub 在线训练 AI 模型
- file: vision/customize_model_yolov5.md
Expand Down
2 changes: 1 addition & 1 deletion docs/doc/zh/video/uvc_streaming.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ uvcs.show(img)

高性能单进程实现,但仅在运行时 USB 全部功能才可用,故停止该进程时需要注意仍启用的 `Rndis``NCM` 会暂时失效,断开网络链接。

参考示例源码路径:`MaixPy/examples/vision/streaming/uvc_stream.py`
参考示例源码路径:`MaixPy/examples/vision/streaming/uvc_server.py`

另有封装成 APP 的源码路径:`MaixCDK/projects/app_uvc_camera/main/src/main.cpp`

Expand Down
50 changes: 50 additions & 0 deletions docs/doc/zh/vision/hand_gesture_classification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类
---


## 简介

`MaixCAM MaixPy 基于手部关键点检测结果进行进行手势分类`可分类手势。

目前使用的数据集为`14 类静态手势数据集`[数据集下载地址(百度网盘 Password: 6urr )](https://pan.baidu.com/s/1Sd-Ad88Wzp0qjGH6Ngah0g),数据集共 2850 个样本,分为 14 类。  


![](../../assets/handposex_14class.jpg)


该 app 实现位于 `MaixPy/projects/app_hand_gesture_classifier/main.py`,主要逻辑是

1. 加载 `14 类静态手势数据集``手部关键点检测` 处理后的 `20` 个相对手腕的坐标偏移
2. 初始训练前 `4` 个分类,以支持手势识别
3. 加载 `手部关键点检测` 模型处理摄像头并通过该分类器将结果可视化在屏幕上
4. 点击右上角 `class14` 可增添剩余分类样本再训练以达到 `14` 分类手势
5. 点击右下角 `class4` 可移除上一步添加的分类样本再训练以达到 `4` 分类手势
6. 点击按钮之间的小块区域,可在顶部显示分类器上一次训练的时长
7. 点击其余大块区域,可在左侧显示当前支持的分类类别,绿色表示支持,黄色表示不支持



## 效果视频
<video playsinline controls autoplay loop muted preload src="/static/video/hand_gesture_demo.mp4" type="video/mp4">
Classifier Result video
</video>

1. 视频内容为执行了上述第 `4` 步后的 `14` 分类模式,可识别手势 `1-10` (默认对应其他英文释义),ok,大拇指点赞,比心(需要手背,拍摄时不好演示,可自行验证),小拇指伸展 一共 `14` 种手势。

2. 紧接着执行第 `5` 步,回退到 `4` 分类模式,仅可识别 1,5,10(握拳)和 ok,其余的手势都无法识别到正常结果。期间也有执行 第 `7` 步展示了当前是 `4` 分类模式,因为除了前 4 种手势为绿,后 10 种全部为黄色显示。

3. 再就是执行第 `4` 步,恢复到 `14` 分类模式,`4` 分类模式无法识别的手势现在也恢复正确识别了。

4. 末尾展示了双手的识别,实测可同时正确识别两只手的手势。


## 其它

效果视频为捕获的 maixvision 右上的屏幕预览窗口而来,和屏幕实际显示内容一致。

详细实现可见源码和上述分析了。

二次开发或修改也可直接基于源码完成,内附有注释。

如确实仍有需要协助的,可与 maixhub 上发帖留言或发 email 到公司邮箱。
Binary file added docs/static/video/hand_gesture_demo.mp4
Binary file not shown.
5 changes: 5 additions & 0 deletions projects/app_hand_gesture_classifier/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

build
dist
/CMakeLists.txt

184 changes: 184 additions & 0 deletions projects/app_hand_gesture_classifier/LinearSVC.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
import numpy as np

class LinearSVC:
class StandardScaler:
mean:np.ndarray
std:np.ndarray
def transform(self, X):
return (X - self.mean) / self.std

def fit_transform(self, X):
self.mean = np.mean(X, axis=0)
self.std = np.std(X, axis=0)
return self.transform(X)

def __init__(self, C=1.0, learning_rate=0.01, max_iter=1000):
self.C = C
self.learning_rate = learning_rate
self.max_iter = max_iter
self.scaler = self.StandardScaler()

def save(self, filename: str):
np.savez(filename,
C = self.C,
learning_rate = self.learning_rate,
max_iter = self.max_iter,
scaler_mean = self.scaler.mean,
scaler_std = self.scaler.std,
classes = self.classes,
_W = self._W,
_B = self._B,
)

@classmethod
def load(cls, filename: str):
npzfile = np.load(filename)
self = cls(
C=float(npzfile["C"]),
learning_rate=float(npzfile["learning_rate"]),
max_iter=float(npzfile["max_iter"])
)
self.scaler.mean = npzfile["scaler_mean"]
self.scaler.std = npzfile["scaler_std"]
self.classes = npzfile["classes"]
self._W = npzfile["_W"]
self._B = npzfile["_B"]
return self

def _train_binary_svm(self, X, y):
"""
训练一个二分类 SVM。
"""
n_samples, n_features = X.shape
w = np.zeros(n_features)
b = 0
for _ in range(self.max_iter):
scores = np.dot(X, w) + b # 计算所有样本的预测得分
margin = y * scores # (n_samples,) 计算每个样本的 margin
mask = margin < 1 # 获取不满足条件的样本,满足 condition 即为支持向量
X_support = X[mask] # 支持向量
y_support = y[mask] # 支持向量的标签
if len(X_support) > 0: # 向量化更新公式
w -= self.learning_rate * (2 * w / n_samples - self.C * np.dot(X_support.T, y_support)) # 批量更新 w
b -= self.learning_rate * (-self.C * np.sum(y_support)) # 批量更新 b
return w, b

def fit(self, X, y):
"""
训练多分类 SVM。
参数:
- X: (n_samples, n_features) 的特征矩阵
- y: (n_samples,) 的标签数组,值为多个类别
"""
self.classes = np.unique(y) # 提取所有类别
self._W = np.zeros((len(self.classes), X.shape[1]))
self._B = np.zeros(len(self.classes))
for i, cls in enumerate(self.classes):
binary_y = np.where(y == cls, 1, -1) # 构造一对多的标签
w, b = self._train_binary_svm(X, binary_y)
self._W[i] = w
self._B[i] = b

def forward(self, X):
return np.dot(X, self._W.T) + self._B

def predict(self, X):
return self.classes[np.argmax(self.forward(X), axis=1)] # 返回得分最高的类别

def predict_with_confidence(self, X):
def softmax(x):
x_max = np.max(x, axis=-1, keepdims=True) # 处理数值稳定性:减去最大值
exp_x = np.exp(x - x_max)
return exp_x / np.sum(exp_x, axis=-1, keepdims=True)
res = self.forward(X) # (n_samples, n_classes)
confidences = softmax(res) # (n_samples, n_classes)
return self.classes[np.argmax(res, axis=1)], np.max(confidences, axis=1) # 返回得分最高的类别


class LinearSVCManager:
def __init__(self, clf: LinearSVC=LinearSVC(), X=None, Y=None, pretrained=False):
if X is None:
X = np.empty((0, 0))
if Y is None:
Y = np.empty((0,))

# 转换为 NumPy 数组
if isinstance(X, list):
X = np.array(X)
if isinstance(Y, list):
Y = np.array(Y)

# 类型检查
if not isinstance(X, np.ndarray):
raise TypeError("X must be a list or numpy array.")
if not isinstance(Y, np.ndarray):
raise TypeError("Y must be a list or numpy array.")

if len(X) != len(Y):
raise ValueError("Length of X and Y must be equal.")
if len(Y) == 0:
raise ValueError("A classifier (clf) must be provided with training samples X and Y.")

if pretrained:
if clf is None:
raise ValueError("A pretrained classifier (clf) can't be `None`.")

if clf is None:
if pretrained:
raise ValueError("A pretrained classifier (clf) can't be `None`.")
clf = LinearSVC()

self.clf = clf
self.samples = (X, Y)

if not pretrained:
self.train()

def train(self):
X_scaled = self.clf.scaler.fit_transform(self.samples[0])
self.clf.fit(X_scaled, self.samples[1])
print(f"{len(self.samples[1])} samples have been trained.")

def test(self, X):
X = np.array(X)
if X.shape[-1] != self.samples[0].shape[1]:
raise ValueError("Tested data dimension mismatch.")
X_scaled = self.clf.scaler.transform(X)
return self.clf.predict_with_confidence(X_scaled)

def add(self, X, Y):
X = np.array(X)
Y = np.array(Y)

if X.shape[-1] != self.samples[0].shape[1]:
raise ValueError("Added data dimension mismatch.")

if len(self.samples[0])>0:
self.samples = (
np.vstack([self.samples[0], X]),
np.concatenate([self.samples[1], Y])
)
else:
self.samples = (X, Y)

self.train()

def rm(self, indices):
X, Y = self.samples

if any(idx < 0 or idx >= len(Y) for idx in indices):
raise IndexError("Index out of bounds.")

mask = np.ones(len(Y), dtype=bool)
mask[indices] = False

self.samples = (X[mask], Y[mask])

if len(self.samples[1]) > 0:
self.train()
else:
print("Warning: All data has been removed. Model is untrained now.")

def clear_samples(self):
self.samples = (np.empty((0, self.samples[0].shape[1])), np.empty((0,)))
print("All training samples have been cleared.")
15 changes: 15 additions & 0 deletions projects/app_hand_gesture_classifier/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
The touchscreen is segmented into four sections:

1. The first two are circles located in the upper-right and lower-right corners.

2. The third section is the area between these two circles.

3. The fourth section is the largest, covering the entire left area.

Upon pressing them, the display shows the following messages:

1. Releasing without moving away will activate them.

2. It indicates the elapsed time since the last training session.

3. It shows the number of active classes.
14 changes: 14 additions & 0 deletions projects/app_hand_gesture_classifier/app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
id: gesture_classifier
name: Gesture Classifier
name[zh]: 手势分类
version: 1.0.0
author: Taorye@Sipeed
icon: icon.png
desc: Classify the hand gesture.
files:
- app.yaml
- icon.png
- main.py
- LinearSVC.py
- clf_dump.npz
- trainSets.npz
Binary file added projects/app_hand_gesture_classifier/clf_dump.npz
Binary file not shown.
Binary file added projects/app_hand_gesture_classifier/icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 2b6050e

Please sign in to comment.