Password Strength Classifier

An example of Jax usage with the new Keras 3.0 API release

Why this Project?

Being a fan of keras framework, the anouncement of Keras 3 is a big YEAH! moment for me. I can now experiment with other deep learning frameworks, like Pytorch or the Jax, with just two or three lines of modification in the current codebase. Yeah, it is no longer limited to using TensorFlow alone!

Any Keras model that only uses built-in layers will immediately work with all supported backends. In fact, your existing tf.keras models that only use built-in layers can start running in JAX and PyTorch right away! That's right, your codebase just gained a whole new set of capabilities.

I've been using Jax because it's the only one that works on CUDA 12 local setup without the need for extra cudnn downloads. I intend to create a simple password strength classifier project to show how simple it is to configure the jax backend for the keras.

TL;DR

Install the CUDA 12.x compatible jax and latest version of keras from pip

pip install -U pip
pip install -U "jax[cuda12_local]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install keras

Configure the backend before keras import
```
import os
os.environ["KERAS_BACKEND"] = "jax"

import keras
```
Note If you used environment variables outside the code, you can use keras.backend.backend() to retrive the name of the backend.

What's new

Although, Keras 3 is a total rewrite of the previous codebase. These are a few of the highlights I want to share with you.

Framework agnostic — You can pick the framework that suits you best, and switch from one to another based on your current goals.
Model parallism — It was previously accomplished with Tensorflow, but now it includes distribution namespace, making it simple to perform model parallelism, data parallelism, and combinations of the two.
Universal data pipelines — The Keras 3 fit() / evaluate() / predict() routines are compatible with tf.data.Dataset objects, with PyTorch DataLoader objects, with NumPy arrays, Pandas dataframes ‐ regardless of the backend you're using.
Ops namespace — In case you don't want to extend the existing layers, you can use keras.ops to create components (like arbitrary custom layers or pretrained models) that works across all backends. It provides numpy API (Not something "NumPy-like" ‐ just literally the NumPy AP) and neural-network functions (softmax, conv and what not).

And more...

Requirements

Python 3.11.x
(Optional) Poetry 1.7.0
CUDA 12.2, cuDNN 8.9, NCCL 2.16

Setup

Clone the repository

git clone --depth=1 https://github.com/tbhaxor/password-strength-clf.git
cd password-strength-clf

(Optional: Using pip) Configure and activate the virtual environment
```
virtualenv venv
source venv/bin/activate
```
Install the dependencies
```
pip install -r requirements.txt
```
Or, with poetry
```
poetry install
```
Note Poetry automatically creates a virtual environment if it does not exists, and installs the packages into it.

Download the dataset from the kaggle or create the CSV in format with 2 columns

Column Name	Order	Description
`password`	1	Textual password in raw format, hashed password are not allowed.
`strength`	2	Strength class of the corresponding password where 0 means WEAK, 1 means MODERATE and 2 means STRONG.

Training the Model

The train.py script takes the input CSV file as required

$ python train.py /path/to/data.csv --passwords HELLO hello h3ll000 H3ll00W@rld h3LL00@@102030
....
Running user provided tests
Password: HELLO                                 Strength: WEAK
Password: hello                                 Strength: WEAK
Password: h3ll000                               Strength: WEAK
Password: H3ll00W@rld                           Strength: MODERATE
Password: h3LL00@@102030                        Strength: STRONG

Note As a bare minimum, only the path to the input CSV file is required to begin training and save the best model for it. However, I recommend that you use the --passwords argument to predict the classes using the best model for the custom passwords.

Training Customization

Use "python train.py -h" to get the arguments provided by train.py script

usage: train.py [-h] [--model-path] [--learning-rate] [--dropout-rate]
                [--epochs] [--batch-size] [--vocab CHARACTERS]
                [--validation-split] [--passwords PASSWORD [PASSWORD ...]]
                file

Password strength classifier trainer script

positional arguments:
  file                  path of the csv dataset file

options:
  -h, --help            show this help message and exit
  --model-path          path to save only weights of the best model (default: trained-models/best_model)
  --learning-rate       learning rate for adam optimizer (default: 0.01)
  --dropout-rate        dropout rate to be used between last hidden layer and the output layer (default: 0.3)
  --epochs              number of epochs to train for (default: 5)
  --batch-size          batch size to use (default: 32)
  --vocab CHARACTERS    vocabulary for the passwords to train the model on (default: None)
  --validation-split    validation split during model training (default: 0.2)
  --passwords PASSWORD [PASSWORD ...]
                        sample passwords to predict and show the performance (default: None)

Using Pretrained Model

If you've already trained the model and want to use it in your application, use the following code.

from utils import read_data, get_vectorized, get_model, CLASS_NAMES
from keras import ops
from pathlib import Path

X = get_vectorized(CHARSET, ["your password"])
saved_model = Path("/path/to/weights")
if not saved_model.exists():
    raise FileNotFoundError(f"Path '{saved_model}' does not exist!")

model = get_model(X.shape[1:], model_path="/path/to/weights")

y_predicted = model.predict(X, verbose=False)
y_class_id = ops.argmax(y_predicted, axis=1)
y_class = CLASS_NAMES[y_class_id[0]]

Contact Me

Email: tbhaxor at gmail dot com
Discord: @tbhaxor.com
Twitter: @tbhaxor
LinkedIn: @tbhaxor

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Password Strength Classifier

Why this Project?

TL;DR

What's new

Requirements

Setup

Training the Model

Training Customization

Using Pretrained Model

Contact Me

About

Languages

License

tbhaxor/password-strength-clf

Folders and files

Latest commit

History

Repository files navigation

Password Strength Classifier

Why this Project?

TL;DR

What's new

Requirements

Setup

Training the Model

Training Customization

Using Pretrained Model

Contact Me

About

Topics

Resources

License

Stars

Watchers

Forks

Languages