Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature extraction - wrapper around schema.apply_udf #198

Merged
merged 8 commits into from
Nov 27, 2023

Conversation

FelipeAdachi
Copy link
Contributor

@FelipeAdachi FelipeAdachi commented Nov 21, 2023

Currently, if the user wants to use Langkit for a Feature Extraction scenario, they would neet to run:

import toxicity
from whylogs.experimental.core.udf_schema import udf_schema
import pandas as pd

df = pd.DataFrame({"prompt": ["I love you", "I hate you"]})
schema = udf_schema()

df_enhanced, _ = schema.apply_udfs(df)

Which unnecessarily exposes the user to whylogs' udf_schema and provides a confusing tuple output.

This PR wraps the code above into a langkit.extract function, so it becomes like this:

import langkit
from langkit import toxicity

df = pd.DataFrame({"prompt": ["I love you", "I hate you"]})
enhanced_df = langkit.extract(data=df)

or, for the row case:

import langkit
from langkit import toxicity

row = {"prompt": "I love you", "response": "I hate you"}
enhanced_row = langkit.extract(data=row)

also:

  • incidental error handling in hallucination module

langkit/extract.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@jamie256 jamie256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @FelipeAdachi, thanks!

langkit/extract.py Outdated Show resolved Hide resolved
@jamie256 jamie256 merged commit 23497fa into main Nov 27, 2023
12 checks passed
@jamie256 jamie256 deleted the dev/felipe/extract branch November 27, 2023 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants