Demo v2: Auto-correct JSON format via loop #23

tholor · 2023-09-20T15:38:54Z

Motivation

Pipelines in Haystack 2.0 are more flexible and more robust than ever.
They allow you now to run loops within your pipeline. Let's look at why this matters.

Exemplary Use Case: Validate & auto-correct the output format of an LLM

Let's imagine you want to extract structured information from a document in JSON format.
As your further processing steps depend on the correctness of this JSON format, you want to make sure that the LLM really produces it in a robust way.

Approach

We validate the output of the LLM against a pydantic schema. If the output is valid, we return it to the user. If not, we create another prompt that leverages the generated output + error message. To avoid infinite loops we limit it to max. 5 iterations.

Demo: Colab

To Dos:

We need optional variables in PromptTemplates (we are using a haystack branch so far)
Allow three variables in the prompt template so that we can include the error message (so far only the wrongly generated output is used)

…an up comments

…ack-demos into parsing_loop_v2

ZanSara

Left two small fixes that make this work on Canals 0.10

auto_fixing_parser_v2/auto_fix_parser.py

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>

bilgeyucel · 2024-02-07T09:07:05Z

Closing this one as we have the example as a tutorial: https://github.com/deepset-ai/haystack-tutorials/blob/main/tutorials/28_Structured_Output_With_Loop.ipynb

tholor and others added 22 commits September 20, 2023 11:59

initial script

e3b712b

Set max_loops_allowed to 5

05a1b51

Update the OutputParser to check for json validity and output error msg

1758c97

final result component

b93292b

improve prompt. add some random failing in validator for testing. cle…

df97b67

…an up comments

change randomness of corrupt json

d89a4fc

Add Pydantic schema validation + schema in prompt

0ec3260

Add requirements.txt

35e2c10

custom pydantic basemodel

d5b2881

Merge branch 'parsing_loop_v2' of https://github.com/deepset-ai/hayst…

ba907bd

…ack-demos into parsing_loop_v2

add create_pipeline

c3b65fb

fix error

0e6b8a1

switch to single prompt with jinja

99a91fa

fix merge conflicts

1fc5525

fix method

608e833

Add passage input

77aea3b

Merge branch 'parsing_loop_v2' of https://github.com/deepset-ai/hayst…

411f932

…ack-demos into parsing_loop_v2

switch to information extraction case

35cc176

(ugly) schema customization via UI

e286522

Add streamlit UI

2a86e99

Add pipeline image

e1a9331

Add output file path as a config

6421842

This was referenced Oct 6, 2023

self correction loop for json parser deepset-ai/haystack#5989

Closed

Self correction loop for RAG when "answer is not possible given context" deepset-ai/haystack#5988

Closed

fix GPTGenerator import; lower probability of corrupt json for debugging

76a74ff

ZanSara reviewed Nov 21, 2023

View reviewed changes

auto_fixing_parser_v2/auto_fix_parser.py Outdated Show resolved Hide resolved

auto_fixing_parser_v2/auto_fix_parser.py Outdated Show resolved Hide resolved

tholor and others added 4 commits November 21, 2023 14:59

Remove returning "None

86095dc

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>

Remove returning "None

b52ff3e

Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>

simplify pipeline

37ffbe2

switch from streamlit demo to notebook

fa31354

tholor requested a review from bilgeyucel November 23, 2023 16:25

bilgeyucel closed this Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo v2: Auto-correct JSON format via loop #23

Demo v2: Auto-correct JSON format via loop #23

tholor commented Sep 20, 2023 •

edited

Loading

ZanSara left a comment

bilgeyucel commented Feb 7, 2024

Demo v2: Auto-correct JSON format via loop #23

Demo v2: Auto-correct JSON format via loop #23

Conversation

tholor commented Sep 20, 2023 • edited Loading

Motivation

Exemplary Use Case: Validate & auto-correct the output format of an LLM

Approach

ZanSara left a comment

Choose a reason for hiding this comment

bilgeyucel commented Feb 7, 2024

tholor commented Sep 20, 2023 •

edited

Loading