A fascinating experiment to evaluate how different Large Language Models (LLMs) perform in simulated job interviews. This project creates a controlled environment where multiple LLM models play the role of job candidates while being interviewed by a consistent interviewer model.
- Process Flow
- Project Overview
- Models Used
- Installation
- Usage
- Features
- Project Structure
- Sample Output
- Contributing
- License
- Acknowledgments
flowchart TD
Start([Start Simulation]) --> Config[Initialize Configuration]
Config --> |Set Job Title| Setup[Setup Interview Environment]
subgraph InitSetup [Initialization]
Setup --> InitInterviewer[Initialize Interviewer Agent<br/>llama-3.1-8b-instant]
Setup --> InitCandidates[Initialize Candidate Models]
end
InitCandidates --> |Model 1| C1[Candidate 1<br/>llama-3.1-8b-instant]
InitCandidates --> |Model 2| C2[Candidate 2<br/>llama3-8b-8192]
InitCandidates --> |Model 3| C3[Candidate 3<br/>mixtral-8x7b-32768]
InitCandidates --> |Model 4| C4[Candidate 4<br/>gemma-7b-it]
subgraph InterviewLoop [Interview Process]
StartInt[Start Interview] --> GenQ[Generate Question]
GenQ --> |Dynamic Question| GetResp[Get Candidate Response]
GetResp --> |Store Response| UpdateHist[Update Interview History]
UpdateHist --> |Check Questions| CheckCount{More Questions?}
CheckCount --> |Yes| GenQ
CheckCount --> |No| Evaluate
end
C1 & C2 & C3 & C4 --> StartInt
subgraph Evaluation [Evaluation Process]
Evaluate[Evaluate Candidate] --> GenScore[Generate Scores]
GenScore --> GenStrength[Identify Strengths]
GenStrength --> GenWeakness[Identify Weaknesses]
GenWeakness --> GenTips[Generate Interview Tips]
GenTips --> FinalEval[Final Evaluation Report]
end
subgraph Analysis [Comparative Analysis]
FinalEval --> CompAnalysis[Compare All Candidates]
CompAnalysis --> RankModels[Rank Model Performance]
RankModels --> PatternAnalysis[Analyze Response Patterns]
PatternAnalysis --> Recommendations[Generate Recommendations]
end
Recommendations --> SaveResults[Save Results to JSON]
SaveResults --> End([End Simulation])
style InitSetup fill:#e1f5fe,stroke:#01579b
style InterviewLoop fill:#f3e5f5,stroke:#4a148c
style Evaluation fill:#f1f8e9,stroke:#33691e
style Analysis fill:#fff3e0,stroke:#e65100
classDef process fill:#fff,stroke:#333,stroke-width:2px
classDef decision fill:#fffde7,stroke:#f57f17,stroke-width:2px
classDef endpoint fill:#006064,color:#fff,stroke:#00838f
class Start,End endpoint
class CheckCount decision
class GenQ,GetResp,Evaluate,CompAnalysis process
flowchart LR
A[Start] -->|Initialize| B[System Setup]
B --> C[Load Models]
B --> D[Configure Job Roles]
C --> E[Setup Interviewer]
C --> F[Setup Candidates]
style A fill:#4CAF50,color:white
style B fill:#2196F3,color:white
style C fill:#9C27B0,color:white
style D fill:#FF9800,color:white
style E fill:#E91E63,color:white
style F fill:#673AB7,color:white
flowchart LR
A[Start Interview] -->|Generate| B[Questions]
B -->|Collect| C[Responses]
C -->|Analyze| D[Feedback]
D -->|Next Question| B
D -->|Complete| E[Evaluation]
style A fill:#4CAF50,color:white
style B fill:#2196F3,color:white
style C fill:#9C27B0,color:white
style D fill:#FF9800,color:white
style E fill:#E91E63,color:white
flowchart LR
A[Evaluation] -->|Generate| B[Scores]
B -->|Identify| C[Strengths]
B -->|Identify| D[Weaknesses]
C --> E[Final Report]
D --> E
style A fill:#4CAF50,color:white
style B fill:#2196F3,color:white
style C fill:#9C27B0,color:white
style D fill:#FF9800,color:white
style E fill:#E91E63,color:white
This project simulates job interviews using different LLM models as candidates, while maintaining a consistent interviewer (Co-founder/CEO) model. The simulation:
- π Conducts structured interviews for various job positions
- π€ Uses different LLM models to simulate candidate responses
- π Provides comprehensive evaluation and feedback
- π Generates comparative analysis across different models
- π‘ Offers practical interview improvement suggestions
Model: groq/llama-3.1-8b-instant
Role: Co-founder and CEO
Purpose: Consistent evaluation across all interviews
Model |
---|
groq/llama-3.1-8b-instant |
groq/llama3-8b-8192 |
groq/mixtral-8x7b-32768 |
gemma-7b-it |
- Clone the repository:
git clone https://github.com/yourusername/llm-interview-simulator.git
cd llm-interview-simulator
- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
echo "GROQ_API_KEY=your_api_key_here" > .env
- Run the simulation:
python main.py
- Choose a job title:
job_titles = [
"Marketing Associate",
"Business Development Representative",
"Product Manager",
"Customer Success Representative",
"Data Analyst",
"AI Engineer"
]
- π Dynamic question generation
- π¬ Natural conversation flow
- π― Technical skill assessment
- π€ Cultural fit evaluation
Metric | Description |
---|---|
Decision | Pass/Fail outcome |
Score | 0-100 numerical rating |
Strengths | Key positive attributes |
Improvements | Areas for development |
Tips | Interview improvement suggestions |
Reasoning | Detailed evaluation logic |
llm-interview-simulator/
βββ π agents.py # Agent definitions
βββ π tasks.py # Interview tasks
βββ π interview_simulation.py # Core logic
βββ π main.py # Entry point
βββ π requirements.txt # Dependencies
βββ π README.md # Documentation
{
"job_title": "AI Engineer",
"interview_date": "2024-03-31 14:30:22",
"candidates": {
"candidate1": {
"model": "groq/llama-3.1-8b-instant",
"score": 85,
"evaluation": "..."
}
},
"comparative_analysis": "..."
}
We welcome contributions! Areas for improvement:
- π New job positions
- π€ Additional LLM models
- π Enhanced metrics
- π‘ Feature additions
This project is licensed under the MIT License - see the LICENSE file for details.
- π οΈ Built with CrewAI framework
- π€ Powered by Groq's LLM models
- πΌ Inspired by real-world interviews
This is an experimental project for research and educational purposes. The simulations should not be used as the sole basis for actual hiring decisions.
Key metrics displayed:
- β Response quality scores
- π€ Cultural fit assessment
- π§ Technical knowledge evaluation
- π¬ Communication style analysis
Made with β€οΈ by KT