A modern web application that extracts and analyzes content from any webpage using advanced AI models. Built with React, Node.js, and multiple AI providers including OpenAI and Anthropic.
-
Multi-Model Analysis: Choose between different AI models:
- GPT-4 Turbo
- GPT-3.5 Turbo
- Claude 3 Sonnet
-
Content Extraction:
- Page summaries and key insights
- Main topics and target audience
- Content structure (headings, paragraphs)
- Important links and metadata
- Product information (if available)
-
Modern UI/UX:
- Clean, minimalist design with burnt orange accents
- Dark/Light mode support
- Real-time loading feedback with humorous messages
- Responsive layout for all devices
- Node.js (v14 or higher)
- npm or yarn
- API keys for:
- OpenAI
- Anthropic
- Clone the repository:
git clone <repository-url>
cd webscraper
- Install dependencies for both frontend and backend:
# Install backend dependencies
cd backend
npm install
# Install frontend dependencies
cd ../frontend
npm install
- Set up environment variables:
Create .env
files in both the backend and root directories:
Backend .env
:
PORT=3000
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
- Start the development servers:
In the backend directory:
npm start
In the frontend directory:
npm run dev
The application will be available at http://localhost:5173
webscraper/
├── backend/
│ ├── server.js # Express server setup
│ ├── services/
│ │ └── aiAnalyzer.js # AI integration logic
│ └── package.json
├── frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── App.tsx # Main app component
│ │ └── main.tsx # Entry point
│ └── package.json
└── README.md
- Start both the backend and frontend servers
- Enter a URL in the input field
- Select your preferred AI model
- Click "Extract Data" to analyze the webpage
- View the structured results including:
- Page summary and key insights
- Content analysis
- Metadata and structure
-
POST /api/analyze
- Analyzes a webpage using the selected AI model
- Body:
{ url: string, modelId: string }
- Returns structured content analysis
-
GET /api/models
- Returns available AI models and their status
Backend:
PORT
: Server port (default: 3000)OPENAI_API_KEY
: OpenAI API keyANTHROPIC_API_KEY
: Anthropic API key
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for GPT models
- Anthropic for Claude models
- Material-UI for the component library