Skip to content

HTTP/1.1 web server supporting GET, POST, and DELETE requests

Notifications You must be signed in to change notification settings

zelhajou/42cursus-webserv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ascii-text-art (1)

42cursus-webserv

Overview

This project involves developing a robust HTTP server in C++98, designed to handle basic web traffic and serve static content. Our server will conform to HTTP/1.1 standards and utilize non-blocking I/O operations for efficient resource management.

Table of Contents

Features

  • HTTP Protocol Support: Implements essential features of the HTTP/1.1 protocol, including methods such as GET, POST, and DELETE.
  • Concurrency Handling: Utilizes non-blocking sockets along with I/O multiplexing techniques (select, poll, epoll, and kqueue) to manage multiple client connections efficiently.
  • Static Content Serving: Capable of serving HTML, CSS, and JavaScript files, enabling the hosting of static websites.
  • Error Handling: Provides meaningful error responses and default error pages, ensuring the server robustly handles incorrect or unexpected client requests.
  • Customizable Configuration: Includes a flexible configuration system inspired by NGINX, allowing adjustments for ports, server names, and routing behaviors without altering the server code.

Technical Requirements

  • Programming Language: C++98
  • Development Tools: Git, Make, GCC, Valgrind
  • Testing Tools: Stress testing with siege or ab (Apache Bench)

Team Members and Roles

Task Assignments

  • hsobane - Network Infrastructure and Server Setup

    • Socket Creation and Management
    • Connection Handling
    • I/O Multiplexing
    • Error Handling and Resilience
  • beddinao - HTTP Protocol Handling

    • Request Parsing
    • Response Generation
    • Support HTTP Methods
    • Static File Serving
    • File Upload Handling
  • zelhajou - Configuration Management and Logging

    • Configuration File Parsing
    • Apply Configuration Settings
    • Logging System

Topics

Basic concepts

What is a Web Server?

A web server is a software application that serves content to clients over the internet or an intranet using the HTTP protocol. It processes incoming requests from clients, retrieves the requested resources, and sends them back as responses. Web servers can serve static content like HTML, CSS, and images, as well as dynamic content generated by applications running on the server.

Endianness

Endianness refers to the order in which bytes are stored in memory. There are two main types of endianness: big-endian and little-endian. Web servers and clients must agree on the endianness to correctly interpret data sent over the network.

little-big-endian

TCP/IP Protocol Suite

The TCP/IP protocol suite is a set of communication protocols used to connect devices over the internet. It consists of several layers, each responsible for different aspects of network communication.

  • IP (Internet Protocol):

    ip

    • An IP address is a unique identifier for a device on a network.
    • IPv4 addresses are 32-bit numbers usually represented in dotted-decimal format (e.g., 192.168.1.1).
    • IPv6 addresses are 128-bit numbers represented in hexadecimal (e.g., 2001:0db8:85a3:0000:0000:8a2e:0370:7334).
  • TCP (Transmission Control Protocol):

    tcp-1

    • A connection-oriented protocol that provides reliable, ordered, and error-checked delivery of a stream of bytes.
    • Three-way handshake:
      1. Client sends a SYN packet to the server.
      2. Server sends a SYN-ACK packet to the client.
      3. Client sends an ACK packet to the server.
    • Four-way handshake:
      1. Client sends a FIN packet to the server.
      2. Server sends an ACK packet to the client.
      3. Server sends a FIN packet to the client.
      4. Client sends an ACK packet to the server.
  • UDP (User Datagram Protocol):

    • A connectionless protocol that provides unreliable, unordered, and error-checked delivery of a stream of bytes.
    • Connectionless: No connection is established before data is sent.
    • Unreliable: No guarantee that data will be delivered.
    • Unordered: No guarantee that data will be received in the order it was sent.
  • Ports:

    • A port is a communication endpoint that identifies a specific process or service on a host.
    • Ports are identified by a 16-bit number ranging from 0 to 65535.
    • Well-known ports:
      • Ports ranging from 0 to 1023 are reserved for system services.
    • Registered ports:
      • Ports ranging from 1024 to 49151 are registered with the IANA.
    • Dynamic ports:
      • Ports ranging from 49152 to 65535 are used for private or temporary purposes.
  • Socket:

    sockets

    • A socket is an endpoint for communication between two machines.
    • A socket is identified by an IP address and a port number.
    • Server socket:
      • A server socket listens for incoming connections.
    • Client socket:
      • A client socket initiates a connection to a server socket.
  • What is a Protocol? (Deepdive)

Socket Programming

Socket programming is a way of connecting two nodes on a network to communicate with each other. One socket listens on a particular port at an IP, while the other socket reaches out to the other to form a connection.

  • Server-Client Model:

    • A server is a program that listens for incoming connections.
    • A client is a program that initiates a connection to a server.
  • Socket:

    • A socket is an endpoint for communication between two machines.
    • A socket is identified by an IP address and a port number.
    • Server socket:
      • A server socket listens for incoming connections.
    • Client socket:
      • A client socket initiates a connection to a server socket.
  • TCP Server-Client:

    • Server
      1. Create a socket using the socket() system call.
        int sockfd = socket(AF_INET, SOCK_STREAM, 0);
        
        // AF_INET: Address family for IPv4.
        // SOCK_STREAM: Type (TCP for reliable, connection-oriented service).
        // 0: Protocol (0 lets the system choose the appropriate protocol).
      2. Bind the socket to an address using the bind() system call.
        struct sockaddr_in addr;
        addr.sin_family = AF_INET;
        addr.sin_addr.s_addr = INADDR_ANY;
        addr.sin_port = htons(8080);
        
        bind(sockfd, (struct sockaddr *)&addr, sizeof(addr));
        
        // struct sockaddr_in: Structure for IPv4 addresses.
        // sin_family: Address family for IPv4.
        // AF_INET: Address family for IPv4.
        // sin_addr.s_addr: IP address of the host.
        // sin_port: Port number.
        // INADDR_ANY: Accept connections to any IP address.
        // htons(): Convert port number to network byte order.
      3. Listen for incoming connections using the listen() system call.
        listen(sockfd, SOMAXCONN);
        
        // SOMAXCONN: Maximum number of pending connections in the listen queue.
      4. Accept a connection using the accept() system call.
        int new_socket = accept(sockfd, (struct sockaddr *)&address, &addrlen);
        
        // accept(): Waits for an incoming connection and returns a new socket descriptor.
        // new_socket: Socket descriptor for the new connection.
      5. Send and receive data using the send() and recv() system calls.
        send(new_socket, "Hello, World!", 13, 0);
        recv(new_socket, buffer, 1024, 0);
        // send(): Sends data to the connected socket.
        // recv(): Receives data from the connected socket.
        // buffer: Buffer to store received data.
    • Client 1. Create a socket using the `socket()` system call. ```c int sockfd = socket(AF_INET, SOCK_STREAM, 0);
        // AF_INET: Address family for IPv4.
        // SOCK_STREAM: Type (TCP for reliable, connection-oriented service).
        // 0: Protocol (0 lets the system choose the appropriate protocol).
        ```
      
      1. Connect to a server using the connect() system call.
        struct sockaddr_in addr;
        addr.sin_family = AF_INET;
        addr.sin_addr.s_addr = inet_addr("a.b.c.d");
        addr.sin_port = htons(8080);
        
        connect(sockfd, (struct sockaddr *)&addr, sizeof(addr));
        
        // inet_addr(): Converts an IP address in dotted-decimal notation to a long.
        // htons(): Convert port number to network byte order.
      2. Send and receive data using the send() and recv() system calls.
        send(sockfd, "Hello, World!", 13, 0);
        recv(sockfd, buffer, 1024, 0);
        
        // send(): Sends data to the connected socket.
        // recv(): Receives data from the connected socket.
        // buffer: Buffer to store received data.
  • UDP Server-Client:

    • Server
      1. Create a socket using the socket() system call.
        int sockfd = socket(AF_INET, SOCK_DGRAM, 0);
        
        // AF_INET: Address family for IPv4.
        // SOCK_DGRAM: Type (UDP for unreliable, connectionless service).
        // 0: Protocol (0 lets the system choose the appropriate protocol).
      2. Bind the socket to an address using the bind() system call.
        struct sockaddr_in addr;
        addr.sin_family = AF_INET;
        addr.sin_addr.s_addr = INADDR_ANY;
        addr.sin_port = htons(8080);
        
        bind(sockfd, (struct sockaddr *)&addr, sizeof(addr));
        
        // struct sockaddr_in: Structure for IPv4 addresses.
        // sin_family: Address family for IPv4.
        // AF_INET: Address family for IPv4.
        // sin_addr.s_addr: IP address of the host.
        // sin_port: Port number.
        // INADDR_ANY: Accept connections to any IP address.
        // htons(): Convert port number to network byte order.
        // bind(): Assigns the address specified by addr to the socket sockfd.
      3. Send and receive data using the sendto() and recvfrom() system calls.
        sendto(sockfd, "Hello, World!", 13, 0, (struct sockaddr *)&addr, sizeof(addr));
        recvfrom(sockfd, buffer, 1024, 0, (struct sockaddr *)&addr, &addrlen);
        
        // sendto(): Sends data to a specific address.
        // recvfrom(): Receives data from a specific address.
    • Client
      1. Create a socket using the socket() system call.
        int sockfd = socket(AF_INET, SOCK_DGRAM, 0);
        
        // AF_INET: Address family for IPv4.
        // SOCK_DGRAM: Type (UDP for unreliable, connectionless service).
        // 0: Protocol (0 lets the system choose the appropriate protocol).
      2. Send and receive data using the sendto() and recvfrom() system calls.
        sendto(sockfd, "Hello, World!", 13, 0, (struct sockaddr *)&addr, sizeof(addr));
        recvfrom(sockfd, buffer, 1024, 0, (struct sockaddr *)&addr, &addrlen);
        
        // sendto(): Sends data to a specific address.
        // recvfrom(): Receives data from a specific address.
  • Sockets and Network Programming in C

  • fun with sockets: let's write a webserver!

Non-blocking I/O and Multiplexing

Non-blocking I/O and multiplexing are techniques used to handle multiple I/O operations simultaneously without blocking the execution of the program.

  • Non-blocking I/O:
    • Blocking I/O: The process waits until the I/O operation is complete.

    • Non-blocking I/O:

      • The process continues executing while the I/O operation is in progress.
         fcntl(sockfd, F_SETFL, O_NONBLOCK);
        
         // fcntl(): Performs operations on file descriptors.
    • Multiplexing:

      • select():
        • Monitors multiple file descriptors for I/O readiness.
        • Read Set: Contains file descriptors that are ready for reading.
        • Write Set: Contains file descriptors that are ready for writing.
        • Error Set: Contains file descriptors that have errors.
        • Timeout: Specifies the maximum time to wait for an event.
        • select() returns the number of ready file descriptors.
        • FD_ISSET(): Checks if a file descriptor is in a set.
        • FD_SET(): Adds a file descriptor to a set.
        • FD_CLR(): Removes a file descriptor from a set.
        • FD_ZERO(): Clears a set.
         fd_set readfds;
         FD_ZERO(&readfds);
         FD_SET(sockfd, &readfds);
        
         select(sockfd + 1, &readfds, NULL, NULL, NULL);
        
         if (FD_ISSET(sockfd, &readfds)) {
         	// sockfd is ready for reading.
         }
        
         // FD_ZERO: Clears a set.
         // FD_SET: Adds a file descriptor to a set.
         // FD_CLR: Removes a file descriptor from a set.
         // FD_ISSET: Checks if a file descriptor is in a set.
      • poll():
        • Monitors multiple file descriptors for I/O readiness.
        • Timeout:
          • Specifies the maximum time to wait for an event.
        • poll() returns the number of ready file descriptors.
         struct pollfd fds;
         fds.fd = sockfd;
         fds.events = POLLIN;
        
         poll(&fds, 1, -1);
        
         if (fds.revents & POLLIN) {
         	// sockfd is ready for reading.
         }
        
         // struct pollfd: Structure for poll events.
         // fds.fd: File descriptor to monitor.
         // fds.events: Events to monitor.
         // fds.revents: Events that occurred.
      • epoll():
        • Monitors multiple file descriptors for I/O readiness.
        • Timeout:
          • Specifies the maximum time to wait for an event.
        • epoll() returns the number of ready file descriptors.
         int epfd = epoll_create(1);
         struct epoll_event event;
         event.events = EPOLLIN;
         event.data.fd = sockfd;
        
         epoll_ctl(epfd, EPOLL_CTL_ADD, sockfd, &event
         epoll_wait(epfd, &event, 1, -1);
        
         // epoll_create(): Creates an epoll instance.
         // struct epoll_event: Structure for epoll events.
         // event.events: Events to monitor.
         // event.data.fd: File descriptor to monitor.
         // epoll_ctl(): Modifies an epoll instance.
         // epoll_wait(): Waits for an event on an epoll instance.
      • kqueue():
        • Monitors multiple file descriptors for I/O readiness.
        • Timeout: Specifies the maximum time to wait for an event.
        • kqueue() returns the number of ready file descriptors.
         int kq = kqueue();
         struct kevent event;
         EV_SET(&event, sockfd, EVFILT_READ, EV_ADD, 0, 0, NULL);
         
         kevent(kq, &event, 1, &event, 1, NULL);
         
         // kqueue(): Creates a kqueue instance.
         // struct kevent: Structure for kqueue events.
         // EV_SET: Initializes a kevent structure.
         // kevent(): Modifies a kqueue instance.

HTTP Protocol

The Hypertext Transfer Protocol (HTTP) is an application-layer protocol used for transmitting hypermedia documents, such as HTML files, over the internet. It is the foundation of data communication on the World Wide Web.

whats-http http2

Request Methods:

Request methods indicate the desired action to be performed on a resource. Common HTTP methods include:

request-methods-1
  • Request Headers:
anatomy-http-request request-headers

Example:

GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
Accept: text/html
Accept-Language: en-US
Accept-Encoding: gzip, deflate
Connection: keep-alive
  • Response Headers:
anatomy-http-response response-headers

Example:

HTTP/1.1 200 OK
Date: Mon, 01 Jan 2022 00:00:00 GMT
Server: Apache/2.4.51
Content-Type: text/html
Content-Length: 13
Connection: keep-alive
Set-Cookie: session=123
Last-Modified: Mon, 01 Jan 2022 00:00:00 GMT
Location: /index.html
Content-Language: en-US
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
  • Status Codes:
status-codes
  • HTTP/1.1:

    • Persistent Connections: Allows multiple requests and responses to be sent over a single connection.
    • Pipelining: Allows multiple requests to be sent without waiting for the responses.
    • Chunked Transfer Encoding: Allows data to be sent in chunks.
    • Content Negotiation: Allows the server to send different content based on the client's preferences.
    • Caching: Allows the client to store a copy of the response for future use.
    • Compression: Allows the server to compress the response before sending it to the client.
    • Authentication: Allows the server to require user authentication.
    • Cookies: Allows the server to store information on the client's computer.
    • Redirects: Allows the server to redirect the client to a different URL.
    • Error Handling: Allows the server to send error messages to the client.
    • Security: Allows the server to enforce security policies.
  • HTTP Headers - MDN Web Docs

Resources

GitHub Repositories:

Releases

No releases published

Packages

No packages published