Skip to content

mcddhub/mcdd-big-data-study

Repository files navigation

Mcdd-Big-Data-Study

Mcdd-Big-Data-Study

Study project for big data (Hadoop, Zookeeper, Kafka, Flink, Spark)

License GitHub stars


Features ✨

Supported Technologies:

  • Hadoop 3.3.6 (with JDK 8.0.352-zulu, Maven 3.6.3)
    • Zookeeper 3.9.2
    • Kafka 2.12-3.7.1

Installation 📦

  1. Clone the repository:
    git clone https://github.com/mcddhub/mcdd-big-data-study.git --depth=1 && cd mcdd-big-data-study
    1. Build the Docker image:
      cd docker
      docker build -t caobaoqi1029/big-data-study:x.x.x .

Note: Replace x.x.x with the appropriate version number.

Docker Build Image Docker Build Complete
  1. Start the containers:
    docker compose up -d
Docker Compose

Configuration 🛠

  1. Connect to the remote server via VS Code and attach to a running container.
VS Code Container Connection Container Connection
  1. Install the Java Dev extension in VS Code.
Java Dev Extension
  1. Restart the extension host to apply changes.
Restarting Extension Host
  1. Initialize Hadoop environment:
    docker exec -it master bash
    hdfs namenode -format
HDFS Format
  1. Start Hadoop services:
    start-all.sh
Hadoop Start
  1. Use the following commands to interact with Hadoop:
    vim input.txt
    hdfs dfs -put -f ./input.txt /
    hdfs dfs -ls /
HDFS Commands
  1. Build and run the Hadoop job:
    mvn clean package
    cd target/
    hadoop jar big-data.jar

Tip: You can set the environment variable to run Java directly:

export CLASSPATH=$CLASSPATH:/tmp/
# Add this to .bashrc for persistence.
Java Execution
  1. View the output:
    hdfs dfs -ls /output
    hdfs dfs -cat /output/part-r-00000
Output View

Contributing 🤝

We welcome contributions! Feel free to submit a pull request. For more details, see the Contribution Guide.

Thanks to all contributors:

Contributors

License 📄

This project is licensed under the MIT License. See the LICENSE file for details.


Support 💖

If you find this project helpful, consider giving it a ⭐️ on GitHub!


Star History ⭐

Star History Chart