diff --git a/docs_ws/src/codebase/codebase_overview.rst b/docs_ws/src/codebase/codebase_overview.rst new file mode 100644 index 0000000..39ee7c2 --- /dev/null +++ b/docs_ws/src/codebase/codebase_overview.rst @@ -0,0 +1,5 @@ +.. _codebase_overview_link: + +CodeBase Overview +================= + diff --git a/docs_ws/src/codebase/hydra_ws.rst b/docs_ws/src/codebase/hydra_ws.rst new file mode 100644 index 0000000..44b04cf --- /dev/null +++ b/docs_ws/src/codebase/hydra_ws.rst @@ -0,0 +1,7 @@ +Hydra Workspace +=============== + +.. note:: + + This workspace is primarily adapted from Hydra. + For more detailed information, please refer to the `original github repository `_ or read the `paper `_. \ No newline at end of file diff --git a/docs_ws/src/codebase/kobuki_ws.rst b/docs_ws/src/codebase/kobuki_ws.rst new file mode 100644 index 0000000..d4f34f8 --- /dev/null +++ b/docs_ws/src/codebase/kobuki_ws.rst @@ -0,0 +1,2 @@ +Kobuki Workspace +================ \ No newline at end of file diff --git a/docs_ws/src/codebase/llm_query_ws.rst b/docs_ws/src/codebase/llm_query_ws.rst new file mode 100644 index 0000000..b077263 --- /dev/null +++ b/docs_ws/src/codebase/llm_query_ws.rst @@ -0,0 +1,2 @@ +LLM Query Workspace +=================== \ No newline at end of file diff --git a/docs_ws/src/codebase/realsense_ws.rst b/docs_ws/src/codebase/realsense_ws.rst new file mode 100644 index 0000000..172acf3 --- /dev/null +++ b/docs_ws/src/codebase/realsense_ws.rst @@ -0,0 +1,2 @@ +Realsense Workspace +=================== \ No newline at end of file diff --git a/docs_ws/src/codebase/ros1_bridge_ws.rst b/docs_ws/src/codebase/ros1_bridge_ws.rst new file mode 100644 index 0000000..b3adaea --- /dev/null +++ b/docs_ws/src/codebase/ros1_bridge_ws.rst @@ -0,0 +1,2 @@ +ROS1 Bridge Workspace +===================== \ No newline at end of file diff --git a/docs_ws/src/codebase/vlplidar_ws.rst b/docs_ws/src/codebase/vlplidar_ws.rst new file mode 100644 index 0000000..2182a2d --- /dev/null +++ b/docs_ws/src/codebase/vlplidar_ws.rst @@ -0,0 +1,2 @@ +VLP-16 LiDAR Workspace +======================= \ No newline at end of file diff --git a/docs_ws/src/demo/demo.rst b/docs_ws/src/demo/demo.rst index 928e16e..f430a06 100644 --- a/docs_ws/src/demo/demo.rst +++ b/docs_ws/src/demo/demo.rst @@ -1,5 +1,22 @@ Demo ==== -Youtube -------- \ No newline at end of file +Gazebo +------- + +.. raw:: html + + + +Habitat Sim +------------------ + +.. raw:: html + + diff --git a/docs_ws/src/index.rst b/docs_ws/src/index.rst index e5d6c7a..a42e0ea 100644 --- a/docs_ws/src/index.rst +++ b/docs_ws/src/index.rst @@ -21,18 +21,36 @@ Links introduction/introduction introduction/architecture - introduction/how_to_use + introduction/ros2 + introduction/quick_start .. toctree:: :caption: Installation :maxdepth: 2 :numbered: - installation/installation + installation/docker + installation/devcontainer + installation/ros2 + +.. toctree:: + :caption: CodeBase + :maxdepth: 2 + :numbered: + + codebase/codebase_overview + codebase/hydra_ws + codebase/kobuki_ws + codebase/llm_query_ws + codebase/realsense_ws + codebase/ros1_bridge_ws + codebase/vlplidar_ws .. toctree:: :caption: Demo :maxdepth: 2 :numbered: - demo/demo \ No newline at end of file + demo/demo + +If you have any questions, please contact the author by email: ``yuzhong1214@gmail.com``, or create an issue on the `GitHub repository `_. diff --git a/docs_ws/src/installation/devcontainer.rst b/docs_ws/src/installation/devcontainer.rst new file mode 100644 index 0000000..11bf920 --- /dev/null +++ b/docs_ws/src/installation/devcontainer.rst @@ -0,0 +1,16 @@ +Devcontainer +============ + +.. note:: + + For more information about installation, please visit the official `devcontainer website `_. + +Press the link below to install the Devcontainer extension: + +`Install the Devcontainer extension `_ + +Or you can install it from the Visual Studio Code Marketplace: + +.. image:: ./images/devcontainer.png + :align: center + :alt: Devcontainer Extension diff --git a/docs_ws/src/installation/docker.rst b/docs_ws/src/installation/docker.rst new file mode 100644 index 0000000..344cc7e --- /dev/null +++ b/docs_ws/src/installation/docker.rst @@ -0,0 +1,44 @@ +Docker +====== + +.. note:: + + For more information about installation, please visit the official `docker website `_. + +To ensure better maintainability of our code, we have packaged all the necessary packages and dependencies into Docker and split it into multiple workspaces based on functionality. +Normally, you only need to navigate to the designated workspace and use `docker compose up` to complete the entire environment setup. + +Follow these steps to install Docker and Docker Compose on your system: + + +1. **Update your package index**:: + + sudo apt-get update + +2. **Install required packages**:: + + sudo apt install apt-transport-https ca-certificates curl software-properties-common + +3. **Add the Docker GPG key**:: + + sudo install -m 0755 -d /etc/apt/keyrings + sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc + sudo chmod a+r /etc/apt/keyrings/docker.asc + +4. **Add the Docker repository**:: + + echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ + $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ + sudo tee /etc/apt/sources.list.d/docker.list > /dev/null + +5. **Update your package index again**:: + + sudo apt-get update + +6. **Install Docker**:: + + sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin + +7. **Verify the installation**:: + + sudo docker --version \ No newline at end of file diff --git a/docs_ws/src/installation/images/devcontainer.png b/docs_ws/src/installation/images/devcontainer.png new file mode 100644 index 0000000..08c6d05 Binary files /dev/null and b/docs_ws/src/installation/images/devcontainer.png differ diff --git a/docs_ws/src/installation/installation.rst b/docs_ws/src/installation/installation.rst deleted file mode 100644 index ac29989..0000000 --- a/docs_ws/src/installation/installation.rst +++ /dev/null @@ -1,8 +0,0 @@ -Docker -====== - -Setup Docker ------------- - -Setup Docker Compose --------------------- \ No newline at end of file diff --git a/docs_ws/src/installation/ros2.rst b/docs_ws/src/installation/ros2.rst new file mode 100644 index 0000000..24dc8fe --- /dev/null +++ b/docs_ws/src/installation/ros2.rst @@ -0,0 +1,46 @@ +ROS2 Humble +=========== + +.. note:: + + For more information about installation, please visit the official `ROS2 website `_. + +Please note that we have already installed ROS 2 Humble within the Docker environment. +Generally, you do not need to install ROS 2 yourself. You will only need to manually install ROS 2 in a few exceptional cases, +such as when you want to add features to our codebase but have your own Dockerfile. + +Follow these steps to install ROS 2 Humble on your system: + +1. **Set Up Your System** + Ensure that your system is up to date:: + + sudo apt update + sudo apt upgrade + sudo apt install -y software-properties-common + sudo add-apt-repository universe + +2. **Add the ROS2 GPG Key** + Download and add the GPG key for the ROS 2 repository:: + + sudo apt install -y curl + sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg + +3. **Add the ROS 2 Repository** + Add the ROS 2 repository to your sources list:: + + echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | sudo tee /etc/apt/sources.list.d/ros2.list > /dev/null + + +4. **Install ROS 2 Humble** + Update your package list and install the full desktop version of ROS 2 Humble:: + + sudo apt update + sudo apt upgrade + sudo apt install ros-humble-desktop + + +5. **Set Up Environment Variables** + Add the ROS 2 environment variables to your shell:: + + echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc + source ~/.bashrc diff --git a/docs_ws/src/introduction/architecture.rst b/docs_ws/src/introduction/architecture.rst index 38d857f..302be96 100644 --- a/docs_ws/src/introduction/architecture.rst +++ b/docs_ws/src/introduction/architecture.rst @@ -4,5 +4,27 @@ Architecture Software -------- +.. image:: ./images/software_arch.png + :align: center + :alt: Software Architecture + +Our software architecture consists of four main modules. The low-level control module receives goal points and control signals to direct the robot's movements. +The segmentation module processes RGBD images from the robot's sensors (camera, LiDAR, odometry) and identifies different objects and regions in the environment. +The scene graph (SG) builder module constructs a 3D scene graph from the segmented data, representing spatial relationships and object attributes. +Finally, the language query module, powered by a large language model, interprets high-level instructions and converts them into goal points for the low-level controller, +enabling the robot to understand and execute complex tasks. Later, we will introduce our implementation details in the following paragraph. + Hardware --------- \ No newline at end of file +-------- + +.. image:: ./images/hardware_arch.png + :align: center + :alt: Hardware Architecture + +In our hardware architecture, various components work together to enable the robot to navigate and interact with its environment efficiently. +The AMD KR260 serves as the core component, interfacing with multiple sensors including the Realsense camera, VLP16 LiDAR, and the Kobuki wheeled robot. +The KR260 processes input from these sensors, utilizing SLAM to construct a basic 2D costmap for navigation and building a 3D scene graph from labeled images. +Due to the high computational load, a local server is employed for semantic segmentation, +sending the labeled images back to the KR260 to build a semantic 3D scene graph. The scene graph is encoded into a hierarchical YAML file, +which is then queried by a remote server hosting a Large Language Model (LLM) with goal descriptions such as "I want to go to bed." +The LLM processes these queries and returns a goal point, which the KR260 uses to navigate the robot to the desired location. diff --git a/docs_ws/src/introduction/how_to_use.rst b/docs_ws/src/introduction/how_to_use.rst deleted file mode 100644 index 2a42b56..0000000 --- a/docs_ws/src/introduction/how_to_use.rst +++ /dev/null @@ -1,7 +0,0 @@ -How to use the code -=================== - -Step By Step ------------- - -1. Clone the repository ... \ No newline at end of file diff --git a/docs_ws/src/introduction/images/hardware_arch.png b/docs_ws/src/introduction/images/hardware_arch.png new file mode 100644 index 0000000..3472c3b Binary files /dev/null and b/docs_ws/src/introduction/images/hardware_arch.png differ diff --git a/docs_ws/src/introduction/images/kr260_board.png b/docs_ws/src/introduction/images/kr260_board.png new file mode 100644 index 0000000..4811c81 Binary files /dev/null and b/docs_ws/src/introduction/images/kr260_board.png differ diff --git a/docs_ws/src/introduction/images/software_arch.png b/docs_ws/src/introduction/images/software_arch.png new file mode 100644 index 0000000..46a7682 Binary files /dev/null and b/docs_ws/src/introduction/images/software_arch.png differ diff --git a/docs_ws/src/introduction/introduction.rst b/docs_ws/src/introduction/introduction.rst index e987a10..29289b8 100644 --- a/docs_ws/src/introduction/introduction.rst +++ b/docs_ws/src/introduction/introduction.rst @@ -16,12 +16,34 @@ the agent is tasked to navigate to the specified goal location using its forward Pipeline +Research in 3D scene graphs (3DSG) offers valuable insights for our mapping approach. +3DSG can hierarchically organize multiple levels of semantic meanings as nodes, including floors, places, and objects. +The edges in a 3DSG can represent relationships between objects and the traversability between locations. +Additionally, the real-time 3DSG construction approach presented by Hydra suggests the feasibility of integrating 3D scene graphs into our navigation task. + +This detailed semantic structure can serve as a mental map for the agent, facilitating query-to-point reasoning. +For example, given a goal query like “I want to go to sleep, where should I go?” and the current 3D scene graph, +the LLM can guide the agent towards the appropriate point. + +There are 3 main components in our architecture: + +- The AMD KR260 acts as a core component that processes the inputs from several sensors, and updating the 3D scene graph. +- A local server response for semantic segmentation module. +- A LLM to reason the 3DSG based on the given query, and output the location the agent should go to. + +Compared to Hydra, our approach supports processing language queries, different from traditional navigation pipeline, +which should input a specific goal by user, our pipeline significantly enhancing the flexibility of the user interface. +Moreover, Hydra's semantic segmentation method will limit its scene understanding capabilities across diverse environments. + +Initially, we aimed to address our navigation task using Reinforcement Learning (RL) methods +due to their proven performance in obstacle avoidance and their ability to learn from various inputs +(in our case, RGBD images, natural language, and the 3DSG). However, we found that RL methods are unstable during training, +largely due to the sparsity of reward signals, and the noisy nature of real-world observations. Consequently, +we transitioned from RL methods to 3DSG-based methods. Features -------- -- Automatically navigate the robot to a specific goal without any high-cost sensors. -- Based on a single camera and use deep learning methods. -- Use Sim-to-Real technology to eliminate the gap between virtual environment and real world. -- Introduce Virtual guidance to entice the agent to move toward a specific direction. -- Use Reinforcement learning to avoid obstacles while driving through crowds of people. \ No newline at end of file +- Real-Time 3D scene graph construction +- Automatically navigate the robot to a specific goal with the help of the 3D scene graph +- Navigate to the goal location based on the user's natural language instructions \ No newline at end of file diff --git a/docs_ws/src/introduction/quick_start.rst b/docs_ws/src/introduction/quick_start.rst new file mode 100644 index 0000000..85ef464 --- /dev/null +++ b/docs_ws/src/introduction/quick_start.rst @@ -0,0 +1,7 @@ +Quick start +=========== + +Before you start, make sure you have read the :ref:`codebase_overview_link` section. + +1. Clone the repository ... + diff --git a/docs_ws/src/introduction/ros2.rst b/docs_ws/src/introduction/ros2.rst new file mode 100644 index 0000000..14f95ff --- /dev/null +++ b/docs_ws/src/introduction/ros2.rst @@ -0,0 +1,75 @@ +ROS2 +============ + +For more information about ROS2, please visit the official ROS2 website: `ROS2 Humble `_. + +What is ROS? +------------ + +ROS (Robot Operating System) is a flexible framework designed to build robot applications. +It enables developers to create complex robotics systems by providing tools and libraries for building, testing, and deploying robotic software. +As an evolution of ROS (ROS1), ROS2 introduces major improvements, focusing on scalability, real-time performance, and improved communication mechanisms. + +Why Use ROS2? +------------- + +ROS2 offers a robust platform for those developing both simple and complex robot applications. +It is especially useful in environments that require distributed systems, real-time control, or that use heterogeneous hardware. + +Key reasons to adopt ROS2 include: + +- **Cross-platform compatibility**: Works on Linux, Windows, and macOS. +- **Modularity**: Provides a component-based architecture, allowing developers to reuse existing modules and customize them for their needs. +- **Scalability**: Capable of handling large-scale distributed systems across multiple machines. +- **Real-time capabilities**: ROS2 is designed with real-time communication in mind, enabling more predictable and timely responses. +- **Flexible communication layers**: ROS2 supports different communication middleware, such as DDS (Data Distribution Service), making it more versatile for various applications. + +ROS2 is ideal for robotics researchers, hobbyists, and companies building robotic systems in diverse industries such as agriculture, healthcare, logistics, and manufacturing. +It is particularly well-suited for: + +- Developers building real-time systems. +- Teams working with distributed robotics applications. +- Anyone needing a scalable platform that supports multiple robots and machines working in sync. + +Core Features of ROS2 +---------------------- + +1. **Node-based Architecture** + ROS2 applications are divided into *nodes*, where each node performs a specific task. + These nodes communicate with each other using topics, services, or actions, providing a clear and maintainable structure for building complex systems. + +2. **Data Distribution Service (DDS) for Communication** + ROS2 uses DDS, a standardized middleware, to ensure reliable communication between nodes, + whether they are on the same device or distributed across multiple machines. + DDS offers better performance for real-time systems and is capable of handling high-volume data transfers, crucial for robotics applications. + +3. **Real-Time Systems** + Unlike ROS1, ROS2 can meet real-time constraints, making it suitable for applications that require fast and deterministic responses, + such as autonomous vehicles, drones, and industrial robots. + +4. **Multi-Robot Support** + ROS2 is designed to support multiple robots operating together in the same environment. + This is achieved by leveraging DDS, which allows communication across multiple networks, making it easy to scale up robot fleets. + +Basic ROS2 CLI Commands +------------------------ + +To interact with ROS2, you can use several command-line interface (CLI) commands. Here are some fundamental commands to get you started: + +1. **Listing Topics** + To see all active topics: ``ros2 topic list`` + +2. **Echoing Topic Data** + To view messages published on a specific topic: ``ros2 topic echo `` + +3. **Publishing to a Topic** + To publish messages to a specific topic: ``ros2 topic pub """`` + +4. **Launching the Node by Launch File** + To start a set of nodes defined in a launch file: ``ros2 launch `` + +5. **Listing Nodes** + To see all active nodes: ``ros2 node list`` + +6. **Inspecting Node Information** + To get information about a specific node: ``ros2 node info `` \ No newline at end of file