Skip to content

Commit

Permalink
Add troubleshooting links to support repo
Browse files Browse the repository at this point in the history
  • Loading branch information
philogicae committed Oct 15, 2024
1 parent 848eb14 commit 6fd488c
Show file tree
Hide file tree
Showing 5 changed files with 114 additions and 88 deletions.
35 changes: 19 additions & 16 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,62 +1,64 @@
# Overview

Welcome to the Aleph.im documentation. This site will provide you with all the necessary
information, resources, and tools to get started with the Aleph.im project.
information, resources, and tools to get started with the Aleph.im project.

## What is Aleph.im?

[Aleph.im](https://aleph.im) is an open-source off-chain P2P (peer-to-peer) network.
[Aleph.im](https://aleph.im) is an open-source off-chain P2P (peer-to-peer) network.
It offers decentralized key-value store, file storage, function execution and virtual machine provisioning.
Interactions with the network rely on decentralized identities that are interoperable with many major blockchain networks,
Interactions with the network rely on decentralized identities that are interoperable with many major blockchain networks,
such as Ethereum, Tezos, and Solana.

This enables interactions between the Aleph.im network and blockchain networks. Bridges also allow smart contracts to
interact with the Aleph.im network.
This enables interactions between the Aleph.im network and blockchain networks. Bridges also allow smart contracts to
interact with the Aleph.im network.

Aleph.im also provides a blockchain indexing framework, allowing developers to index data from any blockchain network
by leveraging the Aleph.im network's decentralized storage and compute capabilities.

### The Aleph.im project has the following components:

* The Aleph peer-to-peer network, comprised of [Compute Resource Nodes or CRNs](nodes/compute/index.md) and [Core Channel Nodes, or CCNs](nodes/core/index.md)
* [Python](libraries/python-sdk/index.md) and [TypeScript](libraries/typescript-sdk/index.md) SDKs to integrate Aleph.im's decentralized compute and storage solutions into your project
* A [Python command-line tool](tools/aleph-client/index.md) to interact with the Aleph.im network directly from a terminal
* A [Web Console](https://console.twentysix.cloud/) to create and manage cloud resources
* A [Node Operator Dashboard](https://account.aleph.im/)
* A [Message Explorer](https://explorer.aleph.im/)
- The Aleph peer-to-peer network, comprised of [Compute Resource Nodes or CRNs](nodes/compute/index.md) and [Core Channel Nodes, or CCNs](nodes/core/index.md)
- [Python](libraries/python-sdk/index.md) and [TypeScript](libraries/typescript-sdk/index.md) SDKs to integrate Aleph.im's decentralized compute and storage solutions into your project
- A [Python command-line tool](tools/aleph-client/index.md) to interact with the Aleph.im network directly from a terminal
- A [Web Console](https://console.twentysix.cloud/) to create and manage cloud resources
- A [Node Operator Dashboard](https://account.aleph.im/)
- A [Message Explorer](https://explorer.aleph.im/)

## The Aleph.im network

![The Aleph.im network](./network-overview.svg)

The Aleph.im network is composed of 2 sets of nodes:

* [CCNs](nodes/core/index.md), the backbone of the P2P network. They serve as an entry point into the network through an API (similar to a blockchain node's RPC).
* [CRNs](nodes/compute/index.md), responsible for the actual compute and storage available on Aleph.im. CRNs must be tied manually to a single CCN, and each CCN is incentivized to tie up to 3 CRNs.
- [CCNs](nodes/core/index.md), the backbone of the P2P network. They serve as an entry point into the network through an API (similar to a blockchain node's RPC).
- [CRNs](nodes/compute/index.md), responsible for the actual compute and storage available on Aleph.im. CRNs must be tied manually to a single CCN, and each CCN is incentivized to tie up to 3 CRNs.

### Messages

In Aleph.im terminology, a "_message_" is similar to a "_transaction_" for a blockchain: it is a set of data sent by an end user, propagated through the entire peer-to-peer network.
A message can be generated using either the [Python SDK](libraries/python-sdk/index.md) or [TypeScript SDK](./libraries/typescript-sdk/index.md), or through [aleph-client](tools/aleph-client/index.md) or the [Console](https://console.aleph.im/).

These messages can contain several different instructions, such as reading or writing [posts](libraries/python-sdk/posts/create.md), [programs/functions](computing/index.md), or [indexing data](tools/indexer/index.md) created on external blockchains.

### Payment

Aleph does not operate as a blockchain but utilizes its native cryptocurrency,
Aleph does not operate as a blockchain but utilizes its native cryptocurrency,
referred to as the _ALEPH_ token, which functions across various blockchains.

This token serves two primary purposes: support users payments for the resources they
allocate on the network, and incentivize node operators to maintain the network's integrity.

The first payment implementation is achieved through a staking mechanism,
The first payment implementation is achieved through a staking mechanism,
where users must hold a certain amount of ALEPH tokens to use the network's resources.
This mechanism is in place for file storage and for persistent virtual machines.

In January 2024, the network started supporting a new payment model, together with the launch
of the [TwentySix Cloud](https://www.twentysix.cloud/) platform,
of the [TwentySix Cloud](https://www.twentysix.cloud/) platform,
where users pay using streams of ALEPH tokens on compatible chains.

### Example

Let's take the example of a user who wants to run a program on the Aleph.im network:

1. The user makes sure to have an Ethereum wallet holding a sufficient number of ALEPH tokens
Expand All @@ -75,5 +77,6 @@ Let's take the example of a user who wants to run a program on the Aleph.im netw

## Community

- Found an issue? [Report it here](https://github.com/aleph-im/support/issues).
- Chat on our [Telegram group](https://t.me/alephim).
- Engage on our [Discourse Channel](https://community.aleph.im/).
6 changes: 4 additions & 2 deletions docs/libraries/typescript-sdk/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# Troubleshooting Guide

If you encounter any issues, please let us know by creating an issue on the [GitHub repository](https://github.com/aleph-im/aleph-sdk-ts/issues).

## Wagmi-compatible Web3Provider

The SDK is based on Ethers, if you are using Wagmi instead, a quick setup is required.
Expand Down Expand Up @@ -61,3 +59,7 @@ export default defineConfig({
plugins: [nodePolyfills()],
});
```

## Found an issue?

If the documentation didn't help, you can [report an issue](https://github.com/aleph-im/support/issues).
155 changes: 85 additions & 70 deletions docs/nodes/compute/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Troubleshooting Guide

Setting up a [Compute Resource Node](index.md) can be a daunting task. This page is here to help you troubleshoot the most common issues.

- Ensure to backup configuration files before making changes.
Expand All @@ -7,7 +8,9 @@ Setting up a [Compute Resource Node](index.md) can be a daunting task. This page
- If you are unable to resolve the issue, then please check out the latest issues on the [Discourse Forum](https://community.aleph.im/c/node-operators/7) for support.

## 1) 404: Invalid message reference

### Issue Summary

After setting up a CRN, users may encounter a `404: Invalid message reference` error when attempting to connect to the node's diagnostic page.

### Probable Cause
Expand All @@ -18,29 +21,32 @@ After setting up a CRN, users may encounter a `404: Invalid message reference` e
### Troubleshooting Steps

1. **Recheck SSL Configuration:**
- Confirm that SSL certificates are correctly installed and configured.
- Review the SSL configuration in the web server (e.g., Caddy, Nginx) to ensure it's correctly pointing to the intended ports with the right certificate paths.

- Confirm that SSL certificates are correctly installed and configured.
- Review the SSL configuration in the web server (e.g., Caddy, Nginx) to ensure it's correctly pointing to the intended ports with the right certificate paths.

2. **Configure Hostname Correctly:**

- Ensure the hostname is properly configured as per the [CRN installation guide](./installation/debian-11.md#2-installation).
- Make sure the domain name in the supervisor.env file matches the domain used in your SSL configuration.
- Ensure the hostname is properly configured as per the [CRN installation guide](./installation/debian-11.md#2-installation).
- Make sure the domain name in the supervisor.env file matches the domain used in your SSL configuration.

3. **Restart Services:**

- After updating the hostname, restart the relevant services to apply the changes.
- This may include restarting the Docker container and the web server service.
- After updating the hostname, restart the relevant services to apply the changes.
- This may include restarting the Docker container and the web server service.

4. **Review Log Files:**

- If the problem still persists, check the log files of both the Docker container and the web server for any specific error messages related to SSL or hostname configurations.

- If the problem still persists, check the log files of both the Docker container and the web server for any specific error messages related to SSL or hostname configurations.

## 2) SQUASHFS Errors in Diagnostic VM

### Issue Summary

Users may encounter SQUASHFS errors indicating a failure to decompress data, suggesting possible corruption of the runtime diagnostic VM.

#### Symptoms

Repeated SQUASHFS errors in the logs such as

- `Failed to read block`
Expand All @@ -50,6 +56,7 @@ Repeated SQUASHFS errors in the logs such as
related to a specific block.

### Probable Cause

The runtime of the new diagnostic VM appears to be improperly downloaded or corrupted.

### Troubleshooting Steps
Expand All @@ -58,23 +65,24 @@ The runtime of the new diagnostic VM appears to be improperly downloaded or corr

2. **Clear Cache**: Remove the cache of the problematic file using the diagnostic VM hash. This can be done by deleting the file located at `/var/cache/aleph/runtime/$RUNTIME_HASH`.

- Navigate to the cache directory: `cd /var/cache/aleph/vm/runtime/`.
- Locate the file with the corresponding `$RUNTIME_HASH`.
- Remove the file:
- Navigate to the cache directory: `cd /var/cache/aleph/vm/runtime/`.
- Locate the file with the corresponding `$RUNTIME_HASH`.
- Remove the file:

```shell
sudo rm -f $RUNTIME_HASH
```

3. **Restart Supervisor**: After deleting the problematic file, restart the supervisor system. This should trigger the re-download of the runtime file.

- Restart the supervisor: `sudo systemctl restart supervisor` (or `aleph-vm-supervisor.service` when installing from source).
- Restart the supervisor: `sudo systemctl restart supervisor` (or `aleph-vm-supervisor.service` when installing from source).

4. **Re-download**: Upon restart, the system will automatically attempt to re-download the runtime, replacing the corrupted file.

- If the problem persists, further investigation into network stability or hardware integrity may be necessary.

- If the problem persists, further investigation into network stability or hardware integrity may be necessary.

## 3) Missing Diagnostic VM Metrics

### Issue Summary

The `diagnostic_vm_latency` metrics data is missing for your CRN, even though virtualization is reportedly operational.
Expand All @@ -100,32 +108,32 @@ Check that both work on your node, on an URL similar to

1. **Upgrade Node Software:**

- Ensure the node is running the latest CRN version.
- Ensure the node is running the latest CRN version.

2. **Disable IPv6 Forwarding:**

- If upgrading does not resolve the issue, try disabling IPv6 forwarding:
- Set `ALEPH_VM_IPV6_FORWARDING_ENABLED=False` in `/etc/aleph-vm/supervisor.env`.
- Manually check if IPv6 forwarding is still active:
```shell
cat /proc/sys/net/ipv6/conf/all/forwarding
```
If the output is 1, disable it with:
```shell
echo 0 > /proc/sys/net/ipv6/conf/all/forwarding
```
- If upgrading does not resolve the issue, try disabling IPv6 forwarding:
- Set `ALEPH_VM_IPV6_FORWARDING_ENABLED=False` in `/etc/aleph-vm/supervisor.env`.
- Manually check if IPv6 forwarding is still active:
```shell
cat /proc/sys/net/ipv6/conf/all/forwarding
```
If the output is 1, disable it with:
```shell
echo 0 > /proc/sys/net/ipv6/conf/all/forwarding
```

3. **Clear Cache:**

- See [SQUASHFS Errors in running diagnostic VM](#2-squashfs-errors-in-diagnostic-vm).
- See [SQUASHFS Errors in running diagnostic VM](#2-squashfs-errors-in-diagnostic-vm).

4. **Contact Cloud Provider:**

- If the issue persists, ask your Cloud Provider:
"I tried to enable IPv6 forwarding on my server. This makes my machine unreachable over IPv6. Why is that?"

- If the issue persists, ask your Cloud Provider:
"I tried to enable IPv6 forwarding on my server. This makes my machine unreachable over IPv6. Why is that?"

## 4) IPv6 Unreachable

### Issue Summary

When using IPv6 on a node, the network is unreachable.
Expand All @@ -145,40 +153,43 @@ When using IPv6 on a node, the network is unreachable.

1. **Check IPv6 Configuration:**

- Ensure that IPv6 is enabled on the network interface.
- Verify that the IPv6 address is correctly assigned to the interface.
- Confirm that the gateway for IPv6 is set up correctly.
- Ensure that IPv6 is enabled on the network interface.
- Verify that the IPv6 address is correctly assigned to the interface.
- Confirm that the gateway for IPv6 is set up correctly.

2. **Review Netplan Configuration (for Ubuntu systems):**

- Open the Netplan configuration file located typically at /etc/netplan/*.yaml.
- Check for proper syntax and settings for IPv6, including address, gateway, and nameservers.
- Example of a Netplan configuration for IPv6:
```yaml
network:
version: 2
ethernets:
eth0:
dhcp4: no
dhcp6: no
addresses:
- "2602:2940:0:1f::2/64"
gateway6: "2602:2940:0:1f::1"
nameservers:
addresses: ["2001:4860:4860::8888", "2001:4860:4860::8844"]
```
- Open the Netplan configuration file located typically at /etc/netplan/\*.yaml.
- Check for proper syntax and settings for IPv6, including address, gateway, and nameservers.
- Example of a Netplan configuration for IPv6:

```yaml
network:
version: 2
ethernets:
eth0:
dhcp4: no
dhcp6: no
addresses:
- "2602:2940:0:1f::2/64"
gateway6: "2602:2940:0:1f::1"
nameservers:
addresses: ["2001:4860:4860::8888", "2001:4860:4860::8844"]
```

After making changes, apply them with `sudo netplan apply`.

3. **Check Network Interface:**

- Use `ip -6 addr show` to check if the IPv6 address is assigned to the network interface.
- Use `ip -6 route show` to verify the default route for IPv6.
- Use `ip -6 addr show` to check if the IPv6 address is assigned to the network interface.
- Use `ip -6 route show` to verify the default route for IPv6.

4. **Test Network Connectivity:**

- Use `ping6` to ping the local IPv6 gateway or known IPv6 addresses like Google's DNS `2001:4860:4860::8888` to test connectivity.
- Use `ping6` to ping the local IPv6 gateway or known IPv6 addresses like Google's DNS `2001:4860:4860::8888` to test connectivity.
## 5) Persistent Storage Corruption
### Issue Summary
A Compute Resource Node exhibits issues with the `persistent_storage` feature.
Expand All @@ -193,39 +204,43 @@ A Compute Resource Node exhibits issues with the `persistent_storage` feature.
The diagnostic VM tests the capability of the VM to persist data on the host. This is done by incrementing a counter in a JSON file, itself stored in a persistent volume.
When a diagnostic virtual machine happens to be stopped while writing data to this file, it is possible to end up with a corrupt file that, for example, only contains part of the expected JSON data and cannot be parsed.
When a diagnostic virtual machine happens to be stopped while writing data to this file, it is possible to end up with a corrupt file that, for example, only contains part of the expected JSON data and cannot be parsed.
### Troubleshooting Steps
1. **Identify Corrupted Volumes:**
- Identify the identifier of the two diagnostic VMs from the variables `CHECK_FASTAPI_VM_ID` and `LEGACY_CHECK_FASTAPI_VM_ID` in the [configuration of aleph-vm](https://github.com/aleph-im/aleph-vm/blob/main/src/aleph/vm/conf.py#L292-L293).
- Identify the identifier of the two diagnostic VMs from the variables `CHECK_FASTAPI_VM_ID` and `LEGACY_CHECK_FASTAPI_VM_ID` in the [configuration of aleph-vm](https://github.com/aleph-im/aleph-vm/blob/main/src/aleph/vm/conf.py#L292-L293).
2. **Stop the service:**
- Stop the service to avoid any further corruption:
```shell
sudo systemctl stop aleph-vm-supervisor.service
```
- Stop the service to avoid any further corruption:
```shell
sudo systemctl stop aleph-vm-supervisor.service
```
3. **Remove Corrupted Volumes:**
- Remove the corrupted files. Here are the commands to remove the identified corrupted volumes:
```shell
sudo rm /var/lib/aleph/vm/volumes/persistent/63faf8b5db1cf8d965e6a464a0cb8062af8e7df131729e48738342d956f29ace/increment-storage.ext4
sudo rm /var/lib/aleph/vm/volumes/persistent/67705389842a0a1b95eaa408b009741027964edc805997475e95c505d642edd8/increment-storage.ext4
```
- Remove the corrupted files. Here are the commands to remove the identified corrupted volumes:
```shell
sudo rm /var/lib/aleph/vm/volumes/persistent/63faf8b5db1cf8d965e6a464a0cb8062af8e7df131729e48738342d956f29ace/increment-storage.ext4
sudo rm /var/lib/aleph/vm/volumes/persistent/67705389842a0a1b95eaa408b009741027964edc805997475e95c505d642edd8/increment-storage.ext4
```
4. **Restart Services:**
- After removing the corrupted volume files, restart the affected services to trigger the recreation of the necessary storage files:
```shell
sudo systemctl restart aleph-vm-supervisor.service
```
- After removing the corrupted volume files, restart the affected services to trigger the recreation of the necessary storage files:
```shell
sudo systemctl restart aleph-vm-supervisor.service
```
5. **Verify System Stability:**
- Check the dashboard of the index page of the CRN or open the storage test endpoint on both VMs opening:
```
https://$YOUR_CRN_HOSTNAME/vm/$CHECK_FASTAPI_VM_ID/state/increment
```
- Check the dashboard of the index page of the CRN or open the storage test endpoint on both VMs opening:
```
https://$YOUR_CRN_HOSTNAME/vm/$CHECK_FASTAPI_VM_ID/state/increment
```
## Found an issue?
If the documentation didn't help, you can [report an issue](https://github.com/aleph-im/support/issues).
5 changes: 5 additions & 0 deletions docs/tools/aleph-client/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Troubleshooting Guide

## Found an issue?

If the documentation didn't help, you can [report an issue](https://github.com/aleph-im/support/issues).
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ nav:
- 'Aleph CLI':
- 'Getting started': tools/aleph-client/index.md
- 'Usage': tools/aleph-client/usage.md
- 'Troubleshooting': tools/aleph-client/troubleshooting.md
- 'Web Console':
- 'Getting started': tools/webconsole/index.md
- 'Upload': tools/webconsole/upload.md
Expand Down

0 comments on commit 6fd488c

Please sign in to comment.