Fabric Node Setup
Intro
Eluvio Fabric Nodes are systems that operate 2 pieces of software: elvmasterd
and qfab
. The elvmasterd
process is the blockchain component of the Fabric node. Transactions and information held on the blockchain are processed through a local copy of the chain. Through it, the qfab
process is able to get information about parts that are part of the content fabric. Both software components are expected to run on the same hardware. While not strictly necessary, it is common to proxy access to both services via some web proxy. This document will document how to use nginx
to do this, but any reverse proxy, like haproxy
can also be used.
Server Requirements
Detailed documentation of fabric node specs are provided here.
While a cloud node configuration is possible, it may be expensive to run long-term. The basic specs are:
- 32 physical CPU Cores (Intel/AMD)
- 256GB Ram
- 0-8TB of NVMe storage (depending on node type – explained below)
- 50TB-100TB of long-term storage (depending on node type – explained below)
- Video encode/decode offload hardware – not necessary, but recommended
- 1-25Gbps Networking to the Public Internet (depending on node type – explained below)
With these resources the fabric node can accommodate ingest video processing, and serving high volumes of web traffic.
Currently, the tested and recommended operating system is Ubuntu 18.04. Ubuntu 20.04 and other Distros like RedHat/Centos are likely to work, but has not been tested. Work is underway to validate on Ubuntu 22.04.
Hardening and Security Considerations
Full server hardening is not in the scope of this document since each organization has their own processes, tools, and baseline configurations. The following is meant to illustrate the best practices used in a typical Fabric Node deployment. The hope is to provide information that can be layered on top of participant-specific hardening.
Fabric Node should run a server-side firewall. This server firewall should be set to allow incoming traffic for the common web ports of 80/TCP
/HTTP
and 443/TCP
/HTTPS
. The HTTP
port is supported only support HTTP 301
redirection to the HTTPS
port.
Fabric Nodes need to also allow the Ethereum wire protocol access. This is configurable at install time, but it is typically TCP and UDP 40304
Remote access is up to the organization setting up the Fabric Node. Remote access over SSH may be needed from time to time. It is recommended that password auth for SSH is disabled at a minimum.
Types of Node
Fabric Nodes can be deployed in one of two configurations. The basic software configuration is the same between both, but the hardware requirements and the network access needs are different. The two deployment options are:
- Full Fabric nodes need public IP access from the entire Internet on
HTTP
andHTTPS
. While malicious use can be safeguarded against, the general idea is that all fabric users can access a node for content serving. They can also fulfill publishing tasks. - Publish Only Fabric nodes need to be visible to the public IP addresses of other fabric nodes, but not broadly accessible to public traffic.
Network Configuration
Fabric Node nodes are public systems that other blockchain participants have the potential of connecting to. While the server may reject a connection due to load, in practice, the servers can accept traffic from any number of hosts via the Ethereum protocol. Because of this, it is not advised to limit connectivity to the Ethereum ports even when setup as a Publish Only node as it ruins the spirit and connectedness of the blockchain.
Given the previous notes on hardening and the ports used, firewalls rules should allow the following inbound:
80/TCP
from the entire Internet443/TCP
from the entire Internet40304/TCP
from the entire Internet40304/UDP
from the entire Internet
Outbound, these ports should be allowed egress if egress filtering is desired, but note: All nodes define their own ports. Egress filtering will prevent connectivity to nodes that do not use the above ports. Some of the original nodes on the Eluvio blockchain opted to use different ports (like 40303
and 30303
).
Network Configuration Notes by Type
Full Fabric Node
A Full Fabric Node is expected to deal with part distribution and traffic serving. So, in addition to the Ethereum ports listed above, the HTTP/S
ports need to be open to the entire internet. It is expected that Full Fabric Nodes will limit traffic to known bad IPs and IP blocks, but as a rule, a Full Fabric Node will service traffic some any legitimate users on the Internet. Put another way, restrictions on ports 80
and 443
will be limited by a blacklist
Publish Only Fabric Node
A Publish Only Fabric Node only needs to communicate with other Fabric nodes to ensure parts are distributed across the content fabric. Access to the HTTP/S
ports can be limited to a list of known IPs – the set comprising Full and Publish Only nodes. Eluvio can provide a list of IPs maintained and operated by Eluvio and other partners that run Fabric Nodes. Put another way, restrictions on ports 80
and 443
will be limited by a whitelist of Fabric IPs.
Tuning
Content Fabric nodes experience various workloads. Of the workloads that a node may perform, content serving necessitates kernel and network tuning. Detailed information will be provided in the Tuning Guide, but generally the tuning is split into 3 types:
- Power State tuning: Depending on the processor used, it may be useful to enable or disable power state settings to ensure the system can be used fully. The specifics are dependent on the processor platform. The tuning guide details the steps Eluvio has taken on Intel hardware.
- Network tuning: Network tuning is intended to prepare nodes for serving large numbers of connections over a WAN. To do so, TCP parameters are increased and the BBR congestion algorithm is set to take advantage of the latest in TCP/IP networking in the Linux kernel.
- Process Limit tuning: The
qfab
process will need to support a large number of connections. This will force the process to use more than the default number of file descriptors. This limit is increased to support the expected load a server may encounter.
In both cases, if a Node is Publish Only, these tuning steps should be considered optional.
Software Installation
Dependencies
The A/V libraries use by avpipe
are mostly packaged with qfab
, but there are some libraries that are taken from the stock OS via the following installed dependencies.
sudo apt-get update
sudo apt-get install -y zlib1g libc6 libtinfo5 libstdc++6 libgcc1 libmp3lame0 libx265-dev libnuma-dev libxvidcore4 libva2 libxv1 libx11-6 libxext6 libxcb1 libxcb-shm0 libxcb-shape0 libxcb-xfixes0 libasound2 libsdl2-2.0-0 libsndio6.1 libfribidi0 libva2 ocl-icd-libopencl1 libfontconfig1 libfreetype6 libva2 libva-drm2 libva-x11-2 libx11-6 libvdpau1 ocl-icd-libopencl1 software-properties-common
NetInt/Codensity
TBD
NVidia and CUDA
If the node platform uses NVidia hardware, the following packages should be installed. The build toolchain is on NVidia 440 and CUDA 10.2.
Plans underway to use newer NVidia and CUDA.
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get install -y libnvidia-cfg1-440 libnvidia-common-440 libnvidia-compute-440 libnvidia-decode-440 libnvidia-encode-440 libnvidia-fbc1-440 libnvidia-gl-440 libnvidia-ifr1-440 nvidia-compute-utils-440 nvidia-dkms-440 nvidia-kernel-common-440 nvidia-kernel-source-440 nvidia-utils-440
sudo apt-get purge -y nvidia-driver-440 xorg-video-nvidia-440 libnvidia-extra-440
apt-key del 7fa2af80
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
curl -L -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get install -y freeglut3 freeglut3-dev libxi-dev libxmu-dev cuda-compiler-10-2 cuda-libraries-10-2 cuda-libraries-dev-10-2 cuda-npp-10-2 cuda-npp-dev-10-2
elvmasterd
The elvmasterd
binary is a standalone binary that can run on an Ubuntu 18.04 system. To use it, simply load the binary onto the system and run it with a valid config.
The following instructions will allow for a globally available elvmasterd
by installing into /usr/local/bin
. If the files are placed elsewhere, the bin directory should be set in PATH
.
MISSING COMMAND HERE
Note: A elvmasterd
installer script is in beta, and the above steps will be deprecated over time. Click this link to download the elvmasterd
installation script.
qfab
The qfab
binary is packaged as a binary and custom A/V libraries to support avpipe
. It is built on a Ubuntu 18.04 system and only expected to work on that platform presently.
The following instructions will allow for a globally available qfab
by installing the binary into /usr/local/bin
and the libraries into /usr/local/lib
. If the files are placed elsewhere, the bin directory should be set in PATH
and ldconfig
or LD_LIBRARY_PATH
will need to be set accordingly.
MISSING COMMAND HERE
The libraries that are installed from the tarbell are placed in /usr/local/lib
(based on these instructions). The build system makes the files user readable, but to make use of them system-wide from /usr/local/lib
, they need to change to 0744
permissions. The following command will work, provided no other libraries are in /usr/local/lib
that need different permissions. On a barebones Ubuntu system, this directory is mostly empty of lib*
files:
sudo chmod 744 /usr/local/lib/lib*
NOTE: A qfab
installer script is in beta, and the above steps will be deprecated over time. Click this link to download the qfab
installation script.
Users and Storage
This document will presume a user is created for the purpose of running both daemons. This user can called anything, and the details of the user permissions can be altered based on site-specific security guidelines. This is just documented to facilitate the rest of the installation. It is also fine to run each of these as a different OS user, but the tweaks needed to support this will not be covered here.
User
Create an eluvio
user:
sudo useradd -c "Eluvio Fabric Node user" -s /sbin/nologin eluvio
And create locations for configuration and logs. Feel free to change these to conform to policies. Logs should be placed on a different storage volume than the blockchain storage.
sudo mkdir /etc/elvmasterd && chown eluvio:eluvio /etc/elvmasterd
sudo mkdir -p /var/elvmasterd/log && chown -R eluvio:eluvio /var/elvmasterd
sudo mkdir /etc/qfab && chown eluvio:eluvio /etc/qfab
sudo mkdir -p /var/qfab/log && chown -R eluvio:eluvio /var/qfab
NOTE: Why /var/elvmasterd/log
and /var/qfab/log
and not somewhere in /var/log
? These logs can be chatty, especially if set to DEBUG
, so presumed a separate volume that is designed for this. Logging can be turned down and this can be moved.
Storage
There are 3 types of storage that need to be setup. In all cases, the storage volume needs to be owned by eluvio
.
The 3 types of storage are:
elvmasterd
chain dataqfab
part dataqfab
serving cache
The chain and part data storage volumes are present in Full Fabric and Publish Only Nodes. The cache volume is only needed on a Full Fabric Node.
elvmasterd
chain data
The chain data can be stored on a 200GB
volume that is optimized for speed. Placing this on an NVMe disk or RAID0 volume is advised. It does not need to be reliable, since the blockchain data can always be reconstituted by peers on the network. This setup document will use /var/elvmasterd/data
for the chain data.
sudo chown eluvio:eluvio /var/elvmasterd/data
qfab
part data
By convention, the qfab
part data is placed in a directory called QDATA
(all-caps). This will be the name used here, but it is not a hard requirement.
The space requirements for the part storage is governed by the type of node:
- Full Fabric Node:
100TB
- Publish Only Fabric Node:
50TB
In both cases, the type of disk-layout/volume used should prioritize in the following order:
- raw space
- speed
- reliability
A simple RAID0 meets this requirement well. Like the chain data, reliability is not prioritized because the parts can be retrieved from peers. Having the ability to store data and access it quickly is more important. “Spinning Disk” is suitable in this application. A “reference” system at Eluvio uses 8x 14TB disks in a RAID0. The file-system used can be ext4
, zfs
, etc.
This setup document will use /var/qfab/QDATA
for the chain data.
sudo chown eluvio:eluvio /var/qfab/QDATA
qfab
serving cache
By convention, the qfab
data directories are all-caps. This document will continue this convention, but it is not a hard requirement.
The space requirements for the cache storage is governed by the type of node:
- Full Fabric Node:
8-10TB
- Publish Only Fabric Node: Not needed
The type of disk-layout/volume used should prioritize in the following order:
- speed
- raw space
- reliability
A simple RAID0 of NVMe meets this requirement well. Cache is transient and does not need to persist. NVMe is a perfect candidate for this over spinning disks. A “reference” system at Eluvio uses 2x 4TB disks in a RAID0. The file-system used can be ext4
, zfs
, etc.
This setup document will use /var/qfab/CACHE
for the chain data.
sudo chown eluvio:eluvio /var/qfab/CACHE
Network Configuration Validation
Presuming an Ubuntu system with a server/host based firewall enabled, which is ufw
by default, run the following:
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw allow 40304/tcp
sudo ufw allow 40304/udp
Specifics for setting the firewall for a Publish Only node is implementation dependent.
Node Partition
Warning
Partition selection is important for the balance of the Content Fabric. Work with Eluvio to ensure the Node ID maps to a partition that ensures balance of part distribution.The Content Fabric stores data as a collection of parts. Those parts are generated by a node and the parts live on the node that created it and all nodes in the same partition as that node.
Currently the Content Fabric is currently at a level 4 partition level, which translates into 16
partitions. In the future, as storage pressure grows to expand part storage, the fabric can be split into level 5 partition scheme equating to 32
partitions. Because of this, it is important to know how partitions are calculated. This information can be used to determine the partition a node address is in.
Tools exist to determine this, but this information is used to illustrate the principals.
The partition is calculated by taking the binary representation of the first bytes of the blockchain address of a node. The partition level indicates how much of the address is used to calculate the partition. For example:
Given a3257a963c872bc55def63152698ddecf48ec0cc
, the first 3 bytes translate to 101000110010
. The resulting partitions based on partition level:
level | partition | number of partitions |
---|---|---|
1 | 1 |
2 |
2 | 10 |
4 |
3 | 101 |
8 |
4 | 1010 |
16 |
5 | 10100 |
32 |
At lower partition levels, nodes may be in the same partition. When the partition level increases, they may stay in the same level or be in different ones. Balancing is done to ensure no single partition is lacking enough partitions to ensure data storage reliability and availability. Balancing takes into account the current balance and the impact on the balance of the network when a level is increased.
Initial Configuration
To use elvmasterd
and qfab
a configuration is needed. Barebones configurations are available in the example configs section of this documentation. Note that each config file is versioned. The versions are used to ensure that end-users know when config files need to change to support new functionality, breaking changes, or deprecations. The following configuration snippets highlight sections that should be changed within the sample.
Use those docs to customize the config files. Once the configurations are setup in accordance to your environment, follow the steps below to finish qfab
and elvmasterd
setup. Once elvmasterd
and qfab
are setup, the last step is to setup the nginx
reverse proxy
Joining to Network and Syncing the local copy
Once a wallet is created and the node_account
value is set in the config file, the necessary parameters are set to join the network.
The first step is to initialize the database:
/usr/local/bin/elvmasterd init --config /etc/elvmasterd/config.toml
This step creates the database used to store the blockchain. Once the database has been initialized, the next step is to start the daemon. This step can be done in a terminal window with the below command. If done this way, before the service has been daemon-ized within systemd
, the daemon will run in the foreground of the terminal. The sync can take a day or longer and will then stay connected. Interrupting the sync because a terminal windows closes is not ideal, so do this only to troubleshoot a config or installation issue:
/usr/local/bin/elvmasterd start --config /etc/elvmasterd/config.toml
qfab
will not work until the chain is nearly in sync. Currently sync times are 12 hours or more.
Daemon-izing the Services
To set the services to run as a daemon that persist, use the systemd
units in the configuration section.
Simply copy the samples provided into a file where systemd
will use it, such as /etc/systemd/system/elvmasterd.service
and /etc/systemd/system/qfab.service
. Edit this file to conform with changes made in the local environment.
Once this file is in place, the service needs to be loaded, enabled, and started:
sudo systemctl daemon-reload
sudo systemctl enable elvmasterd
sudo systemctl enable qfab
sudo systemctl start elvmasterd
sudo systemctl start qfab
Verifying qfab
The easiest end-to-end test of the qfab
communicating with an in-sync elvmasterd
is to query the /config
endpoint on the local qfab
. In this example, the localhost is being queried.
curl https://127.0.0.1:8008/config?self&qspace=main