Host Installation¶
This article describes the basic installation of DataKit.
Register/log in to Guance Cloud¶
The browser visits the Guance Cloud registration portal, fills in the corresponding information, and then logs in to Guance Cloud.
Get the Installation Command¶
Log in to the workspace, click "Integration" on the left and select "Datakit" at the top, and you can see the installation commands of various platforms.
Note that the following Linux/Mac/Windows installer can automatically identify the hardware platform (arm/x86, 32bit/64bit) without making a hardware platform selection.
The installation command supports bash
and ash
( Version-1.14.0), and the command is roughly as follows:
bash
DK_DATAWAY=https://openway.guance.com?token=<TOKEN> bash -c "$(curl -L https://static.guance.com/datakit/install.sh)"
ash
DK_DATAWAY=https://openway.guance.com?token=<TOKEN> ash -c "$(curl -L https://static.guance.com/datakit/install.sh)"
After the installation is completed, you will see a prompt that the installation is successful at the terminal.
Installation on Windows requires a Powershell command line installation and must run Powershell as an administrator. Press the Windows key, enter powershell to see the pop-up powershell icon, and right-click and select "Run as an administrator".
Remove-Item -ErrorAction SilentlyContinue Env:DK_*;
$env:DK_DATAWAY="https://openway.guance.com?token=<TOKEN>";
Set-ExecutionPolicy Bypass -scope Process -Force;
Import-Module bitstransfer;
start-bitstransfer -source https://static.guance.com/datakit/install.ps1 -destination .install.ps1;
powershell ./.install.ps1;
Install DataKit lite¶
You can specify the environment variable DK_LITE
to install DataKit lite ( Version-1.14.0):
Remove-Item -ErrorAction SilentlyContinue Env:DK_*;
$env:DK_DATAWAY="https://openway.guance.com?token=<TOKEN>";
$env:DK_LITE="1";
Set-ExecutionPolicy Bypass -scope Process -Force;
Import-Module bitstransfer;
start-bitstransfer -source https://static.guance.com/datakit/install.ps1 -destination .install.ps1;
powershell ./.install.ps1;
DataKit lite only contains collectors as below:
Collector Name | Description |
---|---|
cpu |
Collect the CPU usage of the host |
disk |
Collect disk occupancy |
diskio |
Collect the disk IO status of the host |
mem |
Collect the memory usage of the host |
swap |
Collect Swap memory usage |
system |
Collect the load of host operating system |
net |
Collect host network traffic |
host_processes |
Collect the list of resident (surviving for more than 10min) processes on the host |
hostobject |
Collect basic information of host computer (such as operating system information, hardware information, etc.) |
DataKit(dk) | Collect Datakit running metrics |
RUM(rum) | Collect user access monitoring data |
Net dialtesting(dialtesting) | Collect the data generated by dialing test |
Prom (prom) | Collect data exposed by Prometheus Exporters |
logging | Collect file log data |
Install Specific Version¶
We can install specific DataKit version, for example 1.2.3:
DK_DATAWAY=https://openway.guance.com?token=<TOKEN> bash -c "$(curl -L https://static.guance.com/datakit/install-1.2.3.sh)"
And the same as Windows:
Remove-Item -ErrorAction SilentlyContinue Env:DK_*;
$env:DK_DATAWAY="https://openway.guance.com?token=<TOKEN>";
Set-ExecutionPolicy Bypass -scope Process -Force;
Import-Module bitstransfer;
start-bitstransfer -source https://static.guance.com/datakit/install-1.2.3.ps1 -destination .install.ps1;
powershell ./.install.ps1;
Additional Supported Installation Variable¶
If you need to define some DataKit configuration during the installation phase, you can add environment variables to the installation command, just append them before DK_DATAWAY
For example, append the DK_NAMESPACE
setting:
Remove-Item -ErrorAction SilentlyContinue Env:DK_*;
$env:DK_DATAWAY="https://openway.guance.com?token=<TOKEN>";
$env:DK_NAMESPACE="<namespace>";
Set-ExecutionPolicy Bypass -scope Process -Force;
Import-Module bitstransfer;
start-bitstransfer -source https://static.guance.com/datakit/install.ps1 -destination .install.ps1;
powershell ./.install.ps1;
The setting format of the two environment variables is:
# Windows: Multiple environment variables are divided by semicolons
$env:NAME1="value1"; $env:Name2="value2"
# Linux/Mac: Multiple environment variables are divided by spaces
NAME1="value1" NAME2="value2"
The environment variables supported by the installation script are as follows (supported by the whole platform).
Attention
These environment variable settings are not supported for full offline installation. However, these environment variables can be set by proxy and setting local installation address.
Most Commonly Used Environment Variables¶
DK_DATAWAY
: Specify the DataWay address, and the DataKit installation command has been brought by defaultDK_GLOBAL_TAGS
: Deprecated, DK_GLOBAL_HOST_TAGS insteadDK_GLOBAL_HOST_TAGS
: Support the installation phase to fill in the global host tag, format example:host=__datakit_hostname,host_ip=__datakit_ip
(multiple tags are separated by English commas)DK_GLOBAL_ELECTION_TAGS
: Support filling in the global election tag during the installation phase,format example:project=my-porject,cluster=my-cluster
(support filling in the global election tag during the installation phase)DK_DEF_INPUTS
: List of collector names opened by default, format example:cpu,mem,disk
. We can also ban some default inputs by putting a-
prefix at input name, such as-cpu,-mem,-disk
. But if mixed them, such ascpu,mem,-disk,-system
, we only accept the banned list, the effect is onlydisk
andsystem
disabled, but others enabled.DK_CLOUD_PROVIDER
: Support filling in cloud vendors during installation (Currently support following cloudsaliyun/aws/tencent/hwcloud/azure
). Deprecated: Datakit can infer cloud type automatically.DK_USER_NAME
:Datakit service running user name. Default isroot
. More details is in Attention below.DK_LITE
: When installing the simplified DataKit, you can set this variable to1
. ( Version-1.14.0)
Disable all default inputs Version-1.5.5
We can set DK_DEF_INPUTS
to -
to disable all default inputs:
DK_DEF_INPUTS="-" \
DK_DATAWAY=https://openway.guance.com?token=<TOKEN> \
bash -c "$(curl -L https://static.guance.com/datakit/install.sh)"
Beside, if Datakit has been installed before, we must delete all default inputs .conf files manually. During installing, Datakit able to add new inputs configure, not cant delete them.
Attention
For privilege reason, using DK_USER_NAME
with not root
name could cause following collector unavailable:
In addition, the following items need to be noted.
-
Manually create user and group first, then start install. There are difference between Linux distribution releases, below commands are for reference:
On DataKit's Own Log¶
DK_LOG_LEVEL
: Optional info/debugDK_LOG
: If changed to stdout, the log will not be written to the file, but will be output by the terminal.DK_GIN_LOG
: If changed to stdout, the log will not be written to the file, but will be output by the terminal.
On DataKit pprof¶
DK_ENABLE_PPROF
(deprecated): whether to turn onpprof
DK_PPROF_LISTEN
:pprof
service listening address
Version-1.9.2 enabled pprof by default.
On DataKit Election¶
DK_ENABLE_ELECTION
: Open the election, not by default. If you need to open it, give any non-empty string value to the environment variable. (egTrue
/False
)DK_NAMESPACE
: Supports namespaces specified during installation (for election)
On HTTP/API Environment¶
DK_HTTP_LISTEN
: Support the installation-stage specified DataKit HTTP service binding network card (defaultlocalhost
)DK_HTTP_PORT
: Support specifying the port of the DataKit HTTP service binding during installation (default9529
)DK_RUM_ORIGIN_IP_HEADER
: RUM-specificDK_DISABLE_404PAGE
: Disable the DataKit 404 page (commonly used when deploying DataKit RUM on the public network. Such asTrue
/False
)DK_INSTALL_IPDB
: Specify the IP library at installation time (currently onlyiploc
andgeolite2
is supported)DK_UPGRADE_IP_WHITELIST
: Starting from Datakit 1.5.9, we can upgrade Datakit by access remote http API. This environment variable is used to set the IP whitelist of clients that can be accessed remotely(multiple IPs could be separated by commas,
). Access outside the whitelist will be denied (default not restricted).DK_HTTP_PUBLIC_APIS
: Specify which Datakit HTTP APIs can be accessed by remote, generally config combined with RUM input,support from Datakit 1.9.2.
On DCA¶
DK_DCA_ENABLE
: Support DCA service to be turned on during installation (not turned on by default)DK_DCA_LISTEN
: Support custom configuration of DCA service listening addresses and ports during installation (default0.0.0.0:9531
)DK_DCA_WHITE_LIST
: Support setup of DCA service access whitelist, multiple whitelists split (e.g.192.168.0.1/24,10.10.0.1/24
)
On External Collector¶
DK_INSTALL_EXTERNALS
: Used to install external collectors not packaged with DataKit
On Confd Configuration¶
Environment Variable Name | Type | Applicable Scenario | Description | Sample Value |
---|---|---|---|---|
DK_CONFD_BACKEND | string | All | Backend Source Type | etcdv3 , zookeeper , redis or consul |
DK_CONFD_BASIC_AUTH | string | etcdv3 , consul |
Optional | |
DK_CONFD_CLIENT_CA_KEYS | string | etcdv3 , consul |
Optional | |
DK_CONFD_CLIENT_CERT | string | etcdv3 , consul |
Optional | |
DK_CONFD_CLIENT_KEY | string | etcdv3 , consul or redis |
Optional | |
DK_CONFD_BACKEND_NODES | string | All | Backend Source Address | [IP 地址:2379,IP address 2:2379] |
DK_CONFD_PASSWORD | string | etcdv3 , consul |
Optional | |
DK_CONFD_SCHEME | string | etcdv3 , consul |
Optional | |
DK_CONFD_SEPARATOR | string | redis |
Optional default 0 | |
DK_CONFD_USERNAME | string | etcdv3 , consul |
Optional |
On Git Configuration¶
DK_GIT_URL
: The remote git repo address for managing configuration files. (e.g.http://username:password@github.com/username/repository.git
)DK_GIT_KEY_PATH
: The full path of the local PrivateKey. (e.g./Users/username/.ssh/id_rsa
)DK_GIT_KEY_PW
: The password to use the local PrivateKey. (e.g.passwd
)DK_GIT_BRANCH
: Specify the branch to pull. If it is empty, it is the default, and the default is the remotely specified main branch, which is usuallymaster
.DK_GIT_INTERVAL
: The interval of the timed pull. (e.g.1m
)
On Sinker Configuration¶
DK_SINKER_GLOBAL_CUSTOMER_KEYS
used to setup sinker tag/field keys, here is the example:
Remove-Item -ErrorAction SilentlyContinue Env:DK_*;
$env:DK_DATAWAY="https://openway.guance.com?token=<TOKEN>";
$env:DK_DATAWAY_ENABLE_SINKER="on";
$env:DK_SINKER_GLOBAL_CUSTOMER_KEYS="key1,key2";
Set-ExecutionPolicy Bypass -scope Process -Force;
Import-Module bitstransfer;
start-bitstransfer -source https://static.guance.com/datakit/install.ps1 -destination .install.ps1;
powershell ./.install.ps1;
On Resource Limit Configuration¶
Only Linux and Windows ( Version-1.15.0) operating system are supported.
DK_LIMIT_DISABLED
: Turn off Resource limit function (on by default)DK_LIMIT_CPUMAX
: Maximum CPU power, default 30.0DK_LIMIT_MEMMAX
: Limit memory (including swap), default 4096 (4GB)
Other Installation Options¶
Environment Variable Name | Sample | Description |
---|---|---|
DK_INSTALL_ONLY |
on |
Install only, not run |
DK_HOSTNAME |
some-host-name |
Support custom configuration hostname during installation |
DK_UPGRADE |
1 |
Upgrade to the latest version (Note: Once this option is turned on, all other options except DK_UPGRADE_MANAGER are invalid) |
DK_UPGRADE_MANAGER |
on |
Whether we upgrade the Remote Upgrade Service when upgrading Datakit, it's used in conjunction with DK_UPGRADE , supported start from 1.5.9 |
DK_INSTALLER_BASE_URL |
https://your-url |
You can choose the installation script for different environments, default to https://static.guance.com/datakit |
DK_PROXY_TYPE |
- | Proxy type. The options are: datakit or nginx , both lowercase |
DK_NGINX_IP |
- | Proxy server IP address (only need to fill in IP but not port). With the highest priority, this is mutually exclusive with the above "HTTP_PROXY" and "HTTPS_PROXY" and will override both. |
DK_INSTALL_LOG |
- | Set the setup log path, default to install.log in the current directory, if set to stdout , output to the command line terminal. |
HTTPS_PROXY |
IP:Port |
Installed through the Datakit agent |
DK_INSTALL_RUM_SYMBOL_TOOLS |
on |
Install source map tools for RUM, support from Datakit 1.9.2. |
DK_VERBOSE |
on |
Enable more verbose info during install(only for Linux/Mac) Version-1.19.0 |
FAQ¶
How to Deal with the Unfriendly Host Name¶
Because DataKit uses Hostname as the basis for data concatenation, in some cases, some host names are not very friendly, such as iZbp141ahn....
, but for some reasons, these host names cannot be modified, which brings some troubles to use. In DataKit, this unfriendly host name can be overwritten in the main configuration.
In datakit.conf
, modify the following configuration and the DataKit will read ENV_HOSTNAME
to overwrite the current real hostname:
Note: If a host has collected data for a period of time, after changing the host name, the historical data will no longer be associated with the new host name. Changing the host name is equivalent to adding a brand-new host.
Issue on macOS installation¶
If it appears during the installation/upgrade process when installing on macOS:
"launchctl" failed with stderr: /Library/LaunchDaemons/cn.dataflux.datakit.plist: Service is disabled
# or
"launchctl" failed with stderr: /Library/LaunchDaemons/com.guance.datakit.plist: Service is disabled
Execute:
Then execute the following command:
sudo launchctl load -w /Library/LaunchDaemons/cn.dataflux.datakit.plist
# or
sudo launchctl load -w /Library/LaunchDaemons/com.guance.datakit.plist