netdata.coupang.duckdns.org
Open in
urlscan Pro
140.238.25.114
Public Scan
Submitted URL: http://netdata.coupang.duckdns.org/
Effective URL: https://netdata.coupang.duckdns.org/
Submission: On August 20 via api from KR — Scanned from US
Effective URL: https://netdata.coupang.duckdns.org/
Submission: On August 20 via api from KR — Scanned from US
Form analysis
5 forms found in the DOM<form id="optionsForm1" class="form-horizontal">
<div class="form-group">
<table>
<tbody>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-success" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="stop_updates_when_focus_is_lost" type="checkbox" checked="checked" data-toggle="toggle" data-offstyle="danger" data-onstyle="success"
data-on="On Focus" data-off="Always" data-width="110px">
<div class="toggle-group"><label class="btn btn-success toggle-on">On Focus</label><label class="btn btn-danger active toggle-off">Always</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>When to refresh the charts?</strong><br><small>When set to <b>On Focus</b>, the charts will stop being updated if the page / tab does not have the focus of the user. When set to <b>Always</b>, the charts will
always be refreshed. Set it to <b>On Focus</b> it to lower the CPU requirements of the browser (and extend the battery of laptops and tablets) when this page does not have your focus. Set to <b>Always</b> to work on another window (i.e.
change the settings of something) and have the charts auto-refresh in this window.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="eliminate_zero_dimensions" type="checkbox" checked="checked" data-toggle="toggle" data-on="Non Zero" data-off="All"
data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Non Zero</label><label class="btn btn-default active toggle-off">All</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Which dimensions to show?</strong><br><small>When set to <b>Non Zero</b>, dimensions that have all their values (within the current view) set to zero will not be transferred from the netdata server (except if
all dimensions of the chart are zero, in which case this setting does nothing - all dimensions are transferred and shown). When set to <b>All</b>, all dimensions will always be shown. Set it to <b>Non Zero</b> to lower the data
transferred between netdata and your browser, lower the CPU requirements of your browser (fewer lines to draw) and increase the focus on the legends (fewer entries at the legends).</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-default off" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="destroy_on_hide" type="checkbox" data-toggle="toggle" data-on="Destroy" data-off="Hide" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Destroy</label><label class="btn btn-default active toggle-off">Hide</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>How to handle hidden charts?</strong><br><small>When set to <b>Destroy</b>, charts that are not in the current viewport of the browser (are above, or below the visible area of the page), will be destroyed and
re-created if and when they become visible again. When set to <b>Hide</b>, the not-visible charts will be just hidden, to simplify the DOM and speed up your browser. Set it to <b>Destroy</b>, to lower the memory requirements of your
browser. Set it to <b>Hide</b> for faster restoration of charts on page scrolling.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-default off" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="async_on_scroll" type="checkbox" data-toggle="toggle" data-on="Async" data-off="Sync" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Async</label><label class="btn btn-default active toggle-off">Sync</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Page scroll handling?</strong><br><small>When set to <b>Sync</b>, charts will be examined for their visibility immediately after scrolling. On slow computers this may impact the smoothness of page scrolling.
To update the page when scrolling ends, set it to <b>Async</b>. Set it to <b>Sync</b> for immediate chart updates when scrolling. Set it to <b>Async</b> for smoother page scrolling on slower computers.</small></td>
</tr>
</tbody>
</table>
</div>
</form>
<form id="optionsForm2" class="form-horizontal">
<div class="form-group">
<table>
<tbody>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="parallel_refresher" type="checkbox" checked="checked" data-toggle="toggle" data-on="Parallel" data-off="Sequential"
data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Parallel</label><label class="btn btn-default active toggle-off">Sequential</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Which chart refresh policy to use?</strong><br><small>When set to <b>parallel</b>, visible charts are refreshed in parallel (all queries are sent to netdata server in parallel) and are rendered
asynchronously. When set to <b>sequential</b> charts are refreshed one after another. Set it to parallel if your browser can cope with it (most modern browsers do), set it to sequential if you work on an older/slower computer.</small>
</td>
</tr>
<tr class="option-row" id="concurrent_refreshes_row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="concurrent_refreshes" type="checkbox" checked="checked" data-toggle="toggle" data-on="Resync" data-off="Best Effort"
data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Resync</label><label class="btn btn-default active toggle-off">Best Effort</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Shall we re-sync chart refreshes?</strong><br><small>When set to <b>Resync</b>, the dashboard will attempt to re-synchronize all the charts so that they are refreshed concurrently. When set to
<b>Best Effort</b>, each chart may be refreshed with a little time difference to the others. Normally, the dashboard starts refreshing them in parallel, but depending on the speed of your computer and the network latencies, charts start
having a slight time difference. Setting this to <b>Resync</b> will attempt to re-synchronize the charts on every update. Setting it to <b>Best Effort</b> may lower the pressure on your browser and the network.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-success" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="sync_selection" type="checkbox" checked="checked" data-toggle="toggle" data-on="Sync" data-off="Don't Sync" data-onstyle="success"
data-offstyle="danger" data-width="110px">
<div class="toggle-group"><label class="btn btn-success toggle-on">Sync</label><label class="btn btn-danger active toggle-off">Don't Sync</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Sync hover selection on all charts?</strong><br><small>When enabled, a selection on one chart will automatically select the same time on all other visible charts and the legends of all visible charts will be
updated to show the selected values. When disabled, only the chart getting the user's attention will be selected. Enable it to get better insights of the data. Disable it if you are on a very slow computer that cannot actually do
it.</small></td>
</tr>
</tbody>
</table>
</div>
</form>
<form id="optionsForm3" class="form-horizontal">
<div class="form-group">
<table>
<tbody>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-default off" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="legend_right" type="checkbox" checked="checked" data-toggle="toggle" data-on="Right" data-off="Below" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Right</label><label class="btn btn-default active toggle-off">Below</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Where do you want to see the legend?</strong><br><small>Netdata can place the legend in two positions: <b>Below</b> charts (the default) or to the <b>Right</b> of
charts.<br><b>Switching this will reload the dashboard</b>.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-success" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="netdata_theme_control" type="checkbox" checked="checked" data-toggle="toggle" data-offstyle="danger" data-onstyle="success"
data-on="Dark" data-off="White" data-width="110px">
<div class="toggle-group"><label class="btn btn-success toggle-on">Dark</label><label class="btn btn-danger active toggle-off">White</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Which theme to use?</strong><br><small>Netdata comes with two themes: <b>Dark</b> (the default) and <b>White</b>.<br><b>Switching this will reload the dashboard</b>.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="show_help" type="checkbox" checked="checked" data-toggle="toggle" data-on="Help Me" data-off="No Help" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Help Me</label><label class="btn btn-default active toggle-off">No Help</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Do you need help?</strong><br><small>Netdata can show some help in some areas to help you use the dashboard. If all these balloons bother you, disable them using this
switch.<br><b>Switching this will reload the dashboard</b>.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="pan_and_zoom_data_padding" type="checkbox" checked="checked" data-toggle="toggle" data-on="Pad" data-off="Don't Pad"
data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Pad</label><label class="btn btn-default active toggle-off">Don't Pad</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Enable data padding when panning and zooming?</strong><br><small>When set to <b>Pad</b> the charts will be padded with more data, both before and after the visible area, thus giving the impression the whole
database is loaded. This padding will happen only after the first pan or zoom operation on the chart (initially all charts have only the visible data). When set to <b>Don't Pad</b> only the visible data will be transferred from the
netdata server, even after the first pan and zoom operation.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="smooth_plot" type="checkbox" checked="checked" data-toggle="toggle" data-on="Smooth" data-off="Rough" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Smooth</label><label class="btn btn-default active toggle-off">Rough</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Enable Bézier lines on charts?</strong><br><small>When set to <b>Smooth</b> the charts libraries that support it, will plot smooth curves instead of simple straight lines to connect the points.<br>Keep in
mind <a href="http://dygraphs.com" target="_blank">dygraphs</a>, the main charting library in netdata dashboards, can only smooth line charts. It cannot smooth area or stacked charts. When set to <b>Rough</b>, this setting can lower the
CPU resources consumed by your browser.</small></td>
</tr>
</tbody>
</table>
</div>
</form>
<form id="optionsForm4" class="form-horizontal">
<div class="form-group">
<table>
<tbody>
<tr class="option-row">
<td colspan="2" align="center"><small><b>These settings are applied gradually, as charts are updated. To force them, refresh the dashboard now</b>.</small></td>
</tr>
<tr class="option-row">
<td class="option-control">
<div class="toggle btn btn-success" data-toggle="toggle" style="width: 110px; height: 38px;"><input id="units_conversion" type="checkbox" checked="checked" data-toggle="toggle" data-on="Scale Units" data-off="Fixed Units"
data-onstyle="success" data-width="110px">
<div class="toggle-group"><label class="btn btn-success toggle-on">Scale Units</label><label class="btn btn-default active toggle-off">Fixed Units</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Enable auto-scaling of select units?</strong><br><small>When set to <b>Scale Units</b> the values shown will dynamically be scaled (e.g. 1000 kilobits will be shown as 1 megabit). Netdata can auto-scale these
original units: <code>kilobits/s</code>, <code>kilobytes/s</code>, <code>KB/s</code>, <code>KB</code>, <code>MB</code>, and <code>GB</code>. When set to <b>Fixed Units</b> all the values will be rendered using the original units
maintained by the netdata server.</small></td>
</tr>
<tr id="settingsLocaleTempRow" class="option-row">
<td class="option-control">
<div class="toggle btn btn-primary" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="units_temp" type="checkbox" checked="checked" data-toggle="toggle" data-on="Celsius" data-off="Fahrenheit" data-width="110px">
<div class="toggle-group"><label class="btn btn-primary toggle-on">Celsius</label><label class="btn btn-default active toggle-off">Fahrenheit</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Which units to use for temperatures?</strong><br><small>Set the temperature units of the dashboard.</small></td>
</tr>
<tr id="settingsLocaleTimeRow" class="option-row">
<td class="option-control">
<div class="toggle btn btn-success" data-toggle="toggle" style="width: 110px; height: 19px;"><input id="seconds_as_time" type="checkbox" checked="checked" data-toggle="toggle" data-on="Time" data-off="Seconds" data-onstyle="success"
data-width="110px">
<div class="toggle-group"><label class="btn btn-success toggle-on">Time</label><label class="btn btn-default active toggle-off">Seconds</label><span class="toggle-handle btn btn-default"></span></div>
</div>
</td>
<td class="option-info"><strong>Convert seconds to time?</strong><br><small>When set to <b>Time</b>, charts that present <code>seconds</code> will show <code>DDd:HH:MM:SS</code>. When set to <b>Seconds</b>, the raw number of seconds will be
presented.</small></td>
</tr>
</tbody>
</table>
</div>
</form>
#
<form action="#"><input class="form-control" id="switchRegistryPersonGUID" placeholder="your personal ID" maxlength="36" autocomplete="off" style="text-align:center;font-size:1.4em"></form>
Text Content
netdata Real-time performance monitoring, done right! Welcome back!Sign in again to enjoy the benefits of Netdata Cloud Sign in oracle-e2-0 Connection to Cloud UTC -10 Playing 8/20/24 • 06:5607:03 • last 7min 0 0 Sign in Discover the free benefits of Netdata Cloud: Home Node View Overview Nodes Dashboards Alerts Anomalies Pricing Privacy NETDATA REAL-TIME PERFORMANCE MONITORING, IN THE GREATEST POSSIBLE DETAIL Drag charts to pan. Shift + wheel on them, to zoom in and out. Double-click on them, to reset. Hover on them too! system.cpu SYSTEM OVERVIEW Overview of the key system metrics. 25Used Swap% 0.7Disk ReadMiB/s 0.07Disk WriteMiB/s 9.6CPU%0.0100.0 0.32Net Inboundmegabits/s 0.49Net Outboundmegabits/s 57.3Used RAM% CPU Total CPU utilization (all cores). 100% here means there is no CPU idle time at all. You can get per core usage at the CPUs section and per application usage at the Applications Monitoring section. Keep an eye on iowait iowait (0.7%). If it is constantly high, your disks are a bottleneck and they slow your system down. An important metric worth monitoring, is softirq softirq (1.99%). A constantly high percentage of softirq may indicate network driver issues. The individual metrics can be found in the kernel documentation. Total CPU utilization (system.cpu) 0.0 20.0 40.0 60.0 80.0 100.0 06:57:30 06:58:00 06:58:30 06:59:00 06:59:30 07:00:00 07:00:30 07:01:00 07:01:30 07:02:00 07:02:30 07:03:00 07:03:30 07:04:00 steal softirq user system nice iowait percentage Tue, Aug 20, 2024|07:04:01 steal3.0 softirq2.0 user2.0 system1.5 nice1.0 iowait0.0 CPU Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on CPU. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. CPU some pressure (system.cpu_some_pressure) 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 06:57:30 06:58:00 06:58:30 06:59:00 06:59:30 07:00:00 07:00:30 07:01:00 07:01:30 07:02:00 07:02:30 07:03:00 07:03:30 some 10 some 60 some 300 percentage Tue, Aug 20, 2024|07:04:00 some 104.3 some 601.9 some 3003.7 The amount of time some processes have been waiting for CPU time. CPU some pressure stall time (system.cpu_some_pressure_stall_time) 20.0 40.0 60.0 80.0 100.0 06:57:30 06:58:00 06:58:30 06:59:00 06:59:30 07:00:00 07:00:30 07:01:00 07:01:30 07:02:00 07:02:30 07:03:00 07:03:30 time ms Tue, Aug 20, 2024|07:04:00 time21.3 CPU Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on CPU resource simultaneously. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. system.cpu_full_pressure The amount of time all non-idle processes have been stalled due to CPU congestion. system.cpu_full_pressure_stall_time LOAD Current system load, i.e. the number of processes using CPU or waiting for system resources (usually CPU and disk). The 3 metrics refer to 1, 5 and 15 minute averages. The system calculates this once every 5 seconds. For more information check this wikipedia article. system.load DISK Total Disk I/O, for all physical disks. You can get detailed information about each disk at the Disks section and per application Disk usage at the Applications Monitoring section. Physical are all the disks that are listed in /sys/block, but do not exist in /sys/devices/virtual/block. system.io Memory paged from/to disk. This is usually the total disk I/O of the system. system.pgpgio I/O Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on I/O. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. system.io_some_pressure The amount of time some processes have been waiting due to I/O congestion. system.io_some_pressure_stall_time I/O Pressure Stall Information. Full line indicates the share of time in which all non-idle tasks are stalled on I/O resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. system.io_full_pressure The amount of time all non-idle processes have been stalled due to I/O congestion. system.io_full_pressure_stall_time RAM System Random Access Memory (i.e. physical memory) usage. system.ram Memory Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on memory. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. system.memory_some_pressure The amount of time some processes have been waiting due to memory congestion. system.memory_some_pressure_stall_time Memory Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on memory resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. system.memory_full_pressure The amount of time all non-idle processes have been stalled due to memory congestion. system.memory_full_pressure_stall_time SWAP System swap memory usage. Swap space is used when the amount of physical memory (RAM) is full. When the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space (usually a disk, a disk partition or a file). system.swap System swap I/O. In - pages the system has swapped in from disk to RAM. Out - pages the system has swapped out from RAM to disk. system.swapio NETWORK Total bandwidth of all physical network interfaces. This does not include lo, VPNs, network bridges, IFB devices, bond interfaces, etc. Only the bandwidth of physical network interfaces is aggregated. Physical are all the network interfaces that are listed in /proc/net/dev, but do not exist in /sys/devices/virtual/net. system.net Total IP traffic in the system. system.ip Total IPv6 Traffic. system.ipv6 PROCESSES System processes. Running - running or ready to run (runnable). Blocked - currently blocked, waiting for I/O to complete. system.processes The number of processes in different states. Running - Process using the CPU at a particular moment. Sleeping (uninterruptible) - Process will wake when a waited-upon resource becomes available or after a time-out occurs during that wait. Mostly used by device drivers waiting for disk or network I/O. Sleeping (interruptible) - Process is waiting either for a particular time slot or for a particular event to occur. Zombie - Process that has completed its execution, released the system resources, but its entry is not removed from the process table. Usually occurs in child processes when the parent process still needs to read its child’s exit status. A process that stays a zombie for a long time is generally an error and causes syst... The number of processes in different states. Running - Process using the CPU at a particular moment. Sleeping (uninterruptible) - Process will wake when a waited-upon resource becomes available or after a time-out occurs during that wait. Mostly used by device drivers waiting for disk or network I/O. Sleeping (interruptible) - Process is waiting either for a particular time slot or for a particular event to occur. Zombie - Process that has completed its execution, released the system resources, but its entry is not removed from the process table. Usually occurs in child processes when the parent process still needs to read its child’s exit status. A process that stays a zombie for a long time is generally an error and causes system PID space leak. Stopped - Process is suspended from proceeding further due to STOP or TSTP signals. In this state, a process will not do anything (not even terminate) until it receives a CONT signal. show more information system.processes_state The number of new processes created. system.forks The total number of processes in the system. system.active_processes Context Switches, is the switching of the CPU from one process, task or thread to another. If there are many processes or threads willing to execute and very few CPU cores available to handle them, the system is making more context switching to balance the CPU resources among them. The whole process is computationally intensive. The more the context switches, the slower the system gets. system.ctxt Number of times a function that starts a process or thread is called. Netdata shows process metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. system.process_thread Number of times a function responsible to close a process or thread is called. Netdata shows process metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. system.exit Difference between the number of calls to functions that close a task and release a task. This chart is provided by the eBPF plugin. system.process_status IDLEJITTER Idle jitter is calculated by netdata. A thread is spawned that requests to sleep for a few microseconds. When the system wakes it up, it measures how many microseconds have passed. The difference between the requested and the actual duration of the sleep, is the idle jitter. This number is useful in real-time environments, where CPU jitter can affect the quality of the service (like VoIP media gateways). system.idlejitter INTERRUPTS Interrupts are signals sent to the CPU by external devices (normally I/O devices) or programs (running processes). They tell the CPU to stop its current activities and execute the appropriate part of the operating system. Interrupt types are hardware (generated by hardware devices to signal that they need some attention from the OS), software (generated by programs when they want to request a system call to be performed by the operating system), and traps (generated by the CPU itself to indicate that some error or condition occurred for which assistance from the operating system is needed). Total number of CPU interrupts. Check system.interrupts that gives more detail about each interrupt and also the CPUs section where interrupts are analyzed per CPU core. system.intr CPU interrupts in detail. At the CPUs section, interrupts are analyzed per CPU core. The last column in /proc/interrupts provides an interrupt description or the device name that registered the handler for that interrupt. system.interrupts Total time spent servicing hardware interrupts. Based on the eBPF hardirqs from BCC tools. This chart is provided by the eBPF plugin. system.hardirq_latency SOFTIRQS Software interrupts (or "softirqs") are one of the oldest deferred-execution mechanisms in the kernel. Several tasks among those executed by the kernel are not critical: they can be deferred for a long period of time, if necessary. The deferrable tasks can execute with all interrupts enabled (softirqs are patterned after hardware interrupts). Taking them out of the interrupt handler helps keep kernel response time small. Total number of software interrupts in the system. At the CPUs section, softirqs are analyzed per CPU core. HI - high priority tasklets. TIMER - tasklets related to timer interrupts. NET_TX, NET_RX - used for network transmit and receive processing. BLOCK - handles block I/O completion events. IRQ_POLL - used by the IO subsystem to increase performance (a NAPI like approach for block devices). TASKLET - handles regular tasklets. SCHED - used by the scheduler to perform load-balancing and other scheduling tasks. HRTIMER - used for high-resolution timers. RCU - performs read-copy-update (RCU) processing. system.softirqs Total time spent servicing software interrupts. Based on the eBPF softirqs from BCC tools. This chart is provided by the eBPF plugin. system.softirq_latency SOFTNET Statistics for CPUs SoftIRQs related to network receive work. Break down per CPU core can be found at CPU / softnet statistics. More information about identifying and troubleshooting network driver related issues can be found at Red Hat Enterprise Linux Network Performance Tuning Guide. Processed - packets processed. Dropped - packets dropped because the network device backlog was full. Squeezed - number of times the network device budget was consumed or the time limit was reached, but more work was available. ReceivedRPS - number of times this CPU has been woken up to process packets via an Inter-processor Interrupt. FlowLimitCount - number of times the flow limit has been reached (flow limiting is an optional Receive Packet Steering feature). system.softnet_stat ENTROPY Entropy, is a pool of random numbers (/dev/random) that is mainly used in cryptography. If the pool of entropy gets empty, processes requiring random numbers may run a lot slower (it depends on the interface each program uses), waiting for the pool to be replenished. Ideally a system with high entropy demands should have a hardware device for that purpose (TPM is one such device). There are also several software-only options you may install, like haveged, although these are generally useful only in servers. system.entropy UPTIME The amount of time the system has been running, including time spent in suspend. system.uptime CLOCK SYNCHRONIZATION NTP lets you automatically sync your system time with a remote server. This keeps your machine’s time accurate by syncing with servers that are known to have accurate times. The system clock synchronization state as provided by the ntp_adjtime() system call. An unsynchronized clock may be the result of synchronization issues by the NTP daemon or a hardware clock fault. It can take several minutes (usually up to 17) before NTP daemon selects a server to synchronize with. State map: 0 - not synchronized, 1 - synchronized. system.clock_sync_state The kernel code can operate in various modes and with various features enabled or disabled, as selected by the ntp_adjtime() system call. The system clock status shows the value of the time_status variable in the kernel. The bits of the variable are used to control these functions and record error conditions as they exist. UNSYNC - set/cleared by the caller to indicate clock unsynchronized (e.g., when no peers are reachable). This flag is usually controlled by an application program, but the operating system may also set it. CLOCKERR - set/cleared by the external hardware clock driver to indicate hardware fault. Status map: 0 - bit unset, 1 - bit set. system.clock_status A typical NTP client regularly polls one or more NTP servers. The client must compute its time offset and round-trip delay. Time offset is the difference in absolute time between the two clocks. system.clock_sync_offset IPC SEMAPHORES System V semaphores is an inter-process communication (IPC) mechanism. It allows processes or threads within a process to synchronize their actions. They are often used to monitor and control the availability of system resources such as shared memory segments. For details, see svipc(7). To see the host IPC semaphore information, run ipcs -us. For limits, run ipcs -ls. Number of allocated System V IPC semaphores. The system-wide limit on the number of semaphores in all semaphore sets is specified in /proc/sys/kernel/sem file (2nd field). system.ipc_semaphores Number of used System V IPC semaphore arrays (sets). Semaphores support semaphore sets where each one is a counting semaphore. So when an application requests semaphores, the kernel releases them in sets. The system-wide limit on the maximum number of semaphore sets is specified in /proc/sys/kernel/sem file (4th field). system.ipc_semaphore_arrays IPC SHARED MEMORY System V shared memory is an inter-process communication (IPC) mechanism. It allows processes to communicate information by sharing a region of memory. It is the fastest form of inter-process communication available since no kernel involvement occurs when data is passed between the processes (no copying). Typically, processes must synchronize their access to a shared memory object, using, for example, POSIX semaphores. For details, see svipc(7). To see the host IPC shared memory information, run ipcs -um. For limits, run ipcs -lm. Number of allocated System V IPC memory segments. The system-wide maximum number of shared memory segments that can be created is specified in /proc/sys/kernel/shmmni file. system.shared_memory_segments Amount of memory currently used by System V IPC memory segments. The run-time limit on the maximum shared memory segment size that can be created is specified in /proc/sys/kernel/shmmax file. system.shared_memory_bytes Number of calls to syscalls responsible to manipulate shared memories. Netdata shows shared memory metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. system.shared_memory_calls -------------------------------------------------------------------------------- CPUS Detailed information for each CPU of the system. A summary of the system for all CPUs can be found at the System Overview section. UTILIZATION cpu.cpu0 cpu.cpu1 INTERRUPTS Total number of interrupts per CPU. To see the total number for the system check the interrupts section. The last column in /proc/interrupts provides an interrupt description or the device name that registered the handler for that interrupt. cpu.cpu0_interrupts cpu.cpu1_interrupts SOFTIRQS Total number of software interrupts per CPU. To see the total number for the system check the softirqs section. cpu.cpu0_softirqs cpu.cpu1_softirqs SOFTNET Statistics for CPUs SoftIRQs related to network receive work. Total for all CPU cores can be found at System / softnet statistics. More information about identifying and troubleshooting network driver related issues can be found at Red Hat Enterprise Linux Network Performance Tuning Guide. Processed - packets processed. Dropped - packets dropped because the network device backlog was full. Squeezed - number of times the network device budget was consumed or the time limit was reached, but more work was available. ReceivedRPS - number of times this CPU has been woken up to process packets via an Inter-processor Interrupt. FlowLimitCount - number of times the flow limit has been reached (flow limiting is an optional Receive Packet Steering feature). cpu.cpu0_softnet_stat cpu.cpu1_softnet_stat CPUIDLE Idle States (C-states) are used to save power when the processor is idle. cpu.cpu0_cpuidle cpu.cpu1_cpuidle -------------------------------------------------------------------------------- MEMORY Detailed information about the memory management of the system. SYSTEM Available Memory is estimated by the kernel, as the amount of RAM that can be used by userspace processes, without causing swapping. mem.available The number of processes killed by Out of Memory Killer. The kernel's OOM killer is summoned when the system runs short of free memory and is unable to proceed without killing one or more processes. It tries to pick the process whose demise will free the most memory while causing the least misery for users of the system. This counter also includes processes within containers that have exceeded the memory limit. mem.oom_kill Committed Memory, is the sum of all memory which has been allocated by processes. mem.committed A page fault is a type of interrupt, called trap, raised by computer hardware when a running program accesses a memory page that is mapped into the virtual address space, but not actually loaded into main memory. Minor - the page is loaded in memory at the time the fault is generated, but is not marked in the memory management unit as being loaded in memory. Major - generated when the system needs to load the memory page from disk or swap memory. mem.pgfaults KERNEL Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk. mem.writeback The total amount of memory being used by the kernel. Slab - used by the kernel to cache data structures for its own use. KernelStack - allocated for each task done by the kernel. PageTables - dedicated to the lowest level of page tables (A page table is used to turn a virtual address into a physical memory address). VmallocUsed - being used as virtual address space. Percpu - allocated to the per-CPU allocator used to back per-CPU allocations (excludes the cost of metadata). When you create a per-CPU variable, each processor on the system gets its own copy of that variable. mem.kernel SLAB Slab memory statistics. Reclaimable - amount of memory which the kernel can reuse. Unreclaimable - can not be reused even when the kernel is lacking memory. mem.slab SYNCHRONIZATION (EBPF) Number of calls to syscalls responsible to transfer modified Linux page cache to disk. This chart has a relationship with File systems and Linux Page Cache. This chart is provided by the eBPF plugin. mem.file_sync Number of calls to syscall responsible to the in-core copy of a file that was mapped. This chart has a relationship with File systems and Linux Page Cache. This chart is provided by the eBPF plugin. mem.memory_map Number of calls to syscalls that sync filesystem metadata or cached. This chart has a relationship with File systems and Linux Page Cache. This chart is provided by the eBPF plugin. mem.sync Number of calls to syscall responsible to sync file segments. This chart has a relationship with File systems and Linux Page Cache. This chart is provided by the eBPF plugin. mem.file_segment -------------------------------------------------------------------------------- DISKS Charts with performance information for all the system disks. Special care has been given to present disk performance metrics in a way compatible with iostat -x. netdata by default prevents rendering performance charts for individual partitions and unmounted virtual disks. Disabled charts can still be enabled by configuring the relative settings in the netdata configuration file. SDA disk.sda disk.sda disk_util.sda The amount of data transferred to and from disk. disk.sda The amount of discarded data that are no longer in use by a mounted file system. disk_ext.sda Completed disk I/O operations. Keep in mind the number of operations requested might be higher, since the system is able to merge adjacent to each other (see merged operations chart). disk_ops.sda The number (after merges) of completed discard/flush requests. Discard commands inform disks which blocks of data are no longer considered to be in use and therefore can be erased internally. They are useful for solid-state drivers (SSDs) and thinly-provisioned storage. Discarding/trimming enables the SSD to handle garbage collection more efficiently, which would otherwise slow future write operations to the involved blocks down. Flush operations transfer all modified in-core data (i.e., modified buffer cache pages) to the disk device so that all changed information can be retrieved even if the system crashes or is rebooted. Flush requests are executed by disks. Flush requests are not tracked for partitions. Before being merged, flush... The number (after merges) of completed discard/flush requests. Discard commands inform disks which blocks of data are no longer considered to be in use and therefore can be erased internally. They are useful for solid-state drivers (SSDs) and thinly-provisioned storage. Discarding/trimming enables the SSD to handle garbage collection more efficiently, which would otherwise slow future write operations to the involved blocks down. Flush operations transfer all modified in-core data (i.e., modified buffer cache pages) to the disk device so that all changed information can be retrieved even if the system crashes or is rebooted. Flush requests are executed by disks. Flush requests are not tracked for partitions. Before being merged, flush operations are counted as writes. show more information disk_ext_ops.sda I/O operations currently in progress. This metric is a snapshot - it is not an average over the last interval. disk_qops.sda Backlog is an indication of the duration of pending disk operations. On every I/O event the system is multiplying the time spent doing I/O since the last update of this field with the number of pending operations. While not accurate, this metric can provide an indication of the expected completion time of the operations in progress. disk_backlog.sda Disk Busy Time measures the amount of time the disk was busy with something. disk_busy.sda Disk Utilization measures the amount of time the disk was busy with something. This is not related to its performance. 100% means that the system always had an outstanding operation on the disk. Keep in mind that depending on the underlying technology of the disk, 100% here may or may not be an indication of congestion. disk_util.sda The average time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. disk_await.sda The average time for discard/flush requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. disk_ext_await.sda The average I/O operation size. disk_avgsz.sda The average discard operation size. disk_ext_avgsz.sda The average service time for completed I/O operations. This metric is calculated using the total busy time of the disk and the number of completed operations. If the disk is able to execute multiple parallel operations the reporting average service time will be misleading. disk_svctm.sda The number of merged disk operations. The system is able to merge adjacent I/O operations, for example two 4KB reads can become one 8KB read before given to disk. disk_mops.sda The number of merged discard disk operations. Discard operations which are adjacent to each other may be merged for efficiency. disk_ext_mops.sda The sum of the duration of all completed I/O operations. This number can exceed the interval if the disk is able to execute I/O operations in parallel. disk_iotime.sda The sum of the duration of all completed discard/flush operations. This number can exceed the interval if the disk is able to execute discard/flush operations in parallel. disk_ext_iotime.sda / Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._ Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. disk_inodes._ /BOOT/EFI Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._boot_efi /DEV Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._dev Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. disk_inodes._dev /DEV/SHM Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._dev_shm Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. disk_inodes._dev_shm /RUN Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._run Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. disk_inodes._run /RUN/LOCK Disk space utilization. reserved for root is automatically reserved by the system to prevent the root user from getting out of space. disk_space._run_lock Inodes (or index nodes) are filesystem objects (e.g. files and directories). On many types of file system implementations, the maximum number of inodes is fixed at filesystem creation, limiting the maximum number of files the filesystem can hold. It is possible for a device to run out of inodes. When this happens, new files cannot be created on the device, even though there may be free space available. disk_inodes._run_lock -------------------------------------------------------------------------------- FILESYSTEM Number of filesystem events for Virtual File System, File Access, Directory cache, and file system latency (BTRFS, EXT4, NFS, XFS, and ZFS) when your disk has the file system. Filesystem charts have relationship with SWAP, Disk, Sync, and Mount Points. VFS (EBPF) Number of calls to Virtual File System functions used to manipulate File Systems. Number of calls to VFS unlinker function. This chart may not show all file system events if it uses other functions to store data on disk. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_deleted_objects Number of calls to VFS I/O functions. This chart may not show all file system events if it uses other functions to store data on disk. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_io Total of bytes read or written with success using the VFS I/O functions. This chart may not show all file system events if it uses other functions to store data on disk. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_io_bytes Number of calls to VFS syncer function. This chart may not show all file system events if it uses other functions to sync data on disk. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_fsync Number of calls to VFS opener function. This chart may not show all file system events if it uses other functions to open files. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_open Number of calls to VFS creator function. This chart may not show all file system events if it uses other functions to create files. Netdata shows virtual file system metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File Systems. filesystem.vfs_create FILE ACCESS (EBPF) Number of calls for internal functions on the Linux kernel responsible to open and closing files. Netdata shows file access per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. to monitor File systems filesystem.file_descriptor -------------------------------------------------------------------------------- MOUNT POINTS MOUNT (EBPF) Monitor calls to syscalls that are responsible for attaching (mount(2)) or removing filesystems (umount(2)). This chart has relationship with File systems. This chart is provided by the eBPF plugin. mount_points.call Monitor errors in calls to syscalls that are responsible for attaching (mount(2)) or removing filesystems (umount(2)). This chart has relationship with File systems. This chart is provided by the eBPF plugin. mount_points.error -------------------------------------------------------------------------------- NETWORKING STACK Metrics for the networking stack of the system. These metrics are collected from /proc/net/netstat or attaching kprobes to kernel functions, apply to both IPv4 and IPv6 traffic and are related to operation of the kernel networking stack. TCP TCP connection aborts. BadData - happens while the connection is on FIN_WAIT1 and the kernel receives a packet with a sequence number beyond the last one for this connection - the kernel responds with RST (closes the connection). UserClosed - happens when the kernel receives data on an already closed connection and responds with RST. NoMemory - happens when there are too many orphaned sockets (not attached to an fd) and the kernel has to drop a connection - sometimes it will send an RST, sometimes it won't. Timeout - happens when a connection times out. Linger - happens when the kernel killed a socket that was already closed by the application and lingered around for long enough. Failed - happens when the kernel attempted to se... TCP connection aborts. BadData - happens while the connection is on FIN_WAIT1 and the kernel receives a packet with a sequence number beyond the last one for this connection - the kernel responds with RST (closes the connection). UserClosed - happens when the kernel receives data on an already closed connection and responds with RST. NoMemory - happens when there are too many orphaned sockets (not attached to an fd) and the kernel has to drop a connection - sometimes it will send an RST, sometimes it won't. Timeout - happens when a connection times out. Linger - happens when the kernel killed a socket that was already closed by the application and lingered around for long enough. Failed - happens when the kernel attempted to send an RST but failed because there was no memory available. show more information ip.tcpconnaborts TCP prevents out-of-order packets by either sequencing them in the correct order or by requesting the retransmission of out-of-order packets. Timestamp - detected re-ordering using the timestamp option. SACK - detected re-ordering using Selective Acknowledgment algorithm. FACK - detected re-ordering using Forward Acknowledgment algorithm. Reno - detected re-ordering using Fast Retransmit algorithm. ip.tcpreorders TCP maintains an out-of-order queue to keep the out-of-order packets in the TCP communication. InQueue - the TCP layer receives an out-of-order packet and has enough memory to queue it. Dropped - the TCP layer receives an out-of-order packet but does not have enough memory, so drops it. Merged - the received out-of-order packet has an overlay with the previous packet. The overlay part will be dropped. All these packets will also be counted into InQueue. Pruned - packets dropped from out-of-order queue because of socket buffer overrun. ip.tcpofo BROADCAST In computer networking, broadcasting refers to transmitting a packet that will be received by every device on the network. In practice, the scope of the broadcast is limited to a broadcast domain. Total broadcast traffic in the system. ip.bcast Total transferred broadcast packets in the system. ip.bcastpkts MULTICAST IP multicast is a technique for one-to-many communication over an IP network. Multicast uses network infrastructure efficiently by requiring the source to send a packet only once, even if it needs to be delivered to a large number of receivers. The nodes in the network take care of replicating the packet to reach multiple receivers only when necessary. Total multicast traffic in the system. ip.mcast Total transferred multicast packets in the system. ip.mcastpkts ECN Explicit Congestion Notification (ECN) is an extension to the IP and to the TCP that allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that may be used between two ECN-enabled endpoints when the underlying network infrastructure also supports it. Total number of received IP packets with ECN bits set in the system. CEP - congestion encountered. NoECTP - non ECN-capable transport. ECTP0 and ECTP1 - ECN capable transport. ip.ecnpkts KERNEL FUNCTIONS (EBPF) Number of calls to functions responsible for receiving connections. This chart is provided by the eBPF plugin. ip.inbound_conn Number of calls to TCP functions responsible for starting connections. Netdata shows TCP outbound connections metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.tcp_outbound_conn Number of calls to TCP functions responsible for exchanging data. Netdata shows TCP outbound connections metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.tcp_functions Total bytes sent and received with TCP internal functions. Netdata shows TCP bandwidth metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.total_tcp_bandwidth Number of times a TCP packet was retransmitted. Netdata shows TCP retransmit per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.tcp_retransmit Number of calls to UDP functions responsible for exchanging data. Netdata shows TCP outbound connections metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.udp_functions Total bytes sent and received with UDP internal functions. Netdata shows UDP bandwidth metrics per application and cgroup (systemd Services) if apps or cgroup (systemd Services) plugins are enabled. This chart is provided by the eBPF plugin. ip.total_udp_bandwidth -------------------------------------------------------------------------------- IPV4 NETWORKING Metrics for the IPv4 stack of the system. Internet Protocol version 4 (IPv4) is the fourth version of the Internet Protocol (IP). It is one of the core protocols of standards-based internetworking methods in the Internet. IPv4 is a connectionless protocol for use on packet-switched networks. It operates on a best effort delivery model, in that it does not guarantee delivery, nor does it assure proper sequencing or avoidance of duplicate delivery. These aspects, including data integrity, are addressed by an upper layer transport protocol, such as the Transmission Control Protocol (TCP). SOCKETS The total number of used sockets for all address families in this system. ipv4.sockstat_sockets PACKETS IPv4 packets statistics for this host. Received - packets received by the IP layer. This counter will be increased even if the packet is dropped later. Sent - packets sent via IP layer, for both single cast and multicast packets. This counter does not include any packets counted in Forwarded. Forwarded - input packets for which this host was not their final IP destination, as a result of which an attempt was made to find a route to forward them to that final destination. In hosts which do not act as IP Gateways, this counter will include only those packets which were Source-Routed and the Source-Route option processing was successful. Delivered - packets delivered to the upper layer protocols, e.g. TCP, UDP, ICMP, and so on. ipv4.packets ERRORS The number of discarded IPv4 packets. InDiscards, OutDiscards - inbound and outbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. InHdrErrors - input packets that have been discarded due to errors in their IP headers, including bad checksums, version number mismatch, other format errors, time-to-live exceeded, errors discovered in processing their IP options, etc. OutNoRoutes - packets that have been discarded because no route could be found to transmit them to their destination. This includes any packets which a host cannot route because all of its default gateways are down. InAddrErrors - input packets that have been discarded du... The number of discarded IPv4 packets. InDiscards, OutDiscards - inbound and outbound packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. InHdrErrors - input packets that have been discarded due to errors in their IP headers, including bad checksums, version number mismatch, other format errors, time-to-live exceeded, errors discovered in processing their IP options, etc. OutNoRoutes - packets that have been discarded because no route could be found to transmit them to their destination. This includes any packets which a host cannot route because all of its default gateways are down. InAddrErrors - input packets that have been discarded due to invalid IP address or the destination IP address is not a local address and IP forwarding is not enabled. InUnknownProtos - input packets which were discarded because of an unknown or unsupported protocol. show more information ipv4.errors ICMP The number of transferred IPv4 ICMP messages. Received, Sent - ICMP messages which the host received and attempted to send. Both these counters include errors. ipv4.icmp The number of IPv4 ICMP errors. InErrors - received ICMP messages but determined as having ICMP-specific errors, e.g. bad ICMP checksums, bad length, etc. OutErrors - ICMP messages which this host did not send due to problems discovered within ICMP such as a lack of buffers. This counter does not include errors discovered outside the ICMP layer such as the inability of IP to route the resultant datagram. InCsumErrors - received ICMP messages with bad checksum. ipv4.icmp_errors The number of transferred IPv4 ICMP control messages. ipv4.icmpmsg TCP The number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT. This is a snapshot of the established connections at the time of measurement (i.e. a connection established and a connection disconnected within the same iteration will not affect this metric). ipv4.tcpsock The number of TCP sockets in the system in certain states. Alloc - in any TCP state. Orphan - no longer attached to a socket descriptor in any user processes, but for which the kernel is still required to maintain state in order to complete the transport protocol. InUse - in any TCP state, excluding TIME-WAIT and CLOSED. TimeWait - in the TIME-WAIT state. ipv4.sockstat_tcp_sockets The number of packets transferred by the TCP layer. Received - received packets, including those received in error, such as checksum error, invalid TCP header, and so on. Sent - sent packets, excluding the retransmitted packets. But it includes the SYN, ACK, and RST packets. ipv4.tcppackets TCP connection statistics. Active - number of outgoing TCP connections attempted by this host. Passive - number of incoming TCP connections accepted by this host. ipv4.tcpopens TCP errors. InErrs - TCP segments received in error (including header too small, checksum errors, sequence errors, bad packets - for both IPv4 and IPv6). InCsumErrors - TCP segments received with checksum errors (for both IPv4 and IPv6). RetransSegs - TCP segments retransmitted. ipv4.tcperrors TCP handshake statistics. EstabResets - established connections resets (i.e. connections that made a direct transition from ESTABLISHED or CLOSE_WAIT to CLOSED). OutRsts - TCP segments sent, with the RST flag set (for both IPv4 and IPv6). AttemptFails - number of times TCP connections made a direct transition from either SYN_SENT or SYN_RECV to CLOSED, plus the number of times TCP connections made a direct transition from the SYN_RECV to LISTEN. SynRetrans - shows retries for new outbound TCP connections, which can indicate general connectivity issues or backlog on the remote host. ipv4.tcphandshake The amount of memory used by allocated TCP sockets. ipv4.sockstat_tcp_mem UDP The number of used UDP sockets. ipv4.sockstat_udp_sockets The number of transferred UDP packets. ipv4.udppackets The number of errors encountered during transferring UDP packets. RcvbufErrors - receive buffer is full. SndbufErrors - send buffer is full, no kernel memory available, or the IP layer reported an error when trying to send the packet and no error queue has been setup. InErrors - that is an aggregated counter for all errors, excluding NoPorts. NoPorts - no application is listening at the destination port. InCsumErrors - a UDP checksum failure is detected. IgnoredMulti - ignored multicast packets. ipv4.udperrors The amount of memory used by allocated UDP sockets. ipv4.sockstat_udp_mem -------------------------------------------------------------------------------- IPV6 NETWORKING Metrics for the IPv6 stack of the system. Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communications protocol that provides an identification and location system for computers on networks and routes traffic across the Internet. IPv6 was developed by the Internet Engineering Task Force (IETF) to deal with the long-anticipated problem of IPv4 address exhaustion. IPv6 is intended to replace IPv4. PACKETS IPv6 packet statistics for this host. Received - packets received by the IP layer. This counter will be increased even if the packet is dropped later. Sent - packets sent via IP layer, for both single cast and multicast packets. This counter does not include any packets counted in Forwarded. Forwarded - input packets for which this host was not their final IP destination, as a result of which an attempt was made to find a route to forward them to that final destination. In hosts which do not act as IP Gateways, this counter will include only those packets which were Source-Routed and the Source-Route option processing was successful. Delivers - packets delivered to the upper layer protocols, e.g. TCP, UDP, ICMP, and so on. ipv6.packets Total number of received IPv6 packets with ECN bits set in the system. CEP - congestion encountered. NoECTP - non ECN-capable transport. ECTP0 and ECTP1 - ECN capable transport. ipv6.ect ERRORS The number of discarded IPv6 packets. InDiscards, OutDiscards - packets which were chosen to be discarded even though no errors had been detected to prevent their being deliverable to a higher-layer protocol. InHdrErrors - errors in IP headers, including bad checksums, version number mismatch, other format errors, time-to-live exceeded, etc. InAddrErrors - invalid IP address or the destination IP address is not a local address and IP forwarding is not enabled. InUnknownProtos - unknown or unsupported protocol. InTooBigErrors - the size exceeded the link MTU. InTruncatedPkts - packet frame did not carry enough data. InNoRoutes - no route could be found while forwarding. OutNoRoutes - no route could be found for packets generated by this host. ipv6.errors TCP6 The number of TCP sockets in any state, excluding TIME-WAIT and CLOSED. ipv6.sockstat6_tcp_sockets UDP6 The number of used UDP sockets. ipv6.sockstat6_udp_sockets The number of transferred UDP packets. ipv6.udppackets The number of errors encountered during transferring UDP packets. RcvbufErrors - receive buffer is full. SndbufErrors - send buffer is full, no kernel memory available, or the IP layer reported an error when trying to send the packet and no error queue has been setup. InErrors - that is an aggregated counter for all errors, excluding NoPorts. NoPorts - no application is listening at the destination port. InCsumErrors - a UDP checksum failure is detected. IgnoredMulti - ignored multicast packets. ipv6.udperrors RAW6 The number of used raw sockets. ipv6.sockstat6_raw_sockets MULTICAST6 Total IPv6 multicast traffic. ipv6.mcast Total transferred IPv6 multicast packets. ipv6.mcastpkts ICMP6 The number of transferred ICMPv6 messages. Received, Sent - ICMP messages which the host received and attempted to send. Both these counters include errors. ipv6.icmp The number of ICMPv6 errors and error messages. InErrors, OutErrors - bad ICMP messages (bad ICMP checksums, bad length, etc.). InCsumErrors - wrong checksum. ipv6.icmperrors The number of transferred ICMPv6 Router Discovery messages. Router Solicitations message is sent from a computer host to any routers on the local area network to request that they advertise their presence on the network. Router Advertisement message is sent by a router on the local area network to announce its IP address as available for routing. ipv6.icmprouter The number of transferred ICMPv6 Neighbour Discovery messages. Neighbor Solicitations are used by nodes to determine the link layer address of a neighbor, or to verify that a neighbor is still reachable via a cached link layer address. Neighbor Advertisements are used by nodes to respond to a Neighbor Solicitation message. ipv6.icmpneighbor The number of transferred ICMPv6 Multicast Listener Discovery (MLD) messages. ipv6.icmpmldv2 The number of transferred ICMPv6 messages of certain types. ipv6.icmptypes -------------------------------------------------------------------------------- NETWORK INTERFACES Performance metrics for network interfaces. Netdata retrieves this data reading the /proc/net/dev file and /sys/class/net/ directory. BR-98AEBA5F4348 net.br-98aeba5f4348 net.br-98aeba5f4348 The amount of traffic transferred by the network interface. net.br-98aeba5f4348 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. net_packets.br-98aeba5f4348 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. net_operstate.br-98aeba5f4348 The current physical link state of the interface. net_carrier.br-98aeba5f4348 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. net_mtu.br-98aeba5f4348 BR-BB58692CE540 net.br-bb58692ce540 net.br-bb58692ce540 The amount of traffic transferred by the network interface. net.br-bb58692ce540 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. net_packets.br-bb58692ce540 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. net_operstate.br-bb58692ce540 The current physical link state of the interface. net_carrier.br-bb58692ce540 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. net_mtu.br-bb58692ce540 ENS3 net.ens3 net.ens3 The amount of traffic transferred by the network interface. net.ens3 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. net_packets.ens3 The interface's latest or current speed that the network adapter negotiated with the device it is connected to. This does not give the max supported speed of the NIC. net_speed.ens3 The interface's latest or current duplex that the network adapter negotiated with the device it is connected to. Unknown - the duplex mode can not be determined. Half duplex - the communication is one direction at a time. Full duplex - the interface is able to send and receive data simultaneously. net_duplex.ens3 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. net_operstate.ens3 The current physical link state of the interface. net_carrier.ens3 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. net_mtu.ens3 BR-B1C5C96C9768 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. net_operstate.br-b1c5c96c9768 The current physical link state of the interface. net_carrier.br-b1c5c96c9768 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. net_mtu.br-b1c5c96c9768 DOCKER0 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. net_operstate.docker0 The current physical link state of the interface. net_carrier.docker0 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. net_mtu.docker0 -------------------------------------------------------------------------------- FIREWALL (NETFILTER) Performance metrics of the netfilter components. CONNECTION TRACKER Netfilter Connection Tracker performance metrics. The connection tracker keeps track of all connections of the machine, inbound and outbound. It works by keeping a database with all open connections, tracking network and address translation and connection expectations. The number of entries in the conntrack table. netfilter.conntrack_sockets NETLINK netfilter.netlink_new netfilter.netlink_changes netfilter.netlink_expect netfilter.netlink_errors netfilter.netlink_search -------------------------------------------------------------------------------- SYSTEMD SERVICES Resources utilization of systemd services. Netdata monitors all systemd services via cgroups (the resources accounting used by containers). CPU Total CPU utilization within the system-wide CPU resources (all cores). The amount of time spent by tasks of the cgroup in user and kernel modes. services.cpu MEM The amount of used RAM. services.mem_usage SWAP The amount of used swap memory. services.swap_usage DISK The amount of data transferred from specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. services.io_read The amount of data transferred to specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. services.io_write The number of read operations performed on specific devices as seen by the CFQ scheduler. services.io_ops_read The number write operations performed on specific devices as seen by the CFQ scheduler. services.io_ops_write -------------------------------------------------------------------------------- APPLICATIONS Per application statistics are collected using apps.plugin. This plugin walks through all processes and aggregates statistics for application groups. The plugin also counts the resources of exited children. So for processes like shell scripts, the reported values include the resources used by the commands these scripts run within each timeframe. CPU Total CPU utilization (all cores). It includes user, system and guest time. apps.cpu The amount of time the CPU was busy executing code in user mode (all cores). apps.cpu_user The amount of time the CPU was busy executing code in kernel mode (all cores). apps.cpu_system DISK The amount of data that has been read from the storage layer. Actual physical disk I/O was required. apps.preads The amount of data that has been written to the storage layer. Actual physical disk I/O was required. apps.pwrites The amount of data that has been read from the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required (the read might have been satisfied from pagecache). apps.lreads The amount of data that has been written or shall be written to the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required. apps.lwrites The number of open files and directories. apps.files MEM Real memory (RAM) used by applications. This does not include shared memory. apps.mem Virtual memory allocated by applications. Check this article for more information. apps.vmem The number of minor faults which have not required loading a memory page from the disk. Minor page faults occur when a process needs data that is in memory and is assigned to another process. They share memory pages between multiple processes – no additional data needs to be read from disk to memory. apps.minor_faults apps.oomkills PROCESSES The number of threads. apps.threads The number of processes. apps.processes The period of time within which at least one process in the group has been running. apps.uptime The number of open pipes. A pipe is a unidirectional data channel that can be used for interprocess communication. apps.pipes Number of times a function that starts a process is called. Netdata gives a summary for this chart in Process, and when the integration is enabled, Netdata shows process per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.process_create Number of times a function that starts a thread is called. Netdata gives a summary for this chart in Process, and when the integration is enabled, Netdata shows process per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.thread_create Number of times a function responsible for closing tasks is called. Netdata gives a summary for this chart in Process, and when the integration is enabled, Netdata shows process per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.task_exit Number of times a function responsible for releasing tasks is called. Netdata gives a summary for this chart in Process, and when the integration is enabled, Netdata shows process per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.task_close SWAP The amount of swapped-out virtual memory by anonymous private pages. This does not include shared swap memory. apps.swap The number of major faults which have required loading a memory page from the disk. Major page faults occur because of the absence of the required page from the RAM. They are expected when a process starts or needs to read in additional data and in these cases do not indicate a problem condition. However, a major page fault can also be the result of reading memory pages that have been written out to the swap file, which could indicate a memory shortage. apps.major_faults NETWORK Netdata also gives a summary for eBPF charts in Networking Stack submenu. The number of open sockets. Sockets are a way to enable inter-process communication between programs running on a server, or between programs running on separate servers. This includes both network and UNIX sockets. apps.sockets Number of calls to IPV4 TCP function responsible for starting connections. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows outbound connections per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.outbound_conn_v4 Number of calls to IPV6 TCP function responsible for starting connections. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows outbound connections per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.outbound_conn_v6 Total bytes sent with TCP or UDP internal functions. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows bandwidth per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.total_bandwidth_sent Total bytes received with TCP or UDP internal functions. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows bandwidth per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.total_bandwidth_recv Number of calls to TCP functions responsible to send data. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows TCP calls per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.bandwidth_tcp_send Number of calls to TCP functions responsible to receive data. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows TCP calls per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.bandwidth_tcp_recv Number of times a TCP packet was retransmitted. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows TCP calls per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.bandwidth_tcp_retransmit Number of calls to UDP functions responsible to send data. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows UDP calls per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.bandwidth_udp_send Number of calls to UDP functions responsible to receive data. Netdata gives a summary for this chart in Network Stack. When the integration is enabled, Netdata shows UDP calls per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.bandwidth_udp_recv FILE ACCESS (EBPF) Netdata also gives a summary for this chart on Filesystem submenu (more details on eBPF plugin file chart section). Number of calls for internal functions on the Linux kernel responsible to open files. Netdata gives a summary for this chart in file access, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.file_open Number of calls for internal functions on the Linux kernel responsible to close files. Netdata gives a summary for this chart in file access, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.file_closed VFS (EBPF) Netdata also gives a summary for these charts in Filesystem submenu. Number of calls to VFS unlinker function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.file_deleted Number of successful calls to VFS writer function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_write_call Number of successful calls to VFS reader function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_read_call Total of bytes successfully written using the VFS writer function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_write_bytes Total of bytes successfully written using the VFS reader function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_read_bytes Number of calls to VFS syncer function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_fsync Number of calls to VFS opener function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_open Number of calls to VFS creator function. Netdata gives a summary for this chart in Virtual File System, and when the integration is enabled, Netdata shows virtual file system per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.vfs_create IPC SHM (EBPF) Number of calls to shmget. Netdata gives a summary for this chart in System Overview, and when the integration is enabled, Netdata shows shared memory metrics per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.shmget_call Number of calls to shmat. Netdata gives a summary for this chart in System Overview, and when the integration is enabled, Netdata shows shared memory metrics per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.shmat_call Number of calls to shmdt. Netdata gives a summary for this chart in System Overview, and when the integration is enabled, Netdata shows shared memory metrics per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.shmdt_call Number of calls to shmctl. Netdata gives a summary for this chart in System Overview, and when the integration is enabled, Netdata shows shared memory metrics per cgroup (systemd Services). This chart is provided by the eBPF plugin. apps.shmctl_call -------------------------------------------------------------------------------- USER GROUPS Per user group statistics are collected using apps.plugin. This plugin walks through all processes and aggregates statistics per user group. The plugin also counts the resources of exited children. So for processes like shell scripts, the reported values include the resources used by the commands these scripts run within each timeframe. CPU Total CPU utilization (all cores). It includes user, system and guest time. groups.cpu The amount of time the CPU was busy executing code in user mode (all cores). groups.cpu_user The amount of time the CPU was busy executing code in kernel mode (all cores). groups.cpu_system DISK The amount of data that has been read from the storage layer. Actual physical disk I/O was required. groups.preads The amount of data that has been written to the storage layer. Actual physical disk I/O was required. groups.pwrites The amount of data that has been read from the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required (the read might have been satisfied from pagecache). groups.lreads The amount of data that has been written or shall be written to the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required. groups.lwrites The number of open files and directories. groups.files MEM Real memory (RAM) used per user group. This does not include shared memory. groups.mem Virtual memory allocated per user group since the Netdata restart. Please check this article for more information. groups.vmem The number of minor faults which have not required loading a memory page from the disk. Minor page faults occur when a process needs data that is in memory and is assigned to another process. They share memory pages between multiple processes – no additional data needs to be read from disk to memory. groups.minor_faults PROCESSES The number of threads. groups.threads The number of processes. groups.processes The period of time within which at least one process in the group has been running. groups.uptime The number of open pipes. A pipe is a unidirectional data channel that can be used for interprocess communication. groups.pipes SWAP The amount of swapped-out virtual memory by anonymous private pages. This does not include shared swap memory. groups.swap The number of major faults which have required loading a memory page from the disk. Major page faults occur because of the absence of the required page from the RAM. They are expected when a process starts or needs to read in additional data and in these cases do not indicate a problem condition. However, a major page fault can also be the result of reading memory pages that have been written out to the swap file, which could indicate a memory shortage. groups.major_faults NET The number of open sockets. Sockets are a way to enable inter-process communication between programs running on a server, or between programs running on separate servers. This includes both network and UNIX sockets. groups.sockets -------------------------------------------------------------------------------- USERS Per user statistics are collected using apps.plugin. This plugin walks through all processes and aggregates statistics per user. The plugin also counts the resources of exited children. So for processes like shell scripts, the reported values include the resources used by the commands these scripts run within each timeframe. CPU Total CPU utilization (all cores). It includes user, system and guest time. users.cpu The amount of time the CPU was busy executing code in user mode (all cores). users.cpu_user The amount of time the CPU was busy executing code in kernel mode (all cores). users.cpu_system DISK The amount of data that has been read from the storage layer. Actual physical disk I/O was required. users.preads The amount of data that has been written to the storage layer. Actual physical disk I/O was required. users.pwrites The amount of data that has been read from the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required (the read might have been satisfied from pagecache). users.lreads The amount of data that has been written or shall be written to the storage layer. It includes things such as terminal I/O and is unaffected by whether or not actual physical disk I/O was required. users.lwrites The number of open files and directories. users.files MEM Real memory (RAM) used per user group. This does not include shared memory. users.mem Virtual memory allocated per user group since the Netdata restart. Please check this article for more information. users.vmem The number of minor faults which have not required loading a memory page from the disk. Minor page faults occur when a process needs data that is in memory and is assigned to another process. They share memory pages between multiple processes – no additional data needs to be read from disk to memory. users.minor_faults PROCESSES The number of threads. users.threads The number of processes. users.processes The period of time within which at least one process in the group has been running. users.uptime The number of open pipes. A pipe is a unidirectional data channel that can be used for interprocess communication. users.pipes SWAP The amount of swapped-out virtual memory by anonymous private pages. This does not include shared swap memory. users.swap The number of major faults which have required loading a memory page from the disk. Major page faults occur because of the absence of the required page from the RAM. They are expected when a process starts or needs to read in additional data and in these cases do not indicate a problem condition. However, a major page fault can also be the result of reading memory pages that have been written out to the swap file, which could indicate a memory shortage. users.major_faults NET The number of open sockets. Sockets are a way to enable inter-process communication between programs running on a server, or between programs running on separate servers. This includes both network and UNIX sockets. users.sockets -------------------------------------------------------------------------------- ANOMALY DETECTION Charts relating to anomaly detection, increased anomalous dimensions or a higher than usual anomaly_rate could be signs of some abnormal behaviour. Read our anomaly detection guide for more details. DIMENSIONS Total count of dimensions considered anomalous or normal. anomaly_detection.dimensions_on_b626a0d7-867b-47b7-a3d9-5106894a46d8 ANOMALY RATE Percentage of anomalous dimensions. anomaly_detection.anomaly_rate_on_b626a0d7-867b-47b7-a3d9-5106894a46d8 DETECTOR WINDOW The length of the active window used by the detector. anomaly_detection.detector_window_on_b626a0d7-867b-47b7-a3d9-5106894a46d8 DETECTOR EVENTS Flags (0 or 1) to show when an anomaly event has been triggered by the detector. anomaly_detection.detector_events_on_b626a0d7-867b-47b7-a3d9-5106894a46d8 -------------------------------------------------------------------------------- AUTOHEAL Container resource utilization metrics. Netdata reads this information from cgroups (abbreviated from control groups), a Linux kernel feature that limits and accounts resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. cgroups together with namespaces (that offer isolation between processes) provide what we usually call: containers. cgroup_autoheal.cpu_limit cgroup_autoheal.mem_usage_limit cgroup_autoheal.net_eth0 cgroup_autoheal.net_eth0 CPU Total CPU utilization within the configured or system-wide (if not set) limits. When the CPU utilization of a cgroup exceeds the limit for the configured period, the tasks belonging to its hierarchy will be throttled and are not allowed to run again until the next period. cgroup_autoheal.cpu_limit Total CPU utilization within the system-wide CPU resources (all cores). The amount of time spent by tasks of the cgroup in user and kernel modes. cgroup_autoheal.cpu The percentage of runnable periods when tasks in a cgroup have been throttled. The tasks have not been allowed to run because they have exhausted all of the available time as specified by their CPU quota. cgroup_autoheal.throttled The total time duration for which tasks in a cgroup have been throttled. When an application has used its allotted CPU quota for a given period, it gets throttled until the next period. cgroup_autoheal.throttled_duration CPU Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on CPU. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.cpu_some_pressure The amount of time some processes have been waiting for CPU time. cgroup_autoheal.cpu_some_pressure_stall_time CPU Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on CPU resource simultaneously. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.cpu_full_pressure The amount of time all non-idle processes have been stalled due to CPU congestion. cgroup_autoheal.cpu_full_pressure_stall_time MEM RAM utilization within the configured or system-wide (if not set) limits. When the RAM utilization of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_autoheal.mem_utilization RAM usage within the configured or system-wide (if not set) limits. When the RAM usage of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_autoheal.mem_usage_limit The amount of used RAM and swap memory. cgroup_autoheal.mem_usage Memory usage statistics. The individual metrics are described in the memory.stat section for cgroup-v1 and cgroup-v2. cgroup_autoheal.mem Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk. cgroup_autoheal.writeback Memory page fault statistics. Pgfault - all page faults. Swap - major page faults. cgroup_autoheal.pgfaults Memory Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on memory. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.mem_some_pressure The amount of time some processes have been waiting due to memory congestion. cgroup_autoheal.memory_some_pressure_stall_time Memory Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on memory resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.mem_full_pressure The amount of time all non-idle processes have been stalled due to memory congestion. cgroup_autoheal.memory_full_pressure_stall_time DISK The amount of data transferred to and from specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. cgroup_autoheal.io The number of I/O operations performed on specific devices as seen by the CFQ scheduler. cgroup_autoheal.serviced_ops I/O Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on I/O. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.io_some_pressure The amount of time some processes have been waiting due to I/O congestion. cgroup_autoheal.io_some_pressure_stall_time I/O Pressure Stall Information. Full line indicates the share of time in which all non-idle tasks are stalled on I/O resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_autoheal.io_full_pressure The amount of time all non-idle processes have been stalled due to I/O congestion. cgroup_autoheal.io_full_pressure_stall_time NET ETH0 The amount of traffic transferred by the network interface. cgroup_autoheal.net_eth0 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. cgroup_autoheal.net_packets_eth0 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. cgroup_autoheal.net_operstate_eth0 The current physical link state of the interface. cgroup_autoheal.net_carrier_eth0 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. cgroup_autoheal.net_mtu_eth0 -------------------------------------------------------------------------------- DUCKDNS Container resource utilization metrics. Netdata reads this information from cgroups (abbreviated from control groups), a Linux kernel feature that limits and accounts resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. cgroups together with namespaces (that offer isolation between processes) provide what we usually call: containers. cgroup_duckdns.cpu_limit cgroup_duckdns.mem_usage_limit cgroup_duckdns.net_eth0 cgroup_duckdns.net_eth0 CPU Total CPU utilization within the configured or system-wide (if not set) limits. When the CPU utilization of a cgroup exceeds the limit for the configured period, the tasks belonging to its hierarchy will be throttled and are not allowed to run again until the next period. cgroup_duckdns.cpu_limit Total CPU utilization within the system-wide CPU resources (all cores). The amount of time spent by tasks of the cgroup in user and kernel modes. cgroup_duckdns.cpu The percentage of runnable periods when tasks in a cgroup have been throttled. The tasks have not been allowed to run because they have exhausted all of the available time as specified by their CPU quota. cgroup_duckdns.throttled The total time duration for which tasks in a cgroup have been throttled. When an application has used its allotted CPU quota for a given period, it gets throttled until the next period. cgroup_duckdns.throttled_duration CPU Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on CPU. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.cpu_some_pressure The amount of time some processes have been waiting for CPU time. cgroup_duckdns.cpu_some_pressure_stall_time CPU Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on CPU resource simultaneously. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.cpu_full_pressure The amount of time all non-idle processes have been stalled due to CPU congestion. cgroup_duckdns.cpu_full_pressure_stall_time MEM RAM utilization within the configured or system-wide (if not set) limits. When the RAM utilization of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_duckdns.mem_utilization RAM usage within the configured or system-wide (if not set) limits. When the RAM usage of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_duckdns.mem_usage_limit The amount of used RAM and swap memory. cgroup_duckdns.mem_usage Memory usage statistics. The individual metrics are described in the memory.stat section for cgroup-v1 and cgroup-v2. cgroup_duckdns.mem Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk. cgroup_duckdns.writeback Memory page fault statistics. Pgfault - all page faults. Swap - major page faults. cgroup_duckdns.pgfaults Memory Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on memory. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.mem_some_pressure The amount of time some processes have been waiting due to memory congestion. cgroup_duckdns.memory_some_pressure_stall_time Memory Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on memory resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.mem_full_pressure The amount of time all non-idle processes have been stalled due to memory congestion. cgroup_duckdns.memory_full_pressure_stall_time DISK The amount of data transferred to and from specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. cgroup_duckdns.io The number of I/O operations performed on specific devices as seen by the CFQ scheduler. cgroup_duckdns.serviced_ops I/O Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on I/O. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.io_some_pressure The amount of time some processes have been waiting due to I/O congestion. cgroup_duckdns.io_some_pressure_stall_time I/O Pressure Stall Information. Full line indicates the share of time in which all non-idle tasks are stalled on I/O resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_duckdns.io_full_pressure The amount of time all non-idle processes have been stalled due to I/O congestion. cgroup_duckdns.io_full_pressure_stall_time NET ETH0 The amount of traffic transferred by the network interface. cgroup_duckdns.net_eth0 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. cgroup_duckdns.net_packets_eth0 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. cgroup_duckdns.net_operstate_eth0 The current physical link state of the interface. cgroup_duckdns.net_carrier_eth0 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. cgroup_duckdns.net_mtu_eth0 -------------------------------------------------------------------------------- NPM Container resource utilization metrics. Netdata reads this information from cgroups (abbreviated from control groups), a Linux kernel feature that limits and accounts resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. cgroups together with namespaces (that offer isolation between processes) provide what we usually call: containers. cgroup_npm.cpu_limit cgroup_npm.mem_usage_limit cgroup_npm.net_eth0 cgroup_npm.net_eth0 cgroup_npm.net_eth1 cgroup_npm.net_eth1 CPU Total CPU utilization within the configured or system-wide (if not set) limits. When the CPU utilization of a cgroup exceeds the limit for the configured period, the tasks belonging to its hierarchy will be throttled and are not allowed to run again until the next period. cgroup_npm.cpu_limit Total CPU utilization within the system-wide CPU resources (all cores). The amount of time spent by tasks of the cgroup in user and kernel modes. cgroup_npm.cpu The percentage of runnable periods when tasks in a cgroup have been throttled. The tasks have not been allowed to run because they have exhausted all of the available time as specified by their CPU quota. cgroup_npm.throttled The total time duration for which tasks in a cgroup have been throttled. When an application has used its allotted CPU quota for a given period, it gets throttled until the next period. cgroup_npm.throttled_duration CPU Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on CPU. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.cpu_some_pressure The amount of time some processes have been waiting for CPU time. cgroup_npm.cpu_some_pressure_stall_time CPU Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on CPU resource simultaneously. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.cpu_full_pressure The amount of time all non-idle processes have been stalled due to CPU congestion. cgroup_npm.cpu_full_pressure_stall_time MEM RAM utilization within the configured or system-wide (if not set) limits. When the RAM utilization of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_npm.mem_utilization RAM usage within the configured or system-wide (if not set) limits. When the RAM usage of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_npm.mem_usage_limit The amount of used RAM and swap memory. cgroup_npm.mem_usage Memory usage statistics. The individual metrics are described in the memory.stat section for cgroup-v1 and cgroup-v2. cgroup_npm.mem Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk. cgroup_npm.writeback Memory page fault statistics. Pgfault - all page faults. Swap - major page faults. cgroup_npm.pgfaults Memory Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on memory. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.mem_some_pressure The amount of time some processes have been waiting due to memory congestion. cgroup_npm.memory_some_pressure_stall_time Memory Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on memory resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.mem_full_pressure The amount of time all non-idle processes have been stalled due to memory congestion. cgroup_npm.memory_full_pressure_stall_time DISK The amount of data transferred to and from specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. cgroup_npm.io The number of I/O operations performed on specific devices as seen by the CFQ scheduler. cgroup_npm.serviced_ops I/O Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on I/O. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.io_some_pressure The amount of time some processes have been waiting due to I/O congestion. cgroup_npm.io_some_pressure_stall_time I/O Pressure Stall Information. Full line indicates the share of time in which all non-idle tasks are stalled on I/O resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_npm.io_full_pressure The amount of time all non-idle processes have been stalled due to I/O congestion. cgroup_npm.io_full_pressure_stall_time NET ETH0 The amount of traffic transferred by the network interface. cgroup_npm.net_eth0 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. cgroup_npm.net_packets_eth0 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. cgroup_npm.net_operstate_eth0 The current physical link state of the interface. cgroup_npm.net_carrier_eth0 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. cgroup_npm.net_mtu_eth0 NET ETH1 The amount of traffic transferred by the network interface. cgroup_npm.net_eth1 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. cgroup_npm.net_packets_eth1 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. cgroup_npm.net_operstate_eth1 The current physical link state of the interface. cgroup_npm.net_carrier_eth1 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. cgroup_npm.net_mtu_eth1 -------------------------------------------------------------------------------- SJVA Container resource utilization metrics. Netdata reads this information from cgroups (abbreviated from control groups), a Linux kernel feature that limits and accounts resource usage (CPU, memory, disk I/O, network, etc.) of a collection of processes. cgroups together with namespaces (that offer isolation between processes) provide what we usually call: containers. cgroup_sjva.cpu_limit cgroup_sjva.mem_usage_limit cgroup_sjva.net_eth0 cgroup_sjva.net_eth0 CPU Total CPU utilization within the configured or system-wide (if not set) limits. When the CPU utilization of a cgroup exceeds the limit for the configured period, the tasks belonging to its hierarchy will be throttled and are not allowed to run again until the next period. cgroup_sjva.cpu_limit Total CPU utilization within the system-wide CPU resources (all cores). The amount of time spent by tasks of the cgroup in user and kernel modes. cgroup_sjva.cpu The percentage of runnable periods when tasks in a cgroup have been throttled. The tasks have not been allowed to run because they have exhausted all of the available time as specified by their CPU quota. cgroup_sjva.throttled The total time duration for which tasks in a cgroup have been throttled. When an application has used its allotted CPU quota for a given period, it gets throttled until the next period. cgroup_sjva.throttled_duration CPU Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on CPU. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.cpu_some_pressure The amount of time some processes have been waiting for CPU time. cgroup_sjva.cpu_some_pressure_stall_time CPU Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on CPU resource simultaneously. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.cpu_full_pressure The amount of time all non-idle processes have been stalled due to CPU congestion. cgroup_sjva.cpu_full_pressure_stall_time MEM RAM utilization within the configured or system-wide (if not set) limits. When the RAM utilization of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_sjva.mem_utilization RAM usage within the configured or system-wide (if not set) limits. When the RAM usage of a cgroup exceeds the limit, OOM killer will start killing the tasks belonging to the cgroup. cgroup_sjva.mem_usage_limit The amount of used RAM and swap memory. cgroup_sjva.mem_usage Memory usage statistics. The individual metrics are described in the memory.stat section for cgroup-v1 and cgroup-v2. cgroup_sjva.mem Dirty is the amount of memory waiting to be written to disk. Writeback is how much memory is actively being written to disk. cgroup_sjva.writeback Memory page fault statistics. Pgfault - all page faults. Swap - major page faults. cgroup_sjva.pgfaults Memory Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on memory. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.mem_some_pressure The amount of time some processes have been waiting due to memory congestion. cgroup_sjva.memory_some_pressure_stall_time Memory Pressure Stall Information. Full indicates the share of time in which all non-idle tasks are stalled on memory resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.mem_full_pressure The amount of time all non-idle processes have been stalled due to memory congestion. cgroup_sjva.memory_full_pressure_stall_time DISK The amount of data transferred to and from specific devices as seen by the CFQ scheduler. It is not updated when the CFQ scheduler is operating on a request queue. cgroup_sjva.io The number of I/O operations performed on specific devices as seen by the CFQ scheduler. cgroup_sjva.serviced_ops I/O Pressure Stall Information. Some indicates the share of time in which at least some tasks are stalled on I/O. In this state the CPU is still doing productive work. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.io_some_pressure The amount of time some processes have been waiting due to I/O congestion. cgroup_sjva.io_some_pressure_stall_time I/O Pressure Stall Information. Full line indicates the share of time in which all non-idle tasks are stalled on I/O resource simultaneously. In this state actual CPU cycles are going to waste, and a workload that spends extended time in this state is considered to be thrashing. This has severe impact on performance. The ratios are tracked as recent trends over 10-, 60-, and 300-second windows. cgroup_sjva.io_full_pressure The amount of time all non-idle processes have been stalled due to I/O congestion. cgroup_sjva.io_full_pressure_stall_time NET ETH0 The amount of traffic transferred by the network interface. cgroup_sjva.net_eth0 The number of packets transferred by the network interface. Received multicast counter is commonly calculated at the device level (unlike received) and therefore may include packets which did not reach the host. cgroup_sjva.net_packets_eth0 The current operational state of the interface. Unknown - the state can not be determined. NotPresent - the interface has missing (typically, hardware) components. Down - the interface is unable to transfer data on L1, e.g. ethernet is not plugged or interface is administratively down. LowerLayerDown - the interface is down due to state of lower-layer interface(s). Testing - the interface is in testing mode, e.g. cable test. It can’t be used for normal traffic until tests complete. Dormant - the interface is L1 up, but waiting for an external event, e.g. for a protocol to establish. Up - the interface is ready to pass packets and can be used. cgroup_sjva.net_operstate_eth0 The current physical link state of the interface. cgroup_sjva.net_carrier_eth0 The interface's currently configured Maximum transmission unit (MTU) value. MTU is the size of the largest protocol data unit that can be communicated in a single network layer transaction. cgroup_sjva.net_mtu_eth0 -------------------------------------------------------------------------------- DOCKERD LOCAL RUNNING CONTAINERS dockerd_local.running_containers HEALTHY CONTAINERS dockerd_local.healthy_containers UNHEALTHY CONTAINERS dockerd_local.unhealthy_containers -------------------------------------------------------------------------------- FAIL2BAN Netdata keeps track of the current jail status by reading the Fail2ban log file. FAILED ATTEMPTS The number of failed attempts. This chart reflects the number of 'Found' lines. Found means a line in the service’s log file matches the failregex in its filter. fail2ban.jails_failed_attempts BANS The number of bans. This chart reflects the number of 'Ban' and 'Restore Ban' lines. Ban action happens when the number of failed attempts (maxretry) occurred in the last configured interval (findtime). fail2ban.jails_bans BANNED IPS The number of banned IP addresses. fail2ban.jails_banned_ips -------------------------------------------------------------------------------- NETDATA MONITORING Performance metrics for the operation of netdata itself and its plugins. NETDATA netdata.server_cpu netdata.uptime API netdata.clients netdata.requests netdata.net The netdata API response time measures the time netdata needed to serve requests. This time includes everything, from the reception of the first byte of a request, to the dispatch of the last byte of its reply, therefore it includes all network latencies involved (i.e. a client over a slow network will influence these metrics). netdata.response_time netdata.compression_ratio QUERIES netdata.queries netdata.db_points DBENGINE netdata.dbengine_compression_ratio netdata.page_cache_hit_ratio netdata.page_cache_stats netdata.dbengine_long_term_page_stats netdata.dbengine_io_throughput netdata.dbengine_io_operations netdata.dbengine_global_errors netdata.dbengine_global_file_descriptors netdata.dbengine_ram STATSD netdata.statsd_metrics netdata.statsd_useful_metrics netdata.statsd_events netdata.statsd_reads netdata.statsd_bytes netdata.statsd_packets netdata.tcp_connects netdata.tcp_connected netdata.private_charts ML netdata.prediction_stats_b626a0d7-867b-47b7-a3d9-5106894a46d8 netdata.training_stats_b626a0d7-867b-47b7-a3d9-5106894a46d8 APPS.PLUGIN netdata.apps_cpu netdata.apps_sizes netdata.apps_fix netdata.apps_children_fix EBPF.PLUGIN eBPF (extended Berkeley Packet Filter) is used to collect metrics from inside Linux kernel giving a zoom inside your Process, Hard Disk, File systems (File Access, and Directory Cache), Memory (Swap I/O, Page Cache), IRQ (Hard IRQ and Soft IRQ ), Shared Memory, Syscalls (Sync, Mount), and Network. Show total number of threads and number of active threads. For more details about the threads, see the official documentation. netdata.ebpf_threads Show number of threads loaded using legacy code (independent binary) or CO-RE (Compile Once Run Everywhere). netdata.ebpf_load_methods PYTHON.D netdata.runtime_dockerd_local netdata.runtime_fail2ban ACLK This chart shows if ACLK was online during entirety of the sample duration. netdata.aclk_status This chart shows how many queries were added for ACLK_query thread to process and how many it was actually able to process. netdata.aclk_query_per_second netdata.aclk_cloud_req netdata.aclk_processed_query_type netdata.aclk_cloud_req_http_type netdata.aclk_query_time netdata.aclk_query_threads netdata.aclk_protobuf_rx_types netdata.aclk_openssl_bytes HEARTBEAT netdata.heartbeat WORKERS netdata.workers_cpu WORKERS ACLK CONTEXTS netdata.workers_time_rrdcontext netdata.workers_cpu_rrdcontext netdata.workers_jobs_by_type_rrdcontext netdata.workers_busy_time_by_type_rrdcontext WORKERS ACLK HOST SYNC netdata.workers_time_aclksync netdata.workers_cpu_aclksync netdata.workers_jobs_by_type_aclksync netdata.workers_busy_time_by_type_aclksync WORKERS ACLK QUERY netdata.workers_time_aclkquery netdata.workers_cpu_aclkquery netdata.workers_jobs_by_type_aclkquery netdata.workers_busy_time_by_type_aclkquery netdata.workers_threads_aclkquery WORKERS DBENGINE INSTANCES netdata.workers_time_dbengine netdata.workers_cpu_dbengine netdata.workers_jobs_by_type_dbengine netdata.workers_busy_time_by_type_dbengine WORKERS GLOBAL STATISTICS netdata.workers_time_stats netdata.workers_cpu_stats netdata.workers_jobs_by_type_stats netdata.workers_busy_time_by_type_stats WORKERS HEALTH ALARMS netdata.workers_time_health netdata.workers_cpu_health netdata.workers_jobs_by_type_health netdata.workers_busy_time_by_type_health WORKERS ML DETECTION netdata.workers_time_mldetect netdata.workers_cpu_mldetect netdata.workers_jobs_by_type_mldetect netdata.workers_busy_time_by_type_mldetect WORKERS ML TRAINING netdata.workers_time_mltrain netdata.workers_cpu_mltrain netdata.workers_jobs_by_type_mltrain netdata.workers_busy_time_by_type_mltrain WORKERS PLUGIN CGROUPS netdata.workers_time_cgroups netdata.workers_cpu_cgroups netdata.workers_jobs_by_type_cgroups netdata.workers_busy_time_by_type_cgroups WORKERS PLUGIN CGROUPS FIND netdata.workers_time_cgroupsdisc netdata.workers_cpu_cgroupsdisc netdata.workers_jobs_by_type_cgroupsdisc netdata.workers_busy_time_by_type_cgroupsdisc WORKERS PLUGIN DISKSPACE netdata.workers_time_diskspace netdata.workers_cpu_diskspace netdata.workers_jobs_by_type_diskspace netdata.workers_busy_time_by_type_diskspace WORKERS PLUGIN IDLEJITTER netdata.workers_time_idlejitter netdata.workers_cpu_idlejitter netdata.workers_jobs_by_type_idlejitter netdata.workers_busy_time_by_type_idlejitter WORKERS PLUGIN PROC netdata.workers_time_proc netdata.workers_cpu_proc netdata.workers_jobs_by_type_proc netdata.workers_busy_time_by_type_proc WORKERS PLUGIN PROC NETDEV netdata.workers_time_netdev netdata.workers_cpu_netdev netdata.workers_jobs_by_type_netdev netdata.workers_busy_time_by_type_netdev WORKERS PLUGIN STATSD netdata.workers_time_statsd netdata.workers_cpu_statsd netdata.workers_jobs_by_type_statsd netdata.workers_busy_time_by_type_statsd WORKERS PLUGIN STATSD FLUSH netdata.workers_time_statsdflush netdata.workers_cpu_statsdflush netdata.workers_jobs_by_type_statsdflush netdata.workers_busy_time_by_type_statsdflush WORKERS PLUGIN TC netdata.workers_time_tc netdata.workers_cpu_tc netdata.workers_jobs_by_type_tc netdata.workers_busy_time_by_type_tc netdata.plugin_tc_time WORKERS PLUGIN TIMEX netdata.workers_time_timex netdata.workers_cpu_timex netdata.workers_jobs_by_type_timex netdata.workers_busy_time_by_type_timex WORKERS PLUGINS.D netdata.workers_time_pluginsd netdata.workers_cpu_pluginsd netdata.workers_jobs_by_type_pluginsd netdata.workers_busy_time_by_type_pluginsd netdata.workers_threads_pluginsd WORKERS WEB SERVER netdata.workers_time_web netdata.workers_cpu_web netdata.workers_jobs_by_type_web netdata.workers_busy_time_by_type_web netdata.workers_threads_web -------------------------------------------------------------------------------- * System Overview * cpu * load * disk * ram * swap * network * processes * idlejitter * interrupts * softirqs * softnet * entropy * uptime * clock synchronization * ipc semaphores * ipc shared memory * CPUs * utilization * interrupts * softirqs * softnet * cpuidle * Memory * system * kernel * slab * synchronization (eBPF) * Disks * sda * / * /boot/efi * /dev * /dev/shm * /run * /run/lock * Filesystem * vfs (eBPF) * file access (eBPF) * Mount Points * mount (eBPF) * Networking Stack * tcp * broadcast * multicast * ecn * kernel functions (eBPF) * IPv4 Networking * sockets * packets * errors * icmp * tcp * udp * IPv6 Networking * packets * errors * tcp6 * udp6 * raw6 * multicast6 * icmp6 * Network Interfaces * br-98aeba5f4348 * br-bb58692ce540 * ens3 * br-b1c5c96c9768 * docker0 * Firewall (netfilter) * connection tracker * netlink * systemd Services * cpu * mem * swap * disk * Applications * cpu * disk * mem * processes * swap * network * file access (eBPF) * vfs (eBPF) * ipc shm (eBPF) * User Groups * cpu * disk * mem * processes * swap * net * Users * cpu * disk * mem * processes * swap * net * Anomaly Detection * dimensions * anomaly rate * detector window * detector events * autoheal * cpu * mem * disk * net eth0 * duckdns * cpu * mem * disk * net eth0 * npm * cpu * mem * disk * net eth0 * net eth1 * sjva * cpu * mem * disk * net eth0 * dockerd local * running containers * healthy containers * unhealthy containers * Fail2ban * failed attempts * bans * banned ips * Netdata Monitoring * netdata * api * queries * dbengine * statsd * ml * apps.plugin * eBPF.plugin * python.d * aclk * heartbeat * workers * workers aclk contexts * workers aclk host sync * workers aclk query * workers dbengine instances * workers global statistics * workers health alarms * workers ML detection * workers ML training * workers plugin cgroups * workers plugin cgroups find * workers plugin diskspace * workers plugin idlejitter * workers plugin proc * workers plugin proc netdev * workers plugin statsd * workers plugin statsd flush * workers plugin tc * workers plugin timex * workers plugins.d * workers web server * Add more charts * Add more alarms * Every second, Netdata collects 3,362 metrics on oracle-e2-0, presents them in 534 charts and monitors them with 0 alarms. netdata v1.36.0-21-nightly * Do you like Netdata? Give us a star! And share the word! Netdata Copyright 2020, Netdata, Inc. Terms and conditions Privacy Policy Released under GPL v3 or later. Netdata uses third party tools. XSS PROTECTION This dashboard is about to render data from server: To protect your privacy, the dashboard will check all data transferred for cross site scripting (XSS). This is CPU intensive, so your browser might be a bit slower. If you trust the remote server, you can disable XSS protection. In this case, any remote dashboard decoration code (javascript) will also run. If you don't trust the remote server, you should keep the protection on. The dashboard will run slower and remote dashboard decoration code will not run, but better be safe than sorry... Keep protecting me I don't need this, the server is mine × PRINT THIS NETDATA DASHBOARD netdata dashboards cannot be captured, since we are lazy loading and hiding all but the visible charts. To capture the whole page with all the charts rendered, a new browser window will pop-up that will render all the charts at once. The new browser window will maintain the current pan and zoom settings of the charts. So, align the charts before proceeding. This process will put some CPU and memory pressure on your browser. For the netdata server, we will sequentially download all the charts, to avoid congesting network and server resources. Please, do not print netdata dashboards on paper! Print Close × PREPARING DASHBOARD FOR PRINTING... Please wait while we initialize and render all the charts on the dashboard. The print dialog will appear as soon as we finish rendering the page. × IMPORT A NETDATA SNAPSHOT netdata can export and import dashboard snapshots. Any netdata can import the snapshot of any other netdata. The snapshots are not uploaded to a server. They are handled entirely by your web browser, on your computer. Click here to select the netdata snapshot file to import Browse for a snapshot file (or drag it and drop it here), then click Import to render it. FilenameHostnameOrigin URLCharts InfoSnapshot InfoTime RangeComments Snapshot files contain both data and javascript code. Make sure you trust the files you import! Import Close × EXPORT A SNAPSHOT Please wait while we collect all the dashboard data... Select the desired resolution of the snapshot. This is the seconds of data per point. Filename Compression * Select Compression * * uncompressed * * pako.deflate (gzip, binary) * pako.deflate.base64 (gzip, ascii) * * lzstring.uri (LZ, ascii) * lzstring.utf16 (LZ, utf16) * lzstring.base64 (LZ, ascii) Comments Select snaphost resolution. This controls the size the snapshot file. The generated snapshot will include all charts of this dashboard, for the visible timeframe, so align, pan and zoom the charts as needed. The scroll position of the dashboard will also be saved. The snapshot will be downloaded as a file, to your computer, that can be imported back into any netdata dashboard (no need to import it back on this server). Snapshot files include all the information of the dashboard, including the URL of the origin server, its netdata unique ID, etc. So, if you share the snapshot file with third parties, they will be able to access the origin server, if this server is exposed on the internet. Snapshots are handled entirely by the web browser. The netdata servers are not aware of them. Export Cancel × NETDATA ALARMS * Active * All * Log loading... loading... loading... Close × NETDATA DASHBOARD OPTIONS These are browser settings. Each viewer has its own. They do not affect the operation of your netdata server. Settings take effect immediately and are saved permanently to browser local storage (except the refresh on focus / always option). To reset all options (including charts sizes) to their defaults, click here. * Performance * Synchronization * Visual * Locale On FocusAlways When to refresh the charts? When set to On Focus, the charts will stop being updated if the page / tab does not have the focus of the user. When set to Always, the charts will always be refreshed. Set it to On Focus it to lower the CPU requirements of the browser (and extend the battery of laptops and tablets) when this page does not have your focus. Set to Always to work on another window (i.e. change the settings of something) and have the charts auto-refresh in this window. Non ZeroAll Which dimensions to show? When set to Non Zero, dimensions that have all their values (within the current view) set to zero will not be transferred from the netdata server (except if all dimensions of the chart are zero, in which case this setting does nothing - all dimensions are transferred and shown). When set to All, all dimensions will always be shown. Set it to Non Zero to lower the data transferred between netdata and your browser, lower the CPU requirements of your browser (fewer lines to draw) and increase the focus on the legends (fewer entries at the legends). DestroyHide How to handle hidden charts? When set to Destroy, charts that are not in the current viewport of the browser (are above, or below the visible area of the page), will be destroyed and re-created if and when they become visible again. When set to Hide, the not-visible charts will be just hidden, to simplify the DOM and speed up your browser. Set it to Destroy, to lower the memory requirements of your browser. Set it to Hide for faster restoration of charts on page scrolling. AsyncSync Page scroll handling? When set to Sync, charts will be examined for their visibility immediately after scrolling. On slow computers this may impact the smoothness of page scrolling. To update the page when scrolling ends, set it to Async. Set it to Sync for immediate chart updates when scrolling. Set it to Async for smoother page scrolling on slower computers. ParallelSequential Which chart refresh policy to use? When set to parallel, visible charts are refreshed in parallel (all queries are sent to netdata server in parallel) and are rendered asynchronously. When set to sequential charts are refreshed one after another. Set it to parallel if your browser can cope with it (most modern browsers do), set it to sequential if you work on an older/slower computer. ResyncBest Effort Shall we re-sync chart refreshes? When set to Resync, the dashboard will attempt to re-synchronize all the charts so that they are refreshed concurrently. When set to Best Effort, each chart may be refreshed with a little time difference to the others. Normally, the dashboard starts refreshing them in parallel, but depending on the speed of your computer and the network latencies, charts start having a slight time difference. Setting this to Resync will attempt to re-synchronize the charts on every update. Setting it to Best Effort may lower the pressure on your browser and the network. SyncDon't Sync Sync hover selection on all charts? When enabled, a selection on one chart will automatically select the same time on all other visible charts and the legends of all visible charts will be updated to show the selected values. When disabled, only the chart getting the user's attention will be selected. Enable it to get better insights of the data. Disable it if you are on a very slow computer that cannot actually do it. RightBelow Where do you want to see the legend? Netdata can place the legend in two positions: Below charts (the default) or to the Right of charts. Switching this will reload the dashboard. DarkWhite Which theme to use? Netdata comes with two themes: Dark (the default) and White. Switching this will reload the dashboard. Help MeNo Help Do you need help? Netdata can show some help in some areas to help you use the dashboard. If all these balloons bother you, disable them using this switch. Switching this will reload the dashboard. PadDon't Pad Enable data padding when panning and zooming? When set to Pad the charts will be padded with more data, both before and after the visible area, thus giving the impression the whole database is loaded. This padding will happen only after the first pan or zoom operation on the chart (initially all charts have only the visible data). When set to Don't Pad only the visible data will be transferred from the netdata server, even after the first pan and zoom operation. SmoothRough Enable Bézier lines on charts? When set to Smooth the charts libraries that support it, will plot smooth curves instead of simple straight lines to connect the points. Keep in mind dygraphs, the main charting library in netdata dashboards, can only smooth line charts. It cannot smooth area or stacked charts. When set to Rough, this setting can lower the CPU resources consumed by your browser. These settings are applied gradually, as charts are updated. To force them, refresh the dashboard now. Scale UnitsFixed Units Enable auto-scaling of select units? When set to Scale Units the values shown will dynamically be scaled (e.g. 1000 kilobits will be shown as 1 megabit). Netdata can auto-scale these original units: kilobits/s, kilobytes/s, KB/s, KB, MB, and GB. When set to Fixed Units all the values will be rendered using the original units maintained by the netdata server. CelsiusFahrenheit Which units to use for temperatures? Set the temperature units of the dashboard. TimeSeconds Convert seconds to time? When set to Time, charts that present seconds will show DDd:HH:MM:SS. When set to Seconds, the raw number of seconds will be presented. Close × UPDATE CHECK Your netdata version: v1.36.0-21-nightly New version of netdata available! Latest version: v1.46.0-361-nightly Click here for the changes log and click here for directions on updating your netdata installation. We suggest to review the changes log for new features you may be interested, or important bug fixes you may need. Keeping your netdata updated is generally a good idea. -------------------------------------------------------------------------------- For progress reports and key netdata updates: Join the Netdata Community You can also follow netdata on twitter, follow netdata on facebook, or watch netdata on github. Check Now Close × SIGN IN Signing-in to netdata.cloud will synchronize the list of your netdata monitored nodes known at registry . This may include server hostnames, urls and identification GUIDs. After you upgrade all your netdata servers, your private registry will not be needed any more. Are you sure you want to proceed? Cancel Sign In × DELETE ? You are about to delete, from your personal list of netdata servers, the following server: Are you sure you want to do this? Keep in mind, this server will be added back if and when you visit it again. keep it delete it × SWITCH NETDATA REGISTRY IDENTITY You can copy and paste the following ID to all your browsers (e.g. work and home). All the browsers with the same ID will identify you, so please don't share this with others. Either copy this ID and paste it to another browser, or paste here the ID you have taken from another browser. Keep in mind that: * when you switch ID, your previous ID will be lost forever - this is irreversible. * both IDs (your old and the new) must list this netdata at their personal lists. * both IDs have to be known by the registry: . * to get a new ID, just clear your browser cookies. cancel impersonate × Checking known URLs for this server... Checks may fail if you are viewing an HTTPS page and the server to be checked is HTTP only. Close SIGN-IN TO NETDATA CLOUD OR GET AN INVITATION! This node is connected to Netdata Cloud but you are not. If you have a Netdata Cloud account sign-in, if not ask for an invitation to it.Netdata Cloud is a FREE service that complements the Netdata Agent, to provide: * Infrastructure level dashboards (each chart aggregates data from multiple nodes) * Central dispatch of alert notifications * Custom dashboards editor * Intelligence assisted troubleshooting, to help surface the root cause of issues Have a look, you will be surprised! Remember my choice Sign-in or get a Netdata Cloud account Later, stay at the Agent dashboard