Metrics exposed by Scaphandre

With Stdout exporter, you can see all metrics available on your machine with flag --raw-metrics. With prometheus exporter, all metrics have a HELP section provided on /metrics (or whatever suffix you choosed to expose them).

Here are some key metrics that you will most probably be interested in:

scaph_host_power_microwatts: Aggregation of several measurements to give a try on the power usage of the the whole host, in microwatts (GAUGE). It might be the same as RAPL PSYS (see RAPL domains) measurement if available, or a combination of RAPL PKG and DRAM domains + an estimation of other hardware componentes power usage.
scaph_process_power_consumption_microwatts{exe="$PROCESS_EXE",pid="$PROCESS_PID",cmdline="path/to/exe --and-maybe-options"}: Power consumption due to the process, measured on at the topology level, in microwatts. PROCESS_EXE being the name of the executable and PROCESS_PID being the pid of the process. (GAUGE)

For more details on that metric labels, see this section.

And some more deep metrics that you may want if you need to make more complex calculations and data processing:

scaph_host_energy_microjoules : Energy measurement for the whole host, as extracted from the sensor, in microjoules. (COUNTER)
scaph_socket_power_microwatts{socket_id="$SOCKET_ID"}: Power measurement relative to a CPU socket, in microwatts. SOCKET_ID being the socket numerical id (GAUGE)

If your machine provides RAPL PSYS domain (see RAPL domains), you can get the raw energy counter for PSYS/platform with scaph_host_rapl_psys_microjoules. Note that scaph_host_power_microwatts is based on this PSYS counter if it is available.

Since 1.0.0 the following host metrics are availalable as well ;

scaph_host_swap_total_bytes: Total swap space on the host, in bytes.
scaph_host_swap_free_bytes: Swap space free to be used on the host, in bytes.
scaph_host_memory_free_bytes: Random Access Memory free to be used (not reused) on the host, in bytes.
scaph_host_memory_available_bytes: Random Access Memory available to be re-used on the host, in bytes.
scaph_host_memory_total_bytes: Random Access Memory installed on the host, in bytes.
scaph_host_disk_total_bytes: Total disk size, in bytes.
scaph_host_disk_available_bytes: Available disk space, in bytes.

Disk metrics have the following labels : disk_file_system, disk_is_removable, disk_type, disk_mount_point, disk_name

scaph_host_cpu_frequency: Global frequency of all the cpus. In MegaHertz
scaph_host_load_avg_fifteen: Load average on 15 minutes.
scaph_host_load_avg_five: Load average on 5 minutes.
scaph_host_load_avg_one: Load average on 1 minute.

If you hack scaph or just want to investigate its behavior, you may be interested in some internal metrics:

scaph_self_memory_bytes: Scaphandre memory usage, in bytes
scaph_self_memory_virtual_bytes: Scaphandre virtual memory usage, in bytes
scaph_self_topo_stats_nb: Number of CPUStat traces stored for the host
scaph_self_topo_records_nb: Number of energy consumption Records stored for the host
scaph_self_topo_procs_nb: Number of processes monitored by scaph
scaph_self_socket_stats_nb{socket_id="SOCKET_ID"}: Number of CPUStat traces stored for each socket
scaph_self_socket_records_nb{socket_id="SOCKET_ID"}: Number of energy consumption Records stored for each socket, with SOCKET_ID being the id of the socket measured
scaph_self_domain_records_nb{socket_id="SOCKET_ID",rapl_domain_name="RAPL_DOMAIN_NAME "}: Number of energy consumption Records stored for a Domain, where SOCKET_ID identifies the socket and RAPL_DOMAIN_NAME identifies the rapl domain measured on that socket

Getting per process data with scaph_process_* metrics

Here are available labels for the scaph_process_power_consumption_microwatts metric that you may need to extract the data you need:

exe: is the name of the executable that is the origin of that process. This is good to be used when your application is running one or only a few processes.
cmdline: this contains the whole command line with the executable path and its parameters (concatenated). You can filter on this label by using prometheus =~ operator to match a regular expression pattern. This is very practical in many situations.
instance: this is a prometheus generated label to enable you to filter the metrics by the originating host. This is very useful when you monitor distributed services, so that you can not only sum the metrics for the same service on the different hosts but also see what instance of that service is consuming the most, or notice differences beteween hosts that may not have the same hardware, and so on...
pid: is the process id, which is useful if you want to track a specific process and have your eyes on what's happening on the host, but not so practical to use in a more general use case

Since 1.0.0 the following per-process metrics are available as well :

scaph_process_cpu_usage_percentage: CPU time consumed by the process, as a percentage of the capacity of all the CPU Cores
scaph_process_memory_bytes: Physical RAM usage by the process, in bytes
scaph_process_memory_virtual_bytes: Virtual RAM usage by the process, in bytes
scaph_process_disk_total_write_bytes: Total data written on disk by the process, in bytes
scaph_process_disk_write_bytes: Data written on disk by the process, in bytes
scaph_process_disk_read_bytes: Data read on disk by the process, in bytes
scaph_process_disk_total_read_bytes: Total data read on disk by the process, in bytes

Get container-specific labels on scaph_process_* metrics

The flag --containers enables Scaphandre to collect data about the running Docker containers or Kubernetes pods on the local machine. This way, it adds specific labels to make filtering processes power consumption metrics by their encapsulation in containers easier.

Generic labels help to identify the container runtime and scheduler used (based on the content of /proc/PID/cgroup):

container_scheduler: possible values are docker or kubernetes. If this label is not attached to the metric, it means that scaphandre didn't manage to identify the container scheduler based on cgroups data.

Then the label container_runtime could be attached. The only possible value for now is containerd.

container_id is the ID scaphandre got from /proc/PID/cgroup for that container.

For Docker containers (if container_scheduler is set), available labels are :

container_names: is a string containing names attached to that container, according to the docker daemon
container_docker_version: version of the docker daemon
container_label_maintainer: content of the maintainer field for this container

For containers coming from a docker-compose file, there are a bunch of labels related to data coming from the docker daemon:

container_label_com_docker_compose_project_working_dir
container_label_com_docker_compose_container_number
container_label_com_docker_compose_project_config_files
container_label_com_docker_compose_version
container_label_com_docker_compose_service
container_label_com_docker_compose_oneoff

For Kubernetes pods (if container_scheduler is set), available labels are :

kubernetes_node_name: identifies the name of the kubernetes node scaphandre is running on
kubernetes_pod_name: the name of the pod the container belongs to
kubernetes_pod_namespace: the namespace of the pod the container belongs to

Scaphandre documentation

Metrics exposed by Scaphandre

Getting per process data with scaph_process_* metrics

Get container-specific labels on scaph_process_* metrics