Skip to main content

13 posts tagged with "monitoring"

View All Tags

· 12 min read
Satyadeep Ashwathnarayana

stacked-netdata

Memory-intensive applications can benefit from improved performance by using huge pages, as they can reduce TLB pressure and memory fragmentation, and lower the memory management overhead overall. Developers should consider using HugeTLBfs in their mmap() and shmget() calls to take advantage of huge pages.

Transparent Huge Pages (THP) is a Linux kernel feature that provides some of the benefits of huge pages without requiring any development effort. However, THP can cause latency in many applications. Although kernel developers are actively working to address these issues, many system administrators prefer to disable THP altogether.

Netdata can assist in determining whether THP is helpful or harmful to your applications, which can guide your decision regarding its use.

· 5 min read
Satyadeep Ashwathnarayana

stacked-netdata

Context switching is the process of switching the CPU from one process, task or thread to another. In a multitasking operating system, such as Linux, the CPU has to switch between multiple processes or threads in order to keep the system running smoothly. This is necessary because each CPU core without hyperthreading can only execute one process or thread at a time. If there are many processes or threads running simultaneously, and very few CPU cores available to handle them, the system is forced to make more context switches to balance the CPU resources among them.

Context switching is an essential function of any multitasking operating system, but it also comes at a cost. The whole process is computationally intensive, and the more context switches that occur, the slower the system becomes. This is because each context switch involves saving the current state of the CPU, loading the state of the new process or thread, and then resuming execution of the new process or thread. This takes time and consumes CPU resources, which can slow down the system.

The impact of context switching on system performance can be significant, especially in systems with many processes or threads running simultaneously.

· 4 min read
Andrew Maguire

img

We have recently extended the native machine learning (ML) based anomaly detection capabilities of Netdata to support all metrics, regardless on their collection frequency (update every).

Previously only metrics collected every second were supported, but now Netdata can run anomaly detection out of the box with zero config on metrics with any collection frequency.

This post will illustrate an example of what this means using Prometheus metrics (via the Netdata Prometheus collector) since they typically have a default collection frequency of 10 seconds.

· 9 min read
Andrew Maguire

img

We recently got this great feedback from a dear user in our Discord:

I would really like to use Netdata to monitor custom internal metrics that come from SQL, not a fan of having 10 diff systems doing essentially the same thing as is, Netdata is pretty much all there in that regard, just needs a few extra features.

This is great and exactly what we want, a clear problem or improvement we could make to help make that users monitoring life a little easier.

This is also where the beauty of open source comes in and being able to build on the shoulders of giants - adding such a feature turned out to be pretty easy by just extending our existing Pandas collector to support SQL queries leveraging its read_sql() capabilities.

Here is the PR that was merged a few days later.

This blog post will cover an example of using the Pandas collector to monitor some custom SQL metrics from a WordPress MySQL database.

· 3 min read
Chris Akritidis

The life of a sysadmin or SRE is often difficult, but occasionally very simple things can make a huge difference. Basic monitoring of your systemd services is one of those simple things, which we sometimes overlook. The simplest question one would want to know is if the thing that’s supposed to be running is actually running at all. If you use systemd services, you can guarantee an answer to that question within minutes using Netdata.