English – Page 9

Cloud Functions is a serverless computing service provided by the
Google Cloud Platform (GCP). It allows you to execute your code in
response to specific events without the need to manage servers or
underlying infrastructure. With Cloud Functions, you can write functions
in various programming languages such as Node.js, Python, Go, and more.
These functions can be triggered by events from GCP services, such as
changes in Cloud Storage buckets, messages received in Pub/Sub, or HTTP
triggers. The service automatically scales the execution of your
functions, ensuring they respond quickly and reliably, even during high
traffic periods. Cloud Functions also offers integration with other GCP
services, enabling you to build scalable and reactive cloud
applications, implement custom business logic, and automate tasks, all
in a simple and efficient manner.

How to monitor Functions on One Platform:

Go to the application of the product where you want to add Functions as a dependency.
Click on “Products” in the menu, then select the desired product card.
Click on the name of the desired application.
In the “External Dependencies” section, located just below the
latency graph, you can add or search for an already registered
dependency.
To add a new dependency, click on the green button with a plus symbol (+).

When you click on “Add,” a modal will appear. In this modal, you will
name your queue and choose the Environment. In the “Check type” field,
select the option “Queue,” and in the “Method” field, choose “Functions
(GCP).” After selecting the method, a field for “Healthcheck URL” will
appear.

We use GCP Logging to search for logs from Google Cloud. Click here to read more about its pricing.

Note: For security reasons, it is not permitted
to enter an IP in the healthcheck field. To monitor an IP, you need to
enter it in a secret and use it in healthcheck

In an external monitoring without add-ons, the platform monitors a digital product without the need for a cloud. The checking process occurs through geographically distributed servers that verify a public application without requiring an internal cloud, using only the ElvenWorks infrastructure. To choose this type of monitoring, click on “Product” in the platform’s side menu and then click on the card “Monitor a web application without a cloud account.”

To make this configuration, access the product for which you want to
set up monitoring and click on the “New Resource” button. Once you
click, a window will open with various monitoring options. Select:
“Receive and record hits and failures via Webhook.”

Once the configurations open, select the environment, give a name to
the application, and configure the values for “Interval in seconds” and
“Fails to incidents.”

When you finish filling in the fields, click on “Save,” and the
application monitoring page will appear. In the box located in the right
corner, click on the button “Monitoring is inactive” to activate the
monitoring. After that, click on the button “HITS AND FAILURES API” to
configure the hits and failures data transmission to the designated API.

You’ll see a pop-up window that contain the information to be used
for communication between your environment and the platform. This
communication is crucial for data analysis, development of the
resilience matrix, incident control, and other resources available on
the platform.

Just replace the <token> and follow the instructions of the pop-up windown to finish the configuratino and initiate the monitoring.

Note: Our application has a rate limit of 30 requests per second. This means that each IP can make a maximum of 30 requests to the server in a period of 1 second. If this limit is exceeded, additional requests will be temporarily blocked.

Elasticsearch is an open-source search and analytics engine built on top of Apache Lucene. It is designed to store, search, and analyze large volumes of data quickly and efficiently. Elasticsearch allows you to index both structured and unstructured data, offering advanced full-text search features such as relevance search, keyword matching, filtering, and result highlighting. Additionally, it supports powerful aggregations for data analysis, enabling you to gain valuable insights from your information. With its distributed and scalable architecture, Elasticsearch can be deployed in clusters to handle intensive workloads, providing high availability, fault tolerance, and optimized performance. It is widely used in various use cases, such as full-text search, log analysis, application monitoring, content personalization, and real-time data exploration. Elasticsearch has become a popular choice for companies seeking a robust and flexible solution for data indexing and searching at scale.

To monitor Elasticsearch on the One Platform:

Go to the product application where you want to add Elasticsearch as a dependency.
Click on the “Products” menu, then select the desired product card.
Click on the name of the application you want to configure.
In the “External Dependencies” section, located below the latency graph, you can add or search for an existing dependency.
To add a new dependency, click on the green button with a plus symbol (+).

When you click “Add,” a modal will appear. In this modal, you will name your dependency and choose the Environment. In the “Check type” field, select the option “Search Engine,” and in the “Method” field, choose “Elasticsearch.” After selecting the method, a field for the Healthcheck URL will appear.

Check below the example strings for Elasticsearch:

example 1: http://HOST:PORT/_cluster/health/staging_entities-orgid

example 2: http://USER:PASSWORD@HOST:PORT/_cluster/health/staging_entities-orgid

example 3: http://APIKEY@HOST:PORT/_cluster/health/staging_entities-orgid

Note: For security reasons, it is not permitted to enter an IP in the healthcheck field. To monitor an IP, you need to enter it in a secret and use it in healthcheck

Elasticsearch is an open source search and analytics engine built on top of Apache Lucene. It is designed to store, search and analyze large volumes of data quickly and efficiently. Elasticsearch allows you to index structured and unstructured data, offering advanced full-text search features such as relevance search, keyword matching, filtering, and result highlighting. Additionally, it supports powerful aggregations for data analysis, allowing you to gain valuable insights from your information. With its distributed, scalable architecture, Elasticsearch can be deployed across clusters to handle intensive workloads, providing high availability, fault tolerance, and optimized performance. It is widely used in a variety of use cases such as full-text search, log analysis, application monitoring, content personalization, and real-time data search. Elasticsearch has become a popular choice for companies looking for a robust and flexible solution for indexing and searching data at scale.

How to monitor Elastic Search on One Platform

1 – In the side menu, click on Services Hub

2 – In the SearchEngine category, click on the Elastic Search card

3 – You will be directed to the Elastic Search configuration form, fill in the fields

4 – If you want, you can configure automatic incident opening. In the Open automatic incident section, fill in the fields:

Severity -> Choose between “SEV-1 – Critical”, “SEV-2 – High”, “SEV-3 – Moderate”, “SEV-4 – Low”, “SEV-5 – Informational” or “Not Classified”;
Check Interval in seconds -> This is the interval at which checking will take place (this interval cannot be less than the number of failures x the Interval configured in the monitoring form;
Failures to open automatic incident -> It is the number of failures necessary to open the automatic incident;
Check Interval in seconds -> This is the interval in which checking will take place (this interval cannot be less than the number of hits x the Interval configured in the monitoring form;
Hits to close automatic incident -> It is the number of hits needed to close the automatic incident;
Responders -> These are the teams that will be notified if there are incidents in this monitoring, and you can add one or multiple teams;

If necessary, you can create a team by clicking + RESPONDER, you will be directed to the form

to create the team, then click on the buttonfor the new team to appear in the list

***Don’t forget to activate the Enable to set up automatic incidents opening toggle to save the automatic incident opening settings

5 – Click on CREATE MONITORING

DynamoDB is a fully managed NoSQL database service provided by Amazon Web Services (AWS). It is designed to be highly scalable, high-performance, and offer low latency in read and write operations. DynamoDB is based on the key-value model, where data is stored in tables and each item is identified by a unique key. It provides features such as synchronous and asynchronous replication for high availability and data durability, and offers automatic scaling to handle growing workloads. DynamoDB also offers flexible indexing capabilities, fast queries, and support for atomic transactions. It is widely used in web, mobile and IoT applications where horizontal scalability and the ability to handle large volumes of real-time data are essential.

How to monitor DynamoDB on One Platform

1 – In the side menu, click on Services Hub

2 – In the Database category, click on the DynamoDB card

3 – You will be directed to the DynamoDB configuration page, fill in the fields

4 – If you want, you can configure automatic incident opening. In the Open automatic incident section, fill in the fields:

Severity -> Choose between “SEV-1 – Critical”, “SEV-2 – High”, “SEV-3 – Moderate”, “SEV-4 – Low”, “SEV-5 – Informational” or “Not Classified”;
Check Interval in seconds -> This is the interval at which checking will take place (this interval cannot be less than the number of failures x the Interval configured in the monitoring form;
Failures to open automatic incident -> It is the number of failures necessary to open the automatic incident;
Check Interval in seconds -> This is the interval in which checking will take place (this interval cannot be less than the number of hits x the Interval configured in the monitoring form;
Hits to close automatic incident -> It is the number of hits needed to close the automatic incident;
Responders -> These are the teams that will be notified if there are incidents in this monitoring, and you can add one or multiple teams;

If necessary, you can create a team by clicking + RESPONDER, you will be directed to the form

to create the team, then click on the button for the new team to appear in the list

***Don’t forget to activate the Enable to set up automatic incidents opening toggle to save the automatic incident opening settings

5 – Click on CREATE MONITORING

DynamoDB is a fully managed NoSQL database service provided by Amazon Web Services (AWS). It is designed to be highly scalable, high-performance, and offer low latency in read and write operations. DynamoDB is based on the key-value model, where data is stored in tables, and each item is identified by a unique key. It provides features such as synchronous and asynchronous replication for high data availability and durability, as well as automatic scalability to handle growing workloads. DynamoDB also offers flexible indexing capabilities, fast queries, and support for atomic transactions. It is widely used in web, mobile, and IoT applications where horizontal scalability and the ability to handle large volumes of real-time data are essential.

How to Monitor DynamoDB on the One Platform

To set up monitoring for DynamoDB on the platform, follow these steps:

Go to the product application where you want to add DynamoDB as a dependency in the platform.
Click on the “Products” menu and select the desired product card.
Then, click on the name of the specific application where you want to configure DynamoDB monitoring.
Look for the section called “External Dependencies,” usually located just below the latency graph of the application.
To add an already registered dependency, type the name of the
dependency in the search field and select it when it appears in the
list.
If DynamoDB is not yet registered as a dependency, click on the green button with a plus (+) symbol to add a new dependency.

Click “Add” and a modal will
appear allowing you to name the database and select the Environment. In
the “Check type” field, choose the option “DB,” and in the “Method”
field, select “DynamoDB.” After selecting the method, a field for the
Healthcheck URL will appear.

Below are two examples of strings for DynamoDB:

ex1: ACCESS_KEY:SECRET_ACCESS_KEY/AWS-REGION@tableName

ex2: ACCESS_KEY:SECRET_ACCESS_KEY/AWS-REGION@tableName:primaryKey:valueItem

Note: For security reasons, it is not permitted to enter an IP in the
healthcheck field. To monitor an IP, you need to enter it in a secret
and use it in healthcheck

The constant monitoring of metrics is essential for evaluating the
performance of your digital product and ensuring that your goals are
being met. For this purpose, One Platform’s metrics dashboard becomes an essential tool in structuring your objectives. With Metrics,
you can have an overview of your organization or a specific product,
calculating an average of the resources used in the last 30 days, which
provides an excellent foundation for product planning.

In Metrics,
located in the sidebar menu of our platform, you can access data
related to the health of your system, Dora Metrics, and SRE Metrics for
your organization.

In addition to the averages, there is also a list of all your resources and their respective metrics, with a column where you can see your resource’s metrics in more detail.

You can also access metrics exclusive to your applications by returning to the application you want and clicking the “Metrics details” button, located in the top right corner of the page.

Through the application you can also check the Uptime metrics per hour, accessing the “Metrics Hour by Hour” link, located above the Uptime, which provides a summary of your Uptime per hour and per day.

In the Metrics Dashboards you can select, through a calendar, the range of dates you want, from the implementation of the metric, to the previous date of the day you are making the query, the services you want to evaluate and download the performance report. result of each metric. An easy and practical way to analyze the entire performance of each of your digital product resources!

Incident management’s main objective is the fastest possible
restoration of normalcy for a digital service, minimizing negative
consequences for operations, and ensuring the highest level of service
and availability. However, beyond problem resolution, having a broad
view of previous incidents is essential to ensure the health of your
digital product.

One Platform provides an incident dashboard
that presents all relevant information about incidents that have
occurred within your organization, offering detailed reports of events
from the last 30 days. You can access it through the Incidents menu and
then by clicking on the “Incident” button.

Through the dashboard, you will
have visibility and information about the occurrence of incidents within
your organization, such as the days of the week with the highest number
of incidents, the most frequent times of occurrence, and the resources
that are experiencing the most impact, along with the total number of
incidents.

In addition to providing more
detailed insights by clicking on the graphs, you can also filter the
incidents by source and time period to obtain more precise results. All
the information you need to identify critical points, improve processes,
and make strategic decisions to enhance system efficiency.

Critical Events is a global space
on the One Platform, where the user can view, regardless of the page
they are browsing, the critical events that are taking place in their
organization at that moment, whether they are incidents and their
respective resolution status. Critical Events can be accessed through
the platform’s top bar.

By clicking the bell next to the
name Critical Events, you can view the incidents that are open in your
organization. Once incidents are resolved or classified as “Recognized”
no Critical Events will appear.