Welcome to the Elven Platform – your complete solution for Monitoring, Incident Management and Status Pages. This guide is designed to ensure that you not only join the platform, but have an exceptional experience from the very first moment. Get ready to start a journey that will exceed all your expectations!

 

1. Registration and Initial Configuration

    Organization registration: 

    • Access Sign Up
    • Fill in your basic information using a corporate email and click “Sign Up”
    • Name your organization, accept the Terms and Conditions and click “Start Now”
    • Confirm your email to activate your account.
    • Note: remember to put the sender @elven.works as trusted so you don’t miss any type of notification or alert.

    Initial Organization Setup:

    • Log in to your new account
    • Complete your registration including your phone number
    • Invite your team to join you, ensuring smooth onboarding from the start
    • Configure which permission role each member of your team can have access to

    2. Exploring the main modules

    The Elven Platform offers three main modules, each designed to meet your organization’s unique needs. Check out each of them below and, if you have any questions, our team is here to help you choose the module that best suits your specific goals and challenges. Click here and let’s talk and find the ideal solution for you! 

    Monitoring Module:

    Proactive, customizable solution for tracking the health and performance of systems and applications.

    • Start by choosing the type of monitoring and configure it by clicking here
    • Understand the resource screen by clicking here
    • Analyze the resource metrics by clicking here

     

    Incident Management and Response Module:

    In real time, the Elven Platform receives the alert event or incident, triggers the on-call schedule, notifies through communication channels and records all interactions within the platform until resolution.

    • Start configuring your team by clicking here
    • Choose the channels you want to be notified by clicking here
    • Configure call rotation by clicking here
    • Understand the incident screen by clicking here
    • Create a postmortem of the incident by clicking here 


    Status Pages Module:

    Customized Status Pages to keep users informed about the status of services or applications via SMS or Webhook, with availability uptime and incident history.

    • Start by configuring your Status Page here

    3. Subscription

    We integrate your consumption with your Azure Marketplace invoice. We will soon have native integration.

     

    4. Support and Continuous Improvement

    • Explore our extensive knowledge base to find answers to your most common questions. clicking here
    • Contact our dedicated support team at any time for efficient assistance via email at support@elvenworks.atlassian.net or Ticket Opening Tool
    • Send us your feedback and suggestions to contact@elven.works for continuous improvements 


    Now that you’re ready to get started, get ready to embark on a journey of unprecedented success and reliability with the Elven Platform. We are honored that you chose our platform and are here to support you every step of the way.m

    Elven Observability: Operator Collector (Automatic Instrumentation)

    In order to use auto-instrumentation, we need to install the collector operator.

    Dependencies:
    cert-manager

    Install the operator

    				
    					kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
    				
    			
    • If you are going to instrument Golang applications:
    				
    					kubectl -n opentelemetry-operator-system patch deployment opentelemetry-operator-controller-manager \
    --type=json \
    -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-go-instrumentation=true"}]'
    				
    			

    Install the collector

    It is a recommended practice to send telemetry from containers to an OpenTelemetry Collector rather than directly to a backend. The Collector helps simplify secrets management, decouples data export issues (such as the need for retries) from your applications, and allows you to add additional data to your telemetry, such as with the k8sattributesprocessor component. If you choose not to use a Collector, you can skip to the next section.

    https://opentelemetry.io/docs/kubernetes/operator/automatic/

    Configuration example

    				
    					apiVersion: opentelemetry.io/v1beta1
    kind: OpenTelemetryCollector
    metadata:
      name: otel
    spec:
      config: 
        receivers:
          otlp:
            protocols:
              grpc:
              http:
        processors:
          memory_limiter:
            check_interval: 1s
            limit_percentage: 75
            spike_limit_percentage: 15
          batch:
            send_batch_size: 10000
            timeout: 10s
        exporters:
          otlp:
            endpoint: "tempo-distributor.domain.io:443"
            tls:
              insecure: false
              insecure_skip_verify: true
          prometheusremotewrite:
            endpoint: https://mimir-distributor.domain.io/api/v1/push
            headers:
              X-Scope-OrgID: <TENANT>          
        service:
          pipelines:
            metrics:
              receivers: [otlp] 
              processors: [batch]
              exporters: [prometheusremotewrite]
            traces:
              receivers: [otlp]
              processors: []
              exporters: [otlp]
    				
    			
    • Apply the collector
    				
    					kubectl apply -k otel-collector-operator/
    				
    			

    OpenTelemetry Instrumentation

    To manage auto-instrumentation, the Operator needs to be configured to know which pods to instrument and which auto-instrumentation to use for those pods. This is done through the Instrumentation CRD.

    Ex. Java

    				
    					apiVersion: opentelemetry.io/v1alpha1
    kind: Instrumentation
    metadata:
      name: instrumentation-sample
    spec:
      propagators:
        - tracecontext
        - baggage
        - b3
      sampler:
        type: parentbased_traceidratio
        argument: "1"
      env:
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: otel-collector.monitoring:4318
      java:    
        env:
          - name: OTEL_EXPORTER_OTLP_ENDPOINT
            value: http://otel-collector.monitoring:4317 
    				
    			

    Ex. DOTNET

    				
    					apiVersion: opentelemetry.io/v1alpha1
    kind: Instrumentation
    metadata:
      name: instrumentation-sample
    spec:
      propagators:
        - tracecontext
        - baggage
        - b3
      sampler:
        type: parentbased_traceidratio
        argument: "1"
      env:
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: otel-collector.default:4318   
      dotnet:
        env:
          - name: OTEL_DOTNET_AUTO_METRICS_CONSOLE_EXPORTER_ENABLED
            value: "false"
          - name: OTEL_DOTNET_AUTO_TRACES_CONSOLE_EXPORTER_ENABLED
            value: "false"
          - name: OTEL_DOTNET_AUTO_LOGS_CONSOLE_EXPORTER_ENABLED
            value: "false"  
          - name: OTEL_EXPORTER_OTLP_ENDPOINT
            value: http://otel-collector.monitoring:4318
          - name: OTEL_TRACES_EXPORTER
            value: "true"
          - name: OTEL_METRICS_EXPORTER
            value: "true"
    				
    			

    Add annotations to existing deployments

    The final step is to opt-in your services for auto-instrumentation. This is done by updating your services spec.template.metadata.annotations to include a language-specific annotation:

    • .NET: instrumentation.opentelemetry.io/inject-dotnet: “true”
    • Go: instrumentation.opentelemetry.io/inject-go: “true”
    • Java: instrumentation.opentelemetry.io/inject-java: “true”
    • Node.js: instrumentation.opentelemetry.io/inject-nodejs: “true”
    • Python: instrumentation.opentelemetry.io/inject-python: “true”

    Test application using auto-instrumentation for JAVA
    https://opentelemetry.io/docs/kubernetes/operator/automatic/

    				
    					kubectl apply -f - <<EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: java-sample
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: java-sample
      template:
        metadata:
          labels:
            app: java-sample
          annotations:
            instrumentation.opentelemetry.io/inject-java: "true"
        spec:
          containers:
          - name: java-sample
            image: emr001/java-app
            ports:
            - containerPort: 8080
            env:
              - name: OTEL_SERVICE_NAME
                value: "java-demo"
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: java-app
    spec:
      selector:
        app: java-sample
      ports:
        - port: 8080
          protocol: TCP
          targetPort: 8080
    ---
    apiVersion: opentelemetry.io/v1alpha1
    kind: Instrumentation
    metadata:
      name: instrumentation-sample
    spec:
      propagators:
        - tracecontext
        - baggage
        - b3
      sampler:
        type: parentbased_traceidratio
        argument: "1"
      env:
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: otel-collector.monitoring:4318
      java:    
        env:
          - name: OTEL_EXPORTER_OTLP_ENDPOINT
            value: http://otel-collector.monitoring:4317
      dotnet:
        env:
          - name: OTEL_DOTNET_AUTO_METRICS_CONSOLE_EXPORTER_ENABLED
            value: "false"
          - name: OTEL_DOTNET_AUTO_TRACES_CONSOLE_EXPORTER_ENABLED
            value: "false"
          - name: OTEL_DOTNET_AUTO_LOGS_CONSOLE_EXPORTER_ENABLED
            value: "false"  
          - name: OTEL_EXPORTER_OTLP_ENDPOINT
            value: http://otel-collector.monitoring:4318
          - name: OTEL_TRACES_EXPORTER
            value: "true"
          - name: OTEL_METRICS_EXPORTER
            value: "true"              
    EOF 
    				
    			

    Restart the deployment for the operator to inject the agent via an init-container

    				
    					kubectl rollout restart deployment java-sample 
    				
    			

    Perform a port-forward on port 8080 and use the loop below to send traces, metrics, and logs.

    				
    					kubectl port-forward svc/java-app 8080:8080
    				
    			

    We will create a loop to send several requests.

    				
    					while true; do curl http://localhost:8080/api/hello && echo "" && sleep 1; done
    				
    			

    Tracing Grafana

    • Test application using auto-instrumentation for .NET

    				
    					kubectl apply -f - <<EOF
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: dotnet-sample
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: dotnet-sample
      template:
        metadata:
          labels:
            app: dotnet-sample
          annotations:
            instrumentation.opentelemetry.io/inject-dotnet: "true"
        spec:
          containers:
          - name: dotnet-sample
            image: emr001/dotnet8-app:v1
            ports:
            - containerPort: 8080
            env:
              - name: OTEL_SERVICE_NAME
                value: "dotnet-demo"
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: dotnet-app
    spec:
      selector:
        app: dotnet-sample
      ports:
        - port: 80
          protocol: TCP
          targetPort: 8080
    EOF
    				
    			

    Restart the deployment for the operator to inject the agent via an init-container

    				
    					kubectl rollout restart deployment dotnet-sample 
    				
    			

    Restart the deployment for the operator to inject the agent via an init-container

    				
    					kubectl port-forward svc/dotnet-app 80:80
    				
    			

    We will create a loop to send several requests.

    				
    					while true; do curl http://localhost/weatherforecast && echo "" && sleep 1; done
    				
    			
    				
    					while true; do curl http://localhost/error && echo "" && sleep 1; done
    				
    			

    Dashboard Grafana

    Elven Observability: Lambda Instrumentation

    Prerequisites
    • OTel Collector installed on an EC2 instance
    • Documentation
    • Organization registered in Elven Observability
    • Lambda Function
    • Allow traffic from Lambda to the EC2 running the Collector if it is not publicly accessible: documentation

    Instrumentation

    Without Serverless Framework
    Instrumentation is performed via the AWS console.

    Configuring Environment Variables

    1. Go to the Lambda function, scroll down, click on the Configuration tab, select the Environment variables section, and click Edit.

    2. Add the following environment variables by clicking on Add Environment Variables.

    				
    					  AWS_LAMBDA_EXEC_WRAPPER="/opt/otel-handler" 
      OTEL_TRACES_SAMPLER="always_on" 
      OTEL_TRACES_EXPORTER="otlp" 
      OTEL_METRICS_EXPORTER="otlp" 
      OTEL_LOG_LEVEL="DEBUG" 
      OTEL_PROPAGATORS="tracecontext,baggage, xray" 
      OTEL_LAMBDA_TRACE_MODE="capture"
    
      OTEL_EXPORTER_OTLP_ENDPOINT="<EC2_URL_HERE>" 
      OTEL_SERVICE_NAME="<LAMBDA-NAME>"
      OTEL_RESOURCE_ATTRIBUTES="service.name=<LAMBDA_NAME>,environment=<ENVIRONMENT>" 
    				
    			
    1. Only the last three variables need to be modified:
    • EC2_URL_HERE: The EC2 IP address or DNS if it was configured
    • LAMBDA-NAME: The name of the service that will appear in traces and metrics
    • ENVIRONMENT: dev, hml, or prd, depending on the environment being instrumented

    Configuring the Instrumentation Layer

    1. In the Code tab, scroll down to the Layers section and click on Add a Layer.

    2. Choose the Specify an ARN option and paste this ARN:
    arn:aws:lambda:us-east-1:184161586896:layer:opentelemetry-nodejs-0_9_0:4

    3. Finally, click Verify, then Add.

    With Serverless Framework
    This is done in the serverless.yaml file where the Lambdas are configured, as shown in the example below:

    It’s important to note that this configuration must be done for each function!

    The configurations are the same as those performed in Configuring Environment Variables and Configuring the Instrumentation Layer.

    				
    					org: <ORG_NAME> 
    app: <APP_NAME> 
    service: <SERVICE_NAME> 
    
    provider: 
      name: aws 
      runtime: nodejs20.x 
    
    functions: 
      <LAMBDA_NAME_1>:
        handler: <HANDLER> 
        
        layers: 
          - arn:aws:lambda:us-east-1:184161586896:layer:opentelemetry-nodejs-0_9_0:4
    
        environment: 
          AWS_LAMBDA_EXEC_WRAPPER: "/opt/otel-handler" 
          OTEL_TRACES_SAMPLER: "always_on" 
          OTEL_TRACES_EXPORTER: "otlp" 
          OTEL_METRICS_EXPORTER: "otlp" 
          OTEL_LOG_LEVEL: "DEBUG" 
          OTEL_LAMBDA_TRACE_MODE: "capture" 
          OTEL_PROPAGATORS: "tracecontext,baggage, xray"
    
          OTEL_EXPORTER_OTLP_ENDPOINT: "<EC2_URL_HERE>" 
          OTEL_SERVICE_NAME: "<LAMBDA_NAME>" 
          OTEL_RESOURCE_ATTRIBUTES: "service.name=<LAMBDA_NAME>,environment=<ENVIRONMENT>" 
      <LAMBDA_NAME_2>:
        handler: <HANDLER> 
        
        layers: 
          - arn:aws:lambda:us-east-1:184161586896:layer:opentelemetry-nodejs-0_9_0:4
    
        environment: 
          AWS_LAMBDA_EXEC_WRAPPER: "/opt/otel-handler" 
          OTEL_TRACES_SAMPLER: "always_on" 
          OTEL_TRACES_EXPORTER: "otlp" 
          OTEL_METRICS_EXPORTER: "otlp" 
          OTEL_LOG_LEVEL: "DEBUG" 
          OTEL_LAMBDA_TRACE_MODE: "capture" 
          OTEL_PROPAGATORS: "tracecontext,baggage, xray"
    
          OTEL_EXPORTER_OTLP_ENDPOINT: "<EC2_URL_HERE>" 
          OTEL_SERVICE_NAME: "<LAMBDA_NAME>" 
          OTEL_RESOURCE_ATTRIBUTES: "service.name=<LAMBDA_NAME>,environment=<ENVIRONMENT>"
    				
    			

    Once done, simply deploy the changes.

    Elven Observability: Installing OpenTelemetry Collector on an EC2

     

    Prerequisites

    • AWS Account
    • Organization registered in Elven Observability

    Configure the EC2

    If you need to configure an EC2 instance, refer to the AWS documentation. However, before completing the creation process, ensure the following:

    • Set up User Data as described in the section Configuring EC2 User Data.
    • Enable inbound traffic for TCP 4318 and TCP 4317 in the security group.
    • Enable the public IP.

    Configuring EC2 User Data

     

    To automate the configuration of the OpenTelemetry Collector (i.e., avoiding the need to manually connect to the machine and perform each step), you can use the “User Data” field, which allows you to run a script when the machine starts.

    Create a copy of the user_data.sh script, and edit it by replacing <TENANT_ID> with your organization’s name, which corresponds to the value in X-Scope-OrgId: <TENANT_ID-here>, and <YOUR-TOKEN> with the value in Authorization: Bearer <YOUR-TOKEN-here>, both provided by the Elven Works team when your organization was registered.

    Paste the edited script into the User Data field in the Advanced Details section during EC2 creation.

    				
    					#!/bin/bash
    
    # Update packages and install Docker
    apt-get update -y
    apt-get install docker.io -y
    
    # Add the ubuntu user to the docker group
    usermod -aG docker $USER
    
    # Create the otel-config.yaml file
    cat <<EOF > /home/ubuntu/otel-config.yaml
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: "0.0.0.0:4317"
          http:
            endpoint: "0.0.0.0:4318"
    
    exporters:
      otlphttp:
        endpoint: https://tempo.elvenobservability.com/http
        headers:
          X-Scope-OrgID: "<TENANT_ID>"
          Authorization: "Bearer <your-token>"
    
      prometheusremotewrite:
        endpoint: https://mimir.elvenobservability.com/api/v1/push
        headers:
          X-Scope-OrgID: "<TENANT_ID>"
          Authorization: "Bearer <your-token>"
    
      loki:
        endpoint: "http://loki.elvenobservability.com/loki/api/v1/push"
        default_labels_enabled:
          exporter: false
          job: true
        headers:
          X-Scope-OrgID: "<TENANT_ID>"
          Authorization: "Bearer <your-token>"
    
    processors:
      batch: {}
      resource:
        attributes:
        - action: insert
          key: loki.tenant
          value: host.name
      filter:
        metrics:
          exclude:
            match_type: regexp
            metric_names:
              - "go_.*"
              - "scrape_.*"
              - "otlp_.*"
              - "promhttp_.*"
    
    service:
      pipelines:
        metrics:
          receivers: [otlp]
          processors: [batch, filter]
          exporters: [prometheusremotewrite]
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlphttp]
        logs:
          receivers: [otlp]
          processors: [resource, batch]
          exporters: [loki]
    EOF
    
    
    # Run Otel Collector Contrib using Docker
    docker run -d --name otel-collector-contrib \
      -p 4317:4317 -p 4318:4318 \
      -v /home/ubuntu/otel-config.yaml:/etc/otel-config.yaml:ro \
      otel/opentelemetry-collector-contrib:latest \
      --config /etc/otel-config.yaml
    
    # Configure Docker to automatically start on boot
    sudo systemctl enable docker 
    				
    			

    Made synthetically, with
    script, Synthetic Monitoring continuously tests the availability of an
    HTTP, immediately alerting you as soon as there are failures. One
    Platform detects, according to users’ rules, any type of problem in your
    digital product before they affect your customers, ensuring its
    availability and reliability.

    Configuring Synthetic Monitoring

    To configure Synthetic Monitoring, click on Synthetic in the platform’s left side menu.

    After opening the
    Synthetic Monitoring Center, where it will list all your synthetic
    monitoring, click on “New Synthetic Monitoring” to add a monitoring.

    Firstly, you need to:

    1) Choose the name of
    the monitoring and the Environment. If you want to create an environment
    through this page, click on “+ Environment” and you will be directed to
    the creation screen. When you finish creating, click on the reload
    symbol for the environment to appear in the list.

    2) After configuring the environment, choose the check interval in Interval check.

    3) In Response Template,
    knowing what the Step response will be, you can place this response in
    the field to generate variables. By clicking on “Generate Keys”, the
    platform will generate variables to be used in requests in any of the
    steps. By clicking on the variables, it will be copied to the clipboard.

    Configuring the steps

    After the first moment, you will configure the customizable steps for how you want your page or Web APIs to be monitored.
    To do this, in the step settings:
    1) Choose the name of each step, the Healthcheck URL and the Timeout.
    Note: For security reasons, it is
    not permitted to enter an IP in the healthcheck field. To monitor an IP,
    you need to enter it in a secret and use it in healthcheck
    2) Choose the method, whether it is
    GET or Post, select Skip SSL Validation, to ignore the existence of the
    SSL certificate, and choose TLS Renegotiation, if the security protocol
    is required. In the case of the Post method, select the Post-Body Type
    from the Raw or “application/x-www-form-undecoded” options. In the case
    of Raw, select the Raw Type (JSON, Text, Javascript, XML or HTML) and
    fill in the Post Body according to the chosen Raw Type. If it is
    “application/x-www-form-undecoded”, fill in the key and value, if you
    would like to configure more than one key, click the plus (+) button
    next to it.

    3) Click on “Optional request” to
    configure a Header where you can place the Validation String, the
    Headers and Values. If you want to configure more than one Header, click
    the plus (+) button next to it.


    Assertions

    Here you will define what the request should be returned to.

    1) In the Source field,
    choose the source between JSON Body, Status Code and Text Body (Fill in
    the Property field, if the choice is JSON Body).

    2) Choose one of the options available in Comparison.


    3) define the Target Value.

    Assertions configuration examples:





    The variables generated by the Response Template can be used in the URL, Headers or Post Body/application/x-www-form-undecoded.

    1) To use them in the Value, you need to add it to the field in the same way it was generated.


    2) To use a variable in the URL, you need to add it to the URL that will be checked.


    3) To use a variable in the Post Body, simply add it as a value.


    If you want more than two monitoring steps, click on the “Add Step” button at the end of the steps configuration step


    After configuring the steps, If you want, you can configure the opening of an automatic incident. Select incident severity, check interval time in
    the Check Interval in seconds (…) to close and open incidents, and
    the number of failures and hits to also open and close an incident.

    After
    configuring the incident, choose which team will be notified and enable
    the “Enable to set up automatic incidents opening” switch to be
    notified when there is an incident.

    After all these configurations, click on “Create monitoring” to create synthetic monitoring.


    To edit or delete your monitoring, click on the button with three dots to perform one of two actions.


    With the external service custom, we receive data in our API (Application Programming Interface). We generate a CURL that will send us the “alarmed” (opens an incident on the platform) or “resolved” (closes the incident on the platform) data. This way, our platform can process this data and contact your team if your application has an error. To configure a custom integration, request CURL from our team.

    Creating an API Token

    To create an API Token on the platform:
    1 – Click on Organization Settings in the bottom left corner
    2 – In the API tab, click on the “+” button to create a new API Token

    3 – Select the Api Token type and fill in the Name field, then click Generate Integration Token

    Creating a Custom External Service

    1 – Enter the Service Hub, located on the left side menu

    2 – Select between the options, if you want to open an alert, select Alert Custom or if you want to open an incident, select Incident Custom

    3 – In the form, you must fill in the External service name and the Responders who will receive notifications from this service, then click on CREATE

    4 – Further below, your External Service information will appear asking you to select an Api token, select the one created previously

    5 – After selecting the Api Token, the information required to configure CURL is complete

    6 – Once created, your External Services will appear in the External services monitoring center, they will be classified in order of status (in alarm before operational ones)

    Below is an example of a CURL for custom integration:
    curl --request POST \
      --url '<URL da API Elven>' \
      --header 'Content-Type: application/json' \
      --header 'User-Agent: 1PcustomAuth/1.0' \
      --data '{
      "title": "<título do incidente>",
      "description": "<descrição do incidente>",
      "external_aggregate_key": "001",
      "action": "alarmed",
      "organization": "<org_uid fornecido pela Elven>",
      "severity": "critical"
    }'
     
    • “–url” = API_URL generated when creating the External Service;
    • “title” = In this field you define a title that will appear in the incident opened in 1P;
    • “description” = In this field you define a description for the incident, it will appear in “cause” in the incident opened in 1P;
    • “external_aggregate_key” = In this field you define an identifier to “open” and “close” the incident, that is, when closing the incident, it must have the same external_aggregate_key as the open incident;
    • “action” = In this field you define the action to be executed, which can be “alarmed” (opens the incident) or “resolved” (closes the incident);
    • “organization” = This field is provided by the Elven team at the time of the request;
    • “severity” = In this field you define the severity associated with the incident, which can be informational, low, moderate, high or critical.

    Note: don’t forget to add the headers:

     –header ‘Content-Type: application/json’ \

     –header ‘User-Agent: 1PcustomAuth/1.0’ \

    By posting this CURL you will open/close incidents on the platform, thus being able to manage them and be notified.

    In the digital age we live in, companies are subject to a variety of incidents that can disrupt their operations and compromise their security. From cyberattacks to infrastructure failures, these incidents can have significant consequences if not managed properly. This is where incident management comes in – a set of practices and procedures designed to effectively detect, respond to and resolve incidents. In this article, we’ll explore what incident management is and why it’s so crucial to business continuity.

    What is Incident Management?

    Incident management is a structured process for dealing with events that may disrupt an organization’s normal operations. The goal of incident management is to minimize the impact of these events, restore normality as quickly as possible, and learn from experiences to avoid similar incidents in the future.

    Incident Management Components:

    • Detection and Reporting: The first step of incident management is to detect and report the incident. This can be done through proactive systems monitoring, employee reporting, or security alerts.
    • Analysis and Assessment: Once an incident is detected, it needs to be analyzed and evaluated to determine its severity and potential impact on company operations.
    • Response and Mitigation: Based on the analysis of the incident, an appropriate response is developed and implemented to mitigate its negative effects. This may include measures such as isolating compromised systems, patching security vulnerabilities, and communicating with relevant stakeholders.
    • Recovery and Resolution: After initial mitigation, the focus shifts to incident recovery and resolution. This involves restoring affected systems, reversing any damage caused, and returning to normal operations as quickly as possible.
    • Post-Incident Analysis and Learning: Once the incident has been resolved, it is essential to perform a post-incident analysis to understand the underlying causes and identify areas for future improvements. This allows the organization to learn from experience and strengthen its security posture.

    Why is Incident Management Important?

    Incident management plays a key role in protecting and resiliency of organizations in the face of a wide range of threats and challenges. Here are some reasons why it’s so important:

    • Minimize Downtime: Unmanaged incidents can result in significant downtime, hampering productivity and causing financial losses. A quick and effective response can help minimize this downtime and reduce the impact on operations.
    • Protect Assets and Data: Incident management helps protect the organization’s critical assets and data against internal and external threats. This includes confidential customer information, intellectual property and critical IT systems.
    • Preserve Brand Reputation: Cybersecurity incidents and other adverse events can have a significant impact on a company’s brand reputation. An effective response can help mitigate reputational damage and maintain trust with customers and stakeholders.
    • Ensure Regulatory Compliance: In many industries, organizations are required by law to protect sensitive data and ensure business continuity. Incident management plays an essential role in ensuring regulatory compliance and mitigating legal risks.
    • Enhance Organizational Resilience: By learning from past incidents and implementing continuous improvements to processes and systems, organizations can strengthen their resilience and ability to deal with future challenges.

    In an increasingly complex and interconnected business environment, incident management is essential to protect assets, ensure business continuity and preserve brand reputation. By adopting effective incident management practices, organizations can face challenges with confidence and better prepare for the digital future.

    Main Incident Management features of the Elven Platform

    • Call Rotation
    • Centralization of alerts
    • Incident centralization
    • Manual incident opening
    • Incident update
    • Unlimited intelligent duty roster
    • Post-mortem by incident
    • Dash with key metrics
    • Notifications on communication channels (Slack, Discord, WhatsApp) among others
    • War-room by incident (Slack)
    • Integration with ITSM tool (ServiceNOW, Jira)

    Insights provides a comprehensive view of an organization’s historical data, allowing leadership to make informed decisions to enhance operational maturity. Found in the sidebar menu, Insights consists of the following tabs:

    General

    The General tab enables tracking and understanding the performance of monitoring conducted by the One Platform over the last 30 days. It offers a view of performance-related data, allowing users to quickly identify trends, patterns, and anomalies


    • Uptime: Uptime is the duration during which a monitoring process remains operational without interruptions or failures. Downtime is the period when it experienced failures.
    • MTTR (Mean time to Resolve): The average time from the trigger of a failure to its resolution; 
    • MTTA (Mean time to Acknowledge): The average time to acknowledge a failure; 
    • MTBF (Mean Time Between Failures): The average time between failures; 

    Incidents

    Incidents provides an overview of the response effort over time for each incident that occurred in an organization. You can filter incidents by date range, severity, and source of the incident.

    • Total Incidents: The total number of incidents the organization faced within a given period.
    • Total Response Effort: The total time spent on an incident, measured from acknowledgment to resolution.
    • MTTA (Mean Time to Acknowledge): The average time to acknowledge an incident.
    • MTTR (Mean Time to Resolve): The average time from the trigger of an incident to its resolution.
    • Time Cluster: The group corresponding to the period when the incident occurred.
    • Business Hour Interruptions: Interruptions that occurred on weekdays between 8 AM and 6 PM.
    • Off Hour Interruptions: Interruptions that occurred on weekdays between 6 PM and 10 PM, or during weekends between 6 PM and 10 PM.
    • Sleep Hour Interruptions: Interruptions that occurred any day of the week between 10 PM and 8 AM.
    • TTA (Time to Acknowledge): The amount of time between the incident trigger and its acknowledgment.
    • TTR (Time to Resolve): The time from the incident trigger to its resolution.

    Responders

    The Responders tab provides insights into the impact of incidents on responders and includes data on how the incidents were resolved. It also features an individual list of incidents for each responder. You can filter this dashboard by date range, severity, responders, time cluster, and MTTR.

    • Total Incidents: The total number of incidents the organization faced within a given period.
    • Total Response Effort: The total sum of the time responders were involved with incidents, measured from the moment a responder acknowledges the incident until it is resolved.
    • MTTA (Mean Time to Acknowledge): The average time a responder took to acknowledge an incident.
    • MTTR (Mean Time to Resolve): The average time from the trigger of an incident to its resolution by a responder.

    Welcome to the Elven Platform – your complete solution for Monitoring, Incident Management and Status Pages. This guide is designed to ensure that you not only join the platform, but have an exceptional experience from the very first moment. Get ready to start a journey that will exceed all your expectations!

     

    1. Registration and Initial Configuration

      Organization registration: 

      • Access Sign Up
      • Fill in your basic information using a corporate email and click “Sign Up”
      • Name your organization, accept the Terms and Conditions and click “Start Now”
      • Confirm your email to activate your account.
      • Note: remember to put the sender @elven.works as trusted so you don’t miss any type of notification or alert.

      Initial Organization Setup:

      • Log in to your new account
      • Complete your registration including your phone number
      • Invite your team to join you, ensuring smooth onboarding from the start
      • Configure which permission role each member of your team can have access to

      2. Exploring the main modules

      The Elven Platform offers three main modules, each designed to meet your organization’s unique needs. Check out each of them below and, if you have any questions, our team is here to help you choose the module that best suits your specific goals and challenges. Click here and let’s talk and find the ideal solution for you! 

      Monitoring Module:

      Proactive, customizable solution for tracking the health and performance of systems and applications.

      • Start by choosing the type of monitoring and configure it by clicking here
      • Understand the resource screen by clicking here
      • Analyze the resource metrics by clicking here

       

      Incident Management and Response Module:

      In real time, the Elven Platform receives the alert event or incident, triggers the on-call schedule, notifies through communication channels and records all interactions within the platform until resolution.

      • Start configuring your team by clicking here
      • Choose the channels you want to be notified by clicking here
      • Configure call rotation by clicking here
      • Understand the incident screen by clicking here
      • Create a postmortem of the incident by clicking here 


      Status Pages Module:

      Customized Status Pages to keep users informed about the status of services or applications via SMS or Webhook, with availability uptime and incident history.

      • Start by configuring your Status Page here

      3. Subscription

      We integrate your consumption with you AWS Marketplace invoice. We will soon have native integration.

       

      4. Support and Continuous Improvement

      • Explore our extensive knowledge base to find answers to your most common questions. clicking here
      • Contact our dedicated support team at any time for efficient assistance via email at support@elvenworks.atlassian.net or Ticket Opening Tool
      • Send us your feedback and suggestions to contact@elven.works for continuous improvements 


      Now that you’re ready to get started, get ready to embark on a journey of unprecedented success and reliability with the Elven Platform. We are honored that you chose our platform and are here to support you every step of the way.

      Welcome to the Elven Platform – your complete solution for Monitoring, Incident Management and Status Pages. This guide is designed to ensure that you not only join the platform, but have an exceptional experience from the very first moment. Get ready to start a journey that will exceed all your expectations!

       

      1. Registration and Initial Configuration

        Organization registration: 

        • Access Sign Up
        • Fill in your basic information using a corporate email and click “Sign Up”
        • Name your organization, accept the Terms and Conditions and click “Start Now”
        • Confirm your email to activate your account.
        • Note: remember to put the sender @elven.works as trusted so you don’t miss any type of notification or alert.

        Initial Organization Setup:

        • Log in to your new account
        • Complete your registration including your phone number
        • Invite your team to join you, ensuring smooth onboarding from the start
        • Configure which permission role each member of your team can have access to

        2. Exploring the main modules

        The Elven Platform offers three main modules, each designed to meet your organization’s unique needs. Check out each of them below and, if you have any questions, our team is here to help you choose the module that best suits your specific goals and challenges. Click here and let’s talk and find the ideal solution for you!

         

        Monitoring Module:

        Proactive, customizable solution for tracking the health and performance of systems and applications.

        • Start by choosing the type of monitoring and configure it by clicking here
        • Understand the resource screen by clicking here
        • Analyze the resource metrics by clicking here

         

        Incident Management and Response Module:

        In real time, the Elven Platform receives the alert event or incident, triggers the on-call schedule, notifies through communication channels and records all interactions within the platform until resolution.

        • Start configuring your team by clicking here
        • Choose the channels you want to be notified by clicking here
        • Configure call rotation by clicking here
        • Understand the incident screen by clicking here
        • Create a postmortem of the incident by clicking here

         

        Status Pages Module:

        Customized Status Pages to keep users informed about the status of services or applications via SMS or Webhook, with availability uptime and incident history.

        • Start by configuring your Status Page here

         

        Main features Elven Platform

         

        3. Support and Continuous Improvement

        • Explore our extensive knowledge base to find answers to your most common questions. clicking here
        • Contact our dedicated support team at any time for efficient assistance via email at support@elvenworks.atlassian.net or Ticket Opening Tool
        • Send us your feedback and suggestions to contato@elven.works for continuous improvements

         

        Now that you’re ready to get started, get ready to embark on a journey of unprecedented success and reliability with the Elven Platform. We are honored that you chose our platform and are here to support you every step of the way.

        Scroll to Top