Skip to content

Data Security

In the era of cloud computing, data security is paramount. Possessing comprehensive data protection capabilities can enhance visibility and insight, automatically warn of security risks, thereby improving overall protection capabilities and ensuring data availability and security compliance.

When using Guance, its built-in tools perform risk assessment and processing on the received data.

How to reduce data risks?

Guance collects monitoring information from your infrastructure and services, and centrally manages it for your analysis and processing at any time. In this process, servers transmit various types of data. Servers used normally with Guance send various types of data content. Most data collected through the normal use of Guance products does not contain personal privacy information. For non-essential personal data that may be included, we provide detailed explanations and recommendations to prevent confusion. Guance offers multiple ways to help you reduce data risks.

Data Security Considerations on the DataKit Side

HTTPS Data Upload

All data from DataKit is uploaded using the HTTPS protocol to ensure the security of data communication.

Limited Push Mechanism

The center cannot issue commands to DataKit for execution; all requests are initiated actively by DataKit. DataKit can only periodically pull some related configurations (such as Pipeline and blacklist configurations) from the center. The center cannot issue commands for DataKit to execute.

Field Value Desensitization During Tracing Collection

During the Tracing collection process, the execution process of some SQL statements may be collected. The field values of these SQL statements will be desensitized, for example:

SELECT name from class where name = 'zhangsan'

Will be desensitized to

SELECT name from class where name = ?

Pipeline and Blacklist Mechanism

If there is indeed some sensitive data in the data that cannot be removed during the collection process, then specific functions of Pipeline (such as the cover() function which can replace parts of a string with *) can be used to desensitize sensitive data (such as phone numbers, etc.).

Furthermore, by configuring blacklist rules, the upload of some sensitive data can also be prevented.

Sensitive Data Scanning

The sensitive data scanning feature can be used to identify, mark, and edit data containing numerous risks such as personal privacy. As a security line of defense, it can effectively prevent sensitive data from leaking out.

For more details, please refer to Sensitive Data Scanning.

Logs

The process of using Guance's product services generates numerous log records. Due to the strong correlation of log data itself, specific rules need to be applied during the collection-analysis process to filter massive amounts of log data.

By configuring sensitive fields for log data, members with corresponding permissions can only see desensitized log data.

Data access permission control is another key method to reduce log data security risks. By configuring corresponding log data access and query scopes for different roles, data is isolated, achieving the purpose of comprehensive management and filtering of sensitive data.

For more details, please refer to Multi-Role Data Access Control.

Snapshots

Guance's snapshot service, as an instant data copy, contains exception data filtering conditions and data records. When facing the need to share monitoring data, by setting data desensitization rules or deciding the sharing method when sharing a snapshot, an access link with specified viewing permissions can be generated, automatically forming a data protection shield.

For more details, please refer to Snapshots.

RUM

When collecting data related to user visits, the RUM (Real User Monitor) SDK can perform custom modifications and interceptions on the data to prevent the flow of sensitive data.

For more details, please refer to SDK Data Interception and Data Modification.

Web RUM SDK

The following is an explanation of Cookie usage and alternative mechanisms by the RUM SDK (compliance disclosure).

Under the default configuration, the Guance RUM SDK writes two types of Cookies into the monitored application to achieve session identification and user identification functions:

  1. Cookies prefixed with _gc_s_: Used to store information related to the current access session, in order to associate and statistics user behavior within one access cycle.

  2. Cookies prefixed with _gc_usr_: Used to store user identification information, used to continuously identify the same end user across different access sessions.

The aforementioned Cookies do not contain the user's real identity information and are only used for session management and anonymous user identification in monitoring data.

If writing Cookies is not allowed in specific business scenarios (such as privacy compliance requirements, user refusal to use Cookies, browser restrictions, etc.), the RUM SDK supports initialization configuration:

sessionPersistence: "local-storage"

This stores the aforementioned session information and user identification in LocalStorage, thus continuing to provide session and user association capabilities without writing any Cookies.

Session Replay Privacy Settings

Session Replay provides privacy controls to ensure that no company exposes sensitive data or personal data. And the data is stored encrypted. The default privacy options for Session Replay are designed to protect end-user privacy and prevent sensitive organizational information from being collected.

Global Configuration

By enabling Session Replay, sensitive elements can be automatically masked so they are not recorded by the RUM SDK.

To enable your privacy settings, set defaultPrivacyLevel to mask-user-input, mask, or allow in your SDK configuration.

import { datafluxRum } from '@cloudcare/browser-rum'

datafluxRum.init({
  applicationId: '<DATAFLUX_APPLICATION_ID>',
  datakitOrigin: '<DATAKIT ORIGIN>',
  service: 'browser',
  env: 'production',
  version: '1.0.0',
  sessionSampleRate: 100,
  sessionReplaySampleRate: 100,
  trackInteractions: true,
  defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
})

datafluxRum.startSessionReplayRecording()

After updating the configuration, you can override elements of the HTML document using the following privacy options:

Mask user input mode: Masks most form fields, such as inputs, text areas, and checkbox values, while recording all other text as is. Inputs are replaced with three asterisks (***), and text areas are obfuscated with x-characters that preserve space.

Note

By default, mask-user-input is the privacy setting when session replay is enabled.

Mask mode: Masks all HTML text, user input, images, and links. Text on the application is replaced with Xs, rendering the page as a wireframe.

Allow mode: Records all data.

Some limitations:

For data security considerations, regardless of the defaultPrivacyLevel mode you configure, the following elements will be masked:

  • Input elements of type password, email, and tel;
  • Elements with the autocomplete attribute, such as credit card numbers, expiration dates, and security codes.
Custom Configuration

Session Replay supports the masking function for sensitive elements. You can flexibly set the content that needs to be masked according to business requirements, such as sensitive information like mobile phone numbers. The following are the specific operation methods:

Configuring Masking via Element Attributes

You can add the data-gc-privacy attribute to elements that need masking, supporting the following four attribute values:

  • allow: Allows data collection, no masking processing.
  • mask: Masks the content, displaying the content in a masked form.
  • mask-user-input: Masks user input, preventing the recording of sensitive input data.
  • hidden: Completely hides the content.

Example code:

<!-- Allow data collection -->
<div class="mobile" data-gc-privacy="allow">13523xxxxx</div>

<!-- Mask content -->
<div class="mobile" data-gc-privacy="mask">13523xxxxx</div>

<!-- Mask user input -->
<input class="mobile" data-gc-privacy="mask-user-input" value="13523xxxxx" />

<!-- Hide content -->
<div class="mobile" data-gc-privacy="hidden">13523xxxxx</div>
Configuring Masking via Element Class Names

Supports implementing the masking function by adding specific class names to elements. The following class names are currently supported:

  • gc-privacy-allow: Allows data collection.
  • gc-privacy-mask: Masks content.
  • gc-privacy-mask-user-input: Masks user input.
  • gc-privacy-hidden: Completely hides content.

Example code:

<!-- Allow data collection -->
<div class="mobile gc-privacy-allow">13523xxxxx</div>

<!-- Mask content -->
<div class="mobile gc-privacy-mask">13523xxxxx</div>

<!-- Mask user input -->
<input class="mobile gc-privacy-mask-user-input" value="13523xxxxx" />

<!-- Hide content -->
<div class="mobile gc-privacy-hidden">13523xxxxx</div>
Using shouldMaskNode to Implement Custom Node Masking Strategies

In certain special scenarios, it may be necessary to perform customized masking processing on specific DOM nodes. For example, in applications with high-security levels, it might be desired to uniformly mask all text content containing numerical values on a page. This requirement can be met by configuring the shouldMaskNode callback function for more flexible privacy control.

import { datafluxRum } from '@cloudcare/browser-rum'

datafluxRum.init({
  applicationId: '<DATAFLUX_APPLICATION_ID>',
  datakitOrigin: '<DATAKIT ORIGIN>',
  service: 'browser',
  env: 'production',
  version: '1.0.0',
  sessionSampleRate: 100,
  sessionReplaySampleRate: 100,
  trackInteractions: true,
  defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
  shouldMaskNode: (node, privacyLevel) => {
    if (node.nodeType === Node.TEXT_NODE) {
      // If it's a text node, check if the content contains numbers
      const textContent = node.textContent || ''
      return /\d+/.test(textContent)
    }
    return false
  },
})

datafluxRum.startSessionReplayRecording()

In the above example, the shouldMaskNode function judges all text nodes. If the content contains numbers (such as amounts, phone numbers, etc.), it automatically performs masking processing, thereby enhancing the privacy protection capability of user data.

Some Recommendations
  • Priority rules:
    • If both the data-gc-privacy attribute and a class name are set, it is recommended to determine the priority according to the project documentation.
  • Applicable scenarios:
    • allow: Suitable for regular data that does not require masking.
    • mask: Suitable for sensitive data that needs to be displayed in a masked form, such as mobile phone numbers.
    • mask-user-input: Suitable for scenarios where input content needs protection, such as password fields.
    • hidden: Suitable for content that you do not want to display or record.
  • Best practices:
    • Prefer simple and clear methods (such as class names or attributes) to ensure accurate configuration.
    • In high-sensitivity data scenarios, such as user privacy forms, it is recommended to use mask-user-input or hidden.

Through the above methods, you can flexibly configure masking rules for sensitive elements, improve data security, and meet business compliance requirements.

Feedback

Is this page helpful? ×