Skip to content

Data Security

In the era of cloud computing, data security is crucial. Having comprehensive data protection capabilities can enhance visibility and insights, automatically warn of security risks, thereby improving overall protection capabilities, and ensuring data availability and compliance.

When using Guance, its built-in tools will assess and process the received data for risk.

How to Reduce Data Risks?

Guance collects monitoring information from your infrastructure and services, and manages it centrally, making it easy for you to analyze and process at any time. During this process, the server transmits various types of data. The data collected by the normal use of Guance products mostly does not contain personal privacy information. For non-essential personal data that may be included, we provide detailed explanations and recommendations to prevent confusion. Guance offers various ways to help you reduce data risks.

Data Security Considerations on the DataKit Side

HTTPS Data Upload

All data from DataKit is uploaded using the HTTPS protocol to ensure the security of data communication.

Limited Download Mechanism

The center cannot issue commands to DataKit for execution. All requests are initiated by DataKit actively. DataKit can only periodically pull some related configurations (such as Pipeline and blacklist configurations) from the center. The center cannot issue commands to DataKit for execution.

Field Value Desensitization During Tracing Collection

During the Tracing collection process, some SQL statement execution processes may be collected, and the field values of these SQL statements will be desensitized. For example:

SELECT name from class where name = 'zhangsan'

Will be desensitized to:

SELECT name from class where name = ?

Pipeline and Blacklist Mechanism

If there are indeed some sensitive data in the data that cannot be removed during the collection process, you can use specific functions of Pipeline (such as the cover() function, which can replace some parts of the string with *) to desensitize some sensitive data (such as phone numbers, etc.).

In addition, by configuring blacklist rules, you can also prevent the upload of some sensitive data.

Sensitive Data Scanning

The sensitive data scanning feature can be used to identify, mark, and edit data containing personal privacy and many other risk data. As a security line of defense, it can effectively prevent sensitive data from leaking out.

For more details, refer to Sensitive Data Scanning.

Logs

The use of Guance's product services will generate many log records. Due to the strong correlation of log data itself, specific rules need to be applied during the collection and analysis process to filter massive log data.

By configuring sensitive fields for log data, members with corresponding permissions can only see desensitized log data.

Data access control is another key method to reduce log data security risks. By configuring corresponding log data access query ranges for different roles, data isolation is achieved, achieving comprehensive management and filtering of sensitive data.

For more details, refer to Multi-Role Data Access Control.

Snapshots

Guance's snapshot service, as an instant data copy, contains abnormal data filtering conditions and data records. When facing the need to share monitoring data, by setting data desensitization rules or deciding the sharing method when sharing snapshots, access links with specified viewing permissions can be generated, automatically forming a data protection shield.

For more details, refer to Snapshots.

RUM

When collecting user access-related data, the RUM (Real User Monitor) SDK will customize and intercept the data to prevent the flow of sensitive data.

For more details, refer to SDK Data Interception and Data Modification.

Session Replay Privacy Settings

Session Replay provides privacy controls to ensure that no company exposes sensitive data or personal data. And the data is stored encrypted. The default privacy options for Session Replay are designed to protect end-user privacy and prevent sensitive organizational information from being collected.

Global Configuration

By enabling Session Replay, sensitive elements can be automatically masked so that they are not recorded by the RUM SDK.

To enable your privacy settings, set defaultPrivacyLevel to mask-user-input, mask, or allow in your SDK configuration.

import { datafluxRum } from '@cloudcare/browser-rum'

datafluxRum.init({
  applicationId: '<DATAFLUX_APPLICATION_ID>',
  datakitOrigin: '<DATAKIT ORIGIN>',
  service: 'browser',
  env: 'production',
  version: '1.0.0',
  sessionSampleRate: 100,
  sessionReplaySampleRate: 100,
  trackInteractions: true,
  defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
})

datafluxRum.startSessionReplayRecording()

After updating the configuration, you can use the following privacy options to override elements in the HTML document:

Mask user input mode: Masks most form fields, such as inputs, text areas, and checkbox values, while recording all other text as is. Inputs are replaced with three asterisks (***), and text areas are obfuscated with x characters that preserve space.

Note

By default, mask-user-input is the privacy setting when session replay is enabled.

Mask mode: Masks all HTML text, user input, images, and links. The text on the application is replaced with X, rendering the page as a wireframe.

Allow mode: Records all data.

Some limitations:

For data security considerations, regardless of the defaultPrivacyLevel mode you configure, the following elements will be masked:

  • Input elements of type password, email, and tel;
  • Elements with the autocomplete attribute, such as credit card numbers, expiration dates, and security codes.
Custom Configuration

Session Replay supports the masking of sensitive elements. You can flexibly set the content to be masked according to business requirements, such as sensitive information like phone numbers. The following are specific operation methods:

Configuring Masking via Element Attributes

You can add the data-gc-privacy attribute to elements that need to be masked, supporting the following four attribute values:

• allow: Allows data collection, no masking.
• mask: Masks content, displaying the content in a masked form.
• mask-user-input: Masks user input, preventing the recording of sensitive input data.
• hidden: Completely hides the content.

Example code:

<!-- Allow data collection -->
<div class="mobile" data-gc-privacy="allow">13523xxxxx</div>

<!-- Mask content -->
<div class="mobile" data-gc-privacy="mask">13523xxxxx</div>

<!-- Mask user input -->
<input class="mobile" data-gc-privacy="mask-user-input" value="13523xxxxx" />

<!-- Hide content -->
<div class="mobile" data-gc-privacy="hidden">13523xxxxx</div>
Configuring Masking via Element Class Names

Supports masking by adding specific class names to elements. Currently, the following class names are supported:

• gc-privacy-allow: Allows data collection. • gc-privacy-mask: Masks content. • gc-privacy-mask-user-input: Masks user input. • gc-privacy-hidden: Completely hides content.

Example code:

<!-- Allow data collection -->
<div class="mobile gc-privacy-allow">13523xxxxx</div>

<!-- Mask content -->
<div class="mobile gc-privacy-mask">13523xxxxx</div>

<!-- Mask user input -->
<input class="mobile gc-privacy-mask-user-input" value="13523xxxxx" />

<!-- Hide content -->
<div class="mobile gc-privacy-hidden">13523xxxxx</div>
Using shouldMaskNode to Implement Custom Node Masking Strategies

In some special scenarios, it may be necessary to perform customized masking on specific DOM nodes. For example, in applications with high security levels, it may be desirable to uniformly mask all text content containing numbers on the page. This requirement can be achieved by configuring the shouldMaskNode callback function to implement more flexible privacy control strategies.

import { datafluxRum } from '@cloudcare/browser-rum'

datafluxRum.init({
  applicationId: '<DATAFLUX_APPLICATION_ID>',
  datakitOrigin: '<DATAKIT ORIGIN>',
  service: 'browser',
  env: 'production',
  version: '1.0.0',
  sessionSampleRate: 100,
  sessionReplaySampleRate: 100,
  trackInteractions: true,
  defaultPrivacyLevel: 'mask-user-input' | 'mask' | 'allow',
  shouldMaskNode: (node, privacyLevel) => {
    if (node.nodeType === Node.TEXT_NODE) {
      // If it is a text node, determine if the content contains numbers
      const textContent = node.textContent || ''
      return /\d+/.test(textContent)
    }
    return false
  },
})

datafluxRum.startSessionReplayRecording()

In the above example, the shouldMaskNode function will judge all text nodes. If the content contains numbers (such as amounts, phone numbers, etc.), it will automatically perform masking processing, thereby enhancing the privacy protection capabilities of user data.

Some Recommendations
  • Priority rules:

    • If both the data-gc-privacy attribute and class name are set, it is recommended to determine the priority according to the project documentation.

  • Use Cases:

    • allow: Suitable for regular data that does not need to be masked.
    • mask: Suitable for sensitive data that needs to be displayed in a masked form, such as phone numbers.
    • mask-user-input: Suitable for scenarios where input content needs to be protected, such as password fields.
    • hidden: Suitable for content that you do not want to display or record.

  • Best Practices:

    • Prefer simple and clear methods (such as class names or attributes) to ensure accurate configuration.
    • In high-sensitivity data scenarios, such as user privacy forms, it is recommended to use mask-user-input or hidden.

Through the above methods, you can flexibly configure the masking rules for sensitive elements, improve data security, and meet business compliance requirements.

Feedback

Is this page helpful? ×