Version: Next

Overview

InLong audit is a subsystem independent of InLong, which performs real-time audit and reconciliation on the incoming and outgoing traffic of the Agent, DataProxy, and Sort modules of the InLong system. There are three granularities for reconciliation: minutes, hours, and days.

The audit reconciliation is based on the log reporting time, and each service participating in the audit will conduct real-time reconciliation according to the same log time. Through audit reconciliation, we can clearly understand InLong The transmission status of each module, and whether the data stream is lost or repeated

Architecture

The audit SDK is nested in the service that needs to be audited, audits the service, and sends the audit result to the audit access layer
The audit proxy writes audit data to MQ (Pulsar, Kafka or TubeMQ)
The distribution service consumes the audit data of MQ, and writes the audit data to MySQL or StarRocks.
The interface layer encapsulates the data of MySQL or StarRocks.
Application scenarios mainly include report display, audit reconciliation, etc.
Support audit and reconciliation of data supplementary recording scenarios.
Support audit reconciliation in Flink checkpoint scenarios.

Module

Modules	Description
audit-sdk	Audit hidden points are reported. Each module uses the SDK to report audit data
audit-proxy	Audit proxy layer, receives data reported by SDK and forwards it to MQ (pulsar/kafka/tubeMQ)
audit-store	Audit storage layer, supporting common JDBC protocol
audit-service	Audit service layer, providing aggregation, cache, OpenAPI and other capabilities

Audit Dimension


Machine ip	Container ID	Thread ID	Log time (minutes)	Audit ID	inlong_group_id	inlong_stream_id	Number of records	Size	Transmission delay (ms)

Audit ID

The receiving and sending of each module are respectively an independent audit item ID

Inlong Service Module	Audit ID
Inlong API Received Successfully	1
Inlong API Send Successfully	2
Inlong Agent Received Successfully	3
Inlong Agent Send Successfully	4
Inlong DataProxy Received Successfully	5
Inlong DataProxy Send Successfully	6

Audit data storage

Audit Store supports writing operations to all storage components compatible with the JDBC protocol. Therefore, when selecting a storage component compatible with the JDBC protocol, it is only necessary to ensure that it meets the following schema:

CREATE TABLE IF NOT EXISTS `audit_data`
(
    `id`               int(32)      NOT NULL PRIMARY KEY AUTO_INCREMENT COMMENT 'Incremental primary key',
    `ip`               varchar(32)  NOT NULL DEFAULT '' COMMENT 'Client IP',
    `docker_id`        varchar(100) NOT NULL DEFAULT '' COMMENT 'Client docker id',
    `thread_id`        varchar(50)  NOT NULL DEFAULT '' COMMENT 'Client thread id',
    `sdk_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'SDK timestamp',
    `packet_id`        BIGINT       NOT NULL DEFAULT '0' COMMENT 'Packet id',
    `log_ts`           TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Log timestamp',
    `inlong_group_id`  varchar(100) NOT NULL DEFAULT '' COMMENT 'The target inlong group id',
    `inlong_stream_id` varchar(100) NOT NULL DEFAULT '' COMMENT 'The target inlong stream id',
    `audit_id`         varchar(100) NOT NULL DEFAULT '' COMMENT 'Audit id',
    `audit_tag`        varchar(100)          DEFAULT '' COMMENT 'Audit tag',
    `audit_version`    BIGINT                DEFAULT -1 COMMENT 'Audit version',
    `count`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message count',
    `size`             BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message size',
    `delay`            BIGINT       NOT NULL DEFAULT '0' COMMENT 'Message delay count',
    `update_time`      timestamp    NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
    INDEX group_stream_audit_id (`inlong_group_id`, `inlong_stream_id`, `audit_id`, `log_ts`)
) ENGINE = InnoDB
  DEFAULT CHARSET = UTF8 COMMENT ='InLong audit data table';

ip: Represents the client's IP address;
docker_id: String of length 100 that represents the client's Docker ID;
thread_id: String of length 50 that represents the client's thread ID;
sdk_ts: TIMESTAMP type that represents the SDK timestamp, with a default value of the current timestamp;
packet_id: 64-bit integer that represents the ID of the data packet;
log_ts: TIMESTAMP type that represents the timestamp of the log, with a default value of the current timestamp;
inlong_group_id: String of length 100 that represents the ID of the target Inlong group;
inlong_stream_id: String of length 100 that represents the ID of the target Inlong stream;
audit_id: String of length 100 that represents the audit ID;
audit_tag: String of length 100 that represents the audit tag, with a default value of an empty string;
audit_version: 64-bit integer that represents the audit version, with a default value of -1;
count: 64-bit integer that represents the message count, with a default value of 0;
size: 64-bit integer that represents the message size, with a default value of 0;
delay: 64-bit integer that represents the message delay count, with a default value of 0;
update_time: TIMESTAMP type that represents the update time, with a default value of the current timestamp.

Architecture​

Module​

Audit Dimension​

Audit ID​

Audit data storage​

Architecture

Module

Audit Dimension

Audit ID

Audit data storage