Apache InLong (应龙) recently released version 2.3.0, which resolved 59 issues, including 3 major features and 50+ optimizations. Mainly completed the support for Transform capabilities in Sort Standalone module for Kafka/HTTP/CLS/ES subscription types, enabled horizontal scaling for Audit Store, and optimized the Transform SDK and DataProxy SDK. This version also optimizes the operational and maintenance experience for Apache InLong. Additionally, numerous other features were implemented in Apache InLong 2.3.0.
About Apache InLong
As the industry's first one-stop, all-scenario massive data integration framework, Apache InLong (Yinglong) delivers automated, secure, reliable, and high-performance data transmission capabilities. It enables businesses to rapidly build stream-based data analysis, modeling, and applications. Currently, InLong is widely used across industries such as advertising, payment, social media, gaming, and artificial intelligence, serving thousands of business use cases. It handles over a million billion records/day in high-performance scenarios and over a hundred billion records/day in high-reliability scenarios.
The core positioning of InLong revolves around three keywords: "one-stop," "all-scenario," and "massive data." For "one-stop," InLong aims to shield technical complexities by providing complete data integration and supporting services for out-of-the-box usability. For "all-scenario," it offers comprehensive solutions covering common data integration scenarios in big data ecosystems. For "massive data," its architecture leverages data pipeline layering, fully extensible components, and built-in multi-cluster management to stably support data scales beyond millions of billions of records/day.
Overview of Version 2.3.0
Apache InLong (应龙) recently released version 2.3.0, which resolved 59 issues, including 3 major features and 50+ optimizations. Key enhancements include:
- Sort: Transform capabilities supported in Sort Standalone module for Kafka/HTTP/CLS/ES subscription types
- Audit: Audit Store supports horizontal scaling, improving system scalability and stability
- SDK: Performance and functional optimizations for Transform SDK and DataProxy SDK
This version also optimizes the operational and maintenance experience for Apache InLong. Other significant features are detailed below.
Dashboard Module
- Added grouping and stream switching features to the audit page, improving usability
- Fixed the issue of multiple API calls triggered by queries on the audit page, optimizing performance
Manager Module
- Added comprehensive audit alarm rule management API, supporting more granular alarm policy configuration
- Supported parsing transformation configurations into transformation SQL
- Enabled configuration support for source fields in receivers, enhancing flexibility
Agent Module
- The agent now supports parallel creation of sender connections to the DataProxy, significantly improving connection initialization efficiency
- Added loading functionality for the agent_ext.properties configuration file to prevent personalized configurations from being lost when agent.properties is overwritten during upgrades
Sort Module
- Sort Standalone module supports Transform functionality for Kafka/HTTP/CLS/ES subscription types
- Deserialization process now supports returning the byte size of data in a single row, improving data processing visualization and accuracy
- Upgraded the Pulsar SDK to version 4.0.3, enhancing stability and functional compatibility
SDK Module
- TransformSDK now supports array index access; the WHERE clause supports the LIKE operator; the str_to_json function can convert KV format data to JSON format
- SortSDK defaults to retrieving GroupId and StreamId from unified metadata if these cannot be obtained from the InLongMsgV0 protocol, enhancing compatibility
- Optimized Golang SDK to fix potential data race issues, improving concurrency safety
Audit Module
- Audit service now supports enabling and disabling custom caching, meeting performance requirements for different scenarios
- Supports audit reconciliation by data stream group dimension, improving verification accuracy
- Added end-to-end reconciliation alarm capabilities, enabling timely detection of abnormal situations
- Added horizontal scaling for Audit Store, enhancing system load capacity
TubeMQ Module
- Fixed TubeMQ image build failure, ensuring stable image generation
Key Features of Version 2.3.0
Dashboard supports audit data reconciliation based on data stream groups
This feature enables querying audit data based on data stream groups to achieve reconciliation functionality

Contributed by @wohainilaodou via INLONG-11894.
Audit Store supports horizontal scaling
When the scale of audit data reaches hundreds of billions, a single Audit Store may face performance pressure. This version introduces horizontal scaling capabilities to effectively enhance the system's capacity

- Build a routing topology between Audit Store clusters and AuditId, GroupId, StreamId, supporting flexible routing rule configuration through regular expressions.
- Each Audit Store writes audit data to the corresponding storage clusters (ClickHouse/StarRocks/MySQL) based on the routing topology.
- Audit Service queries the corresponding audit data based on the routing topology.
Contributed by @doleyzi via INLONG-12009.
Sort supports dynamic Transform capabilities
InLong Transform enhances InLong's Sort distribution capabilities, adapting to complex and diverse data analysis scenarios on the distribution side, improving data quality and data collaboration, simplifying pre-processing operations before data analysis, and focusing on the business value of data. Supports multiple data format parsing (CSV, KV, Protobuf, Json/Bson, Avro, Yaml, XML, Parquet), supports 180+ functions and arithmetic/logical operators. Transform logic supports dynamic changes during Sort runtime without requiring flow interruption or restart, supports seamless migration between Sort tasks, enabling elastic scaling or resource scheduling.

Contributed by @luchunliang via INLONG-11958, INLONG-11937, INLONG-11902, INLONG-11888.
Future Plans
In version 2.3.0, we have enriched and improved our operational capabilities. Welcome everyone to use it. If you have more scenarios and requirements, or encounter any problems during use, please feel free to raise issues and PR. In future versions, the InLong community will continue to:
- Support for collecting from more data sources
- Real-time synchronization supports more data sources and data targets
- Optimized issue of data duplication in weak network environments
We welcome contributions from developers interested in InLong!