Skip to main content

Version 2.2.0 Released

· 4 min read

Apache InLong (应龙) recently released version 2.2.0, which resolved 87 issues, including 3 major features and 80+ optimizations. Key enhancements focus on improving Agent stability and resource management in data collection, multi-language SDK support with performance optimizations, and enhanced traffic control capabilities for Sort. This version also optimizes the operational and maintenance experience for Apache InLong. Additionally, numerous other features were implemented in Apache InLong 2.2.0.

About Apache InLong

As the industry's first one-stop, all-scenario massive data integration framework, Apache InLong (Yinglong) delivers automated, secure, reliable, and high-performance data transmission capabilities. It enables businesses to rapidly build stream-based data analysis, modeling, and applications. Currently, InLong is widely used across industries such as advertising, payment, social media, gaming, and artificial intelligence, serving thousands of business use cases. It handles over a million billion records/day in high-performance scenarios and over a hundred billion records/day in high-reliability scenarios.

The core positioning of InLong revolves around three keywords: "one-stop," "all-scenario," and "massive data." For "one-stop," InLong aims to shield technical complexities by providing complete data integration and supporting services for out-of-the-box usability. For "all-scenario," it offers comprehensive solutions covering common data integration scenarios in big data ecosystems. For "massive data," its architecture leverages data pipeline layering, fully extensible components, and built-in multi-cluster management to stably support data scales beyond millions of billions of records/day.

Overview of Version 2.2.0

Apache InLong (应龙) recently released version 2.2.0, which resolved 87 issues, including 3 major features and 80+ optimizations. Key enhancements include:

  • Agent: Improved stability in data collection and resource management capabilities
  • SDK: Multi-language support and performance optimizations
  • Sort: Enhanced traffic control capabilities

This version also optimizes the operational and maintenance experience for Apache InLong. Other significant features are detailed below.

Dashboard Module

  • Fixed page refresh issues after switching
  • Optimized user login verification logic

Manager Module

  • Added support for SQL data sources
  • Enhanced JDBC validation
  • Enabled configuration of Inlong Properties metadata for Pulsar data sources

Agent Module

  • Optimized instance lifecycle control mechanism
  • Introduced global instance count limits
  • Fixed duplicate file collection issues
  • Enhanced recovery capabilities in exceptional scenarios
  • Improved accuracy in log collection

Sort Module

  • Introduced traffic control for SortStandalone
  • Optimized escaping processing for KV/CSV formats
  • Added Exactly Once audit reporting for MySQL CDC

SDK Module

  • Go SDK: Added connection pool lifetime limits to improve stability in high-concurrency scenarios
  • Optimized JAVA SDK implementations
  • SortSDK: Shared a single PulsarClient across SortTasks to avoid performance bottlenecks from excessive clients

Audit Module

  • Supported CDC auditing for MySQL Binlog

TubeMQ Module

  • Fixed intermittent consumption interruption issues
  • Saved consumption offsets to local files

Key Features of Version 2.2.0

DataProxy SDK Refactoring

The original SDK provided two separate Sender objects for TCP and HTTP reporting protocols, with independent logic and high maintenance costs. Error returns were coarse-grained without detailed error categorization.

This version reconstructs the DataProxy Java SDK to reduce user complexity:

  • Decoupled Sender construction and initialization: Heavy operations (e.g., thread pool initialization, metadata fetching) are moved from the constructor to a new start() method, allowing faster object creation and explicit startup control.

  • TCP reporting with QoS: Three modes are introduced:

    • QoS 0: SDK sends continuously without awaiting responses; DataProxy forwards requests silently.2.2.0-sdk-qos0
    • QoS 1: SDK awaits DataProxy acknowledgment before DataProxy forwards to MQ. 2.2.0-sdk-qos1
    • QoS 2: SDK awaits MQ-level confirmation from DataProxy 2.2.0-sdk-qos2

    Contributed by @gosonzhang via INLONG-11729 and INLONG-11720.

Enhanced Agent Stability and Resource Management

Agent now enforces a global instance limit (default: 100, configurable via agent.instance.limit). Requests to add tasks are rejected if they exceed this limit, preventing resource exhaustion.

2.2.0-agent-global-control

Contributed by @justinwwhuang via INLONG-11760 and INLONG-11776.

Sort Traffic Control Enhancements

Previously, SortStandalone allocated up to 1/3 of resources to new tasks, leading to idle resource waste. Version 2.2.0 introduces:

  • Lazy-loaded Flow Controller: Physical resources are allocated only when a task's data stream is consumed. 2.2.0-sort-resource-assign-1
  • Dynamic resource allocation: Resources scale based on actual traffic demand. 2.2.0-sort-resource-assign-2
  • Failure handling: Blocks data flow for high-failure-rate tasks to prevent systemic degradation. 2.2.0-sort-resource-assign-3

Contributed by @luchunliang via INLONG-11836 and INLONG-11841.

Future Plans

In version 2.1.0, we have enriched and improved our operational capabilities. Welcome everyone to use it. If you have more scenarios and requirements, or encounter any problems during use, please feel free to raise issues and PR. In future versions, the InLong community will continue to:

  • Supporting more data source connectors
  • Expanding real-time synchronization sources and targets
  • Optimizing SDK capabilities and usability
  • Improving Dashboard experience

We welcome contributions from developers interested in InLong!