Skip to main content

Release 1.3.0

· 4 min read

Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data.

1.3.0 Features Overview

The just-released 1.3.0 version closes about 410+ issues, contains 110+ features and 170+ optimizations. Mainly include the following:

Enhance management and control capabilities

  • Added permission authentication for Open Api
  • Added cluster heartbeat mechanism for Agent and DataProxy
  • Manager adapts to two roles such as user and approver
  • Abstract operations on Load nodes to support easy scaling of Load node resources
  • Supports creation of databases and tables for SQLServer, Oracle and MySQL
  • Enhanced functionality of the Manager client, including but not limited to user and data node management

Extended collection node

  • Support for collecting data in TubeMq
  • Support for collecting data in Redis
  • Support for collecting data in Doris
  • Support to collect data in Pulsar without AdminUrl

Optimize write node

  • Kafka Sink supports All Changelog Mode
  • JDBC Sink supports All Changelog Mode

Support data conversion

  • Support Union operator
  • Support encrypted Udf
  • Support Json Udf
  • Support Temporal Join
  • Support Lookup Join
  • Support Interval Join

Strengthen Agent function

  • Support regular expression custom line break: default "\n" line ending mark, custom regular matching line ending mark can realize multi-line merging and folding
  • Support K8s log collection and carry cluster information
  • Supports standard output, node log collection, and will carry container and cluster information for standard output
  • Support full and incremental collection of file content
  • Supports automatic heartbeat reporting and registration to Manager
  • Support custom IP and get IP automatically

Other optimizations

  • GitHub Action check, pipeline optimization
  • DataProxy improves monitoring capabilities such as auditing and indicator reporting
  • DataProxy adds c++ sdk data reporting capability
  • Sort Support metrics report and audit report

1.3.0 Features Details

Abstracting Load node operations

Manager abstracts Load nodes to support easy expansion of Load node resources and greatly reduce the development time of a Load node This part of the feature was contributed by @ciscozhou

Add permission authentication for Manager Open Api

In the old version, the Manager Open Api can be accessed anonymously, and in the new version, it is implemented using the Apache Shiro framework. Login authentication method based on Basic Access Authentication, this part of the function was contributed by @woofyzhao

Enhanced collection of file data and k8s logs

Version 1.3.0 enhances the collection of file data and k8s data, in which file collection supports regular expression custom line breaks, so that multiple lines can be merged and folded In addition, the new version of Agent supports full and incremental collection of file content. This part of the function was contributed by @ganfengtan

DataProxy adds c++ sdk capability

In addition to the current java client, DataProxy has added c++ client capabilities, which are provided by @pocozh

Supports multiple udf and join operators

The new version of Sort supports three kinds of Temporal Join\ Lookup Join \ Interval Join, this part of the function was contributed by @yunqingmoswu Most community users mentioned the need for encryption and decryption and Json Udf, this part of the function was contributed by @Emsnap and @Emhui

Sort connector supports indicator reporting function

The new version of Sort Connector supports Flink built-in indicator reporting of various Connectors. External indicator systems such as Prometheus can directly obtain the number and rate of task data read and write. In addition, the new version also supports InLong Audit Audit data reporting, which is contributed by @pacigong, @Emsnap, @thesumery @Oneal65 @yunqingmoswu

Manager supports the creation of resources in multiple flow directions

In version 1.3.0, Manager added the creation of some storage resources:

  • Create Topic for SQLServer
  • Create Oracle libraries and tables
  • Create MySQL namespaces and tables

The above are all contributed by community member @haibo-duan, thanks

Other features and bug fixes

For related content, please refer to the Release Notes, which details the features, enhancements and bug fixes of this release.

Apache InLong follow-up planning

In subsequent versions, we will expand more data sources and storages to cover more usage scenarios, and gradually improve the usability and robustness of the system, including:

  • Agent adds Redis, CloudEvents, MongoDB collection types
  • Unified DataProxy MQ framework
  • Full support for Apache Kafka