StarTree Looks To Scale Real-Time Analytics With New Management, Security Features
Legacy “batch” data processing systems have long included tools for managing performance and ensuring security. StarTree is bringing those same capabilities to real-time data analysis and AI operations.
StarTree is bolstering its real-time data analysis platform with a bundle of new capabilities, including performance management tools and data security, the company says are needed to accelerate the adoption of real-time analytics systems.
Many of the new features and practices being added to StarTree Cloud have long been available in legacy “batch data processing” systems. But with the growing use of real-time data systems for AI tasks and analytics, those capabilities are increasingly needed there as well, according to StarTree.
“Doing data management in real time presents a whole other set of challenges,” said Chad Meley, StarTree senior vice president of developer relations and marketing, in a briefing with CRN.
[Related: StarTree Extends Cloud Platform’s Real-Time Analytics Capabilities]
StarTree, founded in 2018 and based in Mountain View, Calif., markets its StarTree Cloud real-time analytics system for a range of customer-facing analytical applications including financial transaction analysis, social media engagement metrics, ad targeting, location-based services, dynamic/surge pricing, and video game leaderboards.
The company’s platform is based on Apache Pinot, an open-source, distributed, real-time analytics database that’s designed for highly concurrent, low-latency queries at large scale. (StarTree’s founders, CEO Kishore Gopalakrishna and founding engineer Xiang Fu, originally developed Pinot.)
Today the majority of data movement operations, such as pulling data from ERP and sales systems to load into a data warehouse for analysis, is done through “batch processing” where data is periodically collected and moved – hourly, daily, weekly, and so on.
But analytical and AI systems today increasingly need data in real time or near-real time to be effective. “Everything is getting more and more real time. It used to be from days to hours to minutes, and now even minutes is not acceptable. Now we're going to seconds,” Chinmay Soman, head of product at StarTree, said in the CRN interview.
Soman also noted that “the scale of data has significantly changed.” Data volumes are bigger, data sets are larger, data consumption rates are faster, and more business users and applications are accessing and making use of that data, he said.
“What used to be done manually or in an inefficient manner is no longer acceptable, especially with real time systems,” Soman said. “We are seeing our customers push us in both the scale of data and then in how ‘realtime’ [analytics systems] can get.”
“I think we're early in the game in terms of real time,” Meley added, comparing the use of real-time data processing and analytics to traditional batch processing. “I think the big theme around what we're announcing here is to really get more mainstream adoption” of real-time analytics.
Meley said these new capabilities will help StarTree and its cloud service provider partners drive adoption of Pinot and StarTree for a broader range of applications. The expanded functionality is also expected to create opportunities for the systems integrators and development partners – he specifically mentioned EPAM – that StarTree is building partnerships with.
New capabilities in StarTree Cloud include:
-Pauseless Ingestion: Ensures data freshness by maintaining a continuous flow of data during data building and upload phases.
-Performance Manager: Using a machine learning interface, this automation feature simplifies the process of optimizing query performance.
-Schema Evolution: This capability allows the database system to accommodate new fields, indexes, altered data types and other structural modifications without disrupting operations.
-Data Backfill: This automated feature addresses incorrect or missing data, when data fails to load or stream correctly, by making it possible to reload data from past events, filling in data gaps and maintaining data integrity.
-Role-Based Access Control Management: RBAC allows organizations to assign and control user access to data based on their roles, ensuring security around sensitive data even when it’s being ingested and analyzed in sub-second windows.
The new capabilities in StarTree Cloud are currently in private preview and expected to be generally available in the first quarter of 2025.