Data observability startup Acceldata Inc. today is adding artificial intelligence-powered data reconciliation capabilities to its platform that automate many manually intensive tasks such as column alignment across diverse datasets, rule creation, and handling nested arrays and complex structures.
Acceldata’s platform provides insights into data quality, pipeline performance and infrastructure efficiency to support more reliable and efficient data operations. Functions include data discovery, monitoring, processing optimization and quality management.
Reconciliation has become more challenging as organizations ingest larger volumes of data into their data lakes for analysis and AI model training. “A common theme is to reconcile data as you move it from left to right to make sure it meets certain quality standards,” said Ashwin Rajeeva, Acceldata’s co-founder and chief technology officer. “Doing this across millions of data assets at scale is incredibly hard. At any point in that pipeline, we can validate that data is reconciled between sources and targets.”
The company is applying a commercial large language model fine-tuned to act on metadata and tackle some of the most common reconciliation issues. AI is used to automate column alignment across diverse datasets to reduce manual intervention and enable more precise data comparisons.
Data engineers “have to figure out what data they’re moving, what columns are important, what is relevant and what transformations are running,” Rajeeva said. “Most of the data enterprises use come in all sorts of formats. It’s not as easy as moving table A to staging area B. There’s often some sort of nested proprietary format that requires marshalling.”
Marshalling refers to transforming an object, data structure or memory representation into a format suitable for transmission, storage or integration with other systems.
From hours to minutes
Manual reconciliation typically involves analyzing data types and writing transformation scripts. “AI lets us expand our capabilities to where we just feed in some of the data formats and it can figure out what is important and relevant without having someone to all the mapping,” he said. “Customers don’t need additional workforce or partners. They can just do it the reconciliation on the platform without much effort.”
Rajeeva said customers testing the new features have reported that reconciliation times have dropped from four hours to 20 minutes. Acceldata doesn’t support all data formats out of the box but “we work with customers to figure out which formants they have. With a little bit of fine-tuning you can get it right in most cases,” he said. The company has 35 connectors to object stores, filesystems, enterprises databases and other sources
Bulk reconciliation policy creation speeds up large-scale reconciliation by automating rule creation to ensure more consistent data validation across multiple sources and formats. AI can also be used to reconcile nested arrays, which are multidimensional data structures that enable complex data organization with a single array.
The platform also supports “upsert” processing, which allows data systems to either update an existing record or insert a new one depending on whether the record already exists in the database. That feature is useful in transactional scenarios where appending new data to existing records – a common practice in data lakes — creates consistency problems. Acceldata uses snapshots and hash-based equality — a way to determine if objects are equal — to figure out automatically whether the contents of a column should be updated or appended.
Rajeeva said AI-assisted reconciliation is optional and comes with a learning curve. “With a little bit of training, we’ve observed that most customers are able to start moving some of the reconciliation workloads within six weeks to 12 weeks,” he said. “What used to take a lot of time, effort and coding can now be done very quickly so you can focus on business problems rather than process problems.”
Image: Unsplash
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU
Leave a Comment