Snowflake launches Cortex Analyst, an agentic AI system for accurate data analytics

Snowflake launches Cortex Analyst, an agentic AI system for accurate data analytics

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Snowflake is all set to deploy powerful language models for complex data work. Today, the company announced it is launching Cortex Analyst, an all-new agentic AI system for self-service analytics, in public preview.

First announced during the company’s data cloud summit in June, Cortex Analyst is a fully managed service that provides businesses with a conversational interface to talk to their data. All the users have to do is ask business questions in plain English and the agentic AI system handles the rest, right from converting the prompts into SQL and querying the data to running checks and providing the required answers.

Microsoft Business Intelligence with SQL - YouTube
Microsoft Business Intelligence with SQL – YouTube

Snowflake’s head of AI Baris Gultekin tells VentureBeat that the offering uses a combination of multiple large language model (LLM) agents that work in tandem to ensure insights are delivered with an accuracy of about 90%. He claims this is far better than the accuracy of existing LLM-powered text-to-SQL offerings, including that of Databricks, and can easily accelerate analytics workflows, giving business users instant access to the insights they need for making critical decisions.

Simplifying analytics with Cortex Analyst

Even as enterprises continue to double down on AI-powered generation and forecasting, data analytics continues to play a transformative role in business success. Organizations extract valuable insights from historical structured data – organized in the form of tables – to make decisions across domains such as marketing and sales.

However, the thing is, currently, the entire ecosystem of analytics is largely driven by business intelligence (BI) dashboards that use charts, graphs and maps to visualize data and provide information. The approach works well but can also prove quite rigid at times, with users struggling to drill deeper into specific metrics and depending on often-overwhelmed analysts for follow-up insights.

SQL Server Business Intelligence Features – SQL Server Data Tools
SQL Server Business Intelligence Features – SQL Server Data Tools

“When you have a dashboard and you see something wrong, you immediately follow with three different questions to understand what’s happening. When you ask these questions, an analyst will come in, do the analysis and deliver the answer within a week or so. But, then, you may have more follow-up questions, which may keep the analytics loop open and slow down the decision-making process,” Gultekin said.

To solve this gap, many started exploring the potential of large language models that have been great at unlocking insights from unstructured data (think long PDFs). The idea was to pass raw structured data schema through the models so that they could power a text-to-SQL-based conversational experience, allowing users to instantly talk to their data and ask relevant business questions.

However, as these LLM-powered offerings appeared, Snowflake noted one major problem – low accuracy. According to the company’s internal benchmarks representative of real-world use cases, when using state-of-the-art models like GPT-4o directly, the accuracy of analytical insights stood at about 51%, while dedicated text-to-SQL sections, including Databricks’ Genie, led to 79% accuracy.

“When you’re asking business questions, accuracy is the most important thing. Fifty-one percent accuracy is not acceptable. We were able to almost double that to about 90% by tapping a series of large language models working closely together (for Cortex Analyst),” Gultekin noted.

Top SQL Business Intelligence Software in - Reviews
Top SQL Business Intelligence Software in – Reviews

When integrated into an enterprise application, Cortex Analyst takes in business queries in natural language and passes them through LLM agents sitting at different levels to come up with accurate, hallucination-free answers, grounded in the enterprises’ data in the Snowflake data cloud. These agents handle different tasks, right from analyzing the intent of the question and determining if it can be answered to generating and running the SQL query from it and checking the correctness of the answer before it is returned to the user.

“We’ve built systems that understand if the question is something that can be answered or ambiguous and cannot be answered with accessible data. If the question is ambiguous, we ask the user to restate and provide suggestions. Only after we know the question can be answered by the large language model, we pass it ahead to a series of LLMs, agentic models that generate SQL, reason about whether that SQL is correct, fix the incorrect SQL and then run that SQL to deliver the answer,” Gultekin explains.

The AI head did not share the exact specifics of the models powering Cortex Analyst but Snowflake has confirmed it is using a combination of its own Arctic model as well as those from Mistral and Meta.

How exactly does it work?

To ensure the LLM agents behind Cortex Analyst understand the complete schema of a user’s data structure and provide accurate, context-aware responses, the company requires customers to provide semantic descriptions of their data assets during the setup phase. This fills a major problem associated with raw schemas and enables the models to capture the intent of the question, including the user’s vocabulary and specific jargon.

“In real-world applications, you have tens of thousands of tables and hundreds of thousands of columns with strange names. For example, ‘Rev 1 and Rev 2’ could be iterations of what might mean revenue. Our customers can specify these metrics and their meaning in the semantic descriptions, enabling the system to use them when providing answers,” Gultekin added.

As of now, the company is providing access to Cortex Analyst as a REST API that can be integrated into any application, giving developers the flexibility to tailor how and where their business users tap the service and interact with the results. There’s also the option of using Streamlit to build dedicated apps using Cortex Analyst as the central engine.

In the private preview, about 40-50 enterprises, including pharmaceutical giant Bayer, deployed Cortex Analyst to talk to their data and accelerate analytical workflows. The public preview is expected to increase this number, especially as enterprises continue to focus on adopting LLMs without breaking their banks.  The service will give companies the power of LLMs for analytics, without actually going through all the implementation hassle and cost overhead.

Snowflake also confirmed it will get more features in the coming days, including support for multi-turn conversations for an interactive experience and more complex tables and schemas.

VB Daily

Stay in the know! Get the latest news in your inbox daily

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

What Is the Difference Between Microsoft SSRS, SSIS and SSAS?

Janice Uwujaren has been writing professionally since 1996. Her articles have been published on various websites. Her experience includes developing content for proposals, websites, training materials and technical documentation. Uwujaren has a Bachelor of Science in computer information systems from Strayer University.

What Is the Difference Between Microsoft SSRS, SSIS and SSAS?

Janice Uwujaren has been writing professionally since 1996. Her articles have been published on various websites. Her experience includes developing content for proposals, websites, training materials and technical documentation. Uwujaren has a Bachelor of Science in computer information systems from Strayer University.

Database Admins See Brighter Job Prospects Amid IT Challenges

The field of database administration is far from glamorous. Even within the uncompromisingly geeky domain of information technology, database administration would be among the last fields picked for a game of proverbial kickball.

However, the future employment landscape for tech jobs remains uncertain in these changing times. Despite its somewhat unglamorous reputation, database management and administration is an industry experiencing rapid growth, persistent talent shortages, and significant changes brought by AI. This field has much potential for those willing to cash in on its viable career path.

These watershed moments invariably present significant opportunities as well as challenges. Percona, an open-source database software, support, and services firm, optimizes how databases and applications run. Dave Stokes, a technology evangelist and database veteran at the company, is passionate about helping today’s aspiring database administrators (DBAs) find their way.

Stokes has decades of experience in the DBA field. He often speaks on a wide range of cutting-edge topics regarding database operations. With his finger on the IT pulse, he offers the knowledge and expertise needed to mentor effectively.

Reshaping the Business World’s Database Needs

Perusing any industry report on the role of DBA in 2024 confirms that emerging trends and data solutions make for an ever-changing IT landscape. Today’s data environment generates more than two quintillion bytes of data per day.

Demands for higher-quality data and real-time results push various database platforms to their limits. As a result, DBAs require increasingly sophisticated skill diversification.

Artificial intelligence is only one factor. Other challenges include managing on-premises operations and handling cloud migration and security. Evolution, adaptation, and innovation define DBA’s morphing trends.

Open-source databases are now more popular than their commercial counterparts. More organizations are dependent on PostgreSQL and MySQL, according to Stokes.

“This standardization on the choice of databases provides DBAs with more employment opportunities and gives employers a vast talent pool from which to fish,” he said.

Insider’s View of the World of Database Administration

We spoke with Stokes extensively about his view of the state of database management. He observed that the traditional DBA is virtually nonexistent “in the wild” today.

Dave StokesPercona Tech Evangelist

A significant portion of conventional work has been moved to site reliability engineers or cloud providers’ services. As databases have mushroomed in size and scope, some functions, such as query optimization, have been ignored.

“Much of what was handled by a DBA is now compensated for by purchasing ever larger cloud chunks,” Stokes told TechNewsWorld. “Institutional knowledge about an organization’s data was abandoned when the DBA role was supplanted.”

Until that impact escalates, trying to figure out how some data is structured and how it impacts ongoing operations is a tertiary consideration, he observed.

But there is good news about where DBA is headed. Some data workers are interested in the functions performed by a DBA, even if they do not have that title.

“Query tuning, defining data structures, server optimization, and administration of the instance itself have value,” he said.

Demystifying DBA Dilemmas: Q&A

Dave Stokes shared more insights on the latest trends, technologies, and challenges in the field of database administration. From the impact of cutting-edge technologies to the evolving role of DBAs, Stokes offered valuable perspectives on navigating the complex landscape of database management.

TechNewsWord: What cutting-edge database technology is impacting this field?

Dave Stokes: Vector data for machine learning will consume unprecedented amounts of disk space, processor cycles, and administrative time. Moving a copy of a model to another location for training will incur expensive transfer fees, require monitoring, and take up even more disk space.

JSON is the data interchange of choice for most. Storing data in a JSON format is not as efficient as storing it in traditional data types. Extracting some JSON values and storing them as a traditional data type can speed processing but add complexity.

Replicated data over several data centers is very common. Managing that data, spread out over a continent or globe, is tricky.

How are changing business and industry trends affecting database administration?

Stokes: The ability to add more processing power or disk space by clicking a checkbox on a webpage and paying for it with a credit card has revolutionized data administration. There is no more worrying about getting approval for a capital expenditure, capacity planning, or sweating the optimizations needed.

Lead time for expansion is now nonexistent. When a company may have had a dozen databases at the beginning of this century, they now can have tens of thousands of them.

Need to expand into AI? Then load billions of records into a cloud account and worry about the quality and quantity later. And as data lakes creep into data oceans, the data still needs to be managed, backed up, and monitored.

How will automation and AI impact changes in DBA?

Stokes: AI is needed in the database itself. However, general AI adoption by an organization means more disk space, processor cores, data migrations, and backups.

An optimizer that can spot data usage patterns and recognize the need to cache specific data or autotune buffer usage would be a big plus. Smarter query optimizations and user usage patterns could shift server capacities to accommodate data needs.

How is the shift in DBA from on-site to the cloud impacting business, or is it vice-versa?

Stokes: Those who could move their data to the cloud have found a big benefit in many cases. Scaling is now a function of using a credit card. Backups, server failovers, and software upgrades are handled by the cloud vendor.

For many, the need for an in-house DBA has been replaced by a dependency on their cloud provider. Some have found the cloud too expensive and have returned to on-premises operations. In these cases, they need to have staff to handle the traditional work of a DBA.

What role is cloud migration now playing in the cost and efficiency of DB operations?

Stokes: Costs are steadily increasing. Businesses used to be reluctant to spend money on capital expenditures, and upgrading servers was a complicated process, often taking months.

In the cloud, upgrades are an operational budget expense done on a credit card almost instantaneously. Why optimize data or a server when itemizing a bill is easier and faster?

Why is DBA still a relevant job in 2024?

Stokes: Although the title may not say DBA, someone will always be needed to monitor, tune, optimize, and guide database instances. These may be seen as hygiene factors, but the reliability of the data requires them.

How can young professionals find success in this industry?

Stokes: Learn Structured Query Language (SQL). There is a reason why it is the only computer language that survived from the 1970s. It matches business logic exceptionally well and is designed to deliver the information requested in a way it can be used.

Data normalization is also critical. Poorly defined data structures are slow and become challenging to manage over time.

Lastly, communication is key. The ability to express why a change to a table that seems simple to a requester can shut down a petabyte of information for hours can save an organization from disaster.

Where do you see DBA and its needs headed?

Stokes: Better data backup and faster data restoration are always needed. Much attention is paid to the time and financial costs of recovering data, and there will be a push to reduce these costs.

Security enhancements will be pursued. It is still too easy to have a minor slip-up, either in the cloud or on-premises, that leads to finding your data on the front page of a newspaper.

DBAs will need better tools to handle the explosive growth in the scope and scale of the instances they manage.

What Is the Difference Between Microsoft SSRS, SSIS and SSAS?

Janice Uwujaren has been writing professionally since 1996. Her articles have been published on various websites. Her experience includes developing content for proposals, websites, training materials and technical documentation. Uwujaren has a Bachelor of Science in computer information systems from Strayer University.

How to Start SQL in Single-User Mode

Ruri Ranbe has been working as a writer since 2008. She received an A.A. in English literature from Valencia College and is completing a B.S. in computer science at the University of Central Florida. Ranbe also has more than six years of professional information-technology experience, specializing in computer architecture, operating systems, networking, server administration, virtualization and Web design.

CIO 100 Award winners drive business results with IT

“This is a new way to interact with the web and search. It’s not just returning a list of webpages, but it’s [giving users] richer content, and the content that’s produced through the generative AI search results allows [users] to go to parts of the web property that is the genesis of that information so they can dig deeper if they want,” says Michael A. Pfeffer, SVP and chief information and digital officer, as well as associate dean and clinical professor of medicine. “It’s an exciting new way of finding information.”

Moderna’s own gen AI product democratizes use of AI across company

Organization: Moderna

Project: mChat (Moderna Chat)

IT leader: Brad Miller, CIO

Moderna quickly seized on the potential of generative AI with its creation mChat. Shorthand for Moderna Chat, mChat is a home-built generative AI client for large language models (LLMs) such as GPT, Claude, and Gemini.

Moderna launched mChat to offer employees access to gen AI models in a highly secure, private, user-friendly interface.

From the start of this initiative, Moderna’s IT leadership recognized the privacy concerns associated with using an external platform such as ChatGPT. So CIO Brad Miller along with head of AI engineering Andrew Giessel and others opted to create an alpha application using OpenAI’s backend API, with a zero data retention architecture to protect the company’s data.

This enabled the team to expose the technology to a small group of senior leaders to test. Following a strict data control and security review and that successful test with senior leaders, the company decided to roll out mChat to all employees just six weeks later.

To help ensure adoption, the company used its AI Academy and created a transformation team to train employees on its use and potential. Such efforts paid off, as nearly half of Moderna employees were actively using mChat within two months after its launch, and nearly 65% just months later. Moreover, that 65% represented nearly all employees who had access to devices that could use mChat.

Moderna, which continues to evolve the application, considers mChat transformational.

“Embedding AI into your workforce by upskilling all employees can lead to a dramatic increase in the value each employee brings, while allowing people to focus on the work that really matters, massively improving productivity,” says Brice Challamel, VP of AI products and platforms at Moderna.

Data transformation gives Neighborly competitive advantage

Organization: Neighborly

Project: A New Analytics Era — a Transformative Journey for the Home Services Industry

IT leader: Amer Waheed, CTO

After using experience and intuition to make decisions for most of its first 40-plus years of existence, Neighborly is embracing a new data platform that enables data-driven decision-making.

The company started its New Analytics Era initiative by migrating its data from outdated SQL servers to a modern AWS data lake. It then built a cutting-edge cloud-based analytics platform, designed with an innovative data architecture. It also crafted multiple machine learning and AI models to tackle business challenges. And it created a new dashboard portal in QuickSight to provide a comprehensive view to track the results of each implemented action. This democratized data and disseminated crucial business insights across the entire organization.

“We created a platform to ingest, process, and get value from the data, so we could understand what the data is telling us,” explains Neighborly CTO Amer Waheed.

Waheed says creating a data science team, led by Karen Nogueira, VP of data and analytics, was instrumental to success. So was articulating the business value the data platform could deliver.

Fully deployed after several years of work, the platform allows Neighborly, a home services company, to detail and understand a customer’s journey and expectations and, thus, enable the company to tailor services to better meet customer needs. Those benefits improve customer satisfaction, support franchise owners, and help Neighborly grow its business.

“The project is really about a whole new way of doing business,” Waheed says.

Nogueira says the new data platform gives Neighborly a competitive advantage, helping the company “increase efficiency, reduce costs, generate more revenue, and ultimately get results faster.”

Novva’s water-free cooling system improves data center sustainability, performance

Organization: Novva Data Centers

Project: Colorado Springs Data Center’s Innovative Water-Free Cooling System Saves Millions of Gallons of Water Annually

IT leader: Steve Boyce, Vice President of Mission Critical

Novva Data Centers struck success for sustainability at its Colorado Springs facility by implementing a proprietary water-free cooling system.

The company used elevated floors, surrounding air, and heat exchange coils to create a system that cools the facility’s servers without wasting water. The system recycles heated air through heat exchange coils or uses refrigerant in a closed loop to convert it back to cold air. It also takes advantage of cooler nighttime temperatures, utilizing ambient air cooling for 75% to 80% of the year; it uses a hybrid system that combines ambient air with the water-free cooling technology to optimize efficiency for the remaining 20% to 25% of the year.

Novva calculates that its system saves between 150 and 200 million gallons of water annually that would otherwise have been needed to cool its facility. The accomplishment is particularly significant given concerns about the rapid depletion of water in Colorado River and proposed cuts in water usage in that region.

“Water is a huge resource, and we want to do what we can to conserve it,” says Jared Coleman, automation and controls manager for Novva. He notes that the system not only saves resources but also has improved data center performance. “This project really exemplifies how we do business. You don’t have to exploit the environment for business, and business doesn’t have to suffer because of environmental conscience.”

Novva’s approach goes against conventional designs among data centers, which have — and usually still do — use water as part of their primary cooling method, a practice that often stresses local resources and has raised concerns among stakeholders who want to or are required to track their environmental footprints.

Novva had first deployed its water-free cooling system at its Utah campus. Coleman says it plans to implement this system at all of its data centers.

OHLA taps AI for insurance compliance to reduce risks, yield savings

Organization: OHLA USA

Project: Leveraging AI & Automation to Achieve Subcontractor Insurance Compliance

IT leader: Srivatsan Raghavan, CIO

OHLA USA, a $1.2B company specializing in infrastructure projects, manages dozens of projects with hundreds of subcontractors performing about a third of the work. The company handles 700-plus claims annually, and it relies on insurance to mitigate financial risks.

Despite the criticality of insurance in the industry, OHLA found its longstanding insurance tracking system and the manual input work it required created both efficiency and risk concerns. Executives estimated those issues could result in millions of dollars in noncompliance costs.

So OHLA set out to replace the custom-built software it used to manage its insurance tracking with a modernized process. It sought to re-engineer the workflow and integrate process automation with artificial intelligence to transform how it handles insurance compliance.

The new system, deployed in 2023, eliminates the need for project managers to manually enter insurance certificate details. Process automation extracts email attachments, and a custom AI model extracts and saves policy details in a database.

Additionally, it offers a UI that the risk department can use for storing policy requirements and baseline limits as well as for comparing policies against those baselines. This capability allows the department to promptly identify deviations and quickly alert stakeholders.

CIO Srivatsan Raghavan says his team leaned on its experience and learnings from prior AI-enabled projects to envision the significant improvements that AI could bring to the insurance compliance function.

“We are looking for good ROI use cases for AI, and we saw this as a worthy use case to chase,” he adds.

Indeed, the company has reaped big returns. The new system reduces administrative workload and minimizes the risk of errors and noncompliance, thereby enhancing operational efficiency and risk management. OHLA estimates the new system will reduce the subcontractor noncompliance rate from 10% to 2%, yielding a potential yearly savings of $4.4 million.

New Putnam platform accelerates application development

Organization: Putnam Investments, now Franklin Templeton

Project: Putnam Investments Cloud-First DevOps Test Data Management Platform to Deliver World-Class Digital Customer Experiences

IT leader: Sumedh Mehta, CIO at time of project

Digital leadership at Putnam Investments (which was acquired by Franklin Templeton in January 2024) recognized the need for development teams to deliver software at a speed that matched the changing needs of its customers.

So it tasked the engineering team with developing a new technology architecture, with tools and processes to enable innovation and change at speed. The company sought to use higher levels of automation to develop and release software in a continuous improvement and continuous delivery (CI/CD) cycle.

The result: a cloud DevOps test data management platform that enables teams to rapidly stand up new data environments.

“We were after a quick and cost-effective way to provide quality testing data to investment users and technology associates,” says Joseph Gaffney, who sponsored and directed this project at Putnam Investments and is now VP of IT at Franklin Templeton.

“This project was transformative because we were able to give our business users and agile technology teams access to quality data in a matter of minutes versus days,” he adds. “The biggest benefit is the agility it provides our business users and technology teams to continually improve and move fast. There is no more waiting around for quality data. So, from a software development perspective, the ROI is huge. Developers get access to production quality data whenever they need it. This greatly increases our time to market. From a cloud cost perspective, the Delphix data virtualization has helped us realize significant cost savings in way of storage. We no longer need physical databases with storage attached in our test environment.”

Gaffney says Franklin Templeton now uses the platform to help with its integration projects.

Regeneron turns to data to accelerate drug discovery and development

Organization: Regeneron Pharmaceuticals

Project: Centralized Data Platform: Using Data to Uplift Science

IT leader: Bob McCowan, SVP and CIO

Data is critical for drug discovery, development, and commercialization. As such, Regeneron employees need to access and analyze data from multiple sources to help them reduce experiments, streamline workflows, and improve process understanding and control.

However, data silos, data integration, limitations in data findability, a lack of common data vocabulary, variation in data management practices, and other issues presented challenges for employees looking to access and use data to advance their work and the company’s objectives.

Consequently, Regeneron saw the need to overhaul its data management practices to better allow workers to derive actionable insights and make data-driven decisions more efficiently in near real-time.

“Our data was in the jail and we needed to liberate it,” says Hussain Tameem, associate director, solution partner, with Regeneron Research & Preclinical Development IT (RAPD-IT). “We wanted to bring in data from diverse sources and make it available consistently to all users so that it can support their work.”

To do that, Regeneron created the Centralized Data Platform (CDP), a cloud solution that leverages data lake and data catalog technologies as well as AI and machine learning to enable a unified approach to data access, governance, integration, and analytics across the company’s Preclinical Manufacturing & Process Development and Good Manufacturing Practicing teams.

The platform automates lengthy data processing steps, enables scientists to analyze data efficiently, and increases process insights. It also supports process development, technology transfer, and manufacturing process improvement — all of which supports the company’s mission of bringing new medicines to patients.

And it dramatically reduces the time and effort that data scientists spend requesting and organizing data, giving them more time to actually analyze it.

“This platform makes high-quality data available in a way that people could start interrogating it,” says SVP and CIO Bob McCowan. “Information that we weren’t able to see is now very visible, and now we can drive significant value from it.”

More US CIO 100 Award winners

The following articles provide an in-depth look at these and more of our 2024 US CIO 100 Award winning projects:

The AES Corp.: “AES enlists AI to boost its sustainable energy business”Alejandro Reyes, AES Clean Energy Chief Digital Officer Chipotle: “Robots make a smash in Chipotle kitchens”Curt Garner, Chief Technology & Consumer Officer Dow: “Data literacy, governance keys to transformation at Dow”Melanie Kalmar, CIO Expion Health: “Expion Health revamps its RFP process with AI”Suresh Kumar, Chief Transformation Officer, Mergers & Acquisitions King County: “King County enlists AI to reduce drug overdose deaths”Megan Clarke, CIO Marine Depot Maintenance Command: “Marine Corps enlists RPA, 5G, and AR/VR to retool fighting force”George Lamkin, Assistant Chief of Staff, G-6 The MITRE Group: “Going ‘AI native’ with in-house ChatGPT the MITRE way”Deborah Youmans, Vice President and CIO The Mosaic Company: “Mosaic builds a global IT foundation for growth”Jeff Wysocki, CIO and Vice President TIAA: “TIAA modernizes the customer journey with AI”Sastry Durvasula, Chief Information & Client Services Officer Tractor Supply Co.: “Tractor Supply enlists AI to deliver ‘legendary’ customer service”Rob Mills, EVP, CTO, Digital Strategy UPS: “UPS delivers customer wins with generative AI”Bala Subramanian, Chief Digital and Technology Officer US Med-Equip: “US Med-Equip eases hospital pain point with AI, RPA”Antonio Marin, CIO Full list of 2024 CIO 100 Award Winners
Google adds Gemini to BigQuery, Looker to help with data engineering

For Steven Dickens, chief technology advisor at The Futurum Group, “The introduction of Gemini in BigQuery adds competitive pressure on rival data analytics platform providers as this is a dynamically competitive space where each provider continuously is seeking to outdo the other by offering advanced functionalities.”

Other alternatives from the likes of Oracle, MongoDB, Databricks, and Snowflake also offer similar solutions, Dickens added.

However, Wurm pointed out that since all major data analytics provider are focusing on the strategy of simplifying user experiences via generative AI, the competition is no longer about who offers generative AI, but rather, which vendor can provide a pricing model that will reduce friction to enterprise adoption and generate the greatest return on investment.

Leave a Comment