In the race to infuse artificial intelligence (AI) into every corner of the digital world, Microsoft has set its sights on conquering the final frontier of office productivity: the humble spreadsheet.
The tech giant’s latest creation, SpreadsheetLLM, aims to change how businesses crunch numbers and make decisions. By harnessing the power of large language models (LLMs), this AI tool, which is still in the testing stages, could transform Excel from a static grid into a dynamic, question-answering powerhouse — potentially reshaping workflows for millions of users worldwide.
“The infinite cell-like nature and references to cells in spreadsheets make it challenging for LLMs, which have been trained using standard linear tokenization techniques, to understand the spreadsheet data model,” Rogers Jeffrey Leo John, co-founder and CTO of DataChat, a no-code, generative AI platform, told PYMNTS.
At the heart of SpreadsheetLLM lies SheetCompressor, an encoding framework that effectively compresses spreadsheets for use by LLMs. This breakthrough, detailed in a study on the arXiv preprint server, tackles a longstanding challenge in applying AI to spreadsheets.
Microsoft’s SheetCompressor, uses three clever tricks to shrink spreadsheets for AI use. First, it spots and compresses repetitive data. Next, it converts information to a format (JSON) without losing details. Finally, it bundles data together with matching formats. The results are impressive. SheetCompressor cuts down the AI’s workload by 96%. This could mean businesses pay just 1/25th of what they would otherwise for AI to crunch their spreadsheet numbers.
Of course, AI is already useful for manipulating spreadsheets. Microsoft’s Excel Ideas feature uses AI to analyze data and suggest visualizations, charts and pivot tables. Users can simply select a range of data and ask Excel to recommend insights, streamlining the process of identifying trends and patterns.
Google Sheets has introduced Smart Fill, which uses AI to detect patterns in data entry and automatically suggests column completions. This feature saves time on repetitive data input tasks and helps maintain consistency across large datasets.
Startups like Rows and Causal are building AI-native spreadsheet alternatives. Rows, for instance, allows users to pull data from various sources using natural language queries, while Causal focuses on financial modeling with AI-assisted forecasting.
Tiller Money leverages AI to categorize financial transactions automatically in spreadsheets, helping users track expenses and budgets more effectively. The system learns from user corrections to improve accuracy over time.
Spreadsheet.com incorporates AI to suggest formulas and functions based on a sheet’s data and column headers. It can also generate charts and graphs automatically when users select data ranges.
Airtable’s AI assistant can help users create new tables, suggest field types, and even write formulas based on natural language descriptions of what the user wants to achieve.
Microsoft’s team tested SpreadsheetLLM against proprietary models like GPT-3.5 and GPT-4 and open-source offerings like Llama 2, Llama 3, Phi-3, and Mistral-v2. The results were impressive, with GPT-4 showing a 27% improvement in table detection compared to previous methods.
The researchers have also introduced a chain of spreadsheet (CoS) methodology, further refining the AI’s ability to work with spreadsheet data by breaking tasks into manageable steps for the models.
The implications for businesses could be far-reaching. With its massive user base, Excel has long been a cornerstone of Microsoft’s Office suite and a crucial tool across industries. The company aims to enhance spreadsheet functionality by integrating AI capabilities, potentially automating complex tasks, and offering new data interpretation methods.
“SpreadsheetLLM has the potential to transform data analysis in spreadsheets by enabling efficient user interactions and more accurate responses to plain English questions on spreadsheet data,” John said.
However, the technology is not without limitations. Due to token constraints, the framework doesn’t account for visual elements like background colors and borders. The researchers also acknowledge that more work is needed in semantic understanding of cell contents.
While direct integration of SpreadsheetLLM into Microsoft Excel isn’t imminent, the research signals a clear direction for future feature enhancements. The potential to dramatically improve data analysis and spreadsheet insight generation could lead to time savings and new data-driven discoveries for businesses.
The technology could have broader implications for how businesses leverage their data assets. “With technologies like SpreadsheetLLM that provide ways of encoding the knowledge present in spreadsheets to LLMs, business users will now be able to leverage GenAI technologies to combine information from their spreadsheets and data warehouses to make more efficient business decisions,” John predicted.
For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.