Large language models (LLMs) like ChatGPT are transforming the way we interact with technology, and their impact on spreadsheet software like Microsoft Excel is nothing short of revolutionary. As an AI for Marketing consultant, I've been deeply exploring the integration of LLMs and spreadsheets, and the potential benefits for businesses are immense.

In this post, I'll explain four key approaches to leveraging LLMs within Excel, highlighting their strengths, weaknesses, and the exciting possibilities they unlock:

1. LLM as a Spreadsheet Mentor:

Imagine having an on-demand Excel expert at your fingertips. LLMs can act as virtual mentors, providing guidance and support directly within the spreadsheet environment. Need help with a complex formula? Unsure how to leverage pivot tables effectively? Simply ask the LLM in plain English, and it will provide tailored solutions, suggest relevant functions, and even troubleshoot errors. This approach democratises access to advanced spreadsheet features, empowering users of all skill levels to become more proficient and productive.

Excel boasts over 500 functions, with more being added and updated regularly. Beyond these functions, there are additional features like data analysis, formatting, pivot tables, and VBA, all of which require a certain level of expertise to utilise fully. This is where the LLM, as a mentor, truly shines. While tools like Claude, ChatGPT, Gemini, or Copilot may not know every single function, they are familiar with most and can even write VBA code (or Google Apps Script in the case of Google Sheets) to help users master the spreadsheet environment.

However, to maximise the benefits of an LLM mentor, it's crucial to describe your dataset accurately. Without clear descriptions, the outputs provided might not hit the mark. For instance, if you don't specify column references, the formulas generated by the LLM might require manual adjustments to include the correct cell references. I've experienced cases where ChatGPT provided me with formulas that used table headers instead of specific cells; the logic was correct, but I had to tweak the formula to fit my data. Mistakes can occur even when trying to import a picture of the data that includes column headers. For example, after I uploaded a spreadsheet screenshot and explained my goal, ChatGPT gave me an incorrect formula because it misinterpreted the column reference, choosing G instead of H.

Despite these occasional hiccups, using an LLM as a spreadsheet mentor is a brilliant way to get more out of your data. Whether you're struggling with a complicated formula filled with nested IF statements or finding it challenging to get VLOOKUPs working as you want, an LLM can provide more elegant solutions, often introducing functions you didn't realise existed. For instance, ChatGPT might suggest using an INDEX-MATCH combination instead of a VLOOKUP, saving the day with a more efficient approach. This kind of guidance not only solves immediate problems but also enhances your overall spreadsheet proficiency, making the LLM an invaluable tool for anyone looking to elevate their data management and analysis skills.

However, some challenges remain:

  • Assumed Knowledge: LLMs may sometimes assume a higher level of user knowledge than is actually present, leading to incomplete or confusing instructions.

  • Hallucinations: LLMs can occasionally generate incorrect or fabricated information, requiring users to verify the accuracy of their responses.

  • Training Data Limitations: LLMs are constrained by the data they were trained on, which may not include the latest updates or features in spreadsheet software.

2. LLM as an Analyst with File Ingestion:

Advanced LLMs like GPT-4 can directly interact with spreadsheet data through file ingestion. This allows users to upload their spreadsheets and have the LLM perform data manipulation, analysis, and visualisation tasks within a secure sandbox environment. This approach eliminates the need for manual data manipulation, enabling users to focus on extracting insights and making data-driven decisions.

ChatGPT and Advanced Code Analysis

ChatGPT, for instance, can receive files in CSV, Google Sheets, and Excel formats. By leveraging advanced code analysis—formerly known as the code interpreter—ChatGPT can directly manipulate the data, providing users with a powerful tool for conducting in-depth analysis. This functionality allows users to request specific insights or analysis areas using natural language, and ChatGPT will outline the steps it will take before executing them. This process makes complex data tasks accessible to users without deep technical knowledge, empowering them to perform advanced data analysis without writing a single line of code.

The power of this approach lies in its remarkable efficiency in handling data manipulation, processing, and visualisation. Users can generate correlation heatmaps in mere seconds, identify trends, and create a range of visualisations that would otherwise take significant time to produce manually. For tasks such as statistical analysis, this capability is particularly valuable, allowing analysts to explore data from multiple angles quickly and effectively.

However, there is a critical caveat to this technique: the quality of the text-based insights provided by the LLM about the data. While the charts and graphs generated by ChatGPT are typically accurate, the accompanying descriptions and insights often leave much to be desired. In my experience, ChatGPT frequently misinterprets the data it visualises. For example, a simple bar chart might incorrectly state that one bar is higher than another, even when the opposite is plainly visible. Similarly, it may confuse different cohorts within the data, leading to incorrect conclusions.

In the example below, you can see ChatGPT incorrectly describes the satisfied versus dissatisfied bar chart for the 19-35 year old cohort. This is one of many examples I have encountered where the description is simply wrong.

chatgpt-incorrect-description-of-bar-chart
These inaccuracies highlight a significant limitation: the need for a human in the loop remains essential when using AI tools for data analysis. While the LLM can automate many aspects of data processing and visualisation, it is crucial for analysts to verify the insights generated manually. Failure to do so could result in presenting incorrect data interpretations to stakeholders, potentially undermining trust and credibility. Therefore, while LLMs like ChatGPT offer tremendous potential as analytical tools, they should be used cautiously, ensuring that human oversight is part of the process to validate the results.

One of the most significant advantages of using LLMs like ChatGPT for file ingestion and data analysis is their ability to handle multiple sheets simultaneously. In one instance, I had a series of related but disparate files: a customer list, sales data, product data, retail store data, and a sheet with currency conversions for five currencies covering every day over five years. Within minutes, ChatGPT seamlessly combined these sheets into a master sheet, normalising all global transactions into GBP over the entire five-year period. The speed and accuracy with which it completed this task were remarkable—something that would have taken me an immense amount of time and effort to do manually. In fact, I might not have even managed to complete the task successfully without this tool. Upon reviewing the data, I found no inaccuracies, showcasing the powerful capabilities of LLMs in managing and processing complex datasets efficiently.

But this approach also has its drawbacks:

  • Limited Functionality: Current implementations may not support all advanced spreadsheet features, such as creating interactive charts or complex pivot tables.

  • Hallucinations (Again): LLMs may misinterpret data or provide inaccurate analysis, highlighting the need for careful scrutiny of their outputs.

  • Error Handling: LLMs may struggle to identify and resolve errors within spreadsheets, particularly those related to logic or formula mistakes.

3. Integrated Tools within Office Productivity Suites:

The future of LLM spreadsheet integration lies in native integration within office productivity suites like Microsoft 365 and Google Workspace. Tools like Microsoft Copilot and Google Gemini for Workspace offer a glimpse into this future, enabling users to interact with their spreadsheets using natural language commands.

Gemini for Sheets

Google Gemini for Sheets is particularly promising in this regard. Acting as an in-situ spreadsheet mentor, Gemini for Sheets can understand your data to a certain extent and assist in creating formulas directly within your sheet. This makes it easier for users, especially those less familiar with advanced spreadsheet functions, to automate tasks and analyse data without needing to manually write complex formulas.

Google Gemini for Sheets Screenshot

However, while Gemini for Sheets is undoubtedly useful, it has limitations. For instance, the tool might generate more complex formulas than necessary, such as opting for =TO_DOLLARS(SUM(ARRAYFORMULA(VALUE(D2:D30)))) instead of the simpler =SUM(D2:D30). This can be problematic if the data range specified in the formula doesn't match the actual data in your sheet which I have found to be the case often.

Additionally, while Gemini for Sheets excels at creating and inserting formulas, it currently falls short in other areas, such as conditional formatting. Instead of applying the formatting directly, it provides step-by-step instructions, which, while helpful, might not be as efficient as direct application. Nevertheless, the rapid pace of development suggests that these features will continue to evolve, likely becoming more refined and capable in the coming months.

Microsoft Copilot for Excel

Microsoft Copilot for Excel allows users to interact directly with their data and draft formulas that are then inserted seamlessly into the spreadsheet. Unlike Google Gemini for Sheets, Copilot stands out by offering more advanced functionality, including the ability to directly apply conditional formatting to data. This feature enables users to not only automate calculations but also enhance data visualisation and easily highlight key trends or anomalies.

However, while Copilot is highly capable, it does have its limitations. For instance, the size of the data being analysed can significantly impact its performance. Large datasets, particularly those exceeding 2 million cells, can pose challenges, preventing Copilot from performing comprehensive analysis. Despite this, with smaller datasets, Copilot shines, offering the ability to conduct more advanced analyses that might otherwise require extensive manual effort.

However, current implementations face limitations:

  • Functionality Gaps: Native integrations may not yet support the full range of spreadsheet features, particularly advanced functionalities like VBA macros.

  • Execution Challenges: LLMs may occasionally fail to execute tasks as intended, requiring users to troubleshoot or provide alternative instructions.

  • Learning Curve: Users may need time to adapt to the new interaction paradigms and understand the capabilities and limitations of these integrated tools.

4. Extending Spreadsheet Functionality with LLM Add-ons:

Add-ons and extensions offer another exciting avenue for LLM integration, enabling users to create custom functions and formulas directly interacting with LLM APIs. This approach extends the functionality of spreadsheets beyond their native capabilities, opening up a world of possibilities for automation, data transformation, and content generation.

For instance, the Claude for Sheets add-on by Anthropic integrates Claude directly into Google Sheets, enabling interactions with the model from within spreadsheet cells. Claude for Sheets provides users with a simple =CLAUDE() function for executing straightforward prompts, making generating quick responses directly in a cell easy. For users who require more control over the output, the function also supports additional parameters, allowing you to specify the model, control the number of tokens, adjust the temperature for creativity, and even insert a system prompt to guide Claude’s behaviour.

For more advanced use cases, the =CLAUDEMESSAGES() function offers even greater flexibility by enabling multi-turn conversations within a single cell. This is particularly useful for scenarios where you need to simulate a dialogue or prefill responses based on previous exchanges. By combining these functions with optional parameters like max_tokens, temperature, and stop_sequences, users can tailor their interactions with Claude to fit a wide range of needs, from generating creative ideas to performing precise analytical tasks.

claude-for-sheets-translation-screenshot

Another example is DocGPT’s GPT for Sheets, which allows users to leverage a range of AI models, including GPT-4o, GPT-3 Turbo, Claude, Mistral, and Gemini, directly within Google Sheets. This extension provides a seamless experience, eliminating the need to handle API keys unless desired. The setup is straightforward, enabling users to incorporate advanced AI capabilities into their workflows quickly. Use cases include generating multiple product descriptions at once, lead enrichment, extracting emails from social media, and programmatic SEO. These tools transform Google Sheets from a simple data tool into a powerful platform for AI-driven content creation and automation, enabling users to streamline complex tasks and enhance productivity with ease.

However, this approach comes with its own set of considerations:

  • Technical Expertise: Users must be intermediate skill-level spreadsheet users to effectively leverage LLM add-ons.

  • API Reliability: The performance and reliability of LLM add-ons are dependent on the underlying API, which may be subject to limitations or outages.

  • Security and Privacy: Users must be mindful of security and privacy implications when working with sensitive data through third-party add-ons.

The Future of LLMs and Spreadsheets:

The integration of LLMs and spreadsheets is poised to revolutionise the way we work with data. As LLM technology continues to advance, we can expect more sophisticated integrations, enhanced functionalities, and a more seamless user experience. The potential benefits for businesses are significant, including increased productivity, streamlined workflows, and deeper data insights.

While challenges remain, the future of LLM-spreadsheet integration is incredibly promising. By embracing these new tools and approaches, businesses can unlock the full potential of their data and gain a competitive edge in today's data-driven world.

Stay tuned for future posts where I'll explore specific use cases and provide practical guidance on leveraging LLMs for Excel in your business.

Comments