Reading Data

The main difference between a knowledge base (PDFs, webpages) and a database (spreadsheets) is in how information is retrieved.

Unstructured Data

The data stored in webpages and PDF documents are common examples of unstructured data. The data is stored in a vector database, making it easy for a chatbot to search for similar chunks of data and generate a response. While this method of information storage and retrieval is fast, it often leads to hallucination where AI will generate false responses.

Structured Data

Structured data is easier for AI to undertand and lives in databases, spreadsheets, or CSV format.

Searching For Structured Data

By default Botsheets generates SQL query statements to search through columns and rows of unstructured data. This leads to more percision responses, but if an AI model is not confident in its initial response. Botsheets can fallback and use the same methods to search your spreadsheet data the same way it searches for web and other document data. This semantic search can lead to AI hallucination but it also increases the chances of delivering a relevant response. You can enable or disable semantic search by accessing Bot Settings > Data > Documents > Sheet Settings.


Debugging Responses

You may want to understand how Botsheets generates a response from your Google Sheet data. Did it find the data using a standard query statement, or did it generate a response from data using a semantic search method? When you enable "Debug Mode" in our bot testing interface, you can get a better understanding of how Botsheets generates a response with an explanation.

You can use these explanations to gain insight and make adjustments to your data in your Google Sheet, or determine if you want to enable or disable semantic search for a sheet.

Multiple Data Sources

You may have multiple web pages, PDFs, and Google Sheets all connected to the same chatbot. Because our first attempts at search your sheet are to generate SQL queries rather than do a general semantic search on your sheet data, you'll want to describe the data source so that AI understands it's purpose.

Be descriptive of your data, especially if you have multiple data sources connected. When your chatbot receives a message it will automatically determine the best data source to generate a response.

Note, that although we provide access to the Base Prompt trying to describe your data sources here or guiding your bot to search through specific data sources will have no impoact on your AI.

Best Practices

  • Avoid too many data sources to maximize speed.
  • PDF data sources only are fastest
  • Minimize the number of columns in a Google Sheet to 20 max
  • Minimize the amount of data per cell in a Google Sheet
  • Use documents for knowledge. Use Google Sheets for data.