Data Sources
In this section, we'll explain how you can upload, import, and connect data sources to train AI to respond on your business data. We'll explore Google Sheets first which you can use as a database for frequently changing content such as listings, or a directory. If you're looking to just answer questions, you can upload PDF documents, or connect webpages as a knowledge base.
Google Sheets
Give your chatbot as database by connecting a dataset stored in Google Sheets. You just need to connect a Google Sheet once and we'll keep AI in sync with changes to data in a worksheet.
Document Character Limit | Not applicable |
File Size | 30 mb max |
Sheets Per Bot | Unlimited |
Data Sync | Once every 24 hr, 6 hr, or 1 hr depending on your Botsheets plan.. |
Data Retrieval | SQL Queries + Semantic Search |
Chat Credits | Each message processed with AI that triggers this data source as a response consumes 10 chat credits |
You can create a dataset manually, export data from your apps in CSV format and import it to Google Sheets, or using a tool like Zapier to move data from hundreds of business apps into Google Sheets. You can connect multiple Google Sheets to a single chatbot and when you make changes to your data, we'll auto-sync those changes with your chatbot. In short, uou can train your chatbot on business data working directly from Google Drive.
File Size | Sync Frequency | Doc Character Limit | |
Lite Plan | 30MB | 24 hrs | N/A |
Pro Plan | 30MB | 6 hrs | N/A |
Unlimited Plan | 30MB | 1 hr | N/A |
Preparing Your Data
A dataset is a collection of data that is organized into columns and rows. Column headers are data points and any rows of data below the headers are corresponding values. It provides structure to data that is easy for AI to understand, but also easy for you to manage.

You could have a few rows of data, or several thousand rows of data in a single Google Sheet. There are a number of use cases for datasets and where you would want to use a Google Sheet as a data source:
- Lists of frequently changing data (i.e. Real Estate listings, Restaurant menus, directories of people, products, services, events, etc.)
- Anything with criteria for more perosnalized responses
- Anything that could be quantified
- Anything that can be analyzed to generate insights
Sample Datasets
Here are sample datasets you can test with in a chatbot. You just need to click to view it, copy the URL in your browser and paste it into a Botsheets chatbot when you're prompted to paste in a link to a Google Sheet. You can also create a copy of the Google Sheet and add it to your own Google Drive so you can use it as a template for structuring your own datasets.
Category | View | Make a Copy | |
Real Estate Listings | Lead Generation | View Sheet | Copy Sheet |
Amazon Reviews | Feedback | View Sheet | Copy Sheet |
News Aggregation | News | View Sheet | Copy Sheet |
College Data | Education | View Sheet | Copy Sheet |
Supermarket Sales | Sales Report | View Sheet | Copy Sheet |
Titantic Passengers | Historical | View Sheet | Copy Sheet |
Apparel Sales Report | Sales Report | View Sheet | Copy Sheet |
Amazon Sales Report | Sales Report | View Sheet | Copy Sheet |
Preparing Your Data
You'll need to ensure that your data is prepared properly.
At a minimum, a Google Sheet dataset connected to Botsheets requires at least one top row of column headers representing data points and at least one row of data.
There isn't a limit to the number of rows of data, but you should limit the number of columns to around 20 as with too many columns you'll experience a serious degradation in performance.
The labels you use for your column headers should be concise and succinct for the best performance. Do not use the column headers to be overly descriptive, or as space to engineer prompts. The data point used in the column header is the prompt. You can move columns around, but ensure that a data point in a column header matches with a data point sepcificed in the Botsheets dashboard.
Here are some additional recommendations to ensure best practice for datasets with Botsheets
Column Headers | Use alphanumeric characters symbols such as $ and %. Do not use semicolons, periods, and commas. |
Column Headers | For column header names with separate words we recommend using underscore. For example: email_address |
Column Headers | For pricing data, put the currency symbol in the header rather than in each row of data. For example, your column header may look like this: "$_in_USD". Use just the number in the data. |
Column Headers | Limit the number of columns and text in a row to something reasonable. Max 20 is suggested for optimitum performance. |
Column Headers | Do not have empty column headers, but have data in rows. Always have column headers. |
Column Headers | Do not use duplicate column names. Each column header name should be unique. |
Row Data | Be consistent about your data (if a column holds numbers, every row should hold numbers, etc.) |
Row Data | Avoid using commas in data if possible. It will still work, just not reliably. For example, for numbers use 100000 instead of 100,000 |
Row Data | Your data should only be text, but you can include links and the link will be included in the response. You might have a column header named "URL" and your data could be a link like https://www.botsheets.com. |
Botsheets reads only the first worksheet (the default tab) in a Google Sheet, so if you have multiple datasets for a single chatbot, put then into individual Google Sheets and connect each one to your chatbot.
Just copy and paste a Google Sheet URL to connect it to your chatbot. If you want to connect a Google Sheet stored in a Google Drive not signed-in to Botsheets, than you'll need the owner of that Google Sheet to change the share settings so that Anyone with the link can view it.

If you're an agency, this means your clients can train their chatbot on their business data working from their own Google Drive and without ever accessing the Botsheets dashboard.
PDF Documents
PDFs are an ideal data source to give your chatbot a knowledge base to answer questions:
Document Character Limit | 500,000 to Unlimited |
File Size | Max 30 mb |
Files Per Bot | Unlimited |
Data Sync | None. Upload a PDF. Delete the file and upload another. |
Data Retrieval | Semantic Search |
Chat Credits | Each message processed with AI that triggers this data source as a response consumes 1 chat credit |
Web Documents
We scrape web pages, an ideal data source for an AI knowledge base. Unlike PDF documents though, connect a webpage once to Botsheets and we'll keep AI in sync with changes to your website content.
Document Character Limit | 500,000 to Unlimited |
File Size | Not applicable |
Pages Per Bot | Unlimited |
Data Sync | Once every 24 hr, 6 hr, or 1 hr depending on your Botsheets plan.. |
Data Retrieval | Semantic Search |
Chat Credits | Each message processed with AI that triggers this data source as a response consumes 1 chat credit |
Connect one, or multiple web pages. Botsheets will monitor for changes and auto-train AI for you.
File Size | Sync Frequency | Doc Character Limit | |
Lite Plan | 30MB | 24 hrs | 500,000 |
Pro Plan | 30MB | 6 hrs | 10 million |
Unlimited Plan | 30MB | 1 hr | Unlimited |
Manual Data Input
While spreadsheets provide a more scalable approach to structured data, we provide form fields to capture some basic structure data in the form of name-value (or key-value) pairs. This data source is ideal for highlighting key data points about your business without requiring an extensive structured dataset. A business name, or hours of operation are examples of data you might provide. The structure is much easier for AI to understand than PDFs and Web Documents due to the simplicity of the structure.
Document Character Limit | Not applicable |
File Size | Not applicable |
Name-Value Pairs Per Bot | 20 |
Data Sync | Not applicable. Add, edit, or delete up to the maximum. |
Data Retrieval | Semantic Search |
Chat Credits | Each message processed with AI that triggers this data source as a response consumes 1 chat credit |