> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pixeltable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Import data from Excel files

> Load XLSX and Excel spreadsheets into Pixeltable tables with sheet selection, header handling, and type inference for analysis pipelines.

<a href="https://kaggle.com/kernels/welcome?src=https://github.com/pixeltable/pixeltable/blob/release/docs/release/howto/cookbooks/data/data-import-excel.ipynb" id="openKaggle" target="_blank" rel="noopener noreferrer"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open in Kaggle" style={{ display: 'inline', margin: '0px' }} noZoom /></a>  <a href="https://colab.research.google.com/github/pixeltable/pixeltable/blob/release/docs/release/howto/cookbooks/data/data-import-excel.ipynb" id="openColab" target="_blank" rel="noopener noreferrer"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" style={{ display: 'inline', margin: '0px' }} noZoom /></a>  <a href="https://raw.githubusercontent.com/pixeltable/pixeltable/refs/tags/release/docs/release/howto/cookbooks/data/data-import-excel.ipynb" id="downloadNotebook" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/%E2%AC%87-Download%20Notebook-blue" alt="Download Notebook" style={{ display: 'inline', margin: '0px' }} noZoom /></a>

<Tip>This documentation page is also available as an interactive notebook. You can launch the notebook in
Kaggle or Colab, or download it for use with an IDE or local Jupyter installation, by clicking one of the
above links.</Tip>

export const quartoRawHtml = [`
<table>
<thead>
<tr>
<th>Source</th>
<th>Rows</th>
<th>Use case</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">Sales report.xlsx</td>
<td style="vertical-align: middle;">10K</td>
<td style="vertical-align: middle;">Analyze with AI</td>
</tr>
<tr>
<td style="vertical-align: middle;">Inventory.xlsx</td>
<td style="vertical-align: middle;">5K</td>
<td style="vertical-align: middle;">Enrich with descriptions</td>
</tr>
<tr>
<td style="vertical-align: middle;">Survey results.xlsx</td>
<td style="vertical-align: middle;">1K</td>
<td style="vertical-align: middle;">Sentiment analysis</td>
</tr>
</tbody>
</table>
`, `<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }
    .dataframe tbody tr th {
        vertical-align: top;
    }
    .dataframe thead th {
        text-align: right;
    }
</style>
`, `
<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">order_id</th>
<th data-quarto-table-cell-role="th">customer</th>
<th data-quarto-table-cell-role="th">product</th>
<th data-quarto-table-cell-role="th">quantity</th>
<th data-quarto-table-cell-role="th">price</th>
<th data-quarto-table-cell-role="th">date</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">0</td>
<td style="vertical-align: middle;">1001</td>
<td style="vertical-align: middle;">Alice</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">2024-01-15</td>
</tr>
<tr>
<td style="vertical-align: middle;">1</td>
<td style="vertical-align: middle;">1002</td>
<td style="vertical-align: middle;">Bob</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">1</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">2024-01-16</td>
</tr>
<tr>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">1003</td>
<td style="vertical-align: middle;">Carol</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">5</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">2024-01-16</td>
</tr>
<tr>
<td style="vertical-align: middle;">3</td>
<td style="vertical-align: middle;">1004</td>
<td style="vertical-align: middle;">Dave</td>
<td style="vertical-align: middle;">Tool C</td>
<td style="vertical-align: middle;">3</td>
<td style="vertical-align: middle;">79.99</td>
<td style="vertical-align: middle;">2024-01-17</td>
</tr>
<tr>
<td style="vertical-align: middle;">4</td>
<td style="vertical-align: middle;">1005</td>
<td style="vertical-align: middle;">Eve</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">2024-01-18</td>
</tr>
</tbody>
</table>
`, `
</div>`, `
<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th">order_id</th>
<th data-quarto-table-cell-role="th">customer</th>
<th data-quarto-table-cell-role="th">product</th>
<th data-quarto-table-cell-role="th">quantity</th>
<th data-quarto-table-cell-role="th">price</th>
<th data-quarto-table-cell-role="th">date</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">1001</td>
<td style="vertical-align: middle;">Alice</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">2024-01-15</td>
</tr>
<tr>
<td style="vertical-align: middle;">1002</td>
<td style="vertical-align: middle;">Bob</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">1</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">2024-01-16</td>
</tr>
<tr>
<td style="vertical-align: middle;">1003</td>
<td style="vertical-align: middle;">Carol</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">5</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">2024-01-16</td>
</tr>
<tr>
<td style="vertical-align: middle;">1004</td>
<td style="vertical-align: middle;">Dave</td>
<td style="vertical-align: middle;">Tool C</td>
<td style="vertical-align: middle;">3</td>
<td style="vertical-align: middle;">79.99</td>
<td style="vertical-align: middle;">2024-01-17</td>
</tr>
<tr>
<td style="vertical-align: middle;">1005</td>
<td style="vertical-align: middle;">Eve</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">2024-01-18</td>
</tr>
</tbody>
</table>
`, `
<table class="dataframe" data-quarto-postprocess="true" data-border="1">
<thead>
<tr style="text-align: right;">
<th data-quarto-table-cell-role="th">order_id</th>
<th data-quarto-table-cell-role="th">customer</th>
<th data-quarto-table-cell-role="th">product</th>
<th data-quarto-table-cell-role="th">quantity</th>
<th data-quarto-table-cell-role="th">price</th>
<th data-quarto-table-cell-role="th">total</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">1001</td>
<td style="vertical-align: middle;">Alice</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">59.98</td>
</tr>
<tr>
<td style="vertical-align: middle;">1002</td>
<td style="vertical-align: middle;">Bob</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">1</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">149.99</td>
</tr>
<tr>
<td style="vertical-align: middle;">1003</td>
<td style="vertical-align: middle;">Carol</td>
<td style="vertical-align: middle;">Widget A</td>
<td style="vertical-align: middle;">5</td>
<td style="vertical-align: middle;">29.99</td>
<td style="vertical-align: middle;">149.95</td>
</tr>
<tr>
<td style="vertical-align: middle;">1004</td>
<td style="vertical-align: middle;">Dave</td>
<td style="vertical-align: middle;">Tool C</td>
<td style="vertical-align: middle;">3</td>
<td style="vertical-align: middle;">79.99</td>
<td style="vertical-align: middle;">239.97</td>
</tr>
<tr>
<td style="vertical-align: middle;">1005</td>
<td style="vertical-align: middle;">Eve</td>
<td style="vertical-align: middle;">Gadget B</td>
<td style="vertical-align: middle;">2</td>
<td style="vertical-align: middle;">149.99</td>
<td style="vertical-align: middle;">299.98</td>
</tr>
</tbody>
</table>
`, `
<table>
<colgroup>
<col style="width: 47%" />
<col style="width: 52%" />
</colgroup>
<thead>
<tr>
<th>Method</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;">With source_format hint</td>
<td style="vertical-align: middle;"><code>pxt.create_table('t', source=path, source_format='excel')</code></td>
</tr>
<tr>
<td style="vertical-align: middle;">Auto-detect from .xlsx</td>
<td style="vertical-align: middle;"><code>pxt.create_table('t', source='data/xlsx')</code></td>
</tr>
</tbody>
</table>
`, `
<table>
<thead>
<tr>
<th>Option</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: middle;"><code>sheet_name</code></td>
<td style="vertical-align: middle;">Select specific sheet</td>
</tr>
<tr>
<td style="vertical-align: middle;"><code>skiprows</code></td>
<td style="vertical-align: middle;">Skip header rows</td>
</tr>
<tr>
<td style="vertical-align: middle;"><code>usecols</code></td>
<td style="vertical-align: middle;">Select specific columns</td>
</tr>
<tr>
<td style="vertical-align: middle;"><code>dtype</code></td>
<td style="vertical-align: middle;">Force column types</td>
</tr>
</tbody>
</table>
`];

Load data from Excel spreadsheets (.xlsx) into Pixeltable tables.

## Problem

You have data in Excel format that needs to be loaded for AI
processing—reports, inventory lists, or business data exported from
other systems.

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[0] }} />

## Solution

**What’s in this recipe:**

* Import Excel files directly into tables
* Handle multiple sheets
* Override column types when needed

You use `pxt.create_table()` with an Excel file path as the `source`
parameter. Pixeltable infers column types automatically.

### Setup

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
%pip install -qU pixeltable openpyxl pandas
```

### Create sample Excel file

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
import pandas as pd
import pixeltable as pxt
import tempfile
from pathlib import Path

# Create sample Excel file for demo
sample_data = pd.DataFrame(
    {
        'order_id': [1001, 1002, 1003, 1004, 1005],
        'customer': ['Alice', 'Bob', 'Carol', 'Dave', 'Eve'],
        'product': [
            'Widget A',
            'Gadget B',
            'Widget A',
            'Tool C',
            'Gadget B',
        ],
        'quantity': [2, 1, 5, 3, 2],
        'price': [29.99, 149.99, 29.99, 79.99, 149.99],
        'date': [
            '2024-01-15',
            '2024-01-16',
            '2024-01-16',
            '2024-01-17',
            '2024-01-18',
        ],
    }
)

# Save to temp Excel file
temp_dir = tempfile.mkdtemp()
excel_path = Path(temp_dir) / 'orders.xlsx'
sample_data.to_excel(excel_path, index=False)
sample_data
```

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[1] }} />

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[2] }} />

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[3] }} />

### Import Excel file

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Create a fresh directory
pxt.drop_dir('excel_demo', force=True)
pxt.create_dir('excel_demo')
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
  Created directory 'excel\_demo'.
  \<pixeltable.catalog.dir.Dir at 0x30b6cb730>
</pre>

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Import Excel file directly
orders = pxt.create_table(
    'excel_demo/orders',
    source=str(excel_path),
    source_format='excel',  # Hint for Excel format
)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Created table 'orders'.

  Inserting rows into \`orders\`: 0 rows \[00:00, ? rows/s]
  Inserting rows into \`orders\`: 5 rows \[00:00, 501.21 rows/s]
  Inserted 5 rows with 0 errors.
</pre>

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# View imported data
orders.collect()
```

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[4] }} />

### Add computed columns

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# Add computed column for order total
orders.add_computed_column(total=orders.quantity * orders.price)
```

<pre style={{ 'margin': '-20px 20px 0px 20px', 'padding': '0px', 'background-color': 'transparent', 'color': 'black' }}>
  Added 5 column values with 0 errors.
  5 rows updated, 10 values computed.
</pre>

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
# View with computed total
orders.select(
    orders.order_id,
    orders.customer,
    orders.product,
    orders.quantity,
    orders.price,
    orders.total,
).collect()
```

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[5] }} />

## Explanation

**Import methods:**

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[6] }} />

**Excel-specific options:**

Pass Pandas `read_excel` arguments via `extra_args`:

```python theme={"theme":{"light":"light-plus","dark":"dark-plus"}}
pxt.create_table(
    'table_name',
    source='data.xlsx',
    source_format='excel',
    extra_args={'sheet_name': 'Sheet2', 'skiprows': 1}
)
```

**Common extra\_args:**

<div style={{ 'margin': '0px 20px 0px 20px' }} dangerouslySetInnerHTML={{ __html: quartoRawHtml[7] }} />

## See also

* [Import CSV
  files](/howto/cookbooks/data/data-import-csv) -
  CSV and tabular data
* [Import Parquet
  files](/howto/cookbooks/data/data-import-parquet) -
  Columnar data
