Time Zones
Time Zones in Pixeltable
Because typical use cases involve datasets that span multiple time zones, Pixeltable strives to be precise in how it handles time zone arithmetic for datetimes.
Timestamps are always stored in the Pixeltable database in UTC, to ensure consistency across datasets and deployments. Time zone considerations therefore apply during insertion and retrieval of timestamp data.
%pip install -qU pixeltable
The Default Time Zone
Every Pixeltable deployment has a default time zone. The default time zone can be configured either by setting the PIXELTABLE_TIME_ZONE
environment variable, or by adding a time-zone
entry to the [pixeltable]
section in $PIXELTABLE_HOME/config.toml
. It must be a valid IANA Time Zone.
(See the Pixeltable Configuration guide for more details on configuration options.)
import os
os.environ['PIXELTABLE_TIME_ZONE'] = 'America/Los_Angeles'
If no time zone is configured, then Pixeltable will fall back on the system time zone of the host on which it is running. Because system time zone is deployment-dependent, it is recommended that production deployments configure a default time zone explicitly.
As outlined in the Python datetime documentation, a Python datetime
object may be either naive (no time zone) or aware (equipped with an explicit time zone). Pixeltable will always interpret naive datetime
objects as belonging to the configured default time zone.
Insertion and Retrieval
When a datetime
is inserted into the database, it will be converted to UTC and stored as an absolute timestamp. If the datetime
has an explicit time zone, Pixeltable will use that time zone for the conversion; otherwise, Pixeltable will use the default time zone.
When a datetime
is retrieved, it will always be retrieved in the default time zone. To query in a different time zone, it is necessary to do an explicit conversion; we'll give an example of this in a moment. Let's first walk through a few examples that illustrate the default behavior.
import pixeltable as pxt
pxt.drop_dir('tz_demo', force=True)
pxt.create_dir('tz_demo')
t = pxt.create_table('tz_demo.example', {'dt': pxt.Timestamp, 'note': pxt.String})
Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/asiegel/.pixeltable/pgdata
Created directory `tz_demo`.
Created table `example`.
from datetime import datetime, timezone
from zoneinfo import ZoneInfo
naive_dt = datetime(2024, 8, 9, 23, 0, 0)
explicit_dt = datetime(2024, 8, 9, 23, 0, 0, tzinfo=ZoneInfo('America/Los_Angeles'))
other_dt = datetime(2024, 8, 9, 23, 0, 0, tzinfo=ZoneInfo('America/New_York'))
t.insert([
{'dt': naive_dt, 'note': 'No time zone specified (uses default)'},
{'dt': explicit_dt, 'note': 'Time zone America/Los_Angeles was specified explicitly'},
{'dt': other_dt, 'note': 'Time zone America/New_York was specified explicitly'}
])
Inserting rows into `example`: 3 rows [00:00, 842.91 rows/s]
Inserted 3 rows with 0 errors.
UpdateStatus(num_rows=3, num_computed_values=3, num_excs=0, updated_cols=[], cols_with_excs=[])
On retrieval, all timestamps are normalized to the default time zone, regardless of how they were specified during insertion.
t.collect()
dt | note |
---|---|
2024-08-09 23:00:00-07:00 | No time zone specified (uses default) |
2024-08-09 23:00:00-07:00 | Time zone America/Los_Angeles was specified explicitly |
2024-08-09 20:00:00-07:00 | Time zone America/New_York was specified explicitly |
To represent timestamps in a different time zone, use the astimezone
method.
t.select(t.dt, dt_new_york=t.dt.astimezone('America/New_York'), note=t.note).collect()
dt | dt_new_york | note |
---|---|---|
2024-08-09 23:00:00-07:00 | 2024-08-10 02:00:00-04:00 | No time zone specified (uses default) |
2024-08-09 23:00:00-07:00 | 2024-08-10 02:00:00-04:00 | Time zone America/Los_Angeles was specified explicitly |
2024-08-09 20:00:00-07:00 | 2024-08-09 23:00:00-04:00 | Time zone America/New_York was specified explicitly |
Timestamp Methods and Properties
The Pixeltable API exposes all the standard datetime
methods and properties from the Python library. Because retrieval uses the default time zone, they are all relative to the default time zone unless astimezone
is used.
t.select(
t.dt,
day_default=t.dt.day,
day_eastern=t.dt.astimezone('America/New_York').day
).collect()
dt | day_default | day_eastern |
---|---|---|
2024-08-09 23:00:00-07:00 | 9 | 10 |
2024-08-09 23:00:00-07:00 | 9 | 10 |
2024-08-09 20:00:00-07:00 | 9 | 9 |
Observe that the first two timestamps map to different dates depending on the time zone, as expected.
Updated about 2 months ago