ibis-project.org
Open in
urlscan Pro
75.2.60.5
Public Scan
Submitted URL: http://ibis-project.org/
Effective URL: https://ibis-project.org/
Submission: On January 29 via api from US — Scanned from DE
Effective URL: https://ibis-project.org/
Submission: On January 29 via api from US — Scanned from DE
Form analysis
1 forms found in the DOM<form class="vega-bindings"></form>
Text Content
Ibis Search * Getting started * Installation * Tutorial: getting started * Tutorial: Ibis for dplyr users * Tutorial: Ibis for pandas users * Tutorial: Ibis for SQL users * -------------------------------------------------------------------------------- * Data Platforms * Starburst Galaxy * Concepts * Why Ibis? * Composable data ecosystem * Internals * User testimonials * Versioning policy * Who supports Ibis? * Backends * BigQuery * ClickHouse * Dask * DataFusion * Druid * DuckDB * Exasol * Flink * Impala * MSSQL * MySQL * Oracle * pandas * Polars * PostgreSQL * PySpark * Snowflake * SQLite * Trino * Operation support matrix * How-to * Configure * Basic configuration * -------------------------------------------------------------------------------- * Input Output * Basic input/output * Work with multiple backends * -------------------------------------------------------------------------------- * Analytics * Basic analytics * Chaining expressions * -------------------------------------------------------------------------------- * Visualization * Altair + Ibis * matplotlib + Ibis * Plotly + Ibis * plotnine + Ibis * seaborn + Ibis * Streamlit + Ibis * -------------------------------------------------------------------------------- * Extending * Reference built-in functions * Add an elementwise operation * Add a reduction operation * Using SQL strings with Ibis * Reference * Expression API * Table expressions * Column selectors * Generic expressions * Numeric and Boolean expressions * String expressions * Temporal expressions * Collection expressions * Geospatial expressions * -------------------------------------------------------------------------------- * Type system * Data types * Schemas * -------------------------------------------------------------------------------- * UDFs * Scalar UDFs * -------------------------------------------------------------------------------- * Connection APIs * Top-level connection APIs * -------------------------------------------------------------------------------- * Configuration * ContextAdjustment * Interactive * Options * Repr * SQL * Posts * Release notes * Contribute * Contribute * Setting up a development environment * Contribute to the Ibis codebase * Style and formatting * Maintaining the codebase * Test Class Reference * Source code * Report a bug * Report a documentation issue * Submit a feature request * Ask the community for help IBIS the portable Python dataframe library -------------------------------------------------------------------------------- Why Ibis? Installation Tutorial: getting started GitHub Chat RSS INSTALL We recommend starting with the default backend (DuckDB). 1pip install 'ibis-framework[duckdb,examples]' 1 Install Ibis with the DuckDB backend along with examples. Show supported backends BACKENDS Need to use Ibis with a backend that isn’t currently supported? Let us know! You can install Ibis and a supported backend with pip, conda, mamba, or pixi. * pip * conda * mamba * pixi * BigQuery * ClickHouse * Dask * DataFusion * Druid * DuckDB * Exasol * Flink * Impala * MSSQL * MySQL * Oracle * pandas * Polars * PostgreSQL * PySpark * Snowflake * SQLite * Trino Install with the bigquery extra: pip install 'ibis-framework[bigquery]' Connect using ibis.bigquery.connect. Install with the clickhouse extra: pip install 'ibis-framework[clickhouse]' Connect using ibis.clickhouse.connect. Install with the dask extra: pip install 'ibis-framework[dask]' Connect using ibis.dask.connect. Install with the datafusion extra: pip install 'ibis-framework[datafusion]' Connect using ibis.datafusion.connect. Install with the druid extra: pip install 'ibis-framework[druid]' Connect using ibis.druid.connect. Install with the duckdb extra: pip install 'ibis-framework[duckdb]' Connect using ibis.duckdb.connect. Install with the exasol extra: pip install 'ibis-framework[exasol]' Connect using ibis.exasol.connect. Install alongside the apache-flink package: pip install ibis-framework apache-flink Connect using ibis.flink.connect. Install with the impala extra: pip install 'ibis-framework[impala]' Connect using ibis.impala.connect. Install with the mssql extra: pip install 'ibis-framework[mssql]' Connect using ibis.mssql.connect. Install with the mysql extra: pip install 'ibis-framework[mysql]' Connect using ibis.mysql.connect. Install with the oracle extra: pip install 'ibis-framework[oracle]' Connect using ibis.oracle.connect. Install with the pandas extra: pip install 'ibis-framework[pandas]' Connect using ibis.pandas.connect. Install with the polars extra: pip install 'ibis-framework[polars]' Connect using ibis.polars.connect. Install with the postgres extra: pip install 'ibis-framework[postgres]' Connect using ibis.postgres.connect. Install with the pyspark extra: pip install 'ibis-framework[pyspark]' Connect using ibis.pyspark.connect. Install with the snowflake extra: pip install 'ibis-framework[snowflake]' Connect using ibis.snowflake.connect. Install with the sqlite extra: pip install 'ibis-framework[sqlite]' Connect using ibis.sqlite.connect. Install with the trino extra: pip install 'ibis-framework[trino]' Connect using ibis.trino.connect. * BigQuery * ClickHouse * Dask * DataFusion * Druid * DuckDB * Exasol * Flink * Impala * MSSQL * MySQL * Oracle * pandas * Polars * PostgreSQL * PySpark * Snowflake * SQLite * Trino Install the ibis-bigquery package: conda install -c conda-forge ibis-bigquery Connect using ibis.bigquery.connect. Install the ibis-clickhouse package: conda install -c conda-forge ibis-clickhouse Connect using ibis.clickhouse.connect. Install the ibis-dask package: conda install -c conda-forge ibis-dask Connect using ibis.dask.connect. Install the ibis-datafusion package: conda install -c conda-forge ibis-datafusion Connect using ibis.datafusion.connect. Install the ibis-druid package: conda install -c conda-forge ibis-druid Connect using ibis.druid.connect. Install the ibis-duckdb package: conda install -c conda-forge ibis-duckdb Connect using ibis.duckdb.connect. Install the ibis-exasol package: conda install -c conda-forge ibis-exasol Connect using ibis.exasol.connect. PyFlink is not available on conda-forge; please use pip to install the PyFlink backend instead. Install the ibis-impala package: conda install -c conda-forge ibis-impala Connect using ibis.impala.connect. Install the ibis-mssql package: conda install -c conda-forge ibis-mssql Connect using ibis.mssql.connect. Install the ibis-mysql package: conda install -c conda-forge ibis-mysql Connect using ibis.mysql.connect. Install the ibis-oracle package: conda install -c conda-forge ibis-oracle Connect using ibis.oracle.connect. Install the ibis-pandas package: conda install -c conda-forge ibis-pandas Connect using ibis.pandas.connect. Install the ibis-polars package: conda install -c conda-forge ibis-polars Connect using ibis.polars.connect. Install the ibis-postgres package: conda install -c conda-forge ibis-postgres Connect using ibis.postgres.connect. Install the ibis-pyspark package: conda install -c conda-forge ibis-pyspark Connect using ibis.pyspark.connect. Install the ibis-snowflake package: conda install -c conda-forge ibis-snowflake Connect using ibis.snowflake.connect. Install the ibis-sqlite package: conda install -c conda-forge ibis-sqlite Connect using ibis.sqlite.connect. Install the ibis-trino package: conda install -c conda-forge ibis-trino Connect using ibis.trino.connect. * BigQuery * ClickHouse * Dask * DataFusion * Druid * DuckDB * Exasol * Flink * Impala * MSSQL * MySQL * Oracle * pandas * Polars * PostgreSQL * PySpark * Snowflake * SQLite * Trino Install the ibis-bigquery package: mamba install -c conda-forge ibis-bigquery Connect using ibis.bigquery.connect. Install the ibis-clickhouse package: mamba install -c conda-forge ibis-clickhouse Connect using ibis.clickhouse.connect. Install the ibis-dask package: mamba install -c conda-forge ibis-dask Connect using ibis.dask.connect. Install the ibis-datafusion package: mamba install -c conda-forge ibis-datafusion Connect using ibis.datafusion.connect. Install the ibis-druid package: mamba install -c conda-forge ibis-druid Connect using ibis.druid.connect. Install the ibis-duckdb package: mamba install -c conda-forge ibis-duckdb Connect using ibis.duckdb.connect. Install the ibis-exasol package: mamba install -c conda-forge ibis-exasol Connect using ibis.exasol.connect. PyFlink is not available on conda-forge; please use pip to install the PyFlink backend instead. Install the ibis-impala package: mamba install -c conda-forge ibis-impala Connect using ibis.impala.connect. Install the ibis-mssql package: mamba install -c conda-forge ibis-mssql Connect using ibis.mssql.connect. Install the ibis-mysql package: mamba install -c conda-forge ibis-mysql Connect using ibis.mysql.connect. Install the ibis-oracle package: mamba install -c conda-forge ibis-oracle Connect using ibis.oracle.connect. Install the ibis-pandas package: mamba install -c conda-forge ibis-pandas Connect using ibis.pandas.connect. Install the ibis-polars package: mamba install -c conda-forge ibis-polars Connect using ibis.polars.connect. Install the ibis-postgres package: mamba install -c conda-forge ibis-postgres Connect using ibis.postgres.connect. Install the ibis-pyspark package: mamba install -c conda-forge ibis-pyspark Connect using ibis.pyspark.connect. Install the ibis-snowflake package: mamba install -c conda-forge ibis-snowflake Connect using ibis.snowflake.connect. Install the ibis-sqlite package: mamba install -c conda-forge ibis-sqlite Connect using ibis.sqlite.connect. Install the ibis-trino package: mamba install -c conda-forge ibis-trino Connect using ibis.trino.connect. * BigQuery * ClickHouse * Dask * DataFusion * Druid * DuckDB * Exasol * Flink * Impala * MSSQL * MySQL * Oracle * pandas * Polars * PostgreSQL * PySpark * Snowflake * SQLite * Trino Add the ibis-bigquery package: pixi add ibis-bigquery Connect using ibis.bigquery.connect. Add the ibis-clickhouse package: pixi add ibis-clickhouse Connect using ibis.clickhouse.connect. Add the ibis-dask package: pixi add ibis-dask Connect using ibis.dask.connect. Add the ibis-datafusion package: pixi add ibis-datafusion Connect using ibis.datafusion.connect. Add the ibis-druid package: pixi add ibis-druid Connect using ibis.druid.connect. Add the ibis-duckdb package: pixi add ibis-duckdb Connect using ibis.duckdb.connect. Add the ibis-exasol package: pixi add ibis-exasol Connect using ibis.exasol.connect. PyFlink is not available on conda-forge; please use pip to install the PyFlink backend instead. Add the ibis-impala package: pixi add ibis-impala Connect using ibis.impala.connect. Add the ibis-mssql package: pixi add ibis-mssql Connect using ibis.mssql.connect. Add the ibis-mysql package: pixi add ibis-mysql Connect using ibis.mysql.connect. Add the ibis-oracle package: pixi add ibis-oracle Connect using ibis.oracle.connect. Add the ibis-pandas package: pixi add ibis-pandas Connect using ibis.pandas.connect. Add the ibis-polars package: pixi add ibis-polars Connect using ibis.polars.connect. Add the ibis-postgres package: pixi add ibis-postgres Connect using ibis.postgres.connect. Add the ibis-pyspark package: pixi add ibis-pyspark Connect using ibis.pyspark.connect. Add the ibis-snowflake package: pixi add ibis-snowflake Connect using ibis.snowflake.connect. Add the ibis-sqlite package: pixi add ibis-sqlite Connect using ibis.sqlite.connect. Add the ibis-trino package: pixi add ibis-trino Connect using ibis.trino.connect. See the backend support matrix for details on operations supported. Open a feature request if you’d like to see support for an operation in a given backend. If the backend supports it, we’ll do our best to add it quickly! Show quickstart QUICKSTART See the getting started tutorial for a more in-depth introduction to Ibis. Below is a quick overview. 1import ibis import ibis.selectors as s 2ibis.options.interactive = True 3t = ibis.examples.penguins.fetch() 4t.head(3) 1 Ensure you install Ibis first. 2 Use interactive mode for exploratory data analysis (EDA) or demos. 3 Load a dataset from the built-in examples. 4 Display the table. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ Ibis is a dataframe library with familiar syntax. 1t[10:15] 1 Display a slice of the table. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 37.8 │ 17.1 │ 186 │ 3300 │ NULL │ 2007 │ │ Adelie │ Torgersen │ 37.8 │ 17.3 │ 180 │ 3700 │ NULL │ 2007 │ │ Adelie │ Torgersen │ 41.1 │ 17.6 │ 182 │ 3200 │ female │ 2007 │ │ Adelie │ Torgersen │ 38.6 │ 21.2 │ 191 │ 3800 │ male │ 2007 │ │ Adelie │ Torgersen │ 34.6 │ 21.1 │ 198 │ 4400 │ male │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ Show analytics ANALYTICS Ibis is built for easy analytics at scale in Python. 1( t.filter(ibis._["body_mass_g"] != None) .group_by(["species", "island"]) .aggregate(count=ibis._.count()) .order_by(ibis.desc("count")) ) 1 Group by species and island, and compute the number of rows in each group. ┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├───────────┼───────────┼───────┤ │ Gentoo │ Biscoe │ 123 │ │ Chinstrap │ Dream │ 68 │ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 51 │ │ Adelie │ Biscoe │ 44 │ └───────────┴───────────┴───────┘ Show EDA + visualization EXPLORATORY DATA ANALYSIS (EDA) AND VISUALIZATION EXPLORATORY DATA ANALYSIS Ibis has built-in methods for exploration and visualization. 1num_species = int(t.select("species").nunique().to_pandas()) 2t["species"].topk(num_species) 1 Compute the number of species in the dataset. 2 Display the top species by count. ┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ species ┃ Count(species) ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ string │ int64 │ ├───────────┼────────────────┤ │ Adelie │ 152 │ │ Gentoo │ 124 │ │ Chinstrap │ 68 │ └───────────┴────────────────┘ VISUALIZATION Ibis works with any Python plotting library that supports the dataframe interchange protocol. Code 1width = 640 height = 480 1 Set the width and height of the plots. 1grouped = ( t.group_by("species") .aggregate(count=ibis._.count()) .order_by(ibis.desc("count")) ) 2grouped 1 Setup data to plot. 2 Display the table. ┏━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ count ┃ ┡━━━━━━━━━━━╇━━━━━━━┩ │ string │ int64 │ ├───────────┼───────┤ │ Adelie │ 152 │ │ Gentoo │ 124 │ │ Chinstrap │ 68 │ └───────────┴───────┘ * Altair * matplotlib * Plotly * plotnine * seaborn pip install altair import altair as alt chart = ( alt.Chart(grouped.to_pandas()) .mark_bar() .encode( x="species", y="count", tooltip=["species", "count"], ) .properties(width=width, height=height) .interactive() ) chart Save as SVGSave as PNGView SourceView Compiled VegaOpen in Vega Editor pip install matplotlib import matplotlib.pyplot as plt chart = grouped.to_pandas().plot.bar( x="species", y="count", figsize=(width / 100, height / 100), ) plt.show() pip install plotly import plotly.express as px chart = px.bar( grouped.to_pandas(), x="species", y="count", width=width, height=height, ) chart AdelieGentooChinstrap020406080100120140160 speciescount plotly-logomark pip install plotnine from plotnine import ggplot, aes, geom_bar, theme chart = ( ggplot( grouped, aes(x="species", y="count"), ) + geom_bar(stat="identity") + theme(figure_size=(width / 100, height / 100)) ) chart <Figure Size: (640 x 480)> pip install seaborn import seaborn as sns chart = sns.barplot( data=grouped.to_pandas(), x="species", y="count", ) chart.figure.set_size_inches(width / 100, height / 100) Show data science DATA SCIENCE Use Ibis with your favorite data science libraries for concise and efficient workflows. 1import ibis.selectors as s 2def transform(t): t = t.mutate( s.across(s.numeric(), {"zscore": lambda x: (x - x.mean()) / x.std()}) ).dropna() return t 3f = transform(t.drop("year")) 4f.select("species", "island", s.contains("zscore")) 1 Import the selectors module. 2 Define a function to transform the table for code reuse (compute z-scores on numeric columns). 3 Apply the function to the table and assign it to a new variable. 4 Display the transformed table. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm_zscore ┃ bill_depth_mm_zscore ┃ flipper_length_mm_zscore ┃ body_mass_g_zscore ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ float64 │ ├─────────┼───────────┼───────────────────────┼──────────────────────┼──────────────────────────┼────────────────────┤ │ Adelie │ Torgersen │ -0.883205 │ 0.784300 │ -1.416272 │ -0.563317 │ │ Adelie │ Torgersen │ -0.809939 │ 0.126003 │ -1.060696 │ -0.500969 │ │ Adelie │ Torgersen │ -0.663408 │ 0.429833 │ -0.420660 │ -1.186793 │ │ Adelie │ Torgersen │ -1.322799 │ 1.088129 │ -0.562890 │ -0.937403 │ │ Adelie │ Torgersen │ -0.846572 │ 1.746426 │ -0.776236 │ -0.688012 │ │ Adelie │ Torgersen │ -0.919837 │ 0.328556 │ -1.416272 │ -0.719186 │ │ Adelie │ Torgersen │ -0.864888 │ 1.240044 │ -0.420660 │ 0.590115 │ │ Adelie │ Torgersen │ -0.516876 │ 0.227280 │ -1.345156 │ -1.249141 │ │ Adelie │ Torgersen │ -0.974787 │ 2.050255 │ -0.705121 │ -0.500969 │ │ Adelie │ Torgersen │ -1.707443 │ 1.999617 │ -0.207315 │ 0.247203 │ │ … │ … │ … │ … │ … │ … │ └─────────┴───────────┴───────────────────────┴──────────────────────┴──────────────────────────┴────────────────────┘ pip install scikit-learn 1import plotly.express as px from sklearn.decomposition import PCA 2X = f.select(s.contains("zscore")) 3n_components = 3 pca = PCA(n_components=n_components).fit(X) 4t_pca = ibis.memtable(pca.transform(X)).relabel( {"col0": "pc1", "col1": "pc2", "col2": "pc3"} ) 5f = f.mutate(row_number=ibis.row_number().over()).join( t_pca.mutate(row_number=ibis.row_number().over()), "row_number" ) 6px.scatter_3d( f.to_pandas(), x="pc1", y="pc2", z="pc3", color="species", symbol="island", ) 1 Import data science libraries 2 Select “features” (numeric columns) as X 3 Compute PCA 4 Create a table from the PCA results 5 Join the PCA results to the original table 6 Plot the results species, islandAdelie, TorgersenAdelie, BiscoeAdelie, DreamGentoo, BiscoeChinstrap, Dream plotly-logomark Show input and output INPUT AND OUTPUT Ibis supports a variety of input and output options. DATA PLATFORMS You can connect Ibis to any supported backend to read and write data in backend-native tables. Code con = ibis.duckdb.connect("penguins.ddb") t = con.create_table("penguins", t.to_pyarrow(), overwrite=True) 1con = ibis.duckdb.connect("penguins.ddb") 2t = con.table("penguins") 3t.head(3) 1 Connect to a backend. 2 Load a table. 3 Display the table. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ 1grouped = ( t.group_by(["species", "island"]) .aggregate(count=ibis._.count()) .order_by(ibis.desc("count")) ) 2con.create_table("penguins_grouped", grouped.to_pyarrow(), overwrite=True) 1 Create a lazily evaluated Ibis expression. 2 Write to a table. ┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├───────────┼───────────┼───────┤ │ Gentoo │ Biscoe │ 124 │ │ Chinstrap │ Dream │ 68 │ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └───────────┴───────────┴───────┘ FILE FORMATS Depending on the backend, you can read and write data in several file formats. * CSV * Delta Lake * Parquet pip install 'ibis-framework[duckdb]' 1t.to_csv("penguins.csv") 2ibis.read_csv("penguins.csv").head(3) 1 Write the table to a CSV file. Dependent on backend. 2 Read the CSV file into a table. Dependent on backend. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ pip install 'ibis-framework[duckdb,deltalake]' 1t.to_delta("penguins.delta", mode="overwrite") 2ibis.read_delta("penguins.delta").head(3) 1 Write the table to a Delta Lake table. Dependent on backend. 2 Read the Delta Lake table into a table. Dependent on backend. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ pip install 'ibis-framework[duckdb]' 1t.to_parquet("penguins.parquet") 2ibis.read_parquet("penguins.parquet").head(3) 1 Write the table to a Parquet file. Dependent on backend. 2 Read the Parquet file into a table. Dependent on backend. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ WITH OTHER PYTHON LIBRARIES Ibis uses Apache Arrow for efficient data transfer to and from other libraries. Ibis tables implement the __dataframe__ and __array__ protocols, so you can pass them to any library that supports these protocols. * pandas * polars * pyarrow * torch * __dataframe__ * __array__ You can convert Ibis tables to pandas dataframes. pip install pandas 1df = t.to_pandas() df.head(3) 1 Returns a pandas dataframe. species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year 0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007 1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007 2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007 Or you can convert pandas dataframes to Ibis tables. 1t = ibis.memtable(df) t.head(3) 1 Returns an Ibis table. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ float64 │ string │ int64 │ ├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181.0 │ 3750.0 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186.0 │ 3800.0 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195.0 │ 3250.0 │ female │ 2007 │ └─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ You can convert Ibis tables to Polars dataframes. pip install polars import polars as pl df = pl.from_arrow(t.to_pyarrow()) df.head(3) shape: (3, 8) species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year str str f64 f64 f64 f64 str i64 "Adelie" "Torgersen" 39.1 18.7 181.0 3750.0 "male" 2007 "Adelie" "Torgersen" 39.5 17.4 186.0 3800.0 "female" 2007 "Adelie" "Torgersen" 40.3 18.0 195.0 3250.0 "female" 2007 Or Polars dataframes to Ibis tables. t = ibis.memtable(df) t.head(3) ┏━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ col0 ┃ col1 ┃ col2 ┃ col3 ┃ col4 ┃ col5 ┃ col6 ┃ col7 ┃ ┡━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ float64 │ string │ int64 │ ├────────┼───────────┼─────────┼─────────┼─────────┼─────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181.0 │ 3750.0 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186.0 │ 3800.0 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195.0 │ 3250.0 │ female │ 2007 │ └────────┴───────────┴─────────┴─────────┴─────────┴─────────┴────────┴───────┘ You can convert Ibis tables to PyArrow tables. pip install pyarrow t.to_pyarrow() pyarrow.Table col0: string col1: string col2: double col3: double col4: double col5: double col6: string col7: int64 ---- col0: [["Adelie","Adelie","Adelie","Adelie","Adelie",...,"Chinstrap","Chinstrap","Chinstrap","Chinstrap","Chinstrap"]] col1: [["Torgersen","Torgersen","Torgersen","Torgersen","Torgersen",...,"Dream","Dream","Dream","Dream","Dream"]] col2: [[39.1,39.5,40.3,null,36.7,...,55.8,43.5,49.6,50.8,50.2]] col3: [[18.7,17.4,18,null,19.3,...,19.8,18.1,18.2,19,18.7]] col4: [[181,186,195,null,193,...,207,202,193,210,198]] col5: [[3750,3800,3250,null,3450,...,4000,3400,3775,4100,3775]] col6: [["male","female","female",null,"female",...,"male","female","male","male","female"]] col7: [[2007,2007,2007,2007,2007,...,2009,2009,2009,2009,2009]] Or PyArrow batches: t.to_pyarrow_batches() <pyarrow.lib.RecordBatchReader at 0x7fff72d43cf0> And you can convert PyArrow tables to Ibis tables. ibis.memtable(t.to_pyarrow()).head(3) ┏━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ col0 ┃ col1 ┃ col2 ┃ col3 ┃ col4 ┃ col5 ┃ col6 ┃ col7 ┃ ┡━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ string │ string │ float64 │ float64 │ float64 │ float64 │ string │ int64 │ ├────────┼───────────┼─────────┼─────────┼─────────┼─────────┼────────┼───────┤ │ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181.0 │ 3750.0 │ male │ 2007 │ │ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186.0 │ 3800.0 │ female │ 2007 │ │ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195.0 │ 3250.0 │ female │ 2007 │ └────────┴───────────┴─────────┴─────────┴─────────┴─────────┴────────┴───────┘ You can convert Ibis tables to torch tensors. pip install torch t.select(s.numeric()).limit(3).to_torch() {'col2': tensor([39.1000, 39.5000, 40.3000], dtype=torch.float64), 'col3': tensor([18.7000, 17.4000, 18.0000], dtype=torch.float64), 'col4': tensor([181., 186., 195.], dtype=torch.float64), 'col5': tensor([3750., 3800., 3250.], dtype=torch.float64), 'col7': tensor([2007, 2007, 2007], dtype=torch.int16)} You can directly call the __dataframe__ protocol on Ibis tables, though this is typically handled by the library you’re using. t.__dataframe__() <ibis.expr.types.dataframe_interchange.IbisDataFrame at 0x7fff48d5ada0> You can directly call the __array__ protocol on Ibis tables, though this is typically handled by the library you’re using. t.__array__() array([['Adelie', 'Torgersen', 39.1, ..., 3750.0, 'male', 2007], ['Adelie', 'Torgersen', 39.5, ..., 3800.0, 'female', 2007], ['Adelie', 'Torgersen', 40.3, ..., 3250.0, 'female', 2007], ..., ['Chinstrap', 'Dream', 49.6, ..., 3775.0, 'male', 2009], ['Chinstrap', 'Dream', 50.8, ..., 4100.0, 'male', 2009], ['Chinstrap', 'Dream', 50.2, ..., 3775.0, 'female', 2009]], dtype=object) Show SQL + Python SQL + PYTHON Ibis has the ibis.to_sql to generate SQL strings. In a Jupyter notebook or IPython shell session, the output of ibis.to_sql will be syntax highlighted. In a plain Python REPL use print(ibis.to_sql(...)) to pretty print SQL. Ibis uses SQLGlot under the hood to allow passing a dialect parameter to SQL methods. * BigQuery * Snowflake * Oracle * MySQL * MSSQL * PostgreSQL * SQLite * Trino 1dialect = "bigquery" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.* FROM ( SELECT t1.`species`, t1.`island`, count(1) AS `count` FROM penguins AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0.`count` DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "snowflake" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0."count" FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS "count" FROM "main"."penguins" AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0."count" DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "oracle" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0."count" FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS "count" FROM penguins t1 GROUP BY t1.species, t1.island ) t0 ORDER BY t0."count" DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "mysql" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0.count FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS count FROM penguins AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0.count DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "mssql" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0.count FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS count FROM penguins AS t1 GROUP BY t1.species, t1.island ) AS t0 ORDER BY t0.count DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "postgres" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0.count FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS count FROM penguins AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0.count DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "sqlite" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0.count FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS count FROM penguins AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0.count DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ 1dialect = "trino" 2sql = ibis.to_sql( grouped, dialect=dialect, ) 3sql 1 Set the dialect. 2 Convert the table to a SQL string. 3 Display the SQL string. SELECT t0.species, t0.island, t0.count FROM ( SELECT t1.species AS species, t1.island AS island, COUNT(*) AS count FROM main.penguins AS t1 GROUP BY 1, 2 ) AS t0 ORDER BY t0.count DESC You can chain Ibis expressions and .sql together. 1con.sql(sql, dialect=dialect).filter(ibis._["species"] == "Adelie") 1 Chain .sql calls and Ibis expressions together. ┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ ┃ species ┃ island ┃ count ┃ ┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ │ string │ string │ int64 │ ├─────────┼───────────┼───────┤ │ Adelie │ Dream │ 56 │ │ Adelie │ Torgersen │ 52 │ │ Adelie │ Biscoe │ 44 │ └─────────┴───────────┴───────┘ Back to top * * *