#pydata

Beautiful Leaflet Markers with Folium and Font Awesome

Posted on December 4, 2022 in python · pydata · visualization · til

TIL how to use fontawesome markers with folium. Read More

Scale-Aware Rating of Count Forecasts

Posted on December 1, 2022 in pydata · python · meetup

Forecasts crave a rating that reflects the forecast's quality in the context of what is possible in theory and what is reasonable to expect in practice. Read More

Python Support in Snowflake

Posted on November 16, 2022 in python · sql · snowflake · pydata

Snowflake offers different ways to access and call python from within their compute infrastructure. This post will show how to access python in user defined functions, via stored procedures and in snowpark. Read More

Azure Synapse SQL On-Demand OPENROWSET Common Table Expression with SQLAlchemy

Posted on September 27, 2020 in python · sql · pydata · azure

Using SQLAlchemy to create openrowset common table expressions for Azure Synapse SQL-on-Demand Read More

DuckDB vs. Azure Synapse SQL on-demand with Parquet

Posted on May 25, 2020 in python · parquet · pydata · pandas · azure

Inspired by Uwe Korns post on DuckDB this post shows how to use Azure Synapse SQL-on-Demand to query parquet files with T-SQL on a serverless cloud infrastructure. Read More

Using turbodbc to access Azure Synapse SQL-on-demand endpoints

Posted on May 25, 2020 in python · sql · pydata · azure

Azure Synapse SQL-on-Demand offers a web client, the desktop version Azure Data studio and odbc access with turbodbc to query parquet files in the Azure Data Lake. Read More

Azure Data Explorer and Parquet Files in Azure Blob Storage

Posted on February 1, 2020 in python · pydata · azure · parquet

Last summer Microsoft has rebranded the Azure Kusto Query engine as Azure Data Explorer. While it does not support fully elastic scaling, it at least allows to scale up and out a cluster via an API or the Azure portal to adapt to different workloads. It also offers parquet support out of the box which made me spend some time to look into it. Read More

Understanding Predicate Pushdown at the Row-Group Level in Parquet with PyArrow and Python

Posted on January 19, 2020 in python · pydata · parquet · arrow · pandas

Apache Parquet is a columnar file format to work with gigabytes of data. Reading and writing parquet files is efficiently exposed to python with pyarrow. Additional statistics allow clients to use predicate pushdown to only read subsets of data to reduce I/O. Organizing data by column allows for better compression, as data is more homogeneous. Better compression also reduces the bandwidth required to read the input. Read More

Beautiful Leaflet Markers with Folium and Font Awesome

Scale-Aware Rating of Count Forecasts

Python Support in Snowflake

Azure Synapse SQL On-Demand OPENROWSET Common Table Expression with SQLAlchemy

DuckDB vs. Azure Synapse SQL on-demand with Parquet

Using turbodbc to access Azure Synapse SQL-on-demand endpoints

Azure Data Explorer and Parquet Files in Azure Blob Storage

Understanding Predicate Pushdown at the Row-Group Level in Parquet with PyArrow and Python

Azure Data Lake Storage Gen2 with Python

Exasol User Group Karlsruhe

Getting Started with the Cloudera Kudu Storage Engine in Python

EuroPython 2015 PySpark - Data Processing in Python on top of Apache Spark

PyData 2015 Berlin - Introduction to the PySpark DataFrame API