PinnedAlireza SadeghiDuckDB Beyond the HypeA Powerful Addition to the Data Scientist’s and Data Engineer’s ToolboxSep 181Sep 181
PinnedAlireza SadeghiOpen Source Data Engineering Landscape 2024Exploration of the open source software in data engineering ecosystemFeb 416Feb 416
Alireza SadeghiBuilding a High-Performance Data Pipeline Using DuckDBUsing DuckDB to Serialise, Transform, and Aggregate Data in Data LakesOct 202Oct 202
Alireza SadeghiThe History and Evolution of Open Table FormatsFrom Hive to High Performance: A Journey Through the Evolution of Data Management on Data LakesAug 23Aug 23
Alireza SadeghiHow to build a dual Incremental + snapshot data ingestion pipelineA useful batch data ingestion pattern for maximum data correctness and reliability as well as providing low latency accessOct 1, 2023Oct 1, 2023
Alireza SadeghiTechniques For Periodically Extracting Data From Relational DatabasesPresenting techniques for extracting data from relational databases when building ETL pipelines for a data lake, DWH or data lakehouseSep 19, 2023Sep 19, 2023
Alireza SadeghiTechniques for Managing Dependency Between Data PipelinesIt’s a common challenge to manage dependency between data pipelines on data-driven systems and analytical platforms which having data…Aug 29, 2023Aug 29, 2023
Alireza SadeghiInternal Storage Design of Modern Key-value Database Engines [Part 1]Deep dive into physical storage design implemented by many modern popular key-value stores such as Amazon Dynamo DB, Apache Cassandra, RiakAug 14, 2023Aug 14, 2023
Alireza SadeghiAirflow callbacks to Slack notifications for DAG monitoring and alertingIn this post I’ll demonstrate the step by step guide to integrate Airflow workflows with Slack for notification and monitoring purpose. The…Jul 23, 20232Jul 23, 20232
Alireza SadeghiinTowards DevAdding Custom Country Map to Apache SupersetIn this post I demonstrate the steps followed to add a custom country map to superset repository and rebuild the app.Jul 12, 20231Jul 12, 20231