Tools

Software and applications I use daily

Visual Studio Code

Primary code editor with Python and SQL extensions.

iTerm2

Terminal emulator with split panes and search.

Git

Version control and collaboration.

Docker

Containerizing data applications and local development environments.

GitHub Actions

CI/CD for data pipelines and automated testing.

Tech Stack

Technologies and frameworks I work with

Python

Primary language for data pipelines and scripting.

PySpark

Distributed data processing on large datasets.

SQL

Querying, transforming, and modeling data.

Apache Airflow

Orchestrating and scheduling data pipelines.

dbt

Transforming and modeling data in the warehouse.

AWS Glue

Serverless ETL and data cataloging.

Amazon Athena

Interactive SQL queries on S3 data.

Amazon Redshift

Data warehousing and analytics.

Bash

Automation and infrastructure scripting.

Environment

Hardware and workspace setup

macOS

Primary development environment.

AWS Console

Managing cloud resources, Glue jobs, and Redshift clusters.