About

I'm a data professional passionate about turning raw data into actionable insights and automated workflows. I specialize in end-to-end data engineering, building cloud-based pipelines with AWS and Prefect, and developing ETL and machine learning solutions in Python.

Technical Skills: AWS, Prefect, Python, ETL, Data Wrangling, Machine Learning, Cloud Integration, Pipeline Orchestration, Network Analysis, Pandas, NumPy

View Resume

What I Do

Data Engineering

Building pipelines, ETL systems, and cloud solutions with AWS, Python, and more.

Pipeline Orchestration

Orchestration of ETL workflows using airflow and Prefect.

Data Analysis

Providing actionable insights from complex datasets using SQL, Python, and dashboards.

My Projects

Data Engineering

Building a Modern Real Estate Analytics Pipeline: From Redfin to Power BI

In this project, I built an end to end data pipeline that starts with Apache Airflow running on an AWS EC2 instance. I used Python with pandas to extract raw TSV files, chunking them to avoid memory overload, and cleaning messy fields. The transformed data then flows into AWS S3, where I set up Snowpipe with Snowflake to automatically ingest new files the moment they land, no manual triggers needed. For the data warehouse setup, I designed Snowflake tables with precision. All my development happened in VS Code connected directly to the EC2 instance, which let me iterate quickly.

Apache Airflow, AWS EC2 Instance, Python, SQL, Snowflake, Snowpipe, AWS S3, Power BI, Jupyter Notebook

Github Medium Article

Data Engineering

End-to-End Data Engineering with AWS

This project involved creating a scalable AWS-based data pipeline to facilitate data-driven decisions, using S3, Glue, Athena, and Jupyter for data processing and modeling. The cleaned data was structured into a star schema, loaded into Redshift, and visualized in Power BI to extract key insights. The work highlights my expertise in developing cloud-based pipelines that empower analytics teams to derive business value from complex datasets.

Stacks Used: S3 bucket, AWS Athena, AWS Glue, Python, SQL, Jupyter Notebook, Redshift, Power BI

Medium Article

Data Engineering

Orchestring data pipelines with prefect

This pipeline automates data movement from a local system to the cloud, making use of Prefect to orchestrate each step efficiently. It starts by uploading a CSV file to S3, then ingests the data into Snowflake using a SQL COPY command, followed by triggering a dbt Cloud job for transformations. Each step is managed as a Prefect task with built-in retry logic and error handling for robustness. The pipeline uses Prefect’s configuration blocks for secure, reusable connections, integrating S3, Snowflake, and dbt Cloud into a seamless, scalable workflow.

Stacks Used: AWS S3, Prefect, Snowflake, dbt, AWS IAM, Python, SQL, Git

Github Medium Article

Data Analysis

SellCheapy Retail Analysis

This project investigated declining sales at SellCheapy Retail by analyzing a complex SQL Server database with over 90 tables, focusing on customer behavior and transaction trends. Key steps included parsing XML data for demographics, merging transaction records, and assessing delivery delays, culminating in Tableau visualizations that pinpointed a sales decline starting in May 2013. The analysis not only improved my SQL and data wrangling skills but also highlighted how structured data exploration can reveal critical business issues, such as potential mismatches in customer preferences or product strategy.

Stacks Used: Microsoft SQL Server, Tableau, SQL

Github Medium Article

Data Engineering

Zentel Network Performance

Zentel Network Performance Datafest At the 2022 DataFest Hackathon, my team analyzed Zentel's complaint response data, uncovering significant delays 22-second response times and 2-hour resolutions, particularly during peak evening hours. We recommended optimizing operations between 6-9 PM, restructuring teams, and hiring/training staff to boost efficiency. Our data-driven dashboard and actionable insights earned us 1st Runner Up while helping Zentel improve its service performance.

Stack Used: Power BI

Medium Article

Data Engineering

Data wrangling with udacity

This project involved wrangling and cleaning diverse dog-rating data from three sources, a CSV, TSV, and Twitter’s API via Tweepy, focused on the viral '@WeRateDogs' account. Using Python (Pandas, Tweepy) and Excel, I standardized timestamps, merged dog categories, and filtered noise to reveal insights like top dog breeds and audience engagement trends. The Udacity-approved outcome honed my skills in API integration, data cleaning, and exploratory analysis, demonstrating end-to-end data handling from raw sources to actionable insights.

Stacks Used: Twitter's API, Python, Microsoft Excel, Jupyther Notebook

Github Medium Article

Data Analysis

The Look E-commerce Data Analysis

This Power BI project analyzed Adventure Works’ production data to identify inefficiencies in manufacturing operations, including scrap patterns, delays, and output trends. After cleaning and transforming multi-table datasets using Power Query and DAX, I built a dynamic data model with custom measures and time intelligence for granular analysis. The interactive dashboard featured an executive KPI summary, Key Influencers visuals pinpointing scrap drivers, and drill-down capabilities by product and location.

Stack Used: Power BI

Github

I'm Okpe Chiamaka

About

What I Do

Data Engineering

Pipeline Orchestration

Data Analysis

My Projects

Building a Modern Real Estate Analytics Pipeline: From Redfin to Power BI

End-to-End Data Engineering with AWS

Orchestring data pipelines with prefect

SellCheapy Retail Analysis

Zentel Network Performance

Data wrangling with udacity

The Look E-commerce Data Analysis

Contact

Get In Touch!