Currently accepting engagements – Data Engineering · AI Systems · Analytics Infrastructure

I engineer data systems that work under pressure – production ETL pipelines, governance‑aware analytics, and AI applications built for industries where accuracy isn‘t optional.

Python · dbt · Airflow · LangChain · RAG · FastAPI – available for remote and hybrid engagements globally.

Rare combination: Data engineering depth + criminology/investigative thinking. I build systems that are accurate, auditable, and built to withstand scrutiny – ideal for fintech, govtech, compliance, and regulated industries.

8+
Production Systems
95%
Compliance Time Saved
6
Industries Served
PythonSQLPySparkdbtAirflowLangChainFastAPIPostgreSQLKafka
Technical Skills

What I Bring to the Table

Production‑ready skills across the modern data stack.

Core Data Engineering

PythonSQLPySparkdbtApache AirflowKafka

AI & LLMs

LangChainRAG SystemsWeaviateOpenAI APIHugging Face

Databases & Warehouses

PostgreSQLPostGISClickHouseMongoDB

Backend & APIs

FastAPIRESTGraphQL

Monitoring & Governance

GrafanaPrometheusData Quality FrameworksAudit Logging
Why I'm Different

What You Get Beyond the Tech Stack

Three edges that make me a better hire for data‑intensive roles.

Governance‑First Engineering

Criminology background means I build data systems that understand compliance, audit trails, and regulatory risk from day one. No “retrofit governance” later.

Africa‑Context Systems

4 years building for Kenya and East Africa means I understand infrastructure constraints, local regulatory frameworks, and real‑world business needs.

Ship, Don't Just Code

8 production systems deployed, not 8 GitHub repos. I build systems that go live, stay live, and deliver measurable business impact.

Production Systems

Flagship Systems

End-to-end intelligent systems demonstrating architecture decisions, pipeline design, and real-world impact.

Bank Compliance Analytics Dashboard
GOVERNANCE · ANALYTICS

Bank Compliance Analytics Dashboard

Analytics Engineer

Problem: A Kenyan financial institution's compliance team spends 40+ hours monthly manually producing regulatory reports across fragmented data sources, with no automated alerting when KPIs approached breach thresholds.

Solution: Power BI dashboards with dbt transformations. Real-time compliance monitoring for Kenyan banking regulations. Automated alerting for regulatory breaches.

IMPACT

95% reduction in manual compliance reporting

Architecture: Raw Transactional Data → dbt Models (Staging → Intermediate → Mart) → PostgreSQL DWH → Power BI Semantic Layer → Compliance Dashboard + Alerts
Power BIdbtPostgreSQLPython
JobScout KE — Automated Job Hunting System
AUTOMATION · STREAMLIT

JobScout KE — Automated Job Hunting System

Full Stack Engineer

Problem: Job seekers manually checking 7+ platforms daily missed time-sensitive postings — the fragmented job market meant relevant roles expired before candidates found them.

Solution: Automated job aggregation system using Streamlit, Gmail API, Jooble API, Groq LLaMA, and Playwright. Multi-source scraping with Telegram notifications.

IMPACT

Automated job search across 7 platforms. Real-time notifications.

Architecture: Playwright Scraper + Jooble API → SQLite Deduplication Layer → Groq LLaMA Relevance Scorer → Telegram Bot Notification → Gmail Digest
StreamlitGmail APIPlaywrightSQLiteTelegram Bot
Kijani Care 360 — Environmental Monitoring Platform
GEOSPATIAL · ETL

Kijani Care 360 — Environmental Monitoring Platform

Data Engineer

Problem: Environmental NGOs and field teams tracking tree planting initiatives across Kenya had no centralized, real-time system to monitor survival rates, regional suitability, or impact metrics — leading to delayed reporting and duplicate field work.

Solution: Geospatial ETL pipeline + PostgreSQL/PostGIS. Real-time dashboards tracking 10,000+ tree planting records across Kenya. FastAPI backend, React frontend.

IMPACT

Monitoring 10K+ environmental records in real-time

Architecture: Field Input / CSV Upload → ETL Ingest → PostGIS PostgreSQL → FastAPI Analytics API → React Dashboard
FastAPIPostgreSQLPostGISReactDocker
Kipaji Chetu — AI Adaptive Learning Platform
AI · EDTECH · DATA SYSTEMS

Kipaji Chetu — AI Adaptive Learning Platform

AI Engineer & Backend Developer

Problem: Teachers in under-resourced schools have no scalable tool to deliver personalized assessments - every student received the same quiz regardless of performance level, and feedback was slow and manual.

Solution: AI-powered personalized learning system with adaptive quizzes, real-time feedback, and teacher analytics. Built with FastAPI, PostgreSQL, and LLM integration.

IMPACT

Automated personalized learning and feedback system with adaptive difficulty and real-time analytics for students and teachers.

Architecture: Student Attempt → Performance Log → Groq Adaptive Engine → Next Question + LLM Feedback → Teacher Analytics Dashboard
FastAPIPostgreSQLSQLAlchemyGroqEdge-TTSJavaScriptDocker
Safari Scout — AI Travel Assistant
RAG · AI SYSTEMS

Safari Scout — AI Travel Assistant

AI Engineer & Data Scientist

Problem: Travelers planning Kenya safaris spend 10+ hours manually researching across dozens of fragmented sources, booking sites, travel blogs, forums, with no single reliable, conversational interface.

Solution: LangChain + Weaviate vector database RAG pipeline. Semantic search across 2,000+ curated Kenya travel experiences. Deployed FastAPI backend + Next.js frontend.

IMPACT

Reduced travel research time by 70%. Processed 2,000+ curated experiences with 94% retrieval accuracy.

Architecture: User Query → Query Enhancer → Hybrid Search (Vector + BM25) → Weaviate → LangChain LLM Chain → Streamed Response
LangChainWeaviateFastAPINext.jsRAGOpenAI
Work History

Experience Snapshot

4+ years building production data systems across fintech, civic tech, and environmental sectors.

Data and Analytics Engineer

Data Science East Africa

January 2025Present · Nairobi, Kenya

Leading data engineering and analytics delivery for enterprise and AI clients across East Africa. Designing production ETL/ELT pipelines, compliance-aware analytics infrastructure, and translating complex stakeholder requirements into scalable, well-documented data systems.

PythonSQLPySparkAirflow

Founder & Lead Data Engineer

Data Scout KE

January 2025Present · Nairobi, Kenya

Independent data engineering consultancy building governance-aware analytics systems and AI applications for Kenyan financial institutions and SMEs.

LangChainFastAPIPostgreSQLdbt

Freelance AI Trainer, Data Annotator & Analyst

Remote - Global AI Clients

2021Present · Remote

Annotating and quality-controlling training data for production ML models deployed by global AI organisations. Applying structured analytical reasoning to improve model fairness, reduce labelling inconsistencies, and maintain output quality across large-scale annotation projects.

PythonSQLExcelNLP
Education & Certifications

Academic Background

B.A. Security Studies & CriminologyMount Kenya University (2025)

Data Engineering, Data Science, Analytics and AI Software Development Cybersecurity Fundamentals
Open to Work

I’m available for Data Engineering roles

Full‑time · Remote · Hybrid (Nairobi) · Contract

Thought Leadership

Writing

Technical articles and insights on building production systems.

Let’s Work Together

Let’s build something impactful

Whether you're hiring, building, or scaling — I design systems that move data into decisions.

Open to roles & high-impact projects
Start a conversation

Typically respond within 24 hours

Or email me directly at rosewabere1@gmail.com