Available for senior architecture engagements — Remote / Casablanca

Tarik Boulaajoul.
Data Architect.

I design systems that turn raw data into organizational memory — governed lakehouses, distributed pipelines, and decision-grade BI for telco, BPO, insurance and SAP-driven enterprises.

View work → Download CV ↓ Get in touch

6+ years across 5 enterprises ENSIAS State Engineer · e-Mgmt & BI FR · EN fluent Casablanca, Morocco

Architecture showcase

Three systems I designed end to end.

Real platforms, real stakeholders, real numbers. Each case below is a problem statement, the architectural decision I made, and what it cost — or saved.

Case 01 · Lakehouse modernization

Killing the Excel pipeline at a 1,000-seat BPO.

Assist Digital · 2025–present · Data Platform Lead role

Problem

A call-center BPO ran daily reporting on Python scripts that emailed Excel files. Numbers diverged across teams, refreshes broke silently, and a new client onboarding could take weeks of manual ETL.

Decision

Replace the script-and-email loop with a governed medallion lakehouse: Airflow as the only scheduler, dbt as the only transformer, Power BI as the only consumption layer. Bronze captures raw vendor exports, Silver enforces conformed dimensions, Gold serves business marts.

Stack

Apache Airflowdbt Power BIDAX SQLPython Medallion (Bronze/Silver/Gold)Git

↓ BI cost

Tableau Server retired after audit-driven catalog rationalization. Daily-refreshed dashboards replaced ad-hoc Excel emails for ops & execs.

Case 02 · Distributed compute · predictive ops

Industrializing Big Data for an international telecom operator.

Orange Business Services · 2022–2025 · Big Data Engineer

Problem

A strategic data transformation needed scalable Big Data flows feeding both real-time KPIs and predictive models that flag system incidents before they page someone. Existing batch jobs were brittle and ran on an unhealthy mix of cron and tribal knowledge.

Decision

Standardize on Spark over HDFS/Hive for distributed processing, with a hardened pipeline pattern: idempotent writes, partitioned storage, and resilience baked in. Predictive models live alongside the pipelines so feature freshness is the platform's responsibility, not a notebook's.

Stack

Apache SparkHive HDFSPython Power BISQL Bash · UnixGit

−60%

processing time on critical batch flows after partition + shuffle tuning. Predictive layer surfaced incident precursors hours earlier.

Case 03 · Cloud-native BI

From shared drives to a governed AWS warehouse for an insurance SME.

NGIS · 2020–2021 · BI Analyst & Architect

Problem

An insurance file-management SME had years of historical data sitting in spreadsheets and operational databases, with no central place to ask "how are we doing this quarter?" — and no automation behind the reports they did produce.

Decision

Stand up a cloud-native BI environment on AWS: S3 as raw landing, RDS as analytical store, Airflow + Pandas to automate the prep, and persona-tailored Power BI dashboards so claims handlers, ops managers and execs each see what they need — not what someone else needs.

Stack

AWS S3AWS RDS Apache AirflowPandas Power BIPython SQL

3 personas

handlers, ops, execs — each on a tailored dashboard, all sourced from one governed warehouse. Manual report production replaced by scheduled refreshes.

Skills map

The stack, organized the way I think about it.

Five concerns every data platform has to solve. Hover any node for proficiency and where I last shipped it in production.

Storage Orchestration Modeling Governance Visualization hover any node

Architecture decision records

Decisions worth writing down.

The interesting part of architecture isn't the diagram — it's the "and the alternative was…". Three calls I've defended in production.

ADR-001 · Accepted

Why we replaced Tableau Server with Power BI.

Context

An existing Tableau Server install with hundreds of reports — many unused, several duplicated, a few load-bearing. License cost was material; usage telemetry was thin.

Decision

Audit the catalog (who opens what, when, and why), keep the load-bearing reports, retire the rest, and rebuild the keepers in Power BI on top of the new dbt-curated marts. Single semantic layer, one BI license to renew.

Consequences

Lower BI spend, stronger consistency (one model = one truth), and a forcing function to actually understand which reports drive decisions. Cost: a quarter of careful migration, not a weekend script.

Status: acceptedDomain: BI consolidation

ADR-002 · Accepted

Star Schema vs Data Vault: a real tradeoff.

Context

Bronze landing was easy. The fight was at Silver: do we model conformed dimensions (Kimball star) or capture every source change in hubs/links/satellites (Data Vault)?

Decision

Star schema at Silver, exposed to BI through Gold marts. Data Vault is correct when source systems shift constantly and full historical replay is non-negotiable; for a BPO with stable vendor schemas and a strong "query me right now" need from the business, the dimensional model wins on time-to-insight per analyst-hour.

Consequences

Faster delivery, simpler DAX, easier Power BI semantic layer. We accept that schema-changing source systems will force ETL rework — and we monitor for that explicitly.

Status: acceptedDomain: warehouse modeling

ADR-003 · Accepted

dbt for transforms. Airflow only for orchestration.

Context

We could write transformation logic inside Airflow PythonOperators and skip a tool. We could also stuff orchestration logic inside dbt and skip Airflow. Both work. Both rot.

Decision

One tool per concern. Airflow schedules, retries, captures SLAs and emits state. dbt models the data, runs tests, and produces lineage. They communicate through a thin contract: Airflow calls dbt run --select, dbt fails loudly if a contract test breaks.

Consequences

Each tool stays good at its one job. Onboarding new analysts takes hours instead of weeks because the mental model is small. The cost is two systems to operate — worth it at any non-trivial scale.

Status: acceptedDomain: pipeline architecture

Tech stack timeline

Six years, five companies, one trajectory.

Each stop sharpened a different concern — from raw ETL to distributed compute to governed lakehouses. Scroll →

2019

TraInvestment

Data Science Intern

End-to-end fraud detection system — first taste of designing modular pipelines instead of one-off scripts.

Lang: Python · FlaskPipe: AirflowInfra: Unix · Git

2020 — 2021

NGIS

BI Analyst

Cloud-native BI for an insurance SME. First production AWS deployment; first time owning a semantic model end to end.

Cloud: AWS S3 / RDSPipe: Airflow · PandasBI: Power BI · DAX

2021 — 2022

Grupo Avalon

Data Engineer

Heterogeneous SAP + SQL Server integration. Where I learned that data quality is a system property, not a script.

Sources: SAP HANA · SQL ServerETL: SAP Data Services · PythonQuality: validation pipelines

2022 — 2025

Orange Business Services

Big Data Analyst

Distributed Spark pipelines + predictive analytics for an international telecom operator. −60% processing time on critical batches.

Compute: SparkStorage: HDFS · HiveBI: Power BI

2025 — present

Assist Digital

Data & Platform Lead

Modernization of a BPO reporting stack into an Airflow + dbt + Power BI medallion lakehouse. Tableau Server retired. Self-serve CSV intake for business users.

Pipe: Airflow · dbtModel: Bronze · Silver · GoldBI: Power BI · DAX governance

Writing

Opinions I'll defend in a code review.

Short, opinionated essays from the field. No vendor takes, no "modern data stack" bingo.

modeling

The dashboard isn't the product — the model is.

Stakeholders point at the dashboard, but every interesting question they'll ask next year depends on the semantic layer underneath. Build the model, and the dashboards become cheap.

6 min readRead →

platform

Stop calling it a data lake if you can't query it.

A bucket of parquet files isn't a lake — it's a graveyard. The line between "lake" and "lakehouse" is whether the next analyst can answer a question without summoning you.

4 min readRead →

medallion

Why your medallion architecture leaks: the silver layer trap.

Most teams do bronze well and gold reasonably. Silver is where conformed dimensions live or die — and where well-meaning teams ship marts disguised as Silver tables.

7 min readRead →

Contact

Designing your next data platform?

I take on a small number of senior architecture engagements per year — lakehouse builds, BI modernization, predictive ops platforms. Send a note with a one-paragraph problem statement and I'll reply within two business days.

@Emailboulaajoul.tarik@gmail.com inLinkedInlinkedin.com/in/tarik-boulaajoul </>GitHubgithub.com/BoulaajoulTarik ☎Phone(+212) 6 72 67 00 48 ⌖LocationCasablanca, Morocco · remote-friendly

Tarik Boulaajoul.Data Architect.

Three systems I designed end to end.

Killing the Excel pipeline at a 1,000-seat BPO.

Problem

Decision

Stack

Industrializing Big Data for an international telecom operator.

Problem

Decision

Stack

From shared drives to a governed AWS warehouse for an insurance SME.

Problem

Decision

Stack

The stack, organized the way I think about it.

Decisions worth writing down.

Why we replaced Tableau Server with Power BI.

Context

Decision

Consequences

Star Schema vs Data Vault: a real tradeoff.

Context

Decision

Consequences

dbt for transforms. Airflow only for orchestration.

Context

Decision

Consequences

Six years, five companies, one trajectory.

Opinions I'll defend in a code review.

The dashboard isn't the product — the model is.

Stop calling it a data lake if you can't query it.

Why your medallion architecture leaks: the silver layer trap.

Designing your next data platform?

Tarik Boulaajoul.
Data Architect.