Francisco Sánchez Noguera

Senior Data Platform Engineer

📍 Spain (Remote)

Software Engineer with 6+ years of experience in data engineering, platform engineering, and distributed systems. Expert in Python, Spark, Terraform, and Azure — from optimizing jobs that saved clients 100k+ EUR/year to provisioning 300+ data platforms at scale.

Work History

Senior Software Engineer

BASF

2025-01Present

Internal hire leading platform engineering for a new data platform combining Azure, AKS, Databricks, and infrastructure as code with CDKTF and Terraform.

  • Leading Databricks and Unity Catalog deployment on a new CDKTF-based platform, contributing the largest share of infrastructure constructs
  • Designed zero-downtime state migration process enabling resource imports across 300+ platforms without user impact
  • Resolved a critical production incident within 1 hour by auditing Unity Catalog tables and reverting configuration changes, preventing data processing paralysis across all platforms
  • Defined role-based access control architecture through ADRs and implemented permissions across portal and infrastructure layers
  • Collaborated with architects on network topology design for the new platform
  • Implementing logs and metrics ingestion across multiple clusters using Grafana OSS, configuring Alloy collectors, receivers, and scrape pipelines
TerraformCDKTFTypeScriptPythonKubernetesAzureDatabricksUnity CatalogGitHub Actions

Senior Data Engineer

Capitole Consulting (Client: BASF)

2023-032024-12

Consultant leading two major engagements at BASF: delivering a cosmetics analytics data platform, then becoming lead developer for Azure data platform provisioning serving 300+ workspaces.

  • Delivered a 12-month cosmetics analytics release in under 6 months using Medallion architecture with Delta Lake processing
  • Led refactoring of the platform provisioning codebase (Terraform + Python), improving pipeline success rate from 20% to 90% and reducing execution time from 1.5 hours to 30 minutes
  • Reduced platform provisioning from 4 hours to 20 minutes by orchestrating GitHub Actions workflows with Terraform for infrastructure and Python for configuration
  • Built a REST API with FastAPI and Container App Jobs for federated Databricks Account operations
  • Orchestrated Unity Catalog rollout across 300+ Databricks workspaces with only 1% incident rate
  • Led a team of 5 engineers, increasing test coverage from 10% to 85% through code reviews, pair programming, and Python best practices
  • Conducted ~30 technical interviews for Senior Data Engineer positions, resulting in 6 hires
  • Managed the Data Community at Capitole: biweekly newsletter, meetups, and tech talks reaching 100+ data engineers
PythonTerraformAzure DevOpsAzure DatabricksDelta LakeUnity CatalogFastAPIDockerGitHub Actions

Big Data Developer

Kenmei Technologies

2022-012023-03

Promoted to the Big Data department, processing massive telecom datasets with Scala and PySpark on Google Dataproc and Azure Databricks.

  • Optimized a PySpark data pipeline from 20 hours to 4 hours (-80%), saving the client 100k+ EUR annually in cluster costs
  • Productionized the Uplink Interference detection algorithm using distributed PySpark, reducing execution from 60 minutes (single node) to 6 minutes (-90%)
  • Built a geospatial classification product with Scala and Sedona, categorizing tiles as indoor/outdoor across entire countries
  • One of only two engineers capable of contributing to the CallTraces product — a highly complex Scala/Spark real-time trace processing system
ScalaPySparkGoogle DataprocAzure DatabricksDelta LakeAzure Blob StorageBigQuerySedona

Junior Innovation Engineer

Kenmei Technologies

2019-062022-01

R&D engineer working directly with the CTO on individual projects for telecom clients, building high-performance Python services and signal processing algorithms.

  • Built a real-time drone geolocation service achieving 100+ geolocations/second (16× the client target) using NumPy vectorization and async MQTT
  • Developed a high-throughput TCP trace server processing ~550 Mb/s with Cython and Numba — bottlenecked by network interface, not CPU
  • Created distributed coverage maps with Dask spatial partitioning, reducing country-level processing from 4 hours to 35 minutes (-85%)
  • Designed an Uplink Interference detection algorithm using FFT and RSRQ/SINR metrics for PIM pattern identification — later productionized at scale in Big Data
  • Received recognition from a major telecom client's CTO for project delivery quality
PythonNumPyNumbaCythonDaskPySparkMQTTAsyncioPostgreSQL

Skills

Languages

Pythonexpert
Scalaadvanced
TypeScriptadvanced
Goadvanced
Bashadvanced

Data Engineering

Apache Spark / PySparkexpert
Delta Lakeexpert
Databricksexpert
Unity Catalogexpert
Daskadvanced
BigQueryadvanced

Cloud & Infrastructure

Azureexpert
Terraform / CDKTFexpert
Kubernetesadvanced
Dockeradvanced
GitHub Actionsadvanced
Azure DevOps Pipelinesadvanced

Backend & APIs

FastAPIadvanced
PostgreSQLadvanced
REST API Designadvanced
Asyncioadvanced

Observability & Quality

Prometheus / Grafanaadvanced
Pytest / Unit Testingexpert
Clean Code / Architectureadvanced
CI/CD Pipelinesadvanced

Certifications

Databricks Certified Data Engineer Professional

Databricks

Issued: 2026-01 · Expires: 2028-01

Verify credential →

Certified Kubernetes Application Developer (CKAD)

The Linux Foundation

Issued: 2025-06 · Expires: 2027-06

Databricks Certified Associate Developer for Apache Spark 3.0

Databricks

Issued: 2024-09

AWS Certified Cloud Practitioner

Amazon Web Services

Issued: 2024-06 · Expires: 2027-06

Microsoft Certified: Azure Fundamentals (AZ-900)

Microsoft

Issued: 2024-03

Education

Master's Degree (Enabling) in Telecommunications Engineering

Universitat Politècnica de València

Graduated: 2019-06-15

Bachelor's Degree in Telecommunications Engineering

Universitat Politècnica de València

Graduated: 2018-06-15

CubeSats Concurrent Engineering Workshop in Satellite Systems Design

European Space Agency (ESA)

Graduated: 2019-06-15

Contact