● Design on demand training for SETRA Conseil regarding Apache Cassandra & Apache
Spark
● Train 4 engineers on Apache Cassandra and Apache Spark (ops oriented)
● Apache Cassandra & PowerBI consulting
● Audit of AS IS architecture & issues, design of TO BE architecture for fixing performance issues.
● Consulting about Hortonworks HDP & Cloudera CDP stacks (Apache Atlas, metadata & lineages, Apache Spark optimizations)
● Project management
● Development of data pipelines with Apache Spark and internal Java framework (Spark SQL, S3, Parquet files, Apache Kafka)
● Consulting about key factor success for a data science approach
● Consulting about ML OPS : from Proof of Concept to go live
● Consulting & training about Apache Spark
● Project management (Agile)
● Design and implement SAPHANA & AWS S3 batch pipelines as standalone Spark/Scala
applications to Redshift datalake for all Zalando payments data
● Design and refactor a large Apache Spark batch processing payments logs in streaming
application thanks to Databricks autoloader and Apache Flink
● Design and implement CI/CD process using Zalando Continuous Delivery Platform &
Apache Airflow.
● Expertise support to internal teams regarding Apache Spark and Apache Cassandra
● Expertise support to Data Science projects
● Development in Kotlin / Java / Ksh of soft tools for Cloudera /
HDP, CDP, Apache Atlas, Apache Hive, Apache Phoenix, Hadoop, Spark, Quarkus, Kubernetes, Kotlin, JavaCourses: Big Data introduction, Machine Learning algorithms production deploiement by practice,
Apache Flink, Apache Spark, Scala
Confidential webmarketing group , Netherlands (remote) - Data Engineer, full stack dev
Design and put in place CI/CD with AWS CodePipeline
● Data pipelines with Apache Flink & Kafka / Java & Scala
● Design and deployment of Big Data architecture (AWS, Apache Kafka, PySpark)
● Staffing of Big Data & Data Science teams
● Development of algorithms for parsing bank statement and determining company behind each line,
development of recommender system, clustering (Python, PySpark,)