• Onprem Postgres datawarehouse : Develop comprehensive specifications for the IT department, enabling the construction of a PostgreSQL data warehouse that effectively utilizes data from local servers across the agencies.
• Datamarts for delivery activity reporting : Leverage advanced SQL queries and utilize the power of dbt to establish robust datamarts for monitoring and analyzing delivery activity within the organization.
• Data migration to GCP: Create a secure and GDPR-compliant Google Cloud infrastructure (using terraform, CI/CD). Transfer data to cloud storage in Parquet format. Transform and ingest data into BigQuery using dbt for efficient data transformation and Airflow/Cloud Composer for seamless orchestration. Ensure GDPR data compliance by implementing purge and anonymization procedures.
• Categorial analysis (Carrefour data retail) : Automation of categorical analysis in order to provide insights on a brand or a
product category from the transactional data of carrefour france
(Big data)) using python and bigquery (Advanced sql queries)
• Data migration (OnPrem to GCP): Migrate data from Postgresql
to Google cloud storage then ingest the data to bigquery in a
defined frequency of a batch data using python, Fastapi, deployed
on cloud run and managed with cloud scheduler (All compute
resources are terraformed)
• Data retrieval and structure : Automate mixpanel data retrieval
via api mixpanel to google cloud storage and then ingest the data
to Bigquery using python, fastapi, deployed on cloud run and
scheduled with cloud scheduler.
• Data quality/analysis : Data analysis and data quality using
python (Pandas, data prep, fuzzyWuzzy algorithm), bigquery and
dataiku ng python, Fastapi, deployed on cloud run and managed
with cloud scheduler (All compute resources are terraformed).
• GCP : Implement, manage GCP and create an organisation
whithin it using cloud identity and configure Azure active directory Single Sign-On (sso) with google cloud connector.
• Cloud provider : Benchmarking multiple cloud providers using
TPC-DS in order to find the most suitable solution for the company (Built all compute resources are terraformed)
• Scoring: predict the gender and age of visitors of Prisma media’s websites based on their browsing behaviour and the CRM
database. Developped with python, scikitlearn decision trees
model.
• Segment manager: an intercative segment catalogue. Developed
with Rshiny, python and deployed with Google compute engine
(GCE) and cloud build for CI/CD.
• Revenue dashboard: reporting in detailed manner, revenues generated from advertising and segments data (cookie). Developed
with Rshiny, python, Google Ad manager API and deployed with
GCE and cloud build for CI/CD.