CV

Education

M.S. in Mathematical Modeling, Research, Statistics, and Computation, University of the Basque Country
M.S. in Applied Mathematics, Santo Domingo Institute of Technology
B.S. in International Business & Finance, Utah State University

Work experience

2026-present: AI Engineer
- Pulsecity
- Key contributions:
  - Designed a 35+ table Postgres schema centered on a Canonical Event Schema, with temporal snapshot tables (pricing, engagement), a hierarchical experience taxonomy (categories → subcategories → activities → emotional outputs), and junction tables for many-to-many mappings across events, artists, tags, attendance, among others.
  - Engineered an enrichment pipeline where LLM agents (OpenAI, Anthropic, Llama) operate as data workers, reading and writing through a Model Context Protocol (MCP) layer with schema-constrained validation, enum enforcement, confidence scoring, and full audit logging. Five agents run in an ordered chain with versioned prompts and swappable providers.
  - Built a configuration-driven ingestion framework that onboards heterogeneous event sources (REST APIs, GraphQL, web scrapers) through a unified adapter interface, with automated deduplication, geo-resolution, and source-level monitoring across 25+ sources.
2025-2025: Senior ML/Software Engineer, Pricing
- HP Inc (via Mphasis)
- Key contributions:
  - Collaborated within the Pricing Data Science team of HP Inc (Sant Cugat), with a data science software refactoring project for various markets, aiming at developing a common architecture for feature engineering, modeling, and deployment of price elasticity and causal inference models.
  - Architected 3 modular Delta Live Tables pipelines in Python to support retail and commercial demand models across North America and EMEA, reducing code duplication by ~11%, enabling region-specific customizations.
  - Built counterfactual price model (LightGBM) to simulate competitor brand prices as if they were an HP brand, to use as confounders in a demand elasticity model.
2024-2025: LLM Engineer, Post-training
- Invisible Technologies
- Key contributions:
  - Executed advanced post-training workflows, including Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), to optimize model reasoning and alignment.
  - Developed complex evaluation datasets including Chain-of-Thought (CoT) mathematical reasoning prompts and benchmarks to measure model performance in specialized domains, ensuring high-fidelity outputs for frontier LLM providers.
  - Collaborated on the implementation of reward modeling and preference-based optimization (DPO/PPO) to refine model behavior, safety guardrails, and surface model failure modes in complex tasks.
2023-2024: Data Scientist, Promotion Optimization (BEES)
- AB InBev
- Key contributions:
  - Collaborated on a promotion (pricing) optimization algorithm, improving key metrics like ROI, investment, and coverage across various promotional strategies (combos, stepped, among others).
    - This involved modeling demand elasticity with a log-log model (XGBoost), optimizing for a chosen metric with cubic splines, rank suggested order arrangements in combos, among other dynamic phases.
  - Optimized ROI by 48% in A/B test promotional experiments, over a period of 8 months.
  - Developed an algorithm targeting first-time purchasers, optimizing discount allocation and saving on budget.
  - Created an RCT module that automated user allocation into control and treatment groups and logged experiment metadata (promotion ID, allocation, blocking factors, timestamps) to a historical registry, reducing experiment design time by ~80%.
  - Implemented uplift and causal ML models (X-Learner, DR-Learner, Synthetic DiD) to measure market innovation effectiveness (ATT, CATE).
2022-2023: Data Scientist, Product Recommendation
- Santa Cruz Bank
- Key contributions:
  - Optimized the product recommendation (NBA) model architecture, enhancing feature engineering and embedding calculations, into a batch, two-tower modeling approach.
  - Enhanced customer segmentation by incorporating digital behavior clustering and psychographic profiling based on commerce data.
  - Automated weekly client account reporting via a data pipeline job, boosting productivity 2x and eliminating manual data wrangling.
  - Applied NLP techniques (sentiment analysis, n-grams, entity classification) to extract insights from Salesforce CRM interaction data.
2019-2022: Data Scientist, Risk Analytics
- DGII
- Key contributions:
  - Collaborated in developing the analytics & machine learning feedback system to label taxpayers as risky or not risky.
  - Built models such as linear regression, decision trees, PCA, logistic regression, MCMC, and clustering.
  - Helped define sector-specific, multidimensional risk metrics spanning income underreporting, cost inflation, shareholder benefit abuse, and transfer pricing anomalies.
  - Mapped shareholding structures with graph theory (NetworkX, Bokeh), leveraging centrality insights to reveal influential entities and hidden transactional links, improving tax evasion case prioritization by 2x.
  - Created the Data Warehouse inventory eCatalog, a Shiny (R) solution mapping all data infrastructure used for risk estimations (metadata, data dictionaries, data lineage). This streamlined the process of identifying data sources by 2x.
2017-2019: Junior Data Analyst, Tax
- Deloitte
- Key contributions:
  - Supported tax compliance analytics and client engagements through requirements gathering, data modeling, and reporting.
  - Built workflows to consolidate financial and transactional datasets for audit and tax risk reporting.
  - Collaborated with cross-functional consulting teams (Finance, Transfer Pricing, Audit, Legal) in financial projects, including M&A and tax risk analyses, to deliver data-driven insights to clients.

Skills

Python
R
SQL
Git
Spark
Data Science & AI Techniques: Propensity score matching (PSM), Double machine learning (DML), Doubly robust machine learning (DRML), A/B/n testing, RCTs, CATE calibration methods, RAG, among others.
Azure Cloud: Azure Databricks, DL/Hive, Azure Machine Learning Studio, Azure DevOps, Unity Catalog, Lakeflow (data orchestration), Delta Live Tables (DLT), and Azure Synapse Analytics. AWS: Amazon SageMaker, Amazon Redshift, Athena, AWS Glue.
Additional Tools: MATLAB, Java, Gurobi, PuLP, Docker, MLflow, Airflow, Pytest, LangChain, MCP, CI/CD.

Leadership

AB InBev Analytics Workshop Professor (2024)
AB InBev Global Hackathon (2023)
R Programming Mentor (2022-2023)

Last updated: 2026-03-30

Jose Garcia Ventura

CV

Education

Work experience

Skills

Leadership