Academic Research

Distributed ML pipeline for real-time stock prediction using big data infrastructure.

Data Engineer & ML Developer Academic 2022

End-to-end streaming architecture comparing online vs. batch learning across global equities.

Stack

KafkaHadoopSparkCassandraRiverPython

Context

Real-time stock market prediction system leveraging big data infrastructure and online learning methods across global companies (Google, Apple, BNP Paribas, Alibaba, Novartis).

Highlights

  • Full Kafka → Hadoop → Spark → Cassandra pipeline
  • Online learning (River) vs. batch (LSTM, autoregressive)
  • Real-time inference on streaming market data

Working on a similar problem?

Let's talk about how production-grade ML can move your roadmap forward.