Job description
The Streaming Data Platform team is responsible for building, managing complex stream processing topologies using the latest open-source tech stack, build metrics and visualizations on the generated streams and create varied data sets for different forms of consumption and access patterns. We're looking for a seasoned Staff Software engineer to help us build and scale the next generation of streaming platforms and infrastructure at Fanatics Commerce.
Responsibilities
- Design and build real-time streaming data platforms that enable sub-second data availability to MPP databases (StarRocks, Apache Pinot, Apache Druid)
- Architect and implement data pipelines that handle complex data skews and leverage data colocation strategies for optimal query performance
- Fine-tune Apache Iceberg table parameters including compaction policies, partition evolution, file sizing, and snapshot management for streaming workloads
- Provide technical leadership on streaming architectures, guiding teams on optimal patterns for real-time data ingestion, processing, and materialization into MPP systems
- Contribute to open-source MPP database projects (StarRocks, Apache Pinot, Apache Druid) with performance improvements, bug fixes, and feature enhancements
- Design data distribution strategies and bucketing schemes to minimize shuffle operations and maximize colocation benefits in distributed queries
- Optimize existing streaming infrastructure through profiling, identifying bottlenecks in data skew handling, and implementing dynamic rebalancing strategies
Qualifications
- 8+ years of software development experience
- Proven experience building production-grade streaming pipelines to MPP databases (StarRocks/Pinot/Druid) with consistent sub-second latency
- Strong understanding of data skew patterns and mitigation techniques including salting, bucketing, adaptive partitioning, and custom key distribution
- Hands-on experience with data colocation strategies in distributed systems to optimize for local joins and reduce network shuffles
- Expert-level knowledge of Apache Iceberg for streaming workloads: snapshot isolation, file format tuning, compaction strategies, partition evolution, and metadata management
- Demonstrated open-source contributions to MPP databases or adjacent projects (commits, PRs, design proposals, community engagement)
- Proficiency in Java and/or C++
- Deep expertise in SQL optimization, distributed query planning, and physical execution plans in MPP systems
- Experience with optimizations like: tablet distribution, bucketing, colocation groups, materialized views, and primary key models
This job post has been translated by AI and may contain minor differences or errors.