Data Scientist

We’re hiring a hands-on Data Scientist/Engineer to build production-ready data services and ML solutions. Your primary goal is to turn raw, messy, and distributed data into clean, reliable, well-documented datasets and insights that power product and decision-making. You’ll design robust pipelines, perform deep exploratory analysis, define transformations and aggregations for data, and expose results via stable back-end interfaces (datasets, batch jobs, and APIs).

Key Responsibilities:

Conduct data discovery and profiling across sources; map schemas, assess quality, identify gaps/anomalies, and document data contracts.
Build and maintain ETL/ELT pipelines for ingestion, cleaning, validation, normalization, and aggregation (daily/weekly/monthly rollups, feature tables).
Perform exploratory data analysis (EDA) and statistical analysis to uncover patterns, trends, outliers, and data issues; produce analysis-ready tables.
Define and implement business logic and transformations (feature engineering, window functions, joins, dedupe, imputation, enrichment).
Data analysis and modeling: Analyze data, verify hypotheses, and develop prediction models using data analysis tools and techniques.
Ensure data quality and governance: validations, unit/expectation tests, lineage, documentation, PII handling, and access controls.
Collaborate with research and engineering teams to translate questions into datasets, metrics, and clear definitions.
Optimize processing performance and cost (indexing, partitioning, clustering, caching, vectorized ops, parallelization).
Keeping up with technological advancements and industry trends: Keep up with the latest data analysis techniques and industry trends to enhance the company’s data analysis capabilities.
Uphold engineering best practices: version control, code reviews, testing, CI/CD, reproducible notebooks/experiments, and documentation.

Required Qualifications:

Bachelor’s degree or higher in Data Science, AI, Statistics, Computer Science, or related field.
3+ years in data analysis, ML model development, or data engineering.
Understanding and experience in programming languages such as Python (Pandas, NumPy, scikit-learn, XGBoost), R and proficiency in SQL/NoSQL.
Experience developing ML/DL models (TensorFlow, PyTorch, scikit-learn).
Knowledge of database design and optimization (SQL/ NoSQL) and good understanding of MS Excel.
Experience deploying data/ML workloads in cloud environments (GCP, AWS, or Azure).
Basic understanding of web systems and how back-end services integrate with client apps.
Strong logical thinking, problem-solving, and clean, maintainable coding habits.
Communication skills: Requires the ability to communicate analysis results to non-experts.

Preferred Qualifications:

Prompt engineering and agent development experience.
Built services with modern LLM frameworks (e.g., LangChain, LangGraph, DeepAgents).
Data visualization/reporting (Tableau, Power BI) and Python viz (Matplotlib/Seaborn/Plotly) for stakeholder communication.
Experience with Linux operating systems and shell scripting.
Domain expertise: NLP, time-series, recommender systems, anomaly detection.
Large-scale processing (Hadoop, Spark, Dask).
MLOps & governance (MLflow, Kubeflow, feature stores, model/version management).
Experience on global projects and strong English communication skills.

What You’ll Work On:

High-quality ingestion and transformation jobs that convert raw feeds into trustworthy, analysis-ready datasets.
Metrics, dashboards, and quality monitors that demonstrate business impact.
Analytical studies (EDA, cohort analyses, attribution, baseline metrics) that inform product decisions.
Productionized data products/APIs with monitoring for data freshness, completeness, and accuracy.

Key Responsibilities:

Required Qualifications:

Preferred Qualifications:

What You’ll Work On:

Contact Us