Data Scientist

We’re hiring a hands-on Data Scientist/Engineer to build production-ready data services and ML solutions. Your primary goal is to turn raw, messy, and distributed data into clean, reliable, well-documented datasets and insights that power product and decision-making. You’ll design robust pipelines, perform deep exploratory analysis, define transformations and aggregations for data, and expose results via stable back-end interfaces (datasets, batch jobs, and APIs).

Key Responsibilities:

  • Conduct data discovery and profiling across sources; map schemas, assess quality, identify gaps/anomalies, and document data contracts.
  • Build and maintain ETL/ELT pipelines for ingestion, cleaning, validation, normalization, and aggregation (daily/weekly/monthly rollups, feature tables).
  • Perform exploratory data analysis (EDA) and statistical analysis to uncover patterns, trends, outliers, and data issues; produce analysis-ready tables.
  • Define and implement business logic and transformations (feature engineering, window functions, joins, dedupe, imputation, enrichment).
  • Data analysis and modeling: Analyze data, verify hypotheses, and develop prediction models using data analysis tools and techniques.
  • Ensure data quality and governance: validations, unit/expectation tests, lineage, documentation, PII handling, and access controls.
  • Collaborate with research and engineering teams to translate questions into datasets, metrics, and clear definitions.
  • Optimize processing performance and cost (indexing, partitioning, clustering, caching, vectorized ops, parallelization).
  • Keeping up with technological advancements and industry trends: Keep up with the latest data analysis techniques and industry trends to enhance the company’s data analysis capabilities.
  • Uphold engineering best practices: version control, code reviews, testing, CI/CD, reproducible notebooks/experiments, and documentation.

Required Qualifications:

  • Bachelor’s degree or higher in Data Science, AI, Statistics, Computer Science, or related field.
  • 3+ years in data analysis, ML model development, or data engineering.
  • Understanding and experience in programming languages such as Python (Pandas, NumPy, scikit-learn, XGBoost), R and proficiency in SQL/NoSQL.
  • Experience developing ML/DL models (TensorFlow, PyTorch, scikit-learn).
  • Knowledge of database design and optimization (SQL/ NoSQL) and good understanding of MS Excel.
  • Experience deploying data/ML workloads in cloud environments (GCP, AWS, or Azure).
  • Basic understanding of web systems and how back-end services integrate with client apps.
  • Strong logical thinking, problem-solving, and clean, maintainable coding habits.
  • Communication skills: Requires the ability to communicate analysis results to non-experts.

Preferred Qualifications:

  • Prompt engineering and agent development experience.
  • Built services with modern LLM frameworks (e.g., LangChain, LangGraph, DeepAgents).
  • Data visualization/reporting (Tableau, Power BI) and Python viz (Matplotlib/Seaborn/Plotly) for stakeholder communication.
  • Experience with Linux operating systems and shell scripting.
  • Domain expertise: NLP, time-series, recommender systems, anomaly detection.
  • Large-scale processing (Hadoop, Spark, Dask).
  • MLOps & governance (MLflow, Kubeflow, feature stores, model/version management).
  • Experience on global projects and strong English communication skills.

What You’ll Work On:

  • High-quality ingestion and transformation jobs that convert raw feeds into trustworthy, analysis-ready datasets.
  • Metrics, dashboards, and quality monitors that demonstrate business impact.
  • Analytical studies (EDA, cohort analyses, attribution, baseline metrics) that inform product decisions.
  • Productionized data products/APIs with monitoring for data freshness, completeness, and accuracy.

Apply for this position

Allowed Type(s): .pdf, .doc, .docx

Contact Us