Key Responsibilities:
- Designing and implementing data pipelines
- Ensuring data quality and compliance with security policies
- Data analysis and modeling
- Data collection and preprocessing: Perform appropriate data collection and preprocessing tasks to prepare data necessary for analysis.
- Data analysis and modeling: Analyze data, verify hypotheses, and develop prediction models using data analysis tools and techniques.
- Insight extraction and report writing: Extract insights based on analysis results and write reports to communicate them to decision-makers or stakeholders in a visual format.
- Communication skills to communicate with researchers and data developers
- Data visualization: Visualize analysis results to provide them in a form that can be used for business strategy formulation.
- Keeping up with technological advancements and industry trends: Keep up with the latest data analysis techniques and industry trends to enhance the company’s data analysis capabilities.
Key Skills:
- Understanding and experience in data analysis and modeling
- Understanding and experience in security and data quality management of large-scale data systems
- Understanding and experience in programming languages such as Python, and R
- Expertise in data analysis tools and techniques: Professional knowledge and practical experience in data analysis tools and techniques
- Mathematical knowledge: Requires an understanding of mathematical modeling and statistical techniques.
- Problem-solving skills: Must be able to solve complex problems.
- Communication skills: Requires the ability to communicate analysis results to non-experts.
Preferred Qualifications:
- Experience with CI/CD pipelines for automating and managing the data engineering process
- Experience with cloud platforms such as AWS, GCP, and Azure
- Experience with NoSQL databases
- Experience with Linux operating systems and shell scripting
- Understanding of various database and data warehousing technologies
- NLP project experience
- Experience with large-scale data processing and storage frameworks such as Spark, Hadoop, Hive, and Presto
- 2+ years of experience in designing and implementing large-scale data processing and storage systems
- 2+ years of experience in designing and implementing data pipelines