Data Scientist

About the role

As a data scientist, you must have a strong background in Data Mining, Machine Learning, Recommendation Systems and Statistics. You should possess the signature strengths of a qualified mathematician with the ability to apply concepts of Mathematics and Applied Statistics with specialization in one or more of the following–NLP, Computer Vision, Speech and Data mining–to develop models that provide an effective solution. A strong data engineering background with hands-on coding capabilities is needed to own and deliver outcomes.

The ideal candidate would have a Master’s or Ph.D. Degree in a highly quantitative field (Computer Science, Machine Learning, Operational Research, Statistics, Mathematics, etc.) or equivalent experience, 0-7 years of industry experience in predictive modelling, data science and analysis, with prior experience in an ML or data scientist role and a track record of building ML or DL models.

Responsibilities and skills

  • Work with our customers to deliver a ML/DL project from beginning to end, including understanding the business need, aggregating data, exploring data, building & validating predictive models and deploying completed models to deliver business impact to organizations.
  • Selecting features, building and optimizing classifiers using ML techniques.
  • Data mining using state-of-the-art methods, creating text mining pipelines to clean & process large unstructured datasets to reveal high-quality information and hidden insights using machine learning techniques.
  • Should be able to appreciate and work on Computer Vision problems, for example, extract rich information from images to categorize and process visual data, develop machine learning algorithms for object and image classification, experience in using DBScan, PCA, Random Forests and Multinomial Logistic Regression to select the best features to classify objects. OR
  • Deep understanding of NLP such as fundamentals of information retrieval, deep learning approaches, transformers, attention models, text summarisation, attribute extraction etc. Preferable experience in one or more of the following areas: recommender systems, moderation of user-generated content, sentiment analysis, etc. OR
  • Experience having worked in these areas: speech recognition, speech-to-text and vice versa, understanding NLP and IR, text summarisation, statistical and deep learning approaches to text processing.
  • Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc. Appreciation for deep learning frameworks like MXNet, Caffe 2, Keras, Tensorflow.
  • Experience in working with GPUs to develop models, handling terabyte-size datasets.
  • Experience with common data science toolkits such as R, Weka, NumPy, MatLab, mlr, mllib, Scikit-learn, caret etc – excellence in at least one of these is highly desirable.
  • Should be able to work hands-on in Python, R etc. Should closely collaborate & work with engineering teams to iteratively analyse data using Scala, Spark, Hadoop, Kafka, Storm etc.
  • Experience with NoSQL databases and familiarity with data visualization tools will be of great advantage.

What will you experience in terms of culture at Sahaj?

  • A culture of trust, respect and transparency
  • Opportunity to collaborate with some of the finest minds in the industry
  • Work across multiple domains

What are the benefits of being at Sahaj?

  • Unlimited leaves
  • Life insurance & private health insurance
  • Stock options
  • No hierarchy
  • Open Salaries

