Data Engineer
Bucureşti, Romania
6 zile în urmă

General Role :

Quality of data transformation for connection with Cluster Datalake's solution ("on premises") or Cloud architectures, in order to design and implement "end to end " solutions : proper operation of data processing, data ingestion by API's exposure and data visualization, within an DevOps culture.

All the applications are new developments / products for different LOB's (line of business) inside our Group.

General Skills : Experience in "end to end " data streams implementation and design in Big Data architectures ( Hadoop clusters , NOSQL databases, Elastic search ) and also in massive data processing environments of distributed data with : frameworks as Spark / Scala.

Mandatory :

  • Languages : Spark, Scala, SQL; DevOps : Git, Jenkins, Gitlabee;
  • Datalake : NiFi work in distributed architecture ; Hortonworks with Hive, Oozie
  • Data base : NoSQL and SQL
  • Cloud : GCP (Google) nice to have, AWS nice to have
  • The candidate shouldn’t be too much into : data science / Machine Learning / Artficial Inteligence.

    Project methodology it is only Agile / SCRUM, inside an DevOps culture.

    Main Responsibilities in different stages :

    During project definition :

  • Design of data ingestion chains
  • Design of data preparation chains
  • Basic ML algorithm design
  • Data product design
  • Design of NOSQL data models
  • Design of data visualizations
  • Participation in the selection of services / solutions to be used according to uses
  • Participation in the development of a data toolbox
  • During the iterative implementation phase :

  • Implementation of data ingestion chains
  • Implementation of data preparation chains
  • Implementation of basic ML algorithms
  • Implementing data visualizations
  • Using ML framework
  • Implementation of data products
  • Exposure of data products
  • NOSQL database configuration / parametrisation
  • Use of functional languages
  • Debugging of distributed processes and algorithms
  • Identification and cataloging of reusable entities
  • Contribution to the working development standards
  • Contribution and solution proposals on data processing issues
  • During integration and deployment phase :

  • Participation in problem solving
  • Technical Requirements :

  • Expertise in the implementation of end-to-end data processing chains
  • Experience in distributed architecture
  • Basic knowledge and interest in the development of ML algorithms
  • Knowledge of different ingestion mechanism / framework
  • Knowledge of Spark and its different modules
  • Proficiency of Scala and / or Python
  • Knowledge of the AWS or GCP environment
  • Knowledge of NOSQL databases environment
  • Knowledge in building API's for data products
  • Knowledge of Dataviz tools and libraries
  • Experience in Spark debugging and distributed systems
  • Extension of complex systems
  • Proficiency in the use of notebook data
  • Experience in data testing strategies
  • Strong problem-solving skills, intelligence, initiative and ability to withstand pressure
  • Strong interpersonal skills and great communication skills (ability to go into detail)
  • Aplică
    Adaugați la favorite
    Eliminați de la favorite
    Email-ul meu
    Făcând clic pe "Continuă", acord nevoo consimțământ de a procesa datele mele și de a-mi trimite alerte prin e-mail, așa cum este detaliat în policyApplicația de confidențialitate a lui neuvoo. Pot să-mi retrag consimțământul sau să mă dezabonez în orice moment.