Big Data Engineer
Job ID : 28784
Job Title : Big Data Engineer
Location : Washington, DC
Comapny Name : Atechstar
Job Type : Full-Time, parttime, contract, training
Industry : Information Technology
Salary : $258900 - $499000 per year
Work Authorization : ["OPT","CPT","F1","H4","L1","Have H1 Visa","TN Permit Holder","Green Card Holder","Canadian Citizen","US Citizen"]
No. of Positions : I have ongoing need to fill this role
Posted on : 06-24-2022
Required Skills : Hadoop big data engineering Hive SCALAS park Java C++ apache hadoop Shell Scripting Cassandra Flume JIRA Avro Linux AWS-CWIS qoop apache spark Oozie Python
Benefits : Medical Insurance, Dental Insurance, Vision Insurance, 401K, Life Insurance
Job Description :
Job description
Roles & Responsibilities:
- You would be responsible for evaluating, developing, maintaining and testing big data solutions for advanced analytics projects
- The role would involve big data pre-processing & reporting workflows including collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into business insights
- The role would also involve testing various machine learning models on Big Data, and deploying learned models for ongoing scoring and prediction. An appreciation of the mechanics of complex machine learning algorithms would be a strong advantage.
Qualification & Experience:
0-1 years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.
Ideally, this would include work on the following technologies:
- Expert-level proficiency in at-least one of Java, C++ or Python (preferred). Scala knowledge a strong advantage.
- Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop 2.0 (YARN; MR & HDFS) and associated technologies -- one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, etc..
- Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
- Operating knowledge of cloud computing platforms (AWS, especially EMR, EC2, S3, SWF services and the AWS CLI)
- Experience working within a Linux computing environment, and use of command line tools including knowledge of shell/Python scripting for automating common tasks
- Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works
- In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.
Must Have (hands-on) experience:
- Java or Python or C++ expertise
- Linux environment and shell scripting
- Distributed computing frameworks (Hadoop or Spark)
- Cloud computing platforms (AWS).
Desirable (would be a plus):
- Statistical or machine learning DSL like R
- Distributed and low latency (streaming) application architecture
- Row store distributed DBMSs such as Cassandra
- Familiarity with API design
Role : Big Data Engineer Industry Type : Emerging Technologies Functional Area : Engineering - Software & QA Employment Type : Full Time, Permanent Role Category : Software Development
Education UG : Any Graduate
Key Skills Hadoop big data engineering Hive SCALAS park Java C++ apache hadoop Shell Scripting Cassandra Flume JIRA Avro Linux AWS-CWIS qoop apache spark Oozie Python
Company Details :
Company Information hidden please Login to view details