Job ID : 27313
Location : Ohio City, OH
Company Name : Infoprompt Inc
Job Type : Full-Time, Parttime
Industry : Computer/Software
Salary : $50,000 - $100,000 per year
No. of Positions : 1
Required Skills : Big Data Developer
Benefits : No benefits are available
Job Description :
- Participate in Team activities, Design discussions, Stand up meetings and planning Review with team.
- Perform data analysis, data profiling, data quality and data ingestion in various layers using big data/Hadoop/Hive/Impala queries, PySpark programs and UNIX shell scripts.
- Follow the organization coding standard document, Create mappings, sessions and workflows as per the mapping specification document.
- Perform Gap and impact analysis of ETL and IOP jobs for the new requirement and enhancements.
- Create jobs in Hadoop using SQOOP, PYSPARK and Stream Sets to meet the business user needs.
- Create mockup data, perform Unit testing and capture the result sets against the jobs developed in lower environment.
- Updating the production support Run book, Control M schedule document as per the production release.
- Create and update design documents, provide detail description about workflows after every production release.
- Continuously monitor the production data loads, fix the issues, update the tracker document with the issues, Identify the performance issues.
- Performance tuning long running ETL/ELT jobs by creating partitions, enabling full load and other standard approaches.
- Perform Quality assurance check, Reconciliation post data loads and communicate to vendor for receiving fixed data.
- Participate in ETL/ELT code review and design re-usable frameworks.
- Create Remedy/Service Now tickets to fix production issues, create Support Requests to deploy Database, Hadoop, Hive, Impala, UNIX, ETL/ELT and SAS code to UAT environment.
- Create Remedy/Service Now tickets and/or incidents to trigger Control M jobs for FTP and ETL/ELT jobs on ADHOC, daily, weekly, Monthly and quarterly basis as needed.
- Model and create STAGE / ODS / Data warehouse Hive and Impala tables as and when needed.
- Create Change requests, workplan, Test results, BCAB checklist documents for the code deployment to production environment and perform the code validation post deployment.
- Work with Hadoop Admin, ETL and SAS admin teams for code deployments and health checks.
- Create re-usable UNIX shell scripts for file archival, file validations and Hadoop workflow looping.
- Create re-usable framework for Audit Balance Control to capture Reconciliation, mapping parameters and variables, serves as single point of reference for workflows.
- Create PySpark programs to ingest historical and incremental data.
- Create SQOOP scripts to ingest historical data from EDW oracle database to Hadoop IOP, created HIVE tables and Impala views creation scripts for Dimension tables.
- Participate in meetings to continuously upgrade the Functional and technical expertise.