From:                              route@monster.com

Sent:                               Monday, September 28, 2015 1:01 PM

To:                                   hg@apeironinc.com

Subject:                          Please review this candidate for: Talend

 

This resume has been forwarded to you at the request of Monster User xapeix03

SHAWN DING 

Last updated:  08/31/15

Job Title:  no specified

Company:  no specified

Rating:  Not Rated

Screening score:  no specified

Status:  Resume Received


Manhattan, NY  10003
US

Quick View Links:

Resume Section

Summary Section

 

 

RESUME

  

Resume Headline: SHAWN DING - Senior Big Data Consultant/Hadoop developer

Resume Value: adx8sjutuiapp67r   

  

 

 

SHAWN DING

 

shawn.ding55@gmx.com

Summary:

 

·  6+ years of experience in IT industry (including 4 years worked as Hadoop Developer and 2 years worked as Back-end Java Developer).

·  Experience in setting up and maintaining Hadoop cluster running HDFS and MapReduce on YARN

·  Good at two data formats: Avro for data serialization and Parquet for nested data

·  Experience in using Kafka and Flume to ingest real-time data streams

·  Expertise in using data ingestion tool Sqoop (for bulk data transfer)

·  Experience in how to use high-level data processing tools like Pig, Hive, Crunch, Storm and Spark work with Hadoop

·  Handling and further processing schema oriented and non-schema oriented data using Pig

·  Good at Extending Pig and Hive core functionality by writing custom UDFs

·  Explore deeply on writing Pig Evaluation and Filter Functions using Java

·  Exposure on the HBase distributed database and the ZooKeeper distributed configuration service

·  Good Knowledge on SQL language, RDBMS(Oracle, MySQL, PostgreSQL)and NoSQL databases(MongoDB, Hbase, Oracle Berkeley DB)

·  Knowledge on automated workflows control using shell scripts and Oozie

·  Experience on commercial distribution of Hadoop including Hortonworks production HDP(Hortonworks Data Platform) and Cloudera CDH

·  Core java technology, which includes Class-design, Multithreading, I/O&JDBC, Collections, Localization, ability to develop new API for different projects

·  Extensive experience in developing Web applications like java Applet

·  Knowledge on Data Mining and Analysis including Regression models, Decision Trees, Association rule mining, customer segmentation, Hypothesis Testing and proficient in R Language (including R packages) and SAS

·  Quick learner and good team player who have Good communication and presentation skills

 

 

CERTIFICATIONS:

 

·  Cloudera Certified Developer for Apache Hadoop (CCDH 4.1)

·  Oracle Certified Associate, Java SE 7 Programmer

·  Oracle Certified Professional, Java SE 6 Programmer

·  SAS Certified Base Programmer for SAS 9 Credential 

 

 

TECHNICAL SKILLS:

 

Apache Hadoop Eco-sys: \Relational Databases: \

HDFS, MapReduce V1, YARN, Flume, \Oracle 11g/10g/9i/, MySQL 5.0, Microsoft \

Kafka, Sqoop, Hive, Pig, Spark, Storm, Oozie, \SQL Server 9.0, PostgreSQL 8.0\

Crunch, ZooKeeper, Parquet, Avro, AWS \

(EC2, EMR, S3)\

 

Distribution of Hadoop: \Java Technologies: \

Cloudera Distribution (CDH4, CM), HortonworksJSP, Servlet, EJB, JDBC, Applet, spring, \

Distribution (HDP)\Hibernate, Struts\

 

NoSQL Databases: \Scripting \

Hbase, MongoDB, Oracle Berkeley DB\UNIX Shell Scripting, XML, HTML, JavaScript\

 

Tools: \ Operation Systems: \

MRUnit, Maven, Git, SVN, Talend Open\ Linux (CentOS, Ubuntu), UNIX, Microsoft\

Studio\Windows, Mac OS\

 

Data Analysis Skills:\

Matlab, R language, SAS, VBA, SQL\

 

 

Professional Experience:

 

Bloomberg L.P. - Enterprise Content & Delivery group, Manhattan, NY            10/2014-Till Date

Senior Big Data Consultant                                                          

 

Bloomberg L.P. is a financial data and media company which focus on delivering data, news and analytics to their clients.

 

High-Performing Distributed System

High-performing distributed system is built for managing huge volumes of company data. Based on Hadoop platform, this system provide functions like gather billions of data points, transform data easily in different databases and store those data in an efficient manner for trend analysis, billing and business intelligence.

 

Responsibilities:

·  Capturing business analysis requirements and translate it into Technical design in Hadoop Eco-system

·  Worked on deploying and tuning live Hortonworks production HDP(Hortonworks Data Platform) clusters

·  Implemented Flume (Multiplexing) to steam log data from web server in to HDFS

·  Used Apache Kafka to ingest real-time data streams, and then pushes data to the HBase and HDFS cluster using Apache Sqoop and Apache Storm

·  Used Storm analyze, clean, normalize, and resolve large amounts of non-unique log data

·  Queried and analyzed data from Oracle Berkeley DB(NoSQL) for testing data schema accuracy

·  Developed and implemented data migration strategy from Oracle Berkeley DB to Hbase

·  Used Sqoop for data ingestion from RDBMS to Hive, then store it in Hbase

·  Used MapReduce, Spark, Pig and Hive for data cleansing and processing

·  Adopt Spark and Spark SQL to build processing applications and improved the performance of those applications using Scala UDFs

·  Written the Apache Pig scripts to process the HDP data

·  Created Hive tables to store the processed results in a tabular format

·  Targeting "high performing" data immigration based on modify Java API

·  Developed Java Applets for data scientist, who use it query, process and analysis data in Hbase directly

·  Developed Oozie workflow for scheduling and orchestrating the ETL process

·  Used Maven to call the dependencies for developing Java application

·  Practiced Agile Development and Test Driven Development

 

Environment:

Linux (CentOS, Ubuntu), UNIX Shell, Oracle Berkeley DB, Hbase, Pig, Hive, Java Applets, Eclipse, Core Java, JDK1.7, Oozie Workflows, Agile

 

 

Huron Consulting Group, Manhattan, New York     01/2014-09/2014

Big Data Developer                                                                  

 

The Huron Consulting Group is a global management consulting company offering services in the Healthcare, Education, Life Sciences, Law, and Finance industries.

 

Enterprise Big Data System

This project focused on deploying Cloudera Hadoop distribution, setting up an initial data warehouse for enterprise huge volume documents data and analyzing those data files across project to get insights and metrics information for company.

 

Responsibilities:

·  Involved in setting and monitoring a distributed analytics cluster using Cloudera CDH4, CM

·  Worked on a live Hadoop production CDH cluster with 107 nodes

·  Created Hive tables to store the processed results in a tabular format and developed Hive scripts to denormalize and aggregate the disparate data

·  Exported data from Oracle Database to HDFS, Hive using Sqoop, Storm and NFS mount approach

·  Pushed data collecting from health exchange information and claims data to HDFS clusters using Apache Storm

·  Exported the analyzed data from Hive to the relational databases using Sqoop for visualization and to generate reports for the BI team

·  Designed and developed Pig data transformation scripts and UDF to work against Semi structured data process

·  Worked on Data cleansing using apache Avro schema and implementing it in Pig

·  Involved in developing and customizing MapReduce programs using Java language

·  Loaded the created HFiles into HBase for faster access of large customer base without taking performance

·  Automated workflows using shell scripts and Oozie jobs to pull data from various databases into Hadoop

·  Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews

·  Involved in agile development methodology and actively participated in daily scrum meetings

 

Environment:

Red Hat Linux, CentOS, Hive, Pig, Sqoop, Oozie, Oracle database, HBase, Maven tools, Agile

 

 

Baidu Inc., Beijing, China     09/2012-08/2013

Hadoop developer                                                                   

 

Baidu is a Chinese web services company (NASDAQ: BIDU), which mainly offers a Chinese language-search engine for websites, audio files and images.

 

Big Data Search Engine:

This project focus on next generation internet search engine, which it claims is twice as fast as before. This new search engine system based on a fully managed, scalable NoSQL database service offered through the open-source Apache HBase application programming interface (API).

 

Responsibilities:

·  Involved in architecture design, development and implementation of Hadoop deployment, backup and recovery systems

·  Worked on the proof-of-concept for Apache Hadoop framework initiation

·  Reviewed the HDFS usage and system design for future scalability and fault-tolerance.

·  Loaded large amount of Application Server logs using Flume, immigrate data into Hbase using Sqoop and Storm

·  Created statistics of logs and extracts useful information from the statistics in real-time using Storm

·  Developed MapReduce jobs for Log Analysis, Recommendation and Analytics

·  Optimized MapReduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization for a HDFS cluster

·  Wrote MapReduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS

·  Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS

·  Processed HDFS data and created external tables using Hive

·  Involved in implementing data immigration strategy from MongoDB to Hbase, created data schema suitable for MongoDB data ingestion

·  Used Talend Open Studio to move Data from MongoDB to PostgreSQL

·  Exported analyzed data to Oracle database using Sqoop for generating reports

·  Involved in setting up cluster co-ordination services through ZooKeeper

·  Involved in Agile development methodology

 

Environment:

CentOS, Oracle Database, PostgreSQL, Hive, Pig, Sqoop, Oracle, Flume, Hbase, ZooKeeper, Maven tools, Talend Open Studio, Agile

 

 

Alibaba Group, Beijing, China    09/2011-08/2012

Java Developer               

 

Alibaba Group Holding Limited (NYSE: BABA) is a Chinese e-commerce company that provides consumer-to-consumer, business-to-consumer and business-to-business sales services via web portals.

 

Online Order System:

This application enables different level users to select products and place orders. Various functions such as wish list, expediting or canceling order can be done on web. It also provides product recommendations to customers.

 

Responsibilities:

·  Developed Java classes to be used in JSP and Servlets

·  Improved the coding standards, code reuse. Increased performance of the extended applications by making effective use of various design patterns (MVC, DAO)

·  Implemented Messaging using JMS to get the status of the services

·  Used JDBC to retrieve data from Oracle database

·  Developed procedures, functions and triggers in Oracle PL/SQL

·  Developed UI using JSP, HTML/CSS and JavaScript

·  Analyzed customer data using cloud technologies such as Hadoop 1.0 and HDFS

 

Environment:

J2EE, Java, servlets, JDK, JSTL, Oracle, Eclipse, Windows, Linux, Maven, HDFS

 

 

Renren Network, Beijing, China     01/2009-08/2011

Java Developer                                                                     

 

The Renren Network (NYSE: RENN), sometimes referred to as “the Facebook of China”, is a Chinese social networking service and popular among college students.

 

XMPP Instant Messaging (IM) application:

This application (Applet) is run in both C/S and B/S that enables multiple level users to login, add and group friends, chat with each other, check chatting history, transfer to and download files from others. The system guarantees safe access to user's account.

 

Responsibilities:

·  Focused on Web Applet development

·  Worked in an agile team, using JDK 6.0, Tomcat, PostgreSQL on Windows and Linux Ubuntu

·  Built the Application using MVC (Model-View-Controller) pattern and Struts2 Framework

·  Responsible for features in C/S using Java Thread, Swing and I/O

·  Worked with Spring as the Web-container framework

·  Modified the POJO classes, for new features. Implemented DAO interfaces and wrote the business logics using Servlet

 

Environment:

Java, J2EE, servlets, Spring MVC, Oracle, MySQL, eclipse, JSP, Hibernate

 

 

EDUCATION      

 

·  Masters in Mathematics

·  Bachelors in Information and Computational Science

 



Experience

BACK TO TOP

 

Job Title

Company

Experience

Senior Big Data Consultant

Bloomberg

- Present

 

Additional Info

BACK TO TOP

 

Current Career Level:

Experienced (Non-Manager)

Date of Availability:

Immediately

Work Status:

US - I am authorized to work in this country for any employer.

Active Security Clearance:

None

US Military Service:

Citizenship:

None

 

 

Target Job:

Target Job Title:

Senior Big Data Consultant/Hadoop developer

Desired Job Type:

Intern
Temporary/Contract/Project
Seasonal

Desired Status:

Part-Time

 

Target Company:

Company Size:

Occupation:

IT/Software Development

·         Software/Web Development

 

Target Locations:

Selected Locations:

US-NY-New York City

Relocate:

Yes

Willingness to travel:

Up to 100%

 

Languages:

Languages

Proficiency Level

English

Fluent