From: route@monster.com
Sent: Monday, September 28, 2015 1:01 PM
To: hg@apeironinc.com
Subject: Please review this candidate for: Talend
This resume has been forwarded to
you at the request of Monster User xapeix03
|
|||||||
|
|||||||
|
|
|
||||||
|
||||||
|
SHAWN DING shawn.ding55@gmx.com Summary: · 6+ years of experience in
IT industry (including 4 years worked as Hadoop Developer and 2 years worked
as Back-end Java Developer). · Experience in setting up
and maintaining Hadoop cluster running HDFS and MapReduce
on YARN · Good at two data formats: Avro
for data serialization and Parquet for nested data · Experience in using Kafka
and Flume to ingest real-time data streams · Expertise in using data
ingestion tool Sqoop (for bulk data transfer) · Experience in how to use
high-level data processing tools like Pig, Hive, Crunch,
Storm and Spark work with Hadoop · Handling
and further processing
schema oriented and non-schema oriented data using Pig · Good at Extending Pig and
Hive core functionality by writing custom UDFs · Explore deeply on writing
Pig Evaluation and Filter Functions using Java · Exposure on the HBase
distributed database and the ZooKeeper distributed configuration
service · Good Knowledge on SQL
language, RDBMS(Oracle, MySQL, PostgreSQL)and NoSQL databases(MongoDB,
Hbase, Oracle Berkeley DB) · Knowledge on automated
workflows control using shell scripts and Oozie · Experience on commercial
distribution of Hadoop including Hortonworks production HDP(Hortonworks
Data Platform) and Cloudera CDH · Core java technology, which includes Class-design,
Multithreading, I/O&JDBC, Collections, Localization,
ability to develop new API for different projects · Extensive experience in
developing Web applications like java Applet · Knowledge on Data Mining
and Analysis including Regression models, Decision Trees, Association rule
mining, customer segmentation, Hypothesis Testing and proficient in R
Language (including R packages) and SAS · Quick learner and good team
player who have Good communication and presentation skills CERTIFICATIONS: · Cloudera Certified
Developer for Apache Hadoop (CCDH 4.1) · Oracle Certified Associate,
Java SE 7 Programmer · Oracle Certified
Professional, Java SE 6 Programmer · SAS Certified Base
Programmer for SAS 9 Credential TECHNICAL SKILLS: Apache
Hadoop Eco-sys: \Relational Databases: \ HDFS,
MapReduce V1, YARN, Flume, \Oracle 11g/10g/9i/, MySQL 5.0, Microsoft \ Kafka, Sqoop,
Hive, Pig, Spark, Storm, Oozie, \SQL Server 9.0, PostgreSQL 8.0\ Crunch,
ZooKeeper, Parquet, Avro, AWS \ (EC2, EMR,
S3)\ Distribution
of Hadoop: \Java Technologies: \ Cloudera
Distribution (CDH4, CM), HortonworksJSP, Servlet, EJB, JDBC, Applet, spring,
\ Distribution
(HDP)\Hibernate, Struts\ NoSQL
Databases: \Scripting \ Hbase,
MongoDB, Oracle Berkeley DB\UNIX Shell Scripting, XML, HTML, JavaScript\ Tools: \ Operation
Systems: \ MRUnit, Maven,
Git, SVN, Talend Open\ Linux (CentOS, Ubuntu), UNIX, Microsoft\ Studio\Windows,
Mac OS\ Data
Analysis Skills:\ Matlab, R
language, SAS, VBA, SQL\ Professional Experience: Bloomberg L.P. - Enterprise Content & Delivery group,
Manhattan,
NY
10/2014-Till Date Senior Big
Data
Consultant
Bloomberg L.P. is a financial data and media company which focus
on delivering data, news and analytics to their clients. High-Performing
Distributed System High-performing
distributed system is built for managing huge volumes of company data. Based
on Hadoop platform, this system provide functions like gather billions of
data points, transform data easily in different databases and store those
data in an efficient manner for trend analysis, billing and business
intelligence. Responsibilities: · Capturing
business analysis requirements and translate it into Technical design in Hadoop
Eco-system · Worked on deploying and
tuning live Hortonworks production HDP(Hortonworks Data Platform)
clusters · Implemented Flume
(Multiplexing) to steam log data from web server in to HDFS · Used Apache Kafka to
ingest real-time data streams, and then pushes data to the HBase and HDFS
cluster using Apache Sqoop and Apache Storm · Used Storm analyze,
clean, normalize, and resolve large amounts of non-unique log data · Queried and analyzed data from
Oracle Berkeley DB(NoSQL) for testing data schema accuracy · Developed and implemented data
migration strategy from Oracle Berkeley DB to Hbase · Used Sqoop for data
ingestion from RDBMS to Hive, then store it in Hbase · Used MapReduce, Spark,
Pig and Hive for data cleansing and processing · Adopt Spark and Spark
SQL to build processing applications and improved the performance of
those applications using Scala UDFs · Written the Apache Pig
scripts to process the HDP data · Created Hive tables
to store the processed results in a tabular format · Targeting "high
performing" data immigration based on modify Java API · Developed Java Applets
for data scientist, who use it query, process and analysis data in Hbase
directly · Developed Oozie workflow
for scheduling and orchestrating the ETL process · Used Maven to call
the dependencies for developing Java application · Practiced Agile Development
and Test Driven Development Environment: Linux
(CentOS, Ubuntu), UNIX Shell, Oracle Berkeley DB, Hbase, Pig, Hive, Java
Applets, Eclipse, Core Java, JDK1.7, Oozie Workflows, Agile Huron Consulting Group, Manhattan, New York
01/2014-09/2014 Big Data
Developer
The Huron
Consulting Group is a global management consulting company offering services
in the Healthcare, Education, Life Sciences, Law, and Finance industries. Enterprise
Big Data System This project
focused on deploying Cloudera Hadoop distribution, setting up an initial data
warehouse for enterprise huge volume documents data and analyzing those data
files across project to get insights and metrics information for company. Responsibilities: · Involved in setting and
monitoring a distributed analytics cluster using Cloudera CDH4, CM · Worked
on a live Hadoop production CDH cluster with 107 nodes · Created Hive tables
to store the processed results in a tabular format and developed Hive
scripts to denormalize and aggregate the disparate data · Exported data from Oracle
Database to HDFS, Hive using Sqoop, Storm and NFS mount approach · Pushed data collecting from
health exchange information and claims data to HDFS clusters using Apache
Storm · Exported the analyzed data from
Hive to the relational databases using Sqoop for visualization and to
generate reports for the BI team · Designed and developed Pig
data transformation scripts and UDF to work against Semi structured data
process · Worked on Data cleansing
using apache Avro schema and implementing it in Pig · Involved in developing and customizing
MapReduce programs using Java language · Loaded the created
HFiles into HBase for faster access of large customer base without taking
performance · Automated workflows using shell
scripts and Oozie jobs to pull data from various databases into Hadoop · Actively participated in
software development lifecycle (scope, design, implement, deploy, test),
including design and code reviews · Involved in agile
development methodology and actively participated in daily scrum meetings Environment: Red Hat
Linux, CentOS, Hive, Pig, Sqoop, Oozie, Oracle database, HBase, Maven tools,
Agile Baidu Inc., Beijing, China
09/2012-08/2013 Hadoop
developer
Big Data
Search Engine: This project
focus on next generation internet search engine, which it claims is twice as
fast as before. This new search engine system based on a fully managed,
scalable NoSQL database service offered through the open-source Apache HBase
application programming interface (API). Responsibilities: · Involved in architecture
design, development and implementation of Hadoop deployment, backup and
recovery systems · Worked on the
proof-of-concept for Apache Hadoop framework initiation · Reviewed the HDFS usage and
system design for future scalability and fault-tolerance. · Loaded large amount of
Application Server logs using Flume, immigrate data into Hbase using Sqoop
and Storm · Created statistics of logs
and extracts useful information from the statistics in real-time using Storm · Developed MapReduce jobs
for Log Analysis, Recommendation and Analytics · Optimized MapReduce
algorithm using combiners and partitions to deliver the best results and
worked on Application performance optimization for a HDFS cluster · Wrote MapReduce jobs to
generate reports for the number of activities created on a particular day,
during a dumped from the multiple sources and the output was written back to
HDFS · Wrote Pig Scripts to
generate MapReduce jobs and performed ETL procedures on the data in HDFS · Processed
HDFS data and created external tables using Hive · Involved in implementing
data immigration strategy from MongoDB to Hbase, created data schema
suitable for MongoDB data ingestion · Used Talend Open Studio
to move Data from MongoDB to PostgreSQL · Exported analyzed data to
Oracle database using Sqoop for generating reports · Involved in setting up
cluster co-ordination services through ZooKeeper · Involved in Agile
development methodology Environment:
CentOS,
Oracle Database, PostgreSQL, Hive, Pig, Sqoop, Oracle, Flume, Hbase,
ZooKeeper, Maven tools, Talend Open Studio, Agile Alibaba
Group, Beijing, China 09/2011-08/2012 Java
Developer
Alibaba Group
Holding Limited (NYSE: BABA) is a Chinese e-commerce company that provides
consumer-to-consumer, business-to-consumer and business-to-business sales
services via web portals. Online
Order System: This
application enables different level users to select products and place
orders. Various functions such as wish list, expediting or canceling order
can be done on web. It also provides product recommendations to customers. Responsibilities: · Developed Java classes to
be used in JSP and Servlets · Improved the coding
standards, code reuse. Increased performance of the extended applications by
making effective use of various design patterns (MVC, DAO) · Implemented Messaging using
JMS to get the status of the services · Used JDBC to retrieve data
from Oracle database · Developed procedures,
functions and triggers in Oracle PL/SQL · Developed UI using JSP,
HTML/CSS and JavaScript · Analyzed customer data
using cloud technologies such as Hadoop 1.0 and HDFS Environment:
J2EE, Java,
servlets, JDK, JSTL, Oracle, Eclipse, Windows, Linux, Maven, HDFS Renren
Network, Beijing, China 01/2009-08/2011 Java
Developer
The Renren
Network (NYSE: RENN), sometimes referred to as “the Facebook of
China”, is a Chinese social networking service and popular among college
students. XMPP
Instant Messaging (IM) application: This
application (Applet) is run in both C/S and B/S that enables multiple level
users to login, add and group friends, chat with each other, check chatting
history, transfer to and download files from others. The system guarantees
safe access to user's account. Responsibilities: · Focused on Web Applet
development · Worked in an agile team,
using JDK 6.0, Tomcat, PostgreSQL on Windows and Linux Ubuntu · Built the Application using
MVC (Model-View-Controller) pattern and Struts2 Framework · Responsible for features in
C/S using Java Thread, Swing and I/O · Worked with Spring as
the Web-container framework · Modified the POJO
classes, for new features. Implemented DAO interfaces and wrote the
business logics using Servlet Environment:
Java, J2EE,
servlets, Spring MVC, Oracle, MySQL, eclipse, JSP, Hibernate EDUCATION
· Masters in Mathematics · Bachelors in Information
and Computational Science |
|
|
||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Languages: |
Languages |
Proficiency Level |
|
English |
Fluent |
|
|
|