From: route@monster.com
Sent: Monday, September 28, 2015 12:59 PM
To: hg@apeironinc.com
Subject: Please review this candidate for: Talend
This resume has been forwarded to
you at the request of Monster User xapeix03
|
|||||||
|
|||||||
|
|
|
||||||
|
||||||
|
Rajesh Samuel 508-745-5698/ 617-848-8196 datastage_etl@yahoo.com PROFESSIONAL SUMMARY General: ●
Certified in DataStage 8.0, Oracle 9i Admin. Certified in multiple
financial domains. ●
10+ years experience of IT
experience. 5 years in Datastage ETL. Role/Performance: ●
Held responsibilities of Sr. ETL
Developer, Sr. Technical Consultant, Tech Lead, Senior Developer and
Developer for ETL Development projects. Done Proof of Concept (POC) on
SAS Enterprise BI Suite for an ETL -Framework Reporting architecture. ●
Recognized and appreciated for
several Data Warehouse Performance Tuning initiatives in DataStage
(multiple ETL processes), Informatica(tuning of high-volume ETL for Consumer
Banking), Base SAS (high volume Extraction) and SQLs (high volume Purge
process, tuning of complex logic for data extraction etc.) Skill/Expertise (ETL): ●
Experience in integration of various
data sources like Oracle, DB2, Teradata, Mainframes and Flat Files. ●
Worked for multiple DataStage (primary
skill), Informatica and SAS initiatives on a large Data
Warehouse that processes more than 10 TB/year that grows at nearly 1 TB/year. ●
Good experience with UNIX Shell
Scripting. ●
Extensively worked on SQLs for
data extraction as well as tuning. ●
Worked extensively on different types
of stages like Join, Lookup, Aggregator, Transformer, Merge,
Sort, CDC stages (like Difference Stage), Surrogate Key Generator, XML
Output, Filter, Modify, Copy, Funnel, Remove Duplicates, Pivot, Switch, MQ
Connector, Enterprise/Connector stages for Databases (Teradata, DB2, Oracle),
ODBC Connector, Sequential File, DataSet, Shared Containers, Generic
Stage for developing jobs. Skill/Expertise (Scheduling/Versioning/Documentation): ●
Good experience with Autosys
and Tivoli Maestro Scheduler. ●
Experienced in Borland StarTeam,
Clearcase, SCCS for versioning and QualityCenter, ClearQuest
for defect tracking. ●
Well versed in documentation
of Design Document, Traceability Matrix, Unit Test and SIT scripts,
Implementation Plan, Production Runbooks, Production Schedule Checklists and
Warranty Document. Thorough experience in unit testing, system integration
testing, UAT, implementation, maintenance and performance tuning. Other: ●
Unique ability to understand
long-term project development issues from a budgeting/management perspective
and work through constraining situations. Strives for the best results
through meticulous analysis and review. Have good presentation skills. ●
Well versed with multi-vendor
multi-sourcing models, onsite-offshore-nearshore models. Was instrumental
in guiding, educating and motivating the team across geographies. TECHNICAL SKILLS Domain
: Data Integration in Banking/Cards/Retirement Funds Environments
: Linux, AIX ETL
Tools
: DataStage 8.x, Informatica 8.x, TalenD 3.2 BI
: SAS EBI Suite (3NF) Scripting
: Shell scripting, Base SAS Databases
: Oracle 9i, DB2 UDB Versioning
: StarTeam, ClearCase, SCCS Scheduling
: AutoSys, Tivoli Maestro Defect
Tracker
: Quality Centre, Clear Quest
CERTIFICATION & TRAINING IBM Certified Solution Developer – DataStage v8.0 Oracle Certified Associate – 9i Admin Certified by TCS Financial Technology Centre on General
Banking Certified by TCS Financial Technology Centre on Financial
Risk Management Certified by National Stock Exchange on Basic Financial
Management IBM’s Virtual Classroom Training for DataStage 8.x Oracle University Training for Oracle Certified
Professional (3 months part time) EDUCATION Master of Computer Applications, Mahatma Gandhi
University, India Bachelor of Science (Computer Sc), Kerala University,
India PROFESSIONAL EXPERIENCE ODS (Operational Data
Store)
Apr 2011 – Present TIAA-CREF, Dallas, TX Vendor: Adroit Software Inc. Description: Data
Integration project for the Operational Data Store (Oracle) of TIAA-CREF from
a multitude of sources like MDM (Oracle), Brokerage Database (SQL Server),
Mainframe (flat files), EDW (Teradata), CRM (Siebel) etc. ODS is a purely
relational database on Oracle 11g. ■
Develop DataStage jobs that
streamline data into the Operational Data Store where it will be picked up by
web-based client facing applications and reporting tools. ■
Performed multiple projects involving
a vast number of stages. Challenging/intresting work done on Datastage at ODS
are listed below (not a complete list): ○
Custom aggregation/summary (nested if
else logic) using KeyChange variable, and Transformer Stage variable
manipulation. Summarizes from ~10mil records to a few hundred thousand
records and loads to a summary table. ○
Performance tuned several jobs that
overloaded the 4-node Oracle DB by redesigning jobs to take advantage of
32-node ETL server. ○
Developed a job for file validation –
count the number of records against count in Trailer record. This requires
careful handling of datastage parallelism vs record sequencing, use of
transformer for link sorting, stage variables and constraints, Otherwise/Log
and AbortAfterRows properties. ○
Debugged and resolved a bug with XML
Output stage that involves handling of APT_STRING_PADCHAR using Convert
function. ○
Use of awk script in a pre-job
routine to handle multiple files of a pattern with interleaved multiple
header/trailer in each file. ○
Implementation of “if-else” logic
requirements into a tabularized form so operation can be performed quickly
using a Lookup Stage and also helps maintenance and future changes. Responsibilities: (Sr. DataStage ETL Developer) ●
Data Integration into ODS from
multiple sources. ●
Lead Datastage developer for critical
projects. ●
Tune exisitng DataStage jobs for slow
running jobs. Analyze performance issues and suggest bottlenecks and
resolutions to them. ●
Review of Datastage jobs developed by
team. Technology: DataStage,
GNU/Linux, Oracle 10g, AutoSys, StarTeam, QualityCenter. SMART
ABC
Feb 2011 – Mar 2011 Vendor: Larsen & Toubro Infotech Description: Data
Integration project for Activity Based Costing (ABC) of Citi’s trade
activities on behalf of its institutional clients. ■
Formulate ETL standards and guide ETL
development of Offshore team based on the open source tool TalenD ■
Develop TalenD jobs to stage,
aggregate and load data to fact tables and conform to dimensions. Responsibilities: (Sr. Technical Consultant) ●
Extracting data from Oracle
data sources and flat files ●
Develop TalenD ETL jobs for data
staging, daily aggregation, monthly roll up and population of dimensions and
facts. ●
Formulate ETL standards and best
practices for the offshore team. ●
Guide the team on general ETL
practices Technology: Linux,
Talend, Oracle 10g, Tortoise SVN Bank of America Card Information
(BACARDI)
May 2010 – Jan 2011 Vendor: Tata Consultancy Services Description: Led the
development and actively involved in the development and performance tuning
of 3 key projects -
■
Creation of a new ETL process in
DataStage to load multi-segment customer data from an Oracle DB ■
Migration of SAS to DataStage for a
US Consumer Origination’s ETL process and addition of 45 new fields to the
process/tables. ■
Addition of 8 new indicators and ETL
logic to the critical and complex Small & Medium Business (SMB) Privacy
(Do Not Solicit/Do Not Call process) from a very large and sensitive
“Consumer Choice DataBase” built on Teradata platform. Responsibilities: (ETL Technical Lead for a Globally Networked
Delivery Model) ●
Extracting data from Oracle Warehouse
using Oracle Connector stage and perl scripts invoked by DataStage ●
Aggregation and transformation stages
involved before loading to final Aggregated tables ●
Involved multiple job sequences, exec
command/transformer/lookup/filter/join stages ●
Understanding of existing SAS
programs to develop Detailed job flow diagrams, Design Document and DataStage
job development ●
Designed the ETL jobs using IBM
Websphere Data Stage Server Edition 8.1 to Extract, Transform and load
the data into Staging and then into Oracle Database. ●
Design and development of Extract,
Transform, and Load processes for extracting data Oracle, Mainframe,
Teradata, Flat Files etc. from various systems and loading into DB2 UDB
tables, Oracle tables, Flat file etc. ●
Worked with Oracle Enterprise
Stage, DB2 Connector, ODBC Connector, Transformer Stage, Sequential File
Stage, Aggregator Stage, Change Capture & Change Apply Stages (for CDC
implementation) and Filter Stage ●
Used Lookup Stage to look up
the code table for verifying the codes from the incoming
file. ●
Extensively defined & used Stage
Variables and Constraints in Transformer stage for many ETL Jobs. ●
Designed Sequencer to synchronize the
control flow of multiple activities in a job sequence ●
Involved in Performance tuning of all
the ETL jobs in production environment. ●
Used DataStage Director and
its runtime engine to schedule running the solution, testing and debugging
its components, and monitoring the results executable versions (on an ad hoc
or scheduled basis). ●
Worked to get the better performance
of the ETL jobs where the file size was 5 gig based on Performance Statistics ●
Developed Source to Target mapping
document for all the ETL jobs including the Test Cases. ●
Worked closely with ETL Architect in
finalizing the requirements and in preparing the ETL Technical design
document ●
Worked with TOAD to interact
with Oracle and DB2. ●
Tuned DataStage jobs using
appropriate usage of Environment variables APT_NO_SORT_INSERTION and
APT_SORT_INSERTION_CHECK_ONLY, analysis of OSH Score and Environment
variables from RunDirector. ●
Explored Teradata target load stages
using MultiLoad/TPump/FastLoad options. ●
Tuned Teradata Sparse Lookup to
ordinary Lookup operator thus reducing database hits per row and making
process run faster. ●
Worked out a contingency solution
after end of UAT for a Teradata parallel bulk load that experienced
issues due to non-printable special characters in source data. The load
operation used MultiLoad utility and was replaced with TPump to save
the target Teradata table from being inaccessible. Technology: AIX,
DataStage, SQL, Base SAS, DB2 UDB, Teradata, Oracle 10g, TOAD 8.6, SHELL
SCRIPTING Bank of America Card Information
(BACARDI)
Jan 2010 – Apr 2010 Vendor: Tata Consultancy Services Description: Key
developer for 2 projects- ■ Data Integration project for US and Canada Consumer Cards
using DataStage that was of strategic importance and high visibility at the
bank. ■ Performance tuning of 8 ETL processes for the data
intensive US Consumer IVR portfolio using Informatica Responsibilities (Senior ETL Developer) ●
Developed and implemented new
DataStage ETL process end-to-end to integrate data that was spread across
Legacy Mainframe systems and Unix Warehouse. ●
Extensively used DataStage
Designer to design, develop ETL jobs for extracting, transforming and loading
the data. ●
Did iterative testing of various
performance tuning options to figure out that the particular job ran faster
using APT_DISABLE_COMBINATION set to True and thus achieved optimum
performance. ●
Involved multiple job sequences, CFF
Stages/transformer/lookup/filter/join stages/CDC stages ●
Developed various jobs to read from
Complex Flat Files from source system, Flat Files and Database, transform and
load to Target databases ●
Used stages such as MQ connector
stage to load Audit tables. ●
Worked with QC and Prod Support teams
and bug fix and followed-up with QC-Tickets. ●
Extracted data from various source
systems like Oracle, SQL Server, and Flat Files. ●
Created sequencers to
sequentially execute the designed jobs. ●
Written various Unix Shell Scripts
for scheduling and formatting the files. ●
Implemented the data integration
project on a phased manner to integrate the data with zero user impact. ●
Automated the process to remove
hardcoded values from Tivoli Maestro which was a value-add. Design,
Testing/Documentation and test plan review ●
The performance tuning initiative in
Informatica involved a complex work-around to do the ETL of about 30 Million
records in faster way. ●
Implemented a fool-proof way of
multi-partitioned tuning of the ETL process overcoming challenges like
non-availability of DB2 coordinator node to Informatica. ○ Modified the Source Qualifier to be an 8-way partitioned
reader based on mod value of an evenly distributed integer field. ○ Split the mapping into two parts to work around
non-availability of DB2 coordinator node on partitioned loads. ○ Enabled bulk load to load to DB2 table. ○ The Lookup Transformation present in the mapping was made
a Persistent Lookup and the Lookup Cache was re-directed to a Unix mount
point with large free space. ○ The tuned process saved 5.5 hours of running time reducing
the average run time to about 20 minutes. ●
The tuning changed the run time of 8
IVR ETL processes from an average of 3.45 hrs to an average 22 mins (6 hours
being the time of longest running process before this tuning). Technology: AIX,
DataStage, Informatica, SQL, DB2 UDB, Oracle Bank of America Card Information
(BACARDI)
Aug 2009 – Dec 2009 Vendor: Tata Consultancy Services Description ■ Create a new ETL process for Data Profiling on Online
customer activity using DataStage ■ Migration of Ab Initio process to Informatica ■ Tuning of SAS Extract process for SMB data Responsibilities (Senior ETL Developer) ●
Extract, Transform and Load data for
customer’s online activity from upstream DB2 warehouse and make it available
for BI users to formulate online marketing strategies. ●
Designed the jobs in DataStage
Designer to extract the data from the OLTP to staging area and target. ●
Extracted the data from Flat Files,
Transformed (Implemented required Business Logic) and Loaded into the target
Data warehouse. ●
Migrated an ETL process for Canada
Consumer Solicitation process from Ab Initio to Informatica. ●
End-to-end testing to ensure
the existing process is retained in the new DataStage process. ●
Design, Testing/Documentation and
test plan review ●
Tuned a slow running SAS extraction
process for SMB- ○ The data fetch process in proc sql was split up into
multiple partitions ○ Removed redundant sorts and combined redundant proc sqls ○ Eliminated a recursive update. Technology: AIX,
DataStage, Ab Intio, Informatica, Base SAS, SQL, DB2 UDB Bank of America Card Information
(BACARDI)
Apr 2009 – Jul 2009 Vendor: Tata Consultancy Services Description: ●
Migration of Unix/SQL scripting based
ETL into DataStage ●
Tuning of a Unix/SQL infrastructure
script for Purging Warehouse tables. ●
Creation of a new extract process for
downstream using DataStage Responsibilities (Senior ETL Developer) ●
Understand the business requirement
by analyzing convoluted scripts in Unix and SQL. ●
Develop equivalent ETL processes in
DataStage by seeking performance improvements at the same time. ●
Extensively used stages for Sequential
File, Dataset, Join, Aggregator, Funnel, Filters, DB2 Connectors etc. to
develop multiple DataStage jobs and Job Sequences. ●
A Unix/SQL infrastructure script that
purged the Warehouse tables for years started running slow for a set of huge
Consumer tables (transaction size of 500M to 2.5G rows) and hence required
tuning. The purge process used to slog the production server for
hours and then abend at times. ●
The script would loop to delete
records iteratively on different DB2 nodes, but was found to have leaving
certain database nodes underutilized increasing the transaction volume
between commit points and consuming db logs heavily. ●
To resolve this, a solution was
designed to loop through the records to delete them based on the record count
instead of node number. This helped pumping small manageable counts of
records to all nodes equally. ●
This effort was widely applauded by
Warehouse Managers and Production DBAs. The resultant process ran in 30% of
the original run time and never failed for huge purge processes. ●
The third project was to create an
extract for SMB OverDraft Protection and send it down to TSYS (Total Systems)
which is the SOR (System of Records) for all SMB account transactions and
authorizations. ●
The DataStage job involved stages
like DB2 Connector, ODBC Connector, Join, Sort, CDC, Filter ●
All project involved end-to-end
implementation, right from gathering requirements from user groups till
implementation and 30 days warranty. Have single-handedly documented Design
Document, Traceability Matrix, Unit Test and SIT scripts, Implementation
Plan, Production Runbooks, Production Schedule Checklists and Warranty
Document. Technology: AIX,
DataStage, Informatica, SQL, DB2 UDB Bacardi
DWH
Dec 2008 – Mar 2009 Vendor: Tata Consultancy Services Description: The
project migrated a set of SAS programs used by the analytics team to DataStage
ETL environment. The SAS programs either extracted data down to other
warehouses or loaded incoming data to the SAS layer (datasets) of the Bacardi
Warehouse. The entire set of process belonged to different Line of Businesses
(LOBs) and was migrated to DataStage/DB2 environment. Responsibilities (ETL Developer for Bacardi Data Warehouse) ●
End-to-End ETL implementation using
DataStage and Unix. ●
Developed many Datastage application
jobs for data processing and loading, jobs are scheduled using Tivoli
Maestro. ●
Led a team of 2 developers who worked
in parallel from India. ●
Analysis of SAS code and Unix
wrappers ●
Requirements collection from LOB,
Design, Coding and Implementation ●
Unit testing, Review of test plans ●
Production job monitoring using
Tivoli logs and DataStage RunDirector Technology: AIX,
DataStage, Shell scripting, Base SAS, DB2 UDB Bacardi DWH
Mar 2008– Nov 2008 Bank of America, Dallas, TX Vendor: Tata Consultancy Services Description: Basel
project required extraction of data from the Source & Bacardi Data
warehouses. The data extraction requirement for Basel was complex and the
timelines were stringent, this being a compliance project. The project
extracted data from both Consumer Warehouse and the Commercial Warehouse-
that is from Mainframe Warehouse and Unix Warehouse respectively. The SMB
data extraction for Basel was too complex because of the multitude of logic
applied to each SMB customer groups like Corporate, Super Corporate and
Individual. Responsibilities (ETL Developer for Bacardi Data Warehouse) ●
Developer (end-to-end implementation)
for Basel II project using Unix/SQL for SMB ●
Worked extensively on complex SQL
understanding complex joins and data extraction process. ●
The extract had over 100 fields and
each had its derivation logic. ●
Development of LLD, Traceability
Matrix, Code, Test Scripts ●
Test plan reviews, UTRs, Data file
preparation for SIT ●
Production Implementation & Job
Monitoring Technology: AIX,
DB2, DB2 UDB, Shell scripting,, SQL Know The Customer (KTC) DWH
Jan 2006–Feb 2008 Vendor: Tata Consultancy Services Description: As part
of the KTC (Know the Customer) initiative, Citi maintains the KTC data
warehouse on Oracle which acts as an SOR for personal and transactional
information of its customers. There are hundreds of feeds coming into the
system from various customer-facing applications, credit decisioning systems,
as well as feeds going out to downstream decision making and reporting
systems. Tickets raised by users of KTC system need to be resolved promptly
based on SLA. Responsibilities (ETL Developer and Database Support for KTC Data
Warehouse) ●
Develop load programs in Unix/SQL. ●
Involve in DB support activities and
work with DBAs and developers in Dev Database regions ●
Maintenance of Unix scripts that
helped migration and clean up during Production Migration of data as well as
Dev Region cycling ●
Maintenance of ClearQuest reports on
ClearQuest’s Unix host. Technology: AIX,
SQL, Oracle, ClearQuest, ClearCase OneView Collection
System
Sep 2002 – Dec 2005 Vendor: Ability Computers
Description: Popular
Finance is engaged into multiple financial services like Auto Finance, Chit
Funds, Private Banking etc. The firm decided to integrate its collection
systems across different portfolios into one warehouse for single-view of
customer. The new system extensively used Unix scripts to load data from
various customer facing systems and to send out extracts to credit risk
systems and other warehouses. This also required changes in the front-end UI. Responsibilities (Development and L3 Support for OCS Data Warehouse) ●
Develop UI elements for the
application using VB 6. ●
Debug shell scripts that calls out to
data loads. ●
Modified existing components for
fixes and changes in business logic ●
Own production and support it for any
code changes required (no on-call support) ●
Introduced new documentation
standards wherever applicable ●
Assemble and repair PC hardware and
install systems. Technology: AIX,
DB2 UDB, Shell scripting, SQL, Windows NT, VB 6. |
|
|
||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||
|
|