From:                              route@monster.com

Sent:                               Monday, September 28, 2015 12:58 PM

To:                                   hg@apeironinc.com

Subject:                          Please review this candidate for: Talend

 

This resume has been forwarded to you at the request of Monster User xapeix03

Shagufta Begum 

Last updated:  12/27/12

Job Title:  no specified

Company:  no specified

Rating:  Not Rated

Screening score:  no specified

Status:  Resume Received


Stoneridge, VA  20105
US

Quick View Links:

Resume Section

Summary Section

 

 

RESUME

  

Resume Headline: Shagufta_ETL

Resume Value: 2qhs9ckjbsd7wb93   

  

 

(: 571-224-4550 :: shagufta.begum1@gmail.com
Shagufta Begum
CERTIFICATIONS:
STRENGTHS
· IT Professional with over 6+ years of experience in all phases of SDLC and expertise experience in Data
Warehousing, ETL and Business Intelligence projects using Informatica, Talend, OWB, Datastage, and
Pentaho ETL with Oracle, SQL Server, Mysql and Netezza RDBMS.
· Specialty expertise in Education, Healthcare and Advertising domains.
· Data Warehousing experience using Star schema and Snowflake schema with Facts, Measures and Slowly
Changing Dimensions (SCD) Type1, Type2, and Type3.
· Extensive experience in developing ETL mappings for XML, .csv, .txt sources and also loading the data
from these sources into relational tables with ETL tools loaders like Talend loader, Informatica loader,
Datastage loader and Sql * Loader.
· Extensive experience in creating fact, lookup, dimension, staging tables and other database objects
like views, stored procedure, function, indexes, and constraints.
· Well versed with Ralph Kimball (Bottom up) and Bill Inmonn (Top Down) methodologies, and Star and
Snow flake schemas.
· Well versed with Facts and dimensions and have extensive experience in developing the mapping to
load the data in dimension and facts.
· Extensive experience in analyzing and tuning the ETL flow for better performance and scheduling the
ETL mappings on daily, monthly, quarterly and yearly basis.
· Guides, trains, instruct and assist team members on complex technical issues.
· Extensive experience in requirement gathering, Data quality and developing physical and logical design
documents.
· Well versed with Agile methodology, Scrum and Sprint process.
· Excellent communication, interpersonal and Team Lead skills.
· Highly motivated team player with excellent analytical, problem solving, interpersonal and
communication skills, commitment, result oriented and zeal to learn new technologies and undertake
challenging tasks.
EDUCATIONAL QUALIFICATIONS
· Pursuing MBA-IT from Goldey Beacom College, USA 2011.
· Masters of Information Systems (MIS) from University of N.Virginia, USA in 2011.
· 2 yrs Advanced Diploma in Software Development with specialization in data warehousing tools, Oracle,
Sql-server from National Institute of Information Technology(NIIT) in 2002.
· Bachelor of Science in Business Administration from Rohilkhand University, India in 2000.
TECHINCAL SKILL SET
Skills Profile
ETL Tools
Databases
Others Skills
Dimensional Modelling Tools
Languages
Methodologies
Operating Systems
Informatica Power Centre(8.x,7.x, 6.x), Talend Integration Suite 3/4/5,
Datastage(7.5/8.0/8.5), OWB(11g,10g,9i)
Oracle 8i/9i/10g, Ms-Sql Server 2005/2008, MySql, Netezza, Hadoop
OBIEE 10g, Salesforce
Erwin, Visio
Sql, Pl-Sql, T-Sql, Xml, java
Ralph Kimball Methodology, Bill-Inmon Methodology
Windows 2000/XP,UNIX, Ms-Dos
EMPLOYMENT HISTORY
· Working for Infomatics Corp as Sr. ETL Developer from Apr-2012 – till now.
· Worked for Meta Dimensions as Sr. ETL Developer from July 2011- Apr 2012.
· Worked for Koofers, Inc as Sr. ETL Developer from Aug 2010 – June 2011.
· Worked for Infomatics Corp as ETL Developer form Mar 2008- July 2010.
· Worked for Emirates Neon as IT Analyst from Oct 2003- Sept 2005.
· Worked for Jamia University as Computer Analyst from Jan 2001- Sept 2003.
AWARDS/ACHIEVEMENTS
· Best Performer award by Koofers.
· Created Fun@Work team 1st time in Koofers, which aims at best interaction possible between
employees of whole IT department.
· Streamlined Data Integration process in Koofers by implementing cost effective Talend ETL solution
which results in higher performance and cost saving.
· Assisted MIS director in University of Jamia to prepare Informatica Standards and guidelines, which are
now being followed by university MIS division.
Projects History
Project Number
Project Name
Client Name
Date –From/To
Position
Technology Used
1
Noah Data Integration
Apollo Global
Feb 2012- till now
Sr. ETL Developer
Informatica 8.6, Talend 4.2, Oracle, Sql-server, Hadoop, Mysql, OBIEE, Salesforce
Client Info: Apollo Global is a $1 billion joint venture between Apollo Group and private equity firm The
Carlyle Group formed in September, 2007. Apollo Global intends to make a range of investments in the
international education services sector targeting investments and partnerships primarily in countries outside
the U.S. with attractive demographic and economic growth characteristics.
Project Info:
Purpose of this project is to build an Enterprise Data Warehouse to integrate a suite of best-in-class core
applications to share across multiple universities. The project requires extensive Data Integration process to
load the data into Enterprise Data Warehouse from 1500+ multi format data sources. The project requires
development of complex transformation routines for data profiling, data cleaning, and data transformation to
streamline the Data Integration process. Eleven enterprise system applications and multiple work streams will
be integrated into a new technology infrastructure to support the core business functions of the organization.
Informatica ETL modules like Designer, Workflow Manager, Metadata Manager, Workflow Statistics are
extensively used to develop the new complex ETL jobs from different universities and also to support the
existing jobs and fixing issues. Talend Integration Suite 4.2 is used extensively to migrate the jobs from
Informatica and to integrate the multiple sources data into one standardized format, and to develop ETL
mapping for fact, dimension, staging, ODS, data warehouse and data mart loads. Hadoop has been used to
archive the heavy data files from different universities and to perform extensive map and reduce operations.
Role & Responsibilities
· Documenting, designing and constructing the Extract, Translate and Load jobs that will move data for
the Operational Data Store using Informatica and Talend platform.
· Used Informatica and Talend admin console to provide privileges, creating users, folders, repositories
etc.
· Use Md5 and CRC 32 hashing algorithm to mask the data.
· Develop complex ETL jobs from source to prep, prep to stage, stage to datamarts & datamarts to data
warehouse layers.
· Develop complex data quality rules to clean the data like first name and last name swap rule, phone no
cleaning rule, cleaning special characters like “@,*/&$,;” from the address, structure standardization
rule and deduplication rules to filter the duplicate and unknown records.
· Implemented auditing and batches process to capture the data load information like rows loaded, rows
rejected, file location, success, failure, load start date, load end date etc.
· Implemented error handling strategy to capture the statistics rejects, and data flow with components
like tstatcatcher, tlogcatcher, tflowmeter.
· Working on Informatica Power Center 8.6 and used most of the Informatica functions extensively to build
business rules to load data. Most of the transformations used were like the Source qualifier, Aggregators,
lookups, Filters, Sorting, Source Qualifier, SQL, Joiner, Expression, Router, Reject Strategy, Update
Strategy & Sequence generators.
· Used parameters to handle the values of database connections, file path, data load history dynamically.
· Develop Informatica session, task and workflow to execute the ETL jobs and load the data from
heterogeneous sources like Salesforce ERP system, .csv, .xml, .excel, oracle, mysql and sqlserver.
· Develop Informatica jobs to FTP the files and to send success and failure notifications.
· Used most of the Talend components like tmap, thashinput, tdie, tfilter, tjoin, tsetglobalvar, contexts,
twarn, taggregaterow, tfilelist, trunjob etc.
· Develop Talend jobs to FTP the files and to send notification with components like tftpget, tftpput and
tsendemail.
· Performed requirement gathering, Data Profiling, and unit testing of ETL jobs.
· Develop ETL mapping with Talend to load the data from multiple sources like .csv, .xml, .txt formats.
· Develop complex ETL jobs like SCD type 1 & Type2. The SCD Type 1, & 2 are used extensively to
captures the changes in the from source data into data marts and data warehouse.
· Applied best practices and naming convention to standardize the data integration process like source to
target mapping document, source file definition document, job turnover document and job review
documents.
Project Number
Date –From/To
Project Name
Client Name
Position
Technology Used
2
July 2011- Jan 2012
University Data Integration
State of Georgia
Sr. ETL Developer
Informatica 8.6, Talend 4.2, Oracle, Sql-server, XML, Netezza, OBIEE, Salesforce
Worked on a State of Georgia data integration project to streamline and standardized the department of
education data. This project is in development stage to develop new ETL process for State of Georgia data
integration process. Informatica ETL modules like Designer, Repository Manager, Workflow Manager, Metadata
Manager, Workflow Statistics were extensively used to develop the new complex ETL jobs from different
universities and also to support the existing jobs and fixing issues. Informatica 8.6 was used extensively to
develop and maintain the ETL and data cleaning jobs. Informatica ETL was used to develop the jobs from the
multiple sources data into one standardized format, and to develop the new ETL mapping for fact, dimension,
staging, ODS, data warehouse and data mart loads. Talend Integration Suite 4.2 has been used as the new open
source ETL solution and to migrate the existing ETL jobs from Informatica ETL.
Role & Responsibilities
· Develop complex ETL jobs from source to prep, prep to stage, stage to datamarts & datamarts to data
warehouse layers.
· Used Informatica and Talend admin console to provide privileges, creating users, folders, repositories
etc.
· Use Md5 and CRC 32 hashing algorithm to mask the data.
· Implemented auditing and batches process to capture the data load information like rows loaded, rows
rejected, file location, success, failure, load start date, load end date etc.
· Implemented error handling strategy to capture the statistics rejects, and data flow with components
like tstatcatcher, tlogcatcher, tflowmeter.
· Worked on Informatica Power Center 8.6 and used most of the Informatica functions extensively to build
business rules to load data. Most of the transformations used were like the Source qualifier, Aggregators,
lookups, Filters, Sorting, Source Qualifier, SQL, Joiner, Expression, Router, Reject Strategy, Update
Strategy & Sequence generators.
· Used parameters to handle the values of database connections, file path, data load history dynamically.
· Develop Informatica session, task and workflow to execute the ETL jobs and load the data from
heterogeneous sources like Salesforce ERP system, .csv, .xml, .excel, oracle, mysql and sqlserver.
· Develop Informatica jobs to FTP the files and to send success and failure notifications.
· Develop complex data quality rules to clean the data like first name and last name swap rule, phone no
cleaning rule, cleaning special characters like “@,*/&$,;” from the address, structure standardization
rule and deduplication rules to filter the duplicate and unknown records.
· Implemented error handling strategy to capture the statistics rejects, and data flow with components
like tstatcatcher, tlogcatcher, tflowmeter.
· Used most of the Talend components like tmap, thashinput, tdie, tfilter, tjoin, tsetglobalvar, contexts,
twarn, taggregaterow, tfilelist, trunjob etc.
· Develop Talend jobs to FTP the files and to send notification with components like tftpget, tftpput and
tsendemail.
· Performed requirement gathering, Data Profiling, and unit testing of ETL jobs.
· Develop ETL mapping with Talend to load the data from multiple sources like .csv, .xml, .txt formats.
· Develop complex ETL jobs like SCD type 1 & Type2. The SCD Type 1, & 2 are used extensively to
captures the changes in the from source data into data marts and data warehouse.
· Applied best practices and naming convention to standardize the data integration process like source to
target mapping document, source file definition document, job turnover document and job review
documents.
Company Name
Project Title
Date –From/To
Position
Technology Used
Koofers.com
Koofers Data Integration
Aug 2010- June 2011
Sr. ETL Developer
The main objective of this project is to develop the ETL jobs with Informatica Powercenter to integrate the
multiple sources data into one standardized format and to streamline the data Integration process of the
organization.
Informatica 8.1.1, Talend 3.2/4.1, Mysql, Sql-Server, Oracle, XML, .csv, .txt,
Smerfer, Unix, java
Phases of the Project : This projet involves five phases for the ETL development.
--------------------------·
Raw Files, Database tables( Sql Server, Oracle) è Data Profiling à ETL Prep Layer(Informatica, Mysql)
· ETL Prep Layer à Data Quality à Standardization ETL Layer ( Talend, Mysql)
· Standardization ETL Layer à Taxonomy Data Marts ( Informatica , Oracle) à BI Dashboard
· Standardization ETL Layer à Data Warehouse ETL à BI Dashboard
· Informatica ETL à Migrate à Talend ETL
Role & Responsibilities
· Develop ETL mapping with Informatica & Talend to load the data from multiple sources like .csv, .xml,
.txt formats.
· Develop complex ETL jobs from source to prep, prep to stage, stage to datamarts & datamarts to data
warehouse layers.
· Used Informatica and Talend admin console to provide privileges, creating users, folders, repositories
etc.
· Use Md5 and CRC 32 hashing algorithm to mask the data.
· Develop complex data quality rules to clean the data like first name and last name swap rule, phone no
cleaning rule, cleaning special characters like “@,*/&$,;” from the address, structure standardization
rule and deduplication rules to filter the duplicate and unknown records.
· Develop complex ETL jobs like Standarized convention in Koofers to standardize the data integration
process like source to target mapping document, source file definition document, job turnover
document and job review documents.
· Worked on Informatica Power Center 8.1.1 and used most of the Informatica functions extensively to
build business rules to load data. Most of the transformations used were like the Source qualifier,
Aggregators, lookups, Filters, Sorting, Source Qualifier, SQL, Joiner, Expression, Router, Reject Strategy,
Update Strategy & Sequence generators.
· Used parameters to handle the values of database connections, file path, data load history dynamically.
· Develop Informatica session, task and workflow to execute the ETL jobs and load the data from
heterogeneous sources like Salesforce ERP system, .csv, .xml, .excel, oracle, mysql and sqlserver.
· Develop Informatica jobs to FTP the files and to send success and failure notifications.
· Performed requirement gathering, Data Profiling, and unit testing of ETL jobs.
· Implemented error handling strategy to capture the statistics rejects, and data flow with components
like tstatcatcher, tlogcatcher, tflowmeter.
· Used most of the Talend components like tmap, thashinput, tdie, tfilter, tjoin, tsetglobalvar, contexts,
twarn, taggregaterow, tfilelist, trunjob etc.
Project Number
Project Name
Date –From/To
Position
Technology Used
4
Healthcare Analytics
Mar 2008- July 2010
ETL Developer
Involved in ETL development projects with healthcare client to integrate the multiple sources data into one
standardized format, and to develop ETL mapping for fact, dimension, staging and ODS loads. Informatica was
used initially to develop the ETL jobs. Datastage 8.0/8.5 has been introduced to develop the new ETL
development from source to staging, staging to ODS, ODS to data warehouse and data warehouse to data
marts. This project includes fact tables like total no of Claims, Patient Treatment, Patient enrollment, Patient
Hospitalization, Providers, and In-Network Doctors etc. Develop complex loads for ScD type 1 & Type 2. Applied
complex data cleaning process to standardize the data into client required format.
Informatica 8.1.1, Datastage 8.0/8.5, Talend 3.2, OWB 10g, Oracle, Sql-server,
XML, Netezza, Microstratgy
Role & Responsibilities
· Implemented Datastage error handling strategy to capture the statistics rejects, and data flow.
· Used Informatica, Talend and Datastage admin console to provide privileges, creating users, folders,
repositories etc.
· Used most of the Datastage components like Administrator, Designer, Director and Manager.
· Used most of the Datastage stages like filter, sort, transformer, join, aggregator, peek, copy etc
· Developed Datastage jobs to FTP the files and to send notification with components like email
activity, FTP stage.
· Performed requirement gathering, Data Profiling, and unit testing of ETL jobs.
· Develop ETL mapping with Datastage, Talend to load the data from multiple sources like .csv, .xml,
.txt formats.
· Worked on Informatica Power Center 8.1.1 and used most of the Informatica functions extensively to
build business rules to load data. Most of the transformations used were like the Source qualifier,
Aggregators, lookups, Filters, Sorting, Source Qualifier, SQL, Joiner, Expression, Router, Reject Strategy,
Update Strategy & Sequence generators.
· Used parameters to handle the values of database connections, file path, data load history dynamically.
· Develop Informatica session, task and workflow to execute the ETL jobs and load the data from
heterogeneous sources like Salesforce ERP system, .csv, .xml, .excel, oracle, mysql and sqlserver.
· Develop Informatica jobs to FTP the files and to send success and failure notifications.
· Develop complex ETL jobs from source to prep, prep to stage, stage to datamarts & datamarts to data
warehouse layers.
· Develop complex data quality rules to clean the data like first name and last name swap rule, phone no
cleaning rule, cleaning special characters like “@,*/&$,;” from the address, structure standardization
rule and deduplication rules to filter the duplicate and unknown records.
· Develop complex ETL jobs like SCD type 1 & Type2. The SCD Type 1, & 2 are used extensively to
captures the changes in the from source data into data marts and data warehouse.
· Applied best practices and naming convention to standardize the data integration process like source to
target mapping document, source file definition document, job turnover document and job review
documents.
Project Number
Project Title
Date –From/To
Position
Technology Used
5
Neon Advertising Data Warehouse
Oct 2003- Sept 2005
Analyst
Informatica 6.5, DataStage 7.5, XML, Oracle, Sql-Server, Business Objects 6.5
The objective of this project was to develop the data marts for EmiratesNeon advertising division. The data
was extracted from third party online ad campaign databases and multi format source files into one
standardized format to develop data marts for various departments of the advertising organization which use
integrated information to implement new business strategies based on online campaign customer behavior,
customer interest, customer geography etc. This information is helpful for the Senior Executives & Top
Management for decision making to target the new customers. The data was coming in .csv, .xml , .txt
formats. Informatica ETL was used extensively to develop the mappings from source to staging, staging to ODS,
ODS to data warehouse and data warehouse to data marts. Complex data quality rules has been applied to
clean the data like first name and last name swap rule, phone no cleaning rule, cleaning special characters like
“@,*/&$,;” from the address, structure standardization rule and deduplication rules to filter the duplicate and
unknown records. Business Objects reporting is used to pull the data from data-marts and represent into the
front end layer. Involved with ETL migration project from Informatica ETL to Datastage ETL.
Project Number
Project Title
Date –From/To
Position
Technology Used
6
University Taxonomy Warehouse
Jan 2001- Sept 2003
Computer Analyst
Informatica 6.0, XML, Oracle, Sql-Server, Linux
University MIS department was struggling to streamline the multi format source files and databases integration
process. The main objective of this project is to develop a datawarehouse to clean the data and to create
university courses taxonomy for each quarter, university centers taxonomy and then display the results with
dashboard solution. This project is critical for the university to grow further as at one point, it’s difficult for
the university to give correct information course schedule, faculty availability to the students as the source
data was corrupt and there were no rule in place to satisfy these needs. The data was coming in multiple
formats like .csv, .txt, xml, oracle db, sql-server database and contain lot of junk and dirty records.
Informatica ETL is used extensively to develop the mappings from source to staging, staging to ODS, ODS to
data warehouse and data warehouse to data marts. Complex data quality rules has been applied to clean the
data like first name and last name swap rule, phone no cleaning rule, cleaning special characters like
“@,*/&$,;” from the address, structure standardization rule and deduplication rules to filter the duplicate and
unknown records. Region and Courses dimensions are developed to incorporate the university taxonomies
relationships.



Additional Info

BACK TO TOP

 

Current Career Level:

Experienced (Non-Manager)

Work Status:

US - I am authorized to work in this country for any employer.

 

 

Target Company:

Company Size:

 

Target Locations:

Selected Locations:

US-VA