Unknown Title
By Unknown Author
Share:
Key Concepts
- ETL Process: Extract, Transform, Load - the core process of moving data from source to target.
- Informatica PowerCenter: A platform for building enterprise-class data integration solutions.
- Transformations: Operations performed on data to clean, modify, or enrich it.
- Mappings: Visual representations of the data flow and transformations within Informatica.
- Workflows: Automated sequences of tasks that execute mappings and other processes.
- Sessions: Specific instances of a mapping execution within a workflow.
- Source Qualifier: Represents the source database table in a mapping.
- Target Table: The destination database table where transformed data is loaded.
- Repository Manager: Informatica tool for managing metadata and connections.
- Workflow Manager: Informatica tool for designing, executing, and monitoring workflows.
- Workflow Monitor: Informatica tool for tracking the status and performance of workflows.
- ODBC Connection: A standard way to connect to databases from various applications.
- Data Cleaning: The process of identifying and correcting errors or inconsistencies in data.
- Data Aggregation: The process of summarizing data by grouping it based on certain criteria.
- Data Filtering: The process of selecting specific data based on certain criteria.
- Data Sorting: The process of arranging data in a specific order.
- Data Concatenation: The process of combining data from multiple fields into a single field.
Informatica T13 Fresco Play Hands-on Walkthrough
1. Setting up the Environment
- Course Access: Access the Informatica mini project (Course ID 7937) via the IU platform, which redirects to Fresco Play.
- Virtual Environment: The course launches a virtual machine environment with pre-installed software and project files.
- Project Folder: Contains the problem statement, sample test file, and the Super Store data file (CSV).
- Software: Includes SQL Developer (for database operations) and Informatica PowerCenter tools (Repository Manager, Workflow Manager, Designer).
- Credentials: Database connection credentials (username, password, host, port, SID) are provided in the problem statement file.
2. Database Setup (SQL Developer)
- Connecting to Admin: Connect to the 'admin' user in SQL Developer using the provided credentials (username: system, password: admin).
- Creating Tables: Create five tables based on the problem statement:
- Super Store (raw data)
- Super Store Clean Data (target table for task 1)
- Sales Summary (target table for task 2)
- Order Analysis (target table for task 3)
- Geography Analysis (target table for task 4)
- Order Processing (target table for task 5)
- Table Creation Script: The video demonstrates copy-pasting the table creation scripts (column names and data types) directly from the problem statement.
- Commit Changes: After creating the tables, execute a
COMMITstatement to save the changes. - Importing Data: Import the
Super Store data.CSVfile into theSuper Storetable using the SQL Developer's import data wizard. - Date Format: Correct the date format during the import process to
DD/MM/YYYYto match the data in the CSV file.
3. Informatica PowerCenter - Task 1: Data Cleaning
- Connecting to Repository: Open the PowerCenter Repository Manager and connect using the administrator credentials (username: administrator, password: administrator).
- Creating a Folder: Create a new folder named "Super Store" in the Repository Manager to organize the Informatica assets.
- Launching Designer: Open the Designer tool to create mappings and transformations.
- Creating a Mapping: Create a new mapping named "map_of_clean_data" in the Mapping Designer.
- Source Analyzer:
- Import the
Super StoreandSuper Store Clean Datatables from the database using an ODBC connection. - Create a new ODBC connection with the following details:
- Data Source Name: Oracle
- Host: Local Host
- Port: 1521
- SID: XC
- Username: system
- Password: admin
- Import the
- Target Designer: Import the
Super Store Clean Datatable as the target. - Transformations:
- Sorter Transformation:
- Add a Sorter transformation to remove duplicate records.
- Connect all columns from the
Super Storesource qualifier to the Sorter transformation. - Enable the "Distinct" option in the Sorter transformation properties.
- Filter Transformation:
- Add a Filter transformation to filter records where the country is "United States".
- Connect all columns from the Sorter transformation to the Filter transformation.
- Set the filter condition to
COUNTRY = 'United States'.
- Expression Transformation:
- Add an Expression transformation to concatenate customer ID and customer name.
- Create two new ports:
v_num(Variable Port, String, Precision 50): Extracts the numerical part of the customer ID using the expression:REG_REPLACE(CUSTOMER_ID, '[^0-9]', '').cust_ID_name(Output Port, String, Precision 50): Concatenates the numerical part of the customer ID, a hyphen, and the customer name using the expression:v_num || '-' || CUSTOMER_NAME.
- Connect all necessary columns from the Filter transformation to the Expression transformation.
- Target Table:
- Connect the
cust_ID_nameport from the Expression transformation to theCUSTOMER_ID_NAMEcolumn in theSuper Store Clean Datatarget table. - Connect the remaining columns from the Expression transformation to the corresponding columns in the target table.
- Connect the
- Sorter Transformation:
- Saving the Mapping: Save the mapping and ensure it is valid.
4. Informatica PowerCenter - Workflow Creation and Execution (Task 1)
- Workflow Manager: Open the Workflow Manager tool.
- Connecting to Server: Connect to the Informatica server using the provided credentials.
- Creating a Workflow: Create a new workflow named "workflow_clean_data".
- Task Developer:
- Create a new session named "session_clean_data".
- Link the session to the "map_of_clean_data" mapping.
- Configuring the Session:
- Double-click the session to open the Edit Tasks window.
- Go to the "Mappings" tab.
- Go to the "Connections" tab.
- Configure the source and target connections to use the Oracle ODBC connection.
- Enable the "Truncate target table" option in the target properties.
- Workflow Designer:
- Drag the "session_clean_data" task into the workflow designer.
- Link the start task to the session task.
- Validate the workflow.
- Save the workflow.
- Executing the Workflow:
- Right-click the workflow and select "Start Workflow".
- Monitor the workflow execution in the Workflow Monitor.
- Verify that the session completes successfully.
- Verifying the Data:
- In SQL Developer, query the
Super Store Clean Datatable to verify that the data has been loaded and transformed correctly.
- In SQL Developer, query the
5. Informatica PowerCenter - Task 2: Sales Summary
- Mapping: Create a new mapping named "map_of_sales_summary".
- Source: Use
Super Store Clean Dataas the source. - Transformations:
- Aggregator Transformation:
- Connect
CUSTOMER_ID_NAMEandSALEScolumns from the source to the Aggregator transformation. - Create two new output ports:
tot_sales(Output Port, Number): Calculate the sum of sales using the expression:SUM(SALES).AVG_sales(Output Port, Number): Calculate the average of sales using the expression:AVG(SALES).
- Group by
CUSTOMER_ID_NAME.
- Connect
- Sorter Transformation:
- Connect all columns from the Aggregator transformation to the Sorter transformation.
- Set
tot_salesas the sort key and select "Descending" order.
- Filter Transformation:
- Connect all columns from the Sorter transformation to the Filter transformation.
- Set the filter condition to
tot_sales > 3000 AND AVG_sales > 300.
- Aggregator Transformation:
- Target: Use
Sales Summaryas the target table. - Workflow: Create a new workflow named "workflow_sales_summary" and a session named "session_sales_summary". Configure the session with the appropriate connections and enable the "Truncate target table" option.
- Execution: Execute the workflow and verify the data in the
Sales Summarytable.
6. Informatica PowerCenter - Task 3: Order Analysis
- Mapping: Create a new mapping named "map_of_order_analysis".
- Source: Use
Super Store Clean Dataas the source. - Transformations:
- Filter Transformation:
- Connect
CUSTOMER_ID,CATEGORY,CITY, andORDER_DATEcolumns from the source to the Filter transformation. - Set the filter condition to
CATEGORY = 'Office Supplies' AND CITY = 'San Francisco'.
- Connect
- Aggregator Transformation:
- Connect all columns from the Filter transformation to the Aggregator transformation.
- Create a new output port:
order_count(Output Port, Integer): Calculate the count of orders using the expression:COUNT(*).
- Group by
CUSTOMER_ID.
- Rank Transformation:
- Connect
CUSTOMER_IDandorder_countcolumns from the Aggregator transformation to the Rank transformation. - Set
order_countas the rank key. - Set the "Top" property to 10.
- Connect
- Filter Transformation:
- Target: Use
Order Analysisas the target table. - Workflow: Create a new workflow named "workflow_order_analysis" and a session named "session_order_analysis". Configure the session with the appropriate connections and enable the "Truncate target table" option.
- Execution: Execute the workflow and verify the data in the
Order Analysistable.
7. Informatica PowerCenter - Task 4: Geography Analysis
- Mapping: Create a new mapping named "map_of_geography_analysis".
- Source: Use
Super Store Clean Dataas the source. - Transformations:
- Filter Transformation:
- Connect
CUSTOMER_ID,STATE, andREGIONcolumns from the source to the Filter transformation. - Set the filter condition to
STATE = 'California'.
- Connect
- Aggregator Transformation:
- Connect all columns from the Filter transformation to the Aggregator transformation.
- Create a new output port:
region_ord_count(Output Port, Integer): Calculate the count of customers using the expression:COUNT(CUSTOMER_ID).
- Group by
REGION.
- Filter Transformation:
- Target: Use
Geography Analysisas the target table. - Workflow: Create a new workflow named "workflow_geography_analysis" and a session named "session_geography_analysis". Configure the session with the appropriate connections and enable the "Truncate target table" option.
- Execution: Execute the workflow and verify the data in the
Geography Analysistable.
8. Informatica PowerCenter - Task 5: Order Processing
- Mapping: Create a new mapping named "map_of_order_processing".
- Source: Use
Super Store Clean Dataas the source. - Transformations:
- Expression Transformation:
- Connect
ORDER_DATEandSHIP_DATEcolumns from the source to the Expression transformation. - Create two new ports:
prodas(Variable Port, Integer): Calculate the processing days using the expression:DATEDIFF(SHIP_DATE, ORDER_DATE, 'DD').c_pro_days(Output Port, String, Precision 50): Categorize the processing days using the expression:IIF(prodas < 1, 'Immediate Delivery', IIF(prodas >= 1 AND prodas <= 3, 'Moderate Delivery', 'Long-term Delivery'))
- Connect
- Aggregator Transformation:
- Connect
c_pro_dayscolumn from the Expression transformation to the Aggregator transformation. - Create a new output port:
orders_count(Output Port, Integer): Calculate the count of orders using the expression:COUNT(*).
- Group by
c_pro_days.
- Connect
- Expression Transformation:
- Target: Use
Order Processingas the target table. - Workflow: Create a new workflow named "workflow_order_processing" and a session named "session_order_processing". Configure the session with the appropriate connections and enable the "Truncate target table" option.
- Execution: Execute the workflow and verify the data in the
Order Processingtable.
9. Sample Test and Submission
- Running the Sample Test: Execute the
run.ps1PowerShell script located in the sample test folder to validate the data transformations. - Verifying the Sample Score: Ensure the sample score is 100% to confirm that all transformations have been implemented correctly.
- Submitting the Assessment: Close all applications, refresh the Fresco Play environment, and submit the assessment.
10. Conclusion
The video provides a detailed walkthrough of the Informatica T13 Fresco Play hands-on assessment. It covers the entire ETL process, from setting up the database and importing data to creating mappings, transformations, workflows, and executing them. The video emphasizes the importance of understanding the problem statement, using the correct transformations, configuring the sessions properly, and validating the results. By following the steps outlined in the video, users can successfully complete the assessment and achieve a 100% sample score.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Unknown Title". What would you like to know?
Chat is based on the transcript of this video and may not be 100% accurate.