Unknown Title

By Unknown Author

Share:

Key Concepts

  • ETL Process: Extract, Transform, Load - the core process of moving data from source to target.
  • Informatica PowerCenter: A platform for building enterprise-class data integration solutions.
  • Transformations: Operations performed on data to clean, modify, or enrich it.
  • Mappings: Visual representations of the data flow and transformations within Informatica.
  • Workflows: Automated sequences of tasks that execute mappings and other processes.
  • Sessions: Specific instances of a mapping execution within a workflow.
  • Source Qualifier: Represents the source database table in a mapping.
  • Target Table: The destination database table where transformed data is loaded.
  • Repository Manager: Informatica tool for managing metadata and connections.
  • Workflow Manager: Informatica tool for designing, executing, and monitoring workflows.
  • Workflow Monitor: Informatica tool for tracking the status and performance of workflows.
  • ODBC Connection: A standard way to connect to databases from various applications.
  • Data Cleaning: The process of identifying and correcting errors or inconsistencies in data.
  • Data Aggregation: The process of summarizing data by grouping it based on certain criteria.
  • Data Filtering: The process of selecting specific data based on certain criteria.
  • Data Sorting: The process of arranging data in a specific order.
  • Data Concatenation: The process of combining data from multiple fields into a single field.

Informatica T13 Fresco Play Hands-on Walkthrough

1. Setting up the Environment

  • Course Access: Access the Informatica mini project (Course ID 7937) via the IU platform, which redirects to Fresco Play.
  • Virtual Environment: The course launches a virtual machine environment with pre-installed software and project files.
  • Project Folder: Contains the problem statement, sample test file, and the Super Store data file (CSV).
  • Software: Includes SQL Developer (for database operations) and Informatica PowerCenter tools (Repository Manager, Workflow Manager, Designer).
  • Credentials: Database connection credentials (username, password, host, port, SID) are provided in the problem statement file.

2. Database Setup (SQL Developer)

  • Connecting to Admin: Connect to the 'admin' user in SQL Developer using the provided credentials (username: system, password: admin).
  • Creating Tables: Create five tables based on the problem statement:
    • Super Store (raw data)
    • Super Store Clean Data (target table for task 1)
    • Sales Summary (target table for task 2)
    • Order Analysis (target table for task 3)
    • Geography Analysis (target table for task 4)
    • Order Processing (target table for task 5)
  • Table Creation Script: The video demonstrates copy-pasting the table creation scripts (column names and data types) directly from the problem statement.
  • Commit Changes: After creating the tables, execute a COMMIT statement to save the changes.
  • Importing Data: Import the Super Store data.CSV file into the Super Store table using the SQL Developer's import data wizard.
  • Date Format: Correct the date format during the import process to DD/MM/YYYY to match the data in the CSV file.

3. Informatica PowerCenter - Task 1: Data Cleaning

  • Connecting to Repository: Open the PowerCenter Repository Manager and connect using the administrator credentials (username: administrator, password: administrator).
  • Creating a Folder: Create a new folder named "Super Store" in the Repository Manager to organize the Informatica assets.
  • Launching Designer: Open the Designer tool to create mappings and transformations.
  • Creating a Mapping: Create a new mapping named "map_of_clean_data" in the Mapping Designer.
  • Source Analyzer:
    • Import the Super Store and Super Store Clean Data tables from the database using an ODBC connection.
    • Create a new ODBC connection with the following details:
      • Data Source Name: Oracle
      • Host: Local Host
      • Port: 1521
      • SID: XC
      • Username: system
      • Password: admin
  • Target Designer: Import the Super Store Clean Data table as the target.
  • Transformations:
    • Sorter Transformation:
      • Add a Sorter transformation to remove duplicate records.
      • Connect all columns from the Super Store source qualifier to the Sorter transformation.
      • Enable the "Distinct" option in the Sorter transformation properties.
    • Filter Transformation:
      • Add a Filter transformation to filter records where the country is "United States".
      • Connect all columns from the Sorter transformation to the Filter transformation.
      • Set the filter condition to COUNTRY = 'United States'.
    • Expression Transformation:
      • Add an Expression transformation to concatenate customer ID and customer name.
      • Create two new ports:
        • v_num (Variable Port, String, Precision 50): Extracts the numerical part of the customer ID using the expression: REG_REPLACE(CUSTOMER_ID, '[^0-9]', '').
        • cust_ID_name (Output Port, String, Precision 50): Concatenates the numerical part of the customer ID, a hyphen, and the customer name using the expression: v_num || '-' || CUSTOMER_NAME.
      • Connect all necessary columns from the Filter transformation to the Expression transformation.
    • Target Table:
      • Connect the cust_ID_name port from the Expression transformation to the CUSTOMER_ID_NAME column in the Super Store Clean Data target table.
      • Connect the remaining columns from the Expression transformation to the corresponding columns in the target table.
  • Saving the Mapping: Save the mapping and ensure it is valid.

4. Informatica PowerCenter - Workflow Creation and Execution (Task 1)

  • Workflow Manager: Open the Workflow Manager tool.
  • Connecting to Server: Connect to the Informatica server using the provided credentials.
  • Creating a Workflow: Create a new workflow named "workflow_clean_data".
  • Task Developer:
    • Create a new session named "session_clean_data".
    • Link the session to the "map_of_clean_data" mapping.
  • Configuring the Session:
    • Double-click the session to open the Edit Tasks window.
    • Go to the "Mappings" tab.
    • Go to the "Connections" tab.
    • Configure the source and target connections to use the Oracle ODBC connection.
    • Enable the "Truncate target table" option in the target properties.
  • Workflow Designer:
    • Drag the "session_clean_data" task into the workflow designer.
    • Link the start task to the session task.
    • Validate the workflow.
    • Save the workflow.
  • Executing the Workflow:
    • Right-click the workflow and select "Start Workflow".
    • Monitor the workflow execution in the Workflow Monitor.
    • Verify that the session completes successfully.
  • Verifying the Data:
    • In SQL Developer, query the Super Store Clean Data table to verify that the data has been loaded and transformed correctly.

5. Informatica PowerCenter - Task 2: Sales Summary

  • Mapping: Create a new mapping named "map_of_sales_summary".
  • Source: Use Super Store Clean Data as the source.
  • Transformations:
    • Aggregator Transformation:
      • Connect CUSTOMER_ID_NAME and SALES columns from the source to the Aggregator transformation.
      • Create two new output ports:
        • tot_sales (Output Port, Number): Calculate the sum of sales using the expression: SUM(SALES).
        • AVG_sales (Output Port, Number): Calculate the average of sales using the expression: AVG(SALES).
      • Group by CUSTOMER_ID_NAME.
    • Sorter Transformation:
      • Connect all columns from the Aggregator transformation to the Sorter transformation.
      • Set tot_sales as the sort key and select "Descending" order.
    • Filter Transformation:
      • Connect all columns from the Sorter transformation to the Filter transformation.
      • Set the filter condition to tot_sales > 3000 AND AVG_sales > 300.
  • Target: Use Sales Summary as the target table.
  • Workflow: Create a new workflow named "workflow_sales_summary" and a session named "session_sales_summary". Configure the session with the appropriate connections and enable the "Truncate target table" option.
  • Execution: Execute the workflow and verify the data in the Sales Summary table.

6. Informatica PowerCenter - Task 3: Order Analysis

  • Mapping: Create a new mapping named "map_of_order_analysis".
  • Source: Use Super Store Clean Data as the source.
  • Transformations:
    • Filter Transformation:
      • Connect CUSTOMER_ID, CATEGORY, CITY, and ORDER_DATE columns from the source to the Filter transformation.
      • Set the filter condition to CATEGORY = 'Office Supplies' AND CITY = 'San Francisco'.
    • Aggregator Transformation:
      • Connect all columns from the Filter transformation to the Aggregator transformation.
      • Create a new output port:
        • order_count (Output Port, Integer): Calculate the count of orders using the expression: COUNT(*).
      • Group by CUSTOMER_ID.
    • Rank Transformation:
      • Connect CUSTOMER_ID and order_count columns from the Aggregator transformation to the Rank transformation.
      • Set order_count as the rank key.
      • Set the "Top" property to 10.
  • Target: Use Order Analysis as the target table.
  • Workflow: Create a new workflow named "workflow_order_analysis" and a session named "session_order_analysis". Configure the session with the appropriate connections and enable the "Truncate target table" option.
  • Execution: Execute the workflow and verify the data in the Order Analysis table.

7. Informatica PowerCenter - Task 4: Geography Analysis

  • Mapping: Create a new mapping named "map_of_geography_analysis".
  • Source: Use Super Store Clean Data as the source.
  • Transformations:
    • Filter Transformation:
      • Connect CUSTOMER_ID, STATE, and REGION columns from the source to the Filter transformation.
      • Set the filter condition to STATE = 'California'.
    • Aggregator Transformation:
      • Connect all columns from the Filter transformation to the Aggregator transformation.
      • Create a new output port:
        • region_ord_count (Output Port, Integer): Calculate the count of customers using the expression: COUNT(CUSTOMER_ID).
      • Group by REGION.
  • Target: Use Geography Analysis as the target table.
  • Workflow: Create a new workflow named "workflow_geography_analysis" and a session named "session_geography_analysis". Configure the session with the appropriate connections and enable the "Truncate target table" option.
  • Execution: Execute the workflow and verify the data in the Geography Analysis table.

8. Informatica PowerCenter - Task 5: Order Processing

  • Mapping: Create a new mapping named "map_of_order_processing".
  • Source: Use Super Store Clean Data as the source.
  • Transformations:
    • Expression Transformation:
      • Connect ORDER_DATE and SHIP_DATE columns from the source to the Expression transformation.
      • Create two new ports:
        • prodas (Variable Port, Integer): Calculate the processing days using the expression: DATEDIFF(SHIP_DATE, ORDER_DATE, 'DD').
        • c_pro_days (Output Port, String, Precision 50): Categorize the processing days using the expression:
          IIF(prodas < 1, 'Immediate Delivery',
              IIF(prodas >= 1 AND prodas <= 3, 'Moderate Delivery',
                  'Long-term Delivery'))
          
    • Aggregator Transformation:
      • Connect c_pro_days column from the Expression transformation to the Aggregator transformation.
      • Create a new output port:
        • orders_count (Output Port, Integer): Calculate the count of orders using the expression: COUNT(*).
      • Group by c_pro_days.
  • Target: Use Order Processing as the target table.
  • Workflow: Create a new workflow named "workflow_order_processing" and a session named "session_order_processing". Configure the session with the appropriate connections and enable the "Truncate target table" option.
  • Execution: Execute the workflow and verify the data in the Order Processing table.

9. Sample Test and Submission

  • Running the Sample Test: Execute the run.ps1 PowerShell script located in the sample test folder to validate the data transformations.
  • Verifying the Sample Score: Ensure the sample score is 100% to confirm that all transformations have been implemented correctly.
  • Submitting the Assessment: Close all applications, refresh the Fresco Play environment, and submit the assessment.

10. Conclusion

The video provides a detailed walkthrough of the Informatica T13 Fresco Play hands-on assessment. It covers the entire ETL process, from setting up the database and importing data to creating mappings, transformations, workflows, and executing them. The video emphasizes the importance of understanding the problem statement, using the correct transformations, configuring the sessions properly, and validating the results. By following the steps outlined in the video, users can successfully complete the assessment and achieve a 100% sample score.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "Unknown Title". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video