Azure Data Box Next Gen Walkthrough

By John Savill's Technical Training

TechnologyBusinessFinance
Share:

Azure Databox Next Generation: Offline Data Migration Explained

Key Concepts:

  • Offline Data Migration
  • Azure Databox (120 TB, 525 TB)
  • Data Ingestion/Egress
  • Storage Account (Blob, Page Blob, Files, ADLS Gen2)
  • Managed Disks
  • Cross-Region Data Transfer
  • Data Security (Encryption, Passwords)
  • Azure Storage Mover

1. Introduction to Azure Databox

  • Azure Databox is an offline data movement solution where physical devices are shipped to a data center, filled with data, and then shipped back to Azure for ingestion.
  • The video focuses on importing data into Azure, but the same process can be used for exporting data from Azure.
  • Offline data migration is used when network transfer is not feasible due to bandwidth limitations, large data volumes, or time constraints.

2. When to Use Offline Data Migration

  • Bandwidth: Consider the available bandwidth between the data source and Azure, including internet egress points and ExpressRoute connections.
  • Rate of Change: If data is constantly being rewritten, offline migration may not be suitable. A delta copy strategy might be needed.
  • Volume of Data: Evaluate the amount of data to be moved in relation to available bandwidth and time constraints.
  • Example: If a large amount of data needs to be moved quickly and bandwidth is limited, offline data migration is a good option.
  • Azure Storage Mover: Azure Storage Mover can be used to move the majority of the data offline and then catch up with the delta over the network.

3. Target Services and Data Types

  • Databox supports importing data into:
    • Blob storage (block, page) with tiering (hot, cool, cold, archive)
    • Azure Files (standard, premium)
    • Azure Data Lake Storage Gen2 (hierarchical namespace)
    • Managed Disks
  • It does not directly support Azure NetApp Files or databases. Workarounds involve copying data to Azure Files or storage accounts and then restoring it to the target service.

4. Next Generation Databox Devices

  • The new generation Databox replaces the previous 80 TB and 800 TB (Databox Heavy) offerings.
  • It comes in a single form factor with two size options:
    • 120 TB: 150 TB raw, 120 TB usable (after RAID 5 overhead)
    • 525 TB: 600 TB raw, 525 TB usable (after RAID 5 overhead)
  • Both sizes offer overnight shipping.
  • Network Connectivity:
    • Two 100 Gbps Ethernet ports (Data 1, Data 2) using QSFP pluggable interfaces (adapters not included).
    • One 10 Gbps Ethernet port (Data 3) and a management port.
    • Up to 100 Gbps aggregate throughput.
  • Uses a standard power cable.
  • Ruggedized design with handles, tamper-proofing, secure boot, and TPM 2.0.

5. Azure Databox Disk

  • 7 TB usable (8 TB raw) capacity.
  • Up to five disks per order (35 TB total).
  • Connects via USB.
  • Use case: Collecting data from remote systems (e.g., in-vehicle logging) and shipping the disks back for ingestion.

6. Order Limits

  • Up to 10 Databox Disk orders (up to 5 disks each).
  • Up to five 120 TB Databox orders.
  • Up to four 525 TB Databox orders.

7. Ordering Process

  • Create a new Azure Databox order in the Azure portal.
  • Specify the source country/region (data center location) and the destination Azure region (storage account location).
  • Cross-Region Data Transfer: Databox can now ship from a local data center and restore data to a different Azure region over the Microsoft backbone network. This eliminates the need to ship devices across commerce boundaries.
  • Select the Databox size (120 TB or 525 TB).
  • Choose the data destination: storage accounts and/or managed disks.
  • For storage accounts, select the target storage accounts.
  • Enable the option to copy to archive tier if desired.

8. Share Creation and Folder Structure

  • For each storage account selected, Databox creates a set of shares on the device:
    • storageaccountname_blockblob
    • storageaccountname_pageblob
    • storageaccountname_file
  • Under the blockblob share, subfolders are created for each tier: cold, cool, hot, and archive (if enabled).
  • Copy data to the appropriate tier folder.
  • Create a subfolder under any of these shares which would be the container name or the Azure file share name.

9. Security Configuration

  • Encryption:
    • Hardware-based RAID encryption (AES 256).
    • Option for platform-managed key or customer-managed key (CMK).
    • For CMK, specify an Azure Key Vault, key, and a user-assigned managed identity with permissions to access the key.
    • Option for double encryption (BitLocker software encryption on top of hardware encryption).
  • Passwords:
    • Device password (generated by the system or set by the user).
    • Share passwords (generated by the system or set by the user).

10. Shipping and Delivery

  • The Databox is shipped overnight in a box with protective foam.
  • The usage period is 10 days for the 120 TB device and 20 days for the 525 TB device.
  • Daily overage charges apply beyond the included period.

11. Physical Setup

  • Open the front and back flaps for airflow.
  • Connect the power cable and network cables.
  • The management port has a fixed IP address (192.168.0.10).
  • Data ports are set to DHCP by default but can be configured with static IP addresses.
  • The MAC addresses of the data ports are available in the Azure portal for pre-configuration of DHCP.
  • Power on the device.

12. Unlocking the Device

  • Access the Databox dashboard via its IP address in a web browser.
  • Ignore the certificate error (due to the certificate being for the internal name).
  • Unlock the device using the device password from the Azure portal.

13. Network Interface Configuration

  • The dashboard shows the status of network connections.
  • Network interfaces can be configured (static or DHCP).
  • The management port IP address cannot be changed.

14. Data Copying

  • Two methods for copying data:
    • Client-Initiated Copy: Connect to the Databox shares from a client machine and copy data using standard file transfer tools (e.g., net use command).
    • Databox-Initiated Copy: Use the "Copy Data" feature on the Databox dashboard to create copy jobs. The Databox pulls data from the source.
  • The "Connect and Copy" section of the dashboard shows the share names, connection details, and credentials.
  • Performance varies depending on the protocol (SMB, NFS, REST) and file size.

15. Preparing for Return

  • Once data copying is complete, use the "Prepare to Ship" option on the dashboard.
  • The device locks, generates a file list, and compresses the data.
  • Download the file list for verification.
  • Shut down the device.

16. Shipping Back and Data Ingestion

  • Disconnect the cables, pack the Databox in its original packaging, and arrange for pickup.
  • The data is imported into the target storage account in Azure.
  • There is no specific SLA for the import time.
  • If there is hardware damage during shipping, the disks can be moved to a donor device to recover the data.

17. Data Sanitization

  • After the import is complete, the data on the Databox is scrubbed to NIST SP 800-88 revision one standards.

18. Pricing

  • Service fee, extra day fee (if applicable), and shipping fee.
  • Pricing varies depending on the Databox size.

19. Conclusion

  • Azure Databox is a useful solution for offline data migration when network transfer is not feasible.
  • It offers various security features, including encryption and access control.
  • Azure Storage Mover can be used in conjunction with Databox for bulk data transfer and delta synchronization.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "Azure Data Box Next Gen Walkthrough". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video