Restoring Large Files in Windows with BlackPearl – Sparse Files

Spectra Logic provides Software Development Kits (SDKs) for BlackPearl in several different programming languages to make it easier for our partners and customers to create integrations to BlackPearl. Some of the BlackPearl SDKs are used in Windows environments, particularly the Java and .NET SDKs. An issue was recently discovered in the way that these SDKs write large data blocks (64GB+) restored from BlackPearl on to disk.

In Windows, by default, when a file is created, Windows will initially reserve the space for the file on disk by filling the file with zeros. It then later replaces those zeros with the actual file byte content as it is received. When the files are 64GB or larger, it can take a very long time to write all of the zeros. In that time the file transfer from BlackPearl can time out, causing the SDK to have to resubmit its request to BlackPearl to restore the file. This resubmission causes the overall transfer process to take longer than it would if the resubmission did not occur. And in extreme cases with extremely large files (1TB+), the transfer can permanently fail.

To avoid this time out issue, the zeros should not be initially written in the file. This can be done by specifying that Windows Sparse Files be used. With Windows sparse files, per Microsoft, “the system does not allocate hard disk drive space to a file except in regions where it contains nonzero data”. Sparse files are therefore initially written much more quickly than non-sparse files.

We have updated our Java and .NET SDKs to use sparse files on restores. Partners and customers using these SDKs should update to the latest versions to ensure that they have this sparse file fix. However, the fixes in these SDKs will only be helpful if the “FileHelpers” class is being used, in which case the SDKs will be writing the restored files. For those organizations that are using these SDKs but have written their own code to write the restored files, they must implement sparse files themselves in their own code.

You can see examples of how Spectra Logic implemented sparse files in the SDKs:

Please Contact the Spectra Developer Program Team  if you need assistance with implementing sparse files with BlackPearl.


A New Method of BlackPearl Integration – Spectra RioBroker

To date, hundreds of Spectra Logic customers have implemented our BlackPearl® Converged Storage System to affordably store data long-term. Spectra’s BlackPearl is a purpose-built storage platform that integrates directly with ”data mover” software applications to simplify workflows and seamlessly manage large volumes of data to multiple storage targets. BlackPearl includes an S3-like interface to move data into its object storage on the cloud, disk, and tape. In order to help even more customers meet their long-term storage needs, Spectra has recently introduced another, easier system for transferring files into BlackPearl. This new system is called Spectra RioBroker. The announcement, made yesterday, can be read here.

As a software front-end to BlackPearl, Spectra RioBroker acts as a data mover for applications that wish to move data into the BlackPearl object storage gateway. The typical system architecture is shown in the diagram below.

There are several advantages to using Spectra RioBroker instead of the direct BlackPearl i­nterface:

  • Enables easier client development for partners with a simple abstraction layer over BlackPearl interface
  • Allows more clients and applications to share BlackPearl object storage resources in parallel at ever higher performance
  • Provides remote input/output capabilities to multiple Spectra BlackPearl Converged Storage servers
  • Brokers data stream input and output between multiple sources and destinations
  • Provides for enterprise-level high availability*
  • Delivers ultra-high performance with clustering capabilities*
  • Facilitates seamless content migration from legacy storage software to a modern solution

A perfect use case example for Spectra RioBroker is for Media Asset Management (MAM) software typically used in the Media and Entertainment industry. Spectra’s MAM partners have built integrations to Spectra RioBroker so that their customers can archive and restore files directly from the MAM interface without the files passing through the MAM server. Now that the software is released, we expect many more partners and customers to build integrations.

The Spectra RioBroker software application is provided to Spectra customers at no cost. The software is currently available for Windows servers, and a Linux version will be released in the future. Both partners and customers can utilize the free tools on Spectra’s Developer Program Website to build their own integrations. The website includes all the tools needed to build an integration, including the Spectra RioBroker installer, a BlackPearl simulator, an SDK, code examples, documentation and more. Spectra RioBroker, like BlackPearl, uses a RESTful API with very simple commands. A basic integration to Rio Broker requires only three API commands – Archive, Restore, and Check Job Status.

While Spectra RioBroker will become a popular method to integrate to BlackPearl, direct integration to BlackPearl will continue to be logical for certain use cases. Spectra Logic will continue to support this direct integration method.

*Available in future release of Spectra RioBroker


Using the New BlackPearl “Staging Objects” Feature

Included in Spectra Logic’s BlackPearl 5.0 API is a new Stage Objects command. Most BlackPearl customers utilize tape storage, with many using tape as the final place where data will live. Meaning, over time, the data will age out of both the cache and temporary disk tier, leaving the only copy or copies of the data on tape. Due to the secure offline nature of tape, restore jobs which mount a tape cartridge will always have some latency as it will take time to retrieve a cartridge from a shelf inside of the tape library. This latency can compound into a rather long wait if one job, with a large set of objects and partial objects, spans across many tape cartridges.

This type of latency happens frequently in many environments, since their data sets can be quite large, potentially spanning tens or hundreds of tapes. For instance, in weather forecasting, weather data is gathered over time and archived daily to new tape. When a new climate model is set to be run on the supercomputer, the job requires a sub set of data from each day in the archived time period (potentially over the last 10 years), which is stored across hundreds of tapes.

If a large restore job requires mounting many tapes, this job can take a very long time to restore the data back onto the processing server or supercomputer, especially if there is a limit on drive availability. This waiting can waste time and money because data is sitting idle on expensive compute storage waiting for the rest of the data set to arrive before performing the analysis.

With the Stage Objects command, this same job can be given to BlackPearl and it will instead internally copy the data to either a disk tier, if available, otherwise it will copy the job to cache. Later, when the window of processing time is available, a normal bulk get job can restore the data from BlackPearl disk more efficiently and quickly, since the data is pulled in parallel without additional latency required mounting tapes.

Once the data is restored to disk, the staged objects will stay on the disk as dictated by the disk policy. For example, if the data is restored to ArcticBlue disk that has a 3-month retention policy policy, the data will stay on the disk for at least 3 months. If it’s restored to disk cache, it will stay on cache for as long as the cache can keep it there until it has to make room for other files.

Creating a Stage Objects job works just like creating a bulk get job, but instead of the operation being “START_BULK_GET”, it is instead “START_BULK_STAGE”.

Here is the HTTP request syntax:

PUT http[s]://{datapathDNSname}/_rest_/bucket/{bucket UUID or name}?operation=START_BULK_STAGE[&name={string}]​[&priority=URGENT|​HIGH|​NORMAL|​LOW]

The payload will include some number of objects:

<Objects
<Object Name=”{string}” Length=”{64-bit integer}”
Offset=”{64-bit integer}” Version_Id=”{string}”/>

</Objects>

The response is same as other bulk jobs:

<MasterObjectList
Aggregating=”TRUE|FALSE”
BucketName=”{string}”
CachedSizeInBytes=”{64-bit integer}”
ChunkClientProcessingOrderGuarantee=”IN_ORDER|NONE”
CompletedSizeInBytes=”{64-bit integer}”
EntirelyInCache=”TRUE|FALSE”
JobId=”{string}”
Naked=”TRUE|FALSE”
Name=”{string}”
OriginalSizeInBytes=”{64-bit integer}”
Priority=”CRITICAL|URGENT|HIGH|NORMAL|LOW|BACKGROUND”
RequestType=”GET”
StartDate=”YYYY-MM-DDThh:mm:ss.xxxZ”
Status=”IN_PROGRESS|COMPLETED|CANCELED”
UserId=”{string}”
UserName=”{string}”>
<Nodes>
<Node EndPoint=”{string}” Id=”{string}”/>
</Nodes>
<Objects
ChunkId=”{string}”
ChunkNumber=”{32-bit integer}”>
<Object Id=”{string}” InCache=”TRUE|FALSE”
Latest=”TRUE|FALSE” Length=”{64-bit integer}”
Name=”{string} “Offset=”{64-bit integer}”
VersionId=”{string}”/>

</Objects>

</MasterObjectList>

This Staging Objects operation is available in all 5.0+ SDKs for integration into your application. We encourage developers to use this new feature to pre-stage data for users of their applications. Contact the Developer Program Team if you have any questions or need assistance.


BlackPearl 5 Simulator Now Available

We have released a new BlackPearl simulator which is running the latest version of our BlackPearl 5.0 code, scheduled for release in late June. You can get the simulator on our Downloads page. SDKs for BlackPearl 5 are also available on this page. Make sure to carefully follow the simulator setup instructions. Contact the Developer Program Team if you need assistance.


BlackPearl SDK for Go is Now Available

Spectra Logic provides software development kits (SDKs) to make it easier to create applications that integrate with BlackPearl. We are pleased to announce that now we have an SDK for the Go programming language to simplify and accelerate BlackPearl client application development. In addition to Go, we currently support SDKs for Java, C#/.NET, C and Python. You can download the Go SDK and view the Go SDK documentation. This is a new language release, so please send us any feedback you might have on this SDK.


Getting Started with BlackPearl Partial File Restore Integration

In ​the Media & Entertainment world​, data files have reached very large sizes,​ particularly in cases of high resolution video ​that can exceed the one terabyte in size. In order to efficiently work with very large files, the media file processing is done in sections, with​ the end-user requesting content “snippets” based on timecodes. Object storage devices​​ that are used to store very large files are typically not aware of the timecode-to-byte relationship, and have no content awareness that’s necessary to extract and create partial media files.  In order to bridge the gap between time and bytes, BlackPearl has added a Partial File Restore (PFR) feature to enable the media processing application to efficiently retrieve a complete media file based on timecode offsets.

In​​ a typical Media Asset Management (MAM)/BlackPearl environment, the architecture would involve the following elements:

--        MAM – provides end user interface

--        BlackPearl MAM Plugin – handles all interactions between BlackPearl and MAM

--        BlackPearl – object storage

--        PFR Server – provides PFR processing services

--        NAS – network storage – permanent storage location for index files, temporary storage location for file processing

 

In the BlackPearl environment, there are two parts to the PFR processing of media files. The first part involves creating the media-aware index file for timecode/byte-offset information during file storage (writing) process. The second part of the process involves recalling a range of bytes that are based on a specified timecode, and creating complete, a self-contained media file that contains only the requested frames. In order to make this integration efficient, the PFR Server has a defined set of services that can be called by the media application using a REST interface (in this architecture – via the BlackPearl plugin).

The interface supports three basic request calls and two status calls to monitor the progress of the requested execution:

--        Request File Indexing
--        Request File Indexing Status
--        Request File Byte Offsets
--        Request Partial File Creation
--        Request Partial File Creation Status

For storing the file, the typical flow of execution would be:

  1. Copy media file to NAS storage
  2. Request creation of media index file
  3. Monitor status for completion
  4. Write file to BlackPearl Storage

For retrieving the partial file, the flow of execution would be:

  1. Request File Byte Offsets based on timecode
  2. Read specified byte range from BlackPearl to NAS
  3. Request partial file creation
  4. Monitor file creation status
  5. Supply partial file to Media Application

Spectra’s development team has created an C# and Java SDKs to support the rapid development and integration of the PFR with BlackPearl Storage into existing M&E environments. The SDKs provide “wrapper” calls for the REST API calls. The C# and Java client documentation and code samples can be found at:

https://github.com/SpectraLogic/tpfr_client

https://github.com/SpectraLogic/tpfr_java_client

PFR Users Guide provides the list of supported video file formats.

Quick Start Guide is a required reading to understand the services and locations that have to be setup in order to start PFR development.

PFR REST API definition is available for direct HTML development.

If you are considering creating a PFR integration with a BlackPearl environment, please contact the Spectra Logic Developer Program Team.


Importing Foreign LTFS Tapes into BlackPearl

When BlackPearl writes data to tape, it uses the open Linear Tape File System (LTFS) file format. Because of this LTFS support, Spectra Logic was able to add the ability to import non-BlackPearl or “foreign” LTFS tapes to BlackPearl. This is useful for any customer that receives LTFS-formatted tapes from another source and wishes to read those same tapes in the BlackPearl environment. This workflow is particularly common in the Media and Entertainment industry as a way to transfer video files from one group to another. Since every application may utilize the open LTFS format in a different manner when writing data to tapes, it is important that the user verify the various LTFS tape formats will be properly imported for read only use in the BlackPearl environment.

BlackPearl will support the import of foreign LTFS tapes in version 3.5, which is due out in Q1 of 2017. Importing can be done manually via the BlackPearl web management interface or via an external application that calls the BlackPearl API. The “import” process we discuss herein assumes that the foreign LTFS tapes have already been physically imported into the tape library partition to which BlackPearl is connected. Additionally, when a foreign LTFS tape is added to a BlackPearl tape partition, the write-protect switch on the tape cartridge must be set to “read only” before BlackPearl will allow the tape to be imported.  We strongly recommend that the tape be kept in read-only mode so that BlackPearl will not be able to modify the tape in any way. Also note that the API import commands must be initiated by the administrator or “spectra” user.

To manually import foreign LTFS tapes via the web management interface of BlackPearl, users will go to the Tape Management page (Status > Tape Management), click on the tape(s) to be imported, and then go to the Action menu and select Import Foreign Tape.

The tapes can also be imported via an external application that calls the BlackPearl API. While there are several possible workflows that can be taken to import the tapes into BlackPearl by an application, this is one we are currently recommending to our partners:

  1. Application calls API command Get Tapes to get a list of all tapes and their state. Tapes with a state of LTFS_WITH_FOREIGN_DATA should be recorded along with their barcodes.
  2. Application calls API command Raw Import All Tapes to start the process of importing the foreign LTFS tapes. Make sure to make this call using the admin user “spectra”. The application must provide a bucket name in which to import the files on the foreign LTFS tapes. Alternatively the application can call Raw Import Tape to import individual foreign LTFS tapes based on the barcodes recorded from Step 1 above. Again, each time this is called, a bucket name will need to be provided and using this method can allow the application to import different tapes into different buckets if desired.
  3. Application should periodically call API command Get Tapes again to check the state of all tapes. Once there are no more tapes with a state of LTFS Foreign, it means all foreign LTFS tapes have been successfully imported.
  4. Application calls API command Get Physical Placement for Object Parts on Tape to get a list of all files that are on each LTFS tape, using the barcodes recorded in Step 1 above as the input parameter for this call. The application can then read each of these files if needed, such as to create video proxies or to copy them to another location in BlackPearl. If the application doesn’t need to read the files, it can simply record the list of file names.
  5. If the application no longer needs the files on the foreign LTFS tapes, for example because it has read the files and copied them to another bucket in BlackPearl, it can then issue an Eject Tape command for each tape.

As you can see, it is quite easy for applications to work with foreign LTFS tapes in BlackPearl. It is important to remember that the user and/or developer need to confirm that the user’s LTFS tapes can be read by BlackPearl by testing them out. Also, note that BlackPearl will not generate a checksum for the files on foreign LTFS tapes as it does for other files that are imported into BlackPearl. We also don’t currently support importing foreign LTFS tapes that use “tape spanning”, or spanning one file across multiple tapes.

Foreign LTFS tape import is supported in released Java SDK 3.4.0-RC1, and it will be supported by any BlackPearl hardware with code 3.5 and higher. We will soon (in the next two months) also have the foreign LTFS tape import operations available in our other SDKs (C#/.NET, Python, C). We will also soon update our BlackPearl API documentation to include the new foreign LTFS tape import operations.

If you want help working with foreign LTFS tapes in your BlackPearl integration, please contact the Developer Program.


Developer Summit 2016 Recording and Slides

On November 10, 2016, we had a successful, second annual BlackPearl Developer Summit. We have provided a recording, slides, and agenda from the Summit below.

Summit Recording

Slides

Agenda

  • Corporate updates from Spectra CEO Nathan Thompson
  • BlackPearl product overview and enhancements
  • Learn what tools are available to help develop a BlackPearl client
  • Partner presentation detailing client development
  • Learn how easy it is to integrate an existing BlackPearl client into your workflow
  • Demonstration of Avid PAM and BlackPearl integration
  • Demonstration of CatDV BlackPearl integration
  • Question and answer with our BlackPearl Developer Program Team

 

 


Second Annual BlackPearl Developer Summit – Nov 10 2016

Thursday, November 10, 2016 — 9:00 a.m. MDT (UTC -7), WebEx

Join us for Spectra Logic’s second annual BlackPearl Developer Summit, a virtual conference for current and potential developers of Spectra® BlackPearl® Deep Storage Gateway.

Event Agenda:

  • Corporate updates from Spectra CEO Nathan Thompson
  • BlackPearl product overview and enhancements
  • Learn what tools are available to help develop a BlackPearl client
  • Partner presentation detailing client development
  • Learn how easy it is to integrate an existing BlackPearl client into your workflow
  • Demonstration of Avid PAM and BlackPearl integration
  • Question and answer with our BlackPearl Developer Program Team

November 10, 2016 — 9:00 a.m. MDT (UTC -7)
Check Time by Country

REGISTER NOW


Python SDK 3.0 Released

We have released new version 3.0.0 of our BlackPearl Software Development Kit (SDK) for the Python programming language. It is now available for download on GitHub. You can also view code documentation, code examples, and installation instructions on our Documentation page.

This new SDK is compatible with our current BlackPearl 3.0.1 release and allows access to all BlackPearl 3.0 API commands. Note that the previous version of the Python SDK (1.2.0) is also compatible with the current BlackPearl 3.0.1 release but does not give access to all API commands.

Significant code structure changes were made between the 1.2.0 and 3.0.0 release of the Python SDKs. If you will be upgrading an existing client from 1.2.0 to 3.0.0, you should read our Migration Guide, which describes the changes in more detail. We do not anticipate major code structure changes in the future.

The previous 1.x versions of the Python SDK required that that C SDK also be installed. This new 3.0.0 Python SDK is no longer dependent on the C SDK and therefore no longer requires that the C SDK be installed.

We call this 3.0.0 release a “Pre-Release” because, while it has been tested internally, it is not yet in use by any of our partners or customers.

With this SDK release, all four of our SDKs -- Python, C, Java, and .NET/C# -- are now up to date with BlackPearl 3.x code and support all BlackPearl API commands.

If you have any questions or run into any problems, please post your questions to our Forum.