BlackPearl and Checksums

Spectra Logic’s highest priority is protecting our customers’ data. BlackPearl and the tape libraries that sit behind it have a number of features to ensure that your data stays protected. We start data integrity as soon as data is ingested by BlackPearl. Client applications sending data to BlackPearl can pass in a checksum to ensure that the data arrives safely. A checksum acts as a fingerprint for the file and can be used to make sure that the file received by BlackPearl is the same file that the client thought it sent. BlackPearl accepts MD-5, CRC-32, CRC-32C, SHA-256, or SHA-512 checksums. The client provides the checksum type and value in the header of the API PUT operation:

Content-MD5: 37r//gvw/aB3GmilbcUJpg==

A checksum can also be performed using our C, C#/.NET, and Java Software Development Kits.

BlackPearl uses the checksum provided for each file to ensure that the file it received is the same as the file the client sent. If the checksum does not match correctly with the file that was received, BlackPearl will return an error (400 – bad digest) to the client. If the client does not pass in a checksum with the file, BlackPearl will automatically create a checksum for the file. By default this checksum type will be MD-5, though the checksum type can be changed on the data policy settings on the bucket (a bucket is a top-level container for objects/files in BlackPearl). BlackPearl then stores the checksum, whether generated by the client or BlackPearl itself, both in its object database and with the file on tape.

Once stored by BlackPearl, the checksum can be used internally to make sure the file is still valid when it is recalled back from tape. BlackPearl will automatically do a checksum if the file is requested by a client (GET) and the file must come from tape. The checksum is done both as the file is coming from tape and after it has landed on the BlackPearl cache. BlackPearl will also provide the checksum value to the client so that the client can verify that it successfully received the file as well.

In some cases, when using the Bulk PUT operations, the client may be required to break the object into multiple “blobs” to upload it to BlackPearl. In this case, a checksum will be used for each blob uploaded to BlackPearl. The same is true when using Multi-Part Upload – each part of the file will have its own checksum. BlackPearl also supports Partial-File Restore, which is the ability to restore (GET) part of a file. With Partial-File Restore, a client can specify, for example, that it wants to retrieve the first 2GB of a 10GB file. In order to perform a checksum in this case, BlackPearl must first retrieve the entire file (or chunk) from tape. Once BlackPearl has completed the checksum on the file or chunk, it can send the partial file to the client.

Because BlackPearl may not always calculate the checksum for an entire file (because it may be broken up into multiple pieces), developers may want to have their clients calculate the entire file checksum itself. This value could then be stored in the file’s metadata when uploaded to BlackPearl. When the file is later retrieved, the client could calculate the checksum again and compare the values.


Inaugural BlackPearl Developer Summit

Tuesday, October 20, 2015 9:00 AM Mountain Time (15:00 UTC), WebEx
Join us for Spectra Logic’s inaugural BlackPearl Developer Summit, a virtual conference for current and potential Spectra Logic developers. You’ll get product updates from our CEO and BlackPearl product manager, and you will learn how these new features will help customers and developers. You will learn how to build a Spectra S3 client for BlackPearl, our private cloud gateway to our tape and disk storage systems. You will see how one of our partners developed a client and watch it in action. And you will get to ask questions to our BlackPearl Engineering team. Don’t miss it! Learn more and register at spectralogic.com/developerconference


BlackPearl 1.2 to Be Released This Week

BlackPearl software version 1.2 will be released later this week and should start showing up as an update option in the BlackPearl management web interface next week. The 1.2 update includes enhancements to the BlackPearl management web interface, support for new features in our Deep Storage Browser (formerly DS3 Browser) release version 1.2.1, and a number of bug fixes.

To prepare for this release, we have updated the BlackPearl Simulator to Version 1.2 so you can test out this latest code. The Deep Storage Browser version 1.2.1 is now also available for download.

DSB1.2screen
The Deep Storage Browser is our simple drag-and-drop, FTP-like client for BlackPearl. The new 1.2.1 version of this free, open-source Spectra S3 client has a number of improvements, including:

  • Search for objects on BlackPearl, including with wildcards -- percent (%) or underscore (_), like SQL
  • Upload/download with arrow/click icons
  • Dragging a directory from BlackPearl to local machine no longer results in parent directory also being brought over
  • Logging
  • Folder delete on BlackPearl
  • Multi file delete on BlackPearl

These are the features that were most requested by our users. If you have any other ideas to improve the Deep Storage Browser, please Contact Us or use our Google Group.

We hope you like the new versions of BlackPearl and the Deep Storage Browser.


Demo: Building a Spectra S3 BlackPearl Client Application

I have created a new video that shows how to create a demo Windows desktop BlackPearl client application using our .NET/C# Software Development Kit (SDK). Anybody can create this client in less than 15 minutes. You do not need an actual BlackPearl gateway, you can use our BlackPearl simulator. And all the other tools you need are free to download. Give it a try.

Here’s the final Visual Studio project files for the demo client that we build.

 


BlackPearl Spectra S3 Job Priority

BlackPearl acts as a caching gateway in front of Spectra Logic’s tape libraries. Typical client applications will both send groups of files to (PUT) and request groups of files from (GET) BlackPearl. BlackPearl uses “Jobs” to contain and keep track of these individual input/output operations.

BlackPearl is capable of managing many jobs at once. Jobs have a selectable “Priority” for processing the job so that client applications can have some control over the resources assigned to each job. The job priority settings are only relevant when the cache or tape drive resources are constrained, in which case the BlackPearl is said to be “throttled”. If there are no resource constraints, then all jobs will be processed equally.

Files moved by a job are broken into one or more “chunks” for processing. Cache space is required to store each chunk. The job priority determines how many chunks a job can use at any one time. The more chunk cache spaces that are available for a job, the more chunks that can be uploaded or downloaded by the job at any one time. Job priority can also affect how many tape drives a job can utilize at any one time. It takes at least two chunks to efficiently and continuously feed a tape drive.

Job Priority Values & Chunk Allocation

Jobs can be set by client applications to have one of the following priority. For chunk allocation, these priorities are only applicable if the BlackPearl is throttled.

  • Low – Low priority jobs get a maximum of four chunk cache spaces at any one time.
  • Normal -- (default for PUT) Normal priority jobs get a maximum chunk cache space at any one time of either: (a) eight or (b) two times the number of tape drives, whichever is greater.
  • High -- (default for GET) High priority jobs get a maximum chunk cache space at any one time of either: (a) sixteen or (b) three times the number of tape drives, whichever is greater.
  • Urgent – Urgent priority jobs get special prioritization. An Urgent job is exempt from any maximum limitations. It can use all available chunk cache space that it requests.

When jobs are requesting cache space for their chunks, and there are only a limited number of spaces available, the job that asks for the spaces first will get them, but no more than the maximums described above.

BlackPearl can also create its own system jobs with Critical and Background priority. These priority values are not available for jobs created by client applications, but you may see them on the Jobs Status Screen on the BlackPearl web interface.

  • Critical – The job must be executed immediately and cannot wait. This is typically used for tape drive cleaning operations.
  • Background – The job can be done when resources are available. This is typically used for tape inspection and reclamation.

Prioritization for Tape Read/Write Operations

The prioritization for tape drive read and write operations for chunks works in this order of preference:

  1. Is the priority of the chunk “Urgent”? If yes, goes before all others (this prioritization preference introduced BlackPearl 3.3)
  2. Can the chunk use a tape that’s already in a drive? If yes, it goes before a chunk that can’t.
  3. Is the chunk’s priority higher (excluding Urgent priority)? If yes, it goes before lower priority chunks.
  4. Can the chunk proceed without allocating another tape to itself? If yes, it goes before chunks requiring another tape.
  5. Has the chunk been waiting longer in queue? If yes, it goes before newer chunks.

Note that no job, not even an Urgent priority job, can stop or kill other active tape read or write operations.

Other Ways to Improve Job Performance

If the goal is to improve performance on a PUT operation, then the job’s Write Optimization can be set to “Performance” instead of the default “Capacity”. Using performance means that the job data may be written to more tapes simultaneously, which will allow for faster write performance.

The BlackPearl cache can also be set to its “Performance” configuration rather than its default “Capacity” configuration. This will significantly improve the performance for all jobs. See the “Configure the Cache” section of the BlackPearl User Guide.


Using the Java Command-Line Interface

The Java Command Line Interface (CLI) is a simple but powerful Spectra S3 client for BlackPearl. As the name implies, it allows users to manage files in BlackPearl using a simple command-line interface. The Java CLI, which is both free and open source, was created by Spectra Logic using our Java Software Development Kit and works on Windows, Mac, and Linux.

We have created a Java CLI Reference and Examples page to assist you with using the Java CLI. We find that one of the most common uses of the Java CLI by our customers is moving files to BlackPearl on some periodic basis using a scheduled task or cron job. Our examples therefore show how to use the Java CLI and 7zip (7zip.org) to zip and move files to BlackPearl. The files can then be retrieved from BlackPearl if needed using the Java CLI or the Deep Storage Browser (formerly DS3 Browser) -- another free and open-source Spectra S3 client created by Spectra Logic.

The Java CLI works like many other command line interfaces. There is a base call (ds3_java_cli) with options to control behavior. There are a total of 13 different commands currently available in the CLI. Here is an example of Java CLI command to move a folder from a server to BlackPearl:

ds3_java_cli -e bp1.acmecorp.com -a amVmZmJy -k iYkqUwcd -c put_bulk -b bucket1 -d “c:\Temp\subdir1”

In this example, the command put_bulk is used to move server directory subdir1 to a bucket called bucket1 on the BlackPearl located at bp1.acmecorp.com. The -a is the BlackPearl user’s S3 Access ID and the -k is the user’s S3 Secret Key.

You can optionally set system environment variables for the S3 Access ID (-a), S3 Secret Key (-k), endpoint (-e) and HTTP proxy (-x, not shown).

There are more examples and information located on the Java CLI Reference and Examples page. As always, if you have any questions or need assistance, please use our Forum or Contact Us.


New Developer Website and New Vision

This week Spectra Logic released a new version of the Spectra Logic Developer Program website.

One reason for launching the new site was to update the look and feel to match our recently updated corporate website. Besides looking great, the new sites work on all devices and have some nice interactive features.

But the main reason for the new version of the Developer site was to make our DS3 client development tools easier to download and access. DS3 is our deep storage protocol, an extension of S3, which allows users to move data to deep storage using simple HTTP commands.  This protocol is made possible by our BlackPearl Deep Storage Gateway.  Now anyone can access all the tools needed to create a DS3 application, including a BlackPearl Simulator, Software Development Kits, sample clients, and documentation.

Additionally, Spectra Logic has decided to make its BlackPearl DS3 software development kits (SDKs) and clients both freely downloadable and open source. This has been done for several reasons:

  • To make it as easy as possible for developers, a group which consists of both customers and partners, to develop DS3 clients.
  • To allow developers to modify the software for their particular needs, and to contribute those modifications back to the community.
  • Developers in the web service and cloud storage communities have an expectation that software tools will be freely available and open source. We want to be a trusted partner in that community.
  • Spectra Logic is in the business of selling deep storage hardware. More DS3 clients means that we can provide deep storage hardware to more customers.

We hope that you now find it easier than ever to develop DS3 clients. As always, we look forward to your feedback.

Follow me on Twitter @jeffbr


NAB: See BlackPearl, Learn About DS3 Development

Spectra Logic will be at NAB 2015 in Las Vegas, Nevada, USA from April 11-16 in Booth SL 11816 displaying our products and meeting with customers and partners. The Developer Program team will be there in full force to help out developers and spread our message. We’ll be doing demos at our booth and showing how easy it is to get started with DS3 client development. We’ll also show some of the great client applications we have built using our own software development kits (SDKs) that allow users to easily move data to tape. Please stop by and visit us if you are in town.