The assignee for this patent application is
Reporters obtained the following quote from the background information supplied by the inventors: "Conventional storage management and file relocation solutions use multi-tier storage systems to balance performance and costs. At higher tiers, performance is better but the cost is higher, while at lower tiers the cost is reduced but so is performance. Accordingly, policies can be created that will move files to higher-performance storage devices based on, for example, input/output (I/O) 'temperature' (e.g., frequency of access) or service requirements (e.g., service level agreements), while unimportant or out-of-date files can be moved to less expensive storage devices without changing the way users or applications access those files.
"With dynamic storage tiering (DST), files can be dynamically moved between tiers without having to take the application offline and without changing the way an application or database accesses the information. Consequently, the move is usually transparent to the users and applications that own the files. Furthermore, as data is moved between the different storage tiers, policies can be centrally managed and dynamic, and can support a heterogeneous server and storage infrastructure that does not require modifications to application, database, or backup/recovery policies.
"Solid state devices or drives (SSDs) are used as one tier in a DST system as well as a cache for underlying hard disk device/drive (HDD) storage. As blocks of an SSD are used and reused, free blocks, block allocation and de-allocation (freeing of allocated blocks), and usage of allocated blocks are tracked. Traditional methods use free block lists or free block bitmaps to allocate free blocks and to free allocated blocks, but special care needs to be taken because SSDs have particular characteristics. For example, SSDs perform well for random/sequential reads and sequential writes, but do not perform as well for random writes because, before a region of SSD storage can be overwritten, it must first be erased. Erasing cannot be done for small-sized regions; instead, it must be done at the level of block size, which may be greater than or equal to 128 KB (kilo-bytes).
"To mitigate the erasure penalty, SSDs now support a TRIM command or utility. The TRIM command can be issued to the SSD to inform the SSD that a certain region is to be marked 'free.' As a result, the SSD will not try to preserve old data during its internal block remapping and garbage collection activities. The TRIM command is also able to erase the designated region as a whole, provided the region is appropriately aligned with the erasure block boundaries.
"Although mitigated by the above, some amount of penalty is still associated with the use of the TRIM command. For example, a random write to an SSD can be delayed while a decision is made whether or not a region of the SSD needs to be erased in order to provide space for the data to be written, and then further delayed while the TRIM command is executed."
In addition to obtaining background information on this patent application, VerticalNews editors also obtained the inventors' summary information for this patent application: "In an embodiment according to the present disclosure, a region of memory is logically divided into a number of segments, each of which is logically divided into a number of blocks. The memory may include only one such region, or it may include multiple such regions. In one embodiment, the memory is implemented in (using) an SSD. In one such embodiment, each segment is aligned with a respective SSD erasure block boundary. Alternatively, the memory is implemented in/using a device other than an SSD, such as but not limited to a thin provisioning array.
"In embodiments according to the present disclosure, blocks are allocated sequentially (contiguously). In one embodiment, a first (head) pointer and a second (tail) pointer demarcate the section of the blocks that have been allocated. As additional blocks are allocated as a result of writes to the designated region, the additional blocks are sequentially allocated beginning at the tail pointer. If the section of allocated blocks reaches the end of the designated region, then blocks are subsequently allocated beginning at the start of the region. In other words, the allocation of blocks continues sequentially to the end of the designated region, then wraps around to the beginning of the region. Thus, according to embodiments of the present disclosure, the memory device (e.g., SSD, thin provisioning array, etc.) can be used as a circular buffer or log-structured device where writes are performed sequentially. In other words, random writes are not necessary and can be avoided.
"In the present embodiment, as additional blocks are allocated, the tail pointer is moved so that it remains at the end of the section of allocated blocks and the position of the head pointer remains fixed, at least temporarily. At some point, as blocks continue to be allocated, the location of the tail pointer may approach the location of the head pointer. In the present embodiment, if the tail pointer is within a threshold distance of the head pointer, then the head pointer is moved from its current position (e.g., from a first segment or block address) to a new position (e.g., to a second segment or block address), and the allocated blocks that are between the current position and the new position are freed (erased and/or freed).
"In one embodiment, the distance the head pointer is moved is equivalent to the length of a segment (e.g., the head pointer is moved the number of blocks that are in a segment, from one erasure block boundary to the next erasure boundary). In one such embodiment, allocated blocks are freed using the TRIM command.
"Advantageously, blocks can be freed (e.g., using the TRIM command) in advance of when they are actually needed. In other words, the TRIM command, for example, can be executed when the tail pointer is within a threshold distance of the head pointer, as mentioned above. Thus, blocks can be freed in anticipation of a write operation. As a result, write latency is reduced.
"In one embodiment, the memory device (e.g., SSD, thin provisioning array, etc.) is used as one of the tiers in a multi-tier storage system such as a DST system. In such an embodiment, before the data in the blocks about to be freed is overwritten or erased, it is first copied to another tier of the multi-tier storage system. For example, if an SSD device is used as a cache for an HDD, then the data can be copied from the SSD to the HDD, in which case the file system directs subsequent accesses to that data to the HDD.
"As described above, a block might be allocated to a first file, then subsequently freed and overwritten with data for a second file. Therefore, before a block is read for data for a file, the validity of that data for the file is checked. In one embodiment, this is accomplished by maintaining a generation count for each block or segment of blocks. The value of the generation count when a block is allocated to a file is stored in a data structure (e.g., an inode) that contains information about that file, and that value is also stored in a structure maintained by the file system (e.g., a bmap). If the block is subsequently allocated to and overwritten with data for a second file, then the generation count value for the block is incremented, the new generation count value is stored in the data structure (e.g., inode) for the second file, and the file system structure (e.g., the bmap) is updated accordingly. Before a read operation is performed for the first file, the value of the generation count in the data structure (e.g., inode) for the first file is compared to the value of the generation count in the file system structure (e.g., the bmap). If the two values are different, then the data is invalid for the first file; if the two values are the same, then the data is valid for the first file. As noted above, the data for the first file can be migrated to another device before it is erased/overwritten, in which case the file system directs the read operation to the other device.
"A generation count, used as just described, is beneficial because it allows the affected portion (e.g., block or segment) of the memory (e.g., SSD, thin provisioning array, etc.) to be invalidated without having to locate and then invalidate all references to that portion. For example, in a sequence (e.g., a segment) of blocks, some of the blocks can be allocated to a first file, others to a second file, still others to a third file, and so on. If that segment is overwritten, it is not necessary to locate all files that previously made reference to that sequence of blocks and then invalidate all of those files' references to those blocks. Instead, using the generation count as described above, a determination about the validity of the data is built into the read process. Consequently, it is not necessary to seek out invalid references because an invalid reference can be automatically detected as part of the read operation. Essentially, files are notified that the data is now invalid on a need-to-know basis. Thus, computational resources are conserved.
"In one embodiment, the generation count is assigned a maximum value; if the maximum value is reached, then the generation count is reset to its initial value (e.g., zero) and then incremented upward from the initial value. Data may be stored in the memory (e.g., SSD, thin provisioning array, etc.) for a long period of time and so it is possible that the generation count may be reset and then incremented to a value that matches one of the values previously associated with an allocated block. In other words, if the designated region of the memory is wrapped around repeatedly, then the generation count may be incremented enough times that the count rolls around to a previous value. Thus, for example, generation count g1 may be associated with block 1 initially allocated to file 1, the generation count may eventually roll back around to g1, and then the same generation count g1 may be associated with block 1 when it is allocated to file 2. Consequently, although the data in block 1 is no longer valid for file 1, that might not be detected. Different mechanisms can be used to prevent that from happening. For example, in one embodiment, if the generation count reaches a first threshold value (e.g., that is less than the maximum value), then the file-related data structures (e.g., the inodes) can be accessed, and any references to generation counts that are less than a second threshold value (which may be equal to or less than the first threshold value) can be proactively removed from those data structures.
"In summary, the memory device (e.g., SSD, thin provisioning array, etc.) can be used as a circular buffer or log-structured device where writes are performed sequentially. Advantageously, blocks in the memory can be freed (e.g., using the TRIM command) in advance of when they are actually needed, reducing write latency. Use of a generation count allows a portion of the memory to be invalidated without having to locate and then invalidate all references to that portion, conserving computational resources. These and other objects and advantages of the various embodiments of the present disclosure will be recognized by those of ordinary skill in the art after reading the following detailed description of the embodiments that are illustrated in the various drawing figures.
BRIEF DESCRIPTION OF THE DRAWINGS
"The accompanying drawings, which are incorporated in and form a part of this specification and in which like numerals depict like elements, illustrate embodiments of the present disclosure and, together with the description, serve to explain the principles of the disclosure.
"FIG. 1 is a block diagram of an example of a computer system upon which embodiments according to the present disclosure can be implemented.
"FIG. 2 is a block diagram of an example of a network architecture capable of implementing embodiments according to the present disclosure.
"FIG. 3 is a block diagram of an example of a multi-tier storage system upon which embodiments according to the present disclosure may be implemented.
"FIGS. 4 and 5 are block diagrams illustrating elements of a file system in embodiments according to the present disclosure.
"FIG. 6 illustrates a file stored in a tier of a multi-tier storage system in an embodiment according to the present disclosure.
"FIGS. 7A, 7B, 7C, 7D, and 7E illustrate allocating and freeing blocks in a device such as an SSD or a thin provisioning array in a multi-tier storage system such as a DST system in an embodiment according to the present disclosure.
"FIG. 8 is a flowchart of an example of a computer-implemented method for allocating and freeing blocks in a device such as an SSD or a thin provisioning array in a multi-tier storage system such as a DST system in an embodiment according to the present disclosure."
For more information, see this patent application: Ranade,
Keywords for this news article include:
Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC
Most Popular Stories
- Obama Administration Releases Proposal to Regulate For-Profit Colleges
- Koch Brothers Step up Anti-Obamacare Campaign
- Elizabeth Vargas' Husband Marc Cohn Addresses Rumors
- Keurig Adds Peet's coffee, Alters Starbucks deal
- Quiznos Files for Chapter 11
- U.S. to Relinquish Gov't Control Over Internet
- SoCalGas Reaches Record Spend on Diversity Suppliers
- Vybz Kartel Convicted of Murder
- FDIC Sues Big Banks Over Rate Manipulation
- U.S. Consumer Sentiment Falls in Early March