From:	MERC::"uunet!CRVAX.SRI.COM!RELAY-INFO-VAX" 11-OCT-1992 17:40 11-OCT-1992 17:40:00.00
To:	INFO-VAX@KL.SRI.COM
CC:	
Subj:	RE: Reclaiming deleted space within a file

	I am writing a program which I use to store data in a file in
	variable-length records. Before anyone asks me why I don't use a
	product already available, trust me when I say I can't -- for whatever
	reason(s)...

	The program (or should I say file manager) works great -- I can add,
	delete, etc. no problem. It uses a hashing alg to turn record numbers
	into file addresses for a near constant access time. It has multiple
	record access tables so that I can have multiple files 'within a
	file'. Wow. Yippe.

	My problem (besides all the above) is that I need to do dynamic data
	area reclamation -- you know, when I delete a record, that record's
	space is 'returned to the file' so that it may be re-used for a new
	record. The way I have been doing it up to this point is by using 2
	modified B-tree indexes to track the free space information.

	One tree is an ascending index by file address (of the deleted
	record). The reason this tree is used is so that if there are 10 bytes
	of free space at address 10 in the file and there are 10 bytes of free
	space at address 30 in the file AND THEN the record (in between them)
	at address 20 (must be 10 bytes) is deleted, the function doing the
	insert will add the number of bytes in the record at 20 (10 bytes --
	this is done when the insertion function looks for entries lower than
	20 and finds 10, then checks to see if 10 + 10's length (10 bytes) is
	equal to 20 -- in this case, it is) THEN it checks to see if the new
	value of 10 + 10's length is equal to the key that is greater than 10
	(in this case, it is). If so, it deletes the next key and adds it's
	length. Now instead of having the keys 10 and 30 in the addr index,
	all we have is 10 and it's length is 30 bytes.

	The other tree is a ascending index by length. The reason that this
	tree is used is so the function to allocate space from the file can
	find a 'best fit' for the new record being written. The allocation
	function will attempt to find a length in the index that is >= the
	size of the record being written. If it is found, the length of the
	new record is subtracted from the length in the index, the key is
	deleted, and a new key is instered (according to the new length). To
	make it easier -- If I need to write a record that is 1024 bytes long
	and the first free area that is >= the size of the record is 2048
	bytes long, it will subtract 1024 from 2048, delete the key for 2048,
	and instert a new key, 1024.

	By now, you may be asking yourself 'why am I reading this??'. Yes,
	this works, but I can't help but feel that there might be a better
	approach, and I have been unable to find one. That's why I'm asking
	you. Is there??

There are papers, perhaps even books, devoted to the problem you describe,
which is really the same problem file systems solve:  Allocating, freeing,
and reusing space on a relatively slow disk.  There are TONS of different
solutions.  Without knowing the details of what you are doing, it's impossible
to choose among them.  For example:  It sounds as if you never extend existing
records - and that records are contiguous in logical block space.  In file
system terms, that means (a) you do contiguous allocation only; (b) file size
is specified at creation time.  That makes the problem MUCH simpler.

Are the files shared?  How:  Single writer/multiple readers, or multiple
writers and readers.

Are the files essentially always being accessed?  Do programs retain the
actual file address of a record, or do they always present a record number?
(If your interface routines never return the actual file address, you can be
sure that no one retains it; otherwise, how much do you know about the users
of your files?)  If it's possible to schedule times when the files are
inactive (or at least are not being written), you can use a very simple
strategy:  Allocate space at the end, rebuild periodically by copying to a
new file.  This is by far the simplest approach all around, but may or may
not be practical.

Using two indices is a clever use of a very high-powered mechanism.  File
systems tend to use other, much simpler, techniques.  Bit maps of free blocks
are a common representation that's good for finding large contiguous regions
quickly.  In your case, you'd have to decide on an appropriate "block" size
that (a) keeps the bit map small (b) doesn't waste too much space due to the
"rounding up" of requests to the next full block.  Whether this is practical
depends on the pattern of requests.

Off hand, I know of no system that uses best fit allocation for disk memory;
it's too expensive to implement, and it's questionable if it really helps.
Often, first fit does at least as well, and is faster.  (If the free areas
are on a linked list, first fit tends to put small blocks and large blocks
at opposite ends of the list.  If effect, you get multiple pools for free.)

You are using the index structure to get the effect of an ordered list of
memory blocks.  Usually, this is done by reserving a few bytes before and
after each user region to let you find where in the list a block that is
being freed belongs.  This is certainly cheaper than an index - but the code
you have is probably simpler and, since the control data is kept far from
the user data, perhaps more robust.

In many applications, a few request sizes are very common, and other sizes
are relatively rare.  In that case, a few look-aside lists of fixed-length
blocks can improve performance immensely at very low cost.

Really, this problem could make an interesting center piece for a seminar!

							-- Jerry