Skip to content

Garbage Collector Interface

John Skaller edited this page Nov 28, 2018 · 2 revisions

The Felix GC is a complex piece of software although the actual collection algorithm is a fairly simple naive mark/sweep.

A synopsis of the public interface follows, it is not complete.

struct GC_EXTERN collector_t
{
  collector_t();
  virtual ~collector_t();

  // find the associated thread control object
  virtual ::flx::pthread::thread_control_base_t *get_thread_control()const =0;

  // allocation hook
  void *allocate(gc_shape_t *shape, size_t x);

  // The mark and sweep collector algorithm.
  size_t collect();

  // Routines to add and remove roots.
  void add_root(void *memory);
  void remove_root(void *memory);

 // array management
  virtual void set_used(void *memory, size_t)=0;
  virtual void incr_used(void *memory, ptrdiff_t)=0;
  virtual size_t get_used(void *memory)=0;
  virtual size_t get_count(void *memory)=0;
  virtual void *create_empty_array( gc_shape_t *shape, size_t count)=0;

  virtual pointer_data_t get_pointer_data(void *)=0;
};

GC_EXTERN void *operator new
(
  ::std::size_t,
  flx::gc::generic::gc_profile_t &,
  flx::gc::generic::gc_shape_t &,
  bool
);

World Stop

The collector is associated with a thread control object. When a collection is required, the thread controller is responsible for stopping all threads, and finding the extent of their machine stacks so the collector can scan them as roots.

Registered Roots

The add_root() and remove_root() methods take pointer to the base of a managed object and increment or decrement a reference count. If the count is greater than 0, the object is considered a root for the mark phase of the collection algorithm and cannot be reaped.

All objects are arrays

In Felix all objects are variable length arrays for which there is a bound set at construction time. The bound can be retrieved by get_count(), the argument must point to the whole object. The get_used(), set_used() and incr_used() methods modify the number of the available slots that are actually used. The collector only scans the used slots. The used slots must start at the low address and be assigned contiguously.

Collection

The collector tracks all allocations. When the collect() method is called the collector first calls the thread control objects world_stop() method to stop all the other threads and gather the extents of their machine stacks.

It then scans the machine stacks looking for roots. The set of roots consists of the objects found in the machine stack scans, together with the registered roots. Felix then finds the shape (RTTI) object associated with the object, which contains an array of offsets into the object that contain pointers. Felix checks if these point into an managed object and if so marks the object as reachable, and then recursively scans that object if it has not already been scanned.

The recursive scanning is precise because Felix knows where the pointers are located in managed objects. The machine stack scanning is conservative, because it does not.

Interior pointers

Pointers on the machine stack and in objects can be interior pointers. An interior pointer is a pointer to any byte of the object. Past the end location is NOT considered interior. If the only pointer into an object is incremented until it goes off the end in a loop, the pointer is invalid, the object becomes unreachable, and it is not safe to decrement the pointer to get it to point into the object again, because the collector may have reaped it, unless it can be assured the the collector could not have run between the time the pointer overran the object and the time it is adjusted back. It is best to retain a pointer to the head of the object as well to ensure the object remains reachable.

Foreign pointers

Because it can detect interior pointers, Felix can also detect foreign pointers, that is, those that do not point at or into a Felix managed object. It is safe to use foreign pointers, including nullptr, in any object where a Felix pointer is expected. Foreign and interior pointers can NOT be used when adjusting the use count of varrays or registering roots.

Managed Object Construction

The operator new() function is used to create an object and initialise it. The first argument is the size of the object to allocate, it is provided automatically when a new expression is used.

The second argument requires a reference to a garbage collector profile object. This object contains a pointer to the garbage collector.

The third argument is the shape of the object to be allocated. This also contains the size of the object and must agree with the first argument.

The final argument is a boolean. If it is true, which is the default, then the allocation may trigger a collection. If it is false, the allocation is not permitted to trigger a collection. Fine grained operations in the Felix RTL typically use false to prevent the collector interfering with organisation of data structures at a time when the invariants the collector requires are not satisfied.

Bypassing the collector

The allocate() method can be used to allocate store directly with the system allocator, which in turn typically uses malloc()/free().

Felix itself actually mandates the use of malloc()/free(). This allows Felix to acquire objects allocated by C system functions. However objects allocated by C++ standard new cannot be acquired.