You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for the excellent implementation. I have a quick question regarding sampling from the replay buffer. In the following code from LazyMemory class, you added a bias term to the randomly generated index when the replay buffer is full (bias = -self._p if self._n == self.capacity else 0). Is there any particular reason for doing this? My understanding is that, since the indexes are uniformly generated, adding a bias term would not make any difference compared to not adding the bias. Am I right? Or is there anything that I missed?
def sample(self, batch_size):
indices = np.random.randint(low=0, high=len(self), size=batch_size)
return self._sample(indices, batch_size)
def _sample(self, indices, batch_size):
bias = -self._p if self._n == self.capacity else 0
states = np.empty(
(batch_size, *self.state_shape), dtype=np.uint8)
next_states = np.empty(
(batch_size, *self.state_shape), dtype=np.uint8)
for i, index in enumerate(indices):
_index = np.mod(index+bias, self.capacity)
states[i, ...] = self['state'][_index]
next_states[i, ...] = self['next_state'][_index]
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for the excellent implementation. I have a quick question regarding sampling from the replay buffer. In the following code from
LazyMemory
class, you added a bias term to the randomly generated index when the replay buffer is full (bias = -self._p if self._n == self.capacity else 0
). Is there any particular reason for doing this? My understanding is that, since the indexes are uniformly generated, adding a bias term would not make any difference compared to not adding the bias. Am I right? Or is there anything that I missed?The text was updated successfully, but these errors were encountered: