iget for fetching indexes from a non-sequence iterable #517

groutr · 2021-06-23T01:27:35Z

It would be useful, I think, to have a version of itertoolz.get that works on iterables that don't support indexing (ie sets, iterators, etc)

def iget(ind, seq):
    seq = iter(seq)
    j = 0
    for i in sorted(ind):
        if j < i:
            seq = drop(i - j, seq)
            j = i
        if j == i:
            yield next(seq)
            j += 1

With sets now ordered, it makes sense to pull out the nth item in order of insertion, or the nth line of a file

with open('file1.txt') as fin:
    lines = tuple([2, 5, 8, 9], fin)

The text was updated successfully, but these errors were encountered:

eriknw · 2021-06-25T20:20:04Z

Thanks @groutr. This has come up before: #97

I'm up for adding this functionality in some way.

groutr · 2021-06-29T06:01:51Z

After giving this some more thought I find that extending nth is a more appealing to me. It does require a mental shift from providing absolute indices (from the start of the iterator) to providing relative indices counting from the current position of the iterator.

My take on extending nth

def nth(n, seq):
    """ The nth element in a sequence

    >>> nth(1, 'ABC')
    'B'
    """
    seq = iter(seq)
    if not isinstance(n, Sequence):
        n = (n,)
    for i in n:
        seq = drop(i, seq)
        yield next(seq)

If I want indices 1 and 2 from 'ABC', that would be "return 1st element, then return 0th element".

>>> tuple(nth([1, 0], 'ABC'))
('B', 'C')

Hugovdberg · 2021-07-12T10:49:34Z

wouldn't it be nicer to first calculate the difference between all indices and use that to determine how many to drop?

def lazy(f):
    yield f

@curry
def unpack_args(f, args):
    return f(*args)


def nth(n, seq):
    """ The nth element in a sequence

    >>> nth(1, 'ABC')
    'B'
    """
    from operator import sub
    seq = iter(seq)
    if not isinstance(n, Sequence):
        n = (n,)
    else:
        sub1 = lambda x: x-1
        skip_n = compose(map(compose(sub1, unpack_args(sub), reversed)), sliding_window(2))
        n = concat((lazy(first(n)), skip_n(n)))
    for i in n:
        seq = drop(i, seq)
        yield next(seq)

This doesn't require the mental workout to get the correct differences (especially skipping zero if you want consecutive items). Also, if one wants to change an index it is a lot less error prone this way.

>>>list(nth([1,2,5,8], range(100)))
[1,2,5,8]

A major restriction of both methods is that it can only take items in increasing order:

>>>list(nth([1,2,1], "abcdefghijklmnopqrstuvwxyz"))
[...]
ValueError: Indices for islice() must be None or an integer: 0 <= x <= sys.maxsize.

Perhaps iterating over seq in sorted order and then reiterate in the requested order would be more stable.

Hugovdberg · 2021-07-12T11:14:51Z

A more stable approach would be like this, although it is a lot more ugly and iterates over n multiple times:

def nth(n, seq):
    """ The nth element in a sequence

    >>> nth(1, 'ABC')
    'B'
    """
    seq = iter(seq)
    if not isinstance(n, Sequence):
        n = (n,)
    else:
        sub1 = lambda x: x-1
        skip_n = compose(map(compose(sub1, unpack_args(sub), reversed)), sliding_window(2))
        orig_order, n = zip(*sorted(enumerate([1,2,1]), key=lambda x:x[1]))
        n = concat((lazy(first(n)), skip_n(n)))
    output = []
    for o, i in zip(orig_order, n):
        if i == -1:
            output.append((o, value))
            continue
        seq = drop(i, seq)
        value = next(seq)
        output.append((o, value))
    for _, value in sorted(output, key=lambda x: x[0]):
        yield value

Using duplicated, not monotonic increasing indices works as expected:

>>>list(nth([1,2,1], "abcdefghijklmnopqrstuvwxyz"))
['b', 'c', 'b']

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iget for fetching indexes from a non-sequence iterable #517

iget for fetching indexes from a non-sequence iterable #517

groutr commented Jun 23, 2021

eriknw commented Jun 25, 2021

groutr commented Jun 29, 2021

Hugovdberg commented Jul 12, 2021 •

edited

Loading

Hugovdberg commented Jul 12, 2021

iget for fetching indexes from a non-sequence iterable #517

iget for fetching indexes from a non-sequence iterable #517

Comments

groutr commented Jun 23, 2021

eriknw commented Jun 25, 2021

groutr commented Jun 29, 2021

Hugovdberg commented Jul 12, 2021 • edited Loading

Hugovdberg commented Jul 12, 2021

Hugovdberg commented Jul 12, 2021 •

edited

Loading