Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: np.float_ -> np.float64 #50

Merged
merged 1 commit into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
[![Supported Python versions](https://img.shields.io/pypi/pyversions/stempeg.svg)](https://pypi.python.org/pypi/stempeg)

Python package to read and write [STEM](https://www.native-instruments.com/en/specials/stems/) audio files.
Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).
Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).

Under the hood, _stempeg_ uses [ffmpeg](https://www.ffmpeg.org/) for reading and writing multistream audio, optionally [MP4Box](https://github.com/gpac/gpac) is used to create STEM files that are compatible with Native Instruments hardware and software.

#### Features

- robust and fast interface for ffmpeg to read and write any supported format from/to numpy.
- reading supports seeking and duration.
- control container and codec as well as bitrate when compressed audio is written.
- control container and codec as well as bitrate when compressed audio is written.
- store multi-track audio within audio formats by aggregate streams into channels (concatenation of pairs of
stereo channels).
- support for internal ffmpeg resampling furing read and write.
Expand Down Expand Up @@ -70,7 +70,7 @@ conda install -c conda-forge stempeg

Stempeg can read multi-stream and single stream audio files, thus, it can replace your normal audio loaders for 1d or 2d (mono/stereo) arrays.

By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`).
By default [`read_stems`](https://faroit.com/stempeg/read.html#stempeg.read.read_stems), assumes that multiple substreams can exit (default `reader=stempeg.StreamsReader()`).
To support multi-stream, even when the audio container doesn't support multiple streams
(e.g. WAV), streams can be mapped to multiple pairs of channels. In that
case, `reader=stempeg.ChannelsReader()`, can be passed. Also see:
Expand Down Expand Up @@ -121,7 +121,7 @@ Writing stem files from a numpy tensor can done with.
stempeg.write_stems(path="output.stem.mp4", data=S, sample_rate=44100, writer=stempeg.StreamsWriter())
```

As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio.
As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio.
Each of the method has different number of parameters. To select a method one of the following setting and be passed:

* `stempeg.FilesWriter`
Expand All @@ -136,8 +136,8 @@ Each of the method has different number of parameters. To select a method one of
Stem will be saved into a single multistream audio.
Additionally Native Instruments Stems compabible
Metadata is added. This requires the installation of
`MP4Box`.
`MP4Box`.

> :warning: __Warning__: Muxing stems using _ffmpeg_ leads to multi-stream files not compatible with Native Instrument Hardware or Software. Please use [MP4Box](https://github.com/gpac/gpac) if you use the `stempeg.NISTemsWriter()`

For more information on writing stems, see [`stempeg.write_stems`](https://faroit.com/stempeg/write.html#stempeg.write.write_stems).
Expand Down
36 changes: 18 additions & 18 deletions docs/read.html
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration (float): duration in seconds
dtype (numpy.dtype): Type of audio array to be casted into
stem_idx (int): stream id
ffmpeg_format (str): ffmpeg intermediate format encoding.
ffmpeg_format (str): ffmpeg intermediate format encoding.
Choose &#34;f32le&#34; for best compatibility

Returns:
Expand All @@ -123,10 +123,10 @@ <h1 class="title">Module <code>stempeg.read</code></h1>

# decode to raw pcm format
if ffmpeg_format == &#34;f64le&#34;:
# PCM 64 bit float
# PCM 64 bit float
numpy_dtype = &#39;&lt;f8&#39;
elif ffmpeg_format == &#34;f32le&#34;:
# PCM 32 bit float
# PCM 32 bit float
numpy_dtype = &#39;&lt;f4&#39;
elif ffmpeg_format == &#34;s16le&#34;:
# PCM 16 bit signed int
Expand All @@ -150,7 +150,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format=&#34;f32le&#34;,
info=None,
sample_rate=None,
Expand Down Expand Up @@ -181,28 +181,28 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`

Returns:
Expand Down Expand Up @@ -280,7 +280,7 @@ <h1 class="title">Module <code>stempeg.read</code></h1>
channels = min(_chans)
else:
raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)

# set channels to minimum channel per stream
stems = []

Expand Down Expand Up @@ -511,7 +511,7 @@ <h2 id="shape">Shape</h2>
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format=&#34;f32le&#34;,
info=None,
sample_rate=None,
Expand Down Expand Up @@ -542,28 +542,28 @@ <h2 id="shape">Shape</h2>
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`

Returns:
Expand Down Expand Up @@ -641,7 +641,7 @@ <h2 id="shape">Shape</h2>
channels = min(_chans)
else:
raise RuntimeError(&#34;Stems do not have the same number of channels per substream&#34;)

# set channels to minimum channel per stream
stems = []

Expand Down Expand Up @@ -1130,4 +1130,4 @@ <h4><code><a title="stempeg.read.StreamsReader" href="#stempeg.read.StreamsReade
<p>Generated by <a href="https://pdoc3.github.io/pdoc"><cite>pdoc</cite> 0.9.1</a>.</p>
</footer>
</body>
</html>
</html>
20 changes: 10 additions & 10 deletions stempeg/read.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def _read_ffmpeg(
duration (float): duration in seconds
dtype (numpy.dtype): Type of audio array to be casted into
stem_idx (int): stream id
ffmpeg_format (str): ffmpeg intermediate format encoding.
ffmpeg_format (str): ffmpeg intermediate format encoding.
Choose "f32le" for best compatibility

Returns:
Expand All @@ -93,10 +93,10 @@ def _read_ffmpeg(

# decode to raw pcm format
if ffmpeg_format == "f64le":
# PCM 64 bit float
# PCM 64 bit float
numpy_dtype = '<f8'
elif ffmpeg_format == "f32le":
# PCM 32 bit float
# PCM 32 bit float
numpy_dtype = '<f4'
elif ffmpeg_format == "s16le":
# PCM 16 bit signed int
Expand All @@ -120,7 +120,7 @@ def read_stems(
duration=None,
stem_id=None,
always_3d=False,
dtype=np.float_,
dtype=np.float64,
ffmpeg_format="f32le",
info=None,
sample_rate=None,
Expand Down Expand Up @@ -151,28 +151,28 @@ def read_stems(
duration (float): Duration to load in seconds.
stem_id (int, optional): substream id,
defauls to `None` (all substreams are loaded).
always_3d (bool, optional): By default, reading a
always_3d (bool, optional): By default, reading a
single-stream audio file will return a
two-dimensional array. With ``always_3d=True``, audio data is
always returned as a three-dimensional array, even if the audio
file has only one stream.
dtype (np.dtype, optional): Numpy data type to use, default to `np.float32`.
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
info (Info, Optional): Pass ffmpeg `Info` object to reduce number
of os calls on file.
This can be used e.g. the sample rate and length of a track is
already known in advance. Useful for ML training where the
info objects can be pre-processed, thus audio loading can
be speed up.
sample_rate (float, optional): Sample rate of returned audio.
sample_rate (float, optional): Sample rate of returned audio.
Defaults to `None` which results in
the sample rate returned from the mixture.
reader (Reader): Holds parameters for the reading method.
reader (Reader): Holds parameters for the reading method.
One of the following:
`StreamsReader(...)`
Read from a single multistream audio (default).
`ChannelsReader(...)`
Read/demultiplexed from multiple channels.
multiprocess (bool): Applys multi-processing for reading
multiprocess (bool): Applys multi-processing for reading
substreams in parallel to speed up reading. Defaults to `True`

Returns:
Expand Down Expand Up @@ -250,7 +250,7 @@ def read_stems(
channels = min(_chans)
else:
raise RuntimeError("Stems do not have the same number of channels per substream")

# set channels to minimum channel per stream
stems = []

Expand Down
Loading