diff --git a/brro-compressor/TODO.md b/brro-compressor/TODO.md deleted file mode 100644 index 58c6012..0000000 --- a/brro-compressor/TODO.md +++ /dev/null @@ -1,28 +0,0 @@ -# Compressor TODO and Discussions - -## How the compressor selects the compressor? - -- Do we give the user the change of selecting it? -- Configuration needs to be minimal, the end user is probably a metric server and/or a database -- Pass Frame settings to the compressors (Min, max, size) - -## How do we slice the input data? - -- [Streaming Reader and Writer] Size? Samples? -- Use io_uring (https://github.com/tokio-rs/tokio-uring/tree/master) - -## Should we allow a plugable architecture? - -- Allows to use to get external ideas -- Brings another extra layer of complexity of the code -- ~~[Carlos] Check with the shotover team how complex is to have a plugable compressor thingy~~ - It is complex, future developments. - -## What to do with VSRI? - -- Should we push those into the compressor or leave it outside? - -## Optimization - -- Take optimization out of the compressors, move it before the compressors are used (Currently it is done by each compressor). -- Optimization should `hint` the best compressor for the job -- Best idea I have is to do an histogram and select/`hint` based on that diff --git a/docs/Usage.md b/docs/Usage.md deleted file mode 100644 index 36909fc..0000000 --- a/docs/Usage.md +++ /dev/null @@ -1,51 +0,0 @@ -# Usage - -Quick objectives, get the repo to compile, understand the optimizer (just run it against some files), FLAC some files, compare results. - -## Code - -Code was merged. No more branches and all you need is in the master branch. - -### flac-server - -Needs a prometheus server. We need it to get our samples out. Supports read and write from prometheus. - -### brro_optimizer - -Maybe the most important tool at this point, it picks a WAV file from the datasets described below and optimizes it into a way that we might see a meaning full compression into FLAC. -The tool also has options to dump the output of the file as a single sample per period, instead of the 4 channels. This is good to obtain the data as it was feed into the flac-server. -The documentation also states that the dump flags (`dump-raw` and `dump-optimized`) write content to a file, *THAT IS A LIE!*. It writes to `stdout` so, if you're using the July data, *REDIRECT IT INTO A FILE!*. -The code performs optimizations based on file name, so renaming might cause issues. - -Usage (Getting raw samples): `./brro_optimizer infile.wav --dump-raw > file.raw` -Usage (Getting optimized samples): `./brro_optimizer infile.wav --dump-optimized > file.raw` -Usage (Generate a optimized file): `./brro_optimizer -w infile.wav` - -If you set the ENV Variable for Debug it will output what it is doing. - -### Matlab folder - -This code is *NOT* structured, I put it there to just be safe in case my laptop exploded. It is very random, very buggy, but feel 100% free in exploring and looking into it. -All of the code there, is not converted to Rust, this was done in a posterior phase to speed up prototyping. It will eventually need to be converted if we find anything usefull out of it. - -## Data files - -PS: Contact me for the data files. - -Data is generated by the flac-server. As you explore the codebase you will see that the data are flat WAV files (and a specific Index files, look in the code VSRI files to learn more about that). -The WAV files contain 4 channels with 16bit samples each. This is because prometheus always generate float 64bit samples. Since WAV doesn't support neither 64bit samples, neither float the idea to store RAW data before processing was to split the 64bit into 4x16bit. And send each part to a different channel. - -There is 0 processing on any of these WAV files. You can compress them into different formats (At this stage, lossess only, otherwise you will get trash when decompression) and check if you get any decent compression. In some you will do, other not so much! - -Recommended tool for compression: SoX (Old but good!) https://sox.sourceforge.net/soxformat.html - -How to use it: `sox file.wav file.flac` (yes, is that simple!) - -### flac-server-data-july - -This is a LOT of signals collected over 7 days. Sampling rate 20s per sample - -### flac-server-data-september - -Smaller subset, doesn't include prometheus self-monitoring, less than 1-day. -Sampling Rate 10s per sample. diff --git a/matlab/error_calc.m b/matlab/error_calc.m deleted file mode 100644 index 55e5caf..0000000 --- a/matlab/error_calc.m +++ /dev/null @@ -1,26 +0,0 @@ -function median_percentage_error = calculate_error(input_signal, output_signal, sampling) - % Calculate the median percentage error between two time series signals. - % Input: - % input_signal: Input time series signal. - % output_signal: Output time series signal. - % sampling: Sampling factor for trimming input_signal. - - % Input validation - if ~isnumeric(input_signal) || ~isnumeric(output_signal) - error('Input and output signals must be numeric arrays.'); - end - - if ~isscalar(sampling) || sampling <= 0 - error('Sampling must be a positive scalar.'); - end - - % Trim input_signal to match the size of output_signal - input_trimmed = input_signal(1:sampling:end-1); - - % Calculate the percentage error - scaling_factor = 100; % You can adjust this as needed - percentage_error = abs(output_signal - input_trimmed) / scaling_factor; - - % Calculate and return the median percentage error - median_percentage_error = median(percentage_error); -end diff --git a/matlab/process_ts.m b/matlab/process_ts.m deleted file mode 100644 index 81e51a6..0000000 --- a/matlab/process_ts.m +++ /dev/null @@ -1,140 +0,0 @@ -function [dc, ac, composed, fft_data] = process_ts(ts, w, freq_n, n_hold) -% ts: timeseries -% w: window size -% freq_n: Number of frequencies to find -tic -if nargin<4 - n_hold = 0; -endif - -nanIDX = find(isnan(ts)); -while(~isempty(nanIDX)) - ts(nanIDX) = ts(nanIDX+1); - nanIDX = find(isnan(ts)); -end - -% Split the signal in DC and AC parts -%dc = movmean(ts, w); -ac = center(ts); -%ac = ts-dc; -dc = ts-ac; - -%hold on;plot(dc);plot(ac); - -window_n = ceil(length(ts)/w); -data_rebuild = []; -fft_store = []; -fft_data = []; -window_err = []; -% Process the whole signal -for i=1:window_n - window_s = (i-1)*w + 1; - window_e = i*w; - if i == window_n - data_window = ts(window_s:end); - data_dc = dc(window_s:end); - data_ac = ac(window_s:end); - else - data_window = ts(window_s:window_e); - %data_dc = movmean(data_window, w/10); - data_dc = dc(window_s:window_e); - %data_ac = data_window - data_dc; - data_ac = ac(window_s:window_e); - endif - window_size = length(data_dc); - - % Process AC data - if isempty(fft_store) - f = fft(data_ac); - tmp_f = f; - out_fft = zeros(1, window_size); - window_freqs = []; - if freq_n > window_size/2 - freq_n = floor(window_size/2); - endif - for i=1:freq_n*2 - [mx,ix] = max(tmp_f); - window_freqs(i,:) = [real(ix) mx]; - tmp_f(ix) = 0; - out_fft(ix) = mx; - end - fft_data = [fft_data out_fft]; - %disp("Window Frequencies: ") - %disp(sort(window_freqs)) - out_ift = ifft(out_fft); - if n_hold ~= 0 - fft_store = out_ift; - endif - elseif n_hold ~= 0 - out_ift = fft_store; - fft_store = []; - endif - - % Process DC data - yi = polyfit(1:window_size,data_dc,1); - %disp("DC points: ") - %disp(yi) - % Rebuild the sinal for the window - yii = polyval(yi,1:window_size); - % Build the dataset for the window - window_rebuild = real(out_ift)+yii; - - % Calculate the error - pererr = abs(data_window-window_rebuild)./data_window*100; - mean(pererr) - window_err = [window_err pererr]; - data_rebuild = [data_rebuild window_rebuild]; - - - %plot(abs(out_fft)) -end -toc -composed = data_rebuild; -nnz(fft_data) -figure; -plot(window_err); - -figure; -subplot(2,2,1); -plot(data_rebuild); -title('Rebuild'); -subplot(2,2,2); -plot(ts, 'r'); -title('Original'); -subplot(2,2,[3,4]); -plot(ts,'r',data_rebuild,'b'); -title('Both'); - - -%{ -wdw = ac(1:w); -f = fft(wdw); - -% Create a output array -tmp_f = f; -out_fft = zeros(1, w); - -% Zero out the frequency just found and around it -for i=1:freq_n*2 - [mx,ix] = max(tmp_f); - tmp_f(ix) = 0; - out_fft(ix) = mx; -end - -out_ift = ifft(out_fft); -ift = ifft(f); - -% DC component approximation -yi = polyfit(1:w,dc(1:w),1); -% Lets see the aproximattion -yii = polyval(yi,1:w); -rebuilt = real(out_ift)+yii; -x = rebuilt; - -hold on; -%plot(abs(out_fft)) -%plot(wdw) -%plot(real(ift)) -plot(rebuilt) -plot(ts(1:w)) -%} \ No newline at end of file diff --git a/matlab/process_ts2.m b/matlab/process_ts2.m deleted file mode 100644 index a365ebe..0000000 --- a/matlab/process_ts2.m +++ /dev/null @@ -1,112 +0,0 @@ -function [dc, ac, composed, fft_data] = process_ts2(timeseries, w, freq_n, ss) -% ts: timeseries -% w: window size -% freq_n: Number of frequencies to find - -% Clear errors from data -nanIDX = find(isnan(timeseries)); -while(~isempty(nanIDX)) - timeseries(nanIDX) = timeseries(nanIDX+1); - nanIDX = find(isnan(timeseries)); -end - -% Supersample the timeseries -[x, ts] = supersample_signal(timeseries, ss, 3); -tic -% Split the signal in DC and AC parts -%dc = movmean(ts, w); -ac = center(ts); -%ac = ts-dc; -dc = ts-ac; - -% Window (Probably not needed) -% hn = hann (w)'; - -%hold on;plot(dc);plot(ac); - -window_n = ceil(length(ts)/w); -data_rebuild = []; -fft_store = []; -fft_data = []; -window_err = []; -dc_store = []; -% Process the whole signal -for i=1:window_n - window_s = (i-1)*w + 1; - window_e = i*w; - if i == window_n - data_window = ts(window_s:end); - data_dc = dc(window_s:end); - data_ac = ac(window_s:end); - else - data_window = ts(window_s:window_e); - %data_dc = movmean(data_window, w/10); - data_dc = dc(window_s:window_e); - %data_ac = data_window - data_dc; - data_ac = ac(window_s:window_e);%.*hn; - endif - window_size = length(data_dc); - -% Process AC data - f = fft(data_ac); - tmp_f = f; - out_fft = zeros(1, window_size); - window_freqs = []; - if freq_n > window_size/2 - freq_n = floor(window_size/2); - endif - for i=1:freq_n*2 - [mx,ix] = max(tmp_f); - window_freqs(i,:) = [real(ix) mx]; - tmp_f(ix) = 0; - out_fft(ix) = mx; - end - % Process DC data - yi = polyfit(1:window_size,data_dc,1); - dc_store = [dc_store yi]; - fft_data = [fft_data out_fft]; -end -toc -% Decompressing -tic -for j=1:window_n - window_s = (j-1)*w + 1; - window_e = j*w; - if j == window_n - data_fft = fft_data(window_s:end); - %original_data = timeseries(window_s:end); - else - data_fft = fft_data(window_s:window_e); - %original_data = timeseries(window_s:window_e); - endif - window_size = length(data_fft); - out_ift = ifft(data_fft); - % Process DC data - %yi = polyfit(1:window_size,center(out_ift),1); - % Rebuild the sinal for the window - yii = polyval(dc_store((2*j)-1:2*j),1:window_size); - % Build the dataset for the window - window_rebuild = real(out_ift)+yii; - data_rebuild = [data_rebuild window_rebuild]; -end -toc - -% Calculate the error -pererr = abs(timeseries-data_rebuild(1:ss:end-1))./timeseries*100; -mean(pererr) -figure; -plot(pererr, '+'); - -composed = data_rebuild; - - -figure; -subplot(2,2,1); -plot(data_rebuild); -title('Rebuild'); -subplot(2,2,2); -plot(ts, 'r'); -title('Original'); -subplot(2,2,[3,4]); -plot(x,timeseries,'r',data_rebuild,'b'); -title('Both'); \ No newline at end of file diff --git a/matlab/remove_nan.m b/matlab/remove_nan.m deleted file mode 100644 index 627797f..0000000 --- a/matlab/remove_nan.m +++ /dev/null @@ -1,11 +0,0 @@ -function ts = remove_nan(timeseries) -% Since octave doesn't have any function that removes NaN from data, -% this function does just that. - -nanIDX = find(isnan(timeseries)); -while(~isempty(nanIDX)) - timeseries(nanIDX) = timeseries(nanIDX+1); - nanIDX = find(isnan(timeseries)); -end - -ts=timeseries; \ No newline at end of file diff --git a/matlab/supersample_signal.m b/matlab/supersample_signal.m deleted file mode 100644 index f9bc161..0000000 --- a/matlab/supersample_signal.m +++ /dev/null @@ -1,26 +0,0 @@ -function [x, y] = supersample_signal(ts, ts_sampling_rate, polinomial) -% This function adds data samples interpolation the original data samples -% Why? FFTs like more data, so this is a way to add more data to smooth -% out the FFTs - warning ('off', 'all') ; - tic - x = (1:ts_sampling_rate:ts_sampling_rate*length(ts)); - window = polinomial * ts_sampling_rate; - y = []; - window_count = 0; - for i=1:polinomial:length(ts) - top = i+polinomial; - - if top > length(x) - top = length(x); - endif - if i == top - break; - endif - yi = polyfit(x(i:top),ts(i:top),3); - yii = polyval(yi,window*window_count+1:window*(window_count+1)+1); - y = [y(1:end-1) yii]; - window_count = window_count + 1; - end - toc - plot(y, 'r', x,ts, '+'); \ No newline at end of file