This algorithm is designed for the voice database copyright encryption project of Florida's signal and image processing lab.
In our database, we have kept tens of thousands of voice records of teachers and students from their lectures, daily conversations, and celebrations. Now we would like to add a copyright to protect the privacy of our teachers and students.
It should be clear that all the voice records in the database are mono recorded, and we have several requirements:
- Since some voice records are quite long, it might be used after sliced in to frames. Our encryption algorithm should protect as many pieces as possible.
- The encryption algorithm should be robust to noise as voice data might change after copying or encoding/decoding.
- We do not have a restricted requirement of avoiding distortion, but human ears should not distinguish the slight difference between original voice record and encrypted voice record.
- The decryption procedure should be independent from original voice record.
- Voice data is read at 441000 Hz (we assume that all the voice records satisfy the VCD standard).
- Voice data is cut into pieces with a time gap of 20 ms.
- For each frame, calculate DWT first and then DCT.
- In the DCT domain, find the max amplitude "a" (indicator) and second-fourth largest amplitudes "b-d" (encoder).
- Divide indicator a into a piece array via step length alpha=0.2 and find which piece is close to encoder b, we call this piece "p".
- We encrypt the encoders by changing their sign based on the index of p and encoding information provided.
- Express the copyright information circularly in binary so that the frames and the binary copyright information are in a one-to-one correspondence.
- Encode each frame following step 4.
motified_amplitude=quan(max_amplitude,encoder,step_length_alpha,encoding_information)
% return the modified encoder
frames=enframe(target,sample_number_each_frame)
% return a list of frames, each frame contains sample_number_each_frame samples
watermark_info_circle=wm(copyright_string,total_frame_num)
% return circularly expressed binary copyright information
watermark_info_circle_recovered=dwm(watermark_info_circle)
% decode circularly expressed binary info to circularly expressed string.
watermark_info=findwm(watermark_info_circle_recovered)
% return the exact watermark information from circular expression. Confirm the watermark info via majority principle.
amplitude=dequan(max_amplitude,motified_encoder,step_length_alpha)
% decode the binary inform in each frame
Full code is in the example folder. Since the copyright information is circularly encrypted, no matter how the voice record is cut, we have at most 882 iterations to work out the real copyright information.