Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.27 #98

Merged
merged 11 commits into from
Aug 8, 2024
Merged

Version 1.27 #98

merged 11 commits into from
Aug 8, 2024

Conversation

cdgriffith
Copy link
Owner

@cdgriffith cdgriffith commented Aug 8, 2024

@coveralls
Copy link

coveralls commented Aug 8, 2024

Coverage Status

coverage: 100.0%. remained the same
when pulling 5f3b625 on develop
into 72ee164 on master.

@cdgriffith
Copy link
Owner Author

@NebularNerd can you test this develop branch and make sure the confidence sort works for you this way?

@NebularNerd
Copy link
Contributor

NebularNerd commented Aug 8, 2024

All works fine, using my test mp3 we get the correct confidence order.

With my script:

(01) Adamski - Killer.mp3
Most likely match:
Format:        MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file
Confidence:    80.0%
Extension:     .mp3
MIME:          audio/mpeg
Offset:        10
Bytes Matched: b'ID3\x03\x00\x00\x00\x01 \x1eMCDI'
Hex:           4944 3303 0000 0001 201e 4d43 4449
String:        ID3 MCDI

Alternate match #1
Format:        MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file
Confidence:    80.0%
Extension:     .mp3
MIME:          audio/mpeg
Offset:        -128
Bytes Matched: b'ID3\x03\x00TAG'
Hex:           4944 3303 0054 4147
String:        ID3TAG

Alternate match #2
Format:        Sprint Music Store audio
Confidence:    70.0%
Extension:     .koz
MIME:
Offset:        0
Bytes Matched: b'ID3\x03\x00\x00\x00'
Hex:           4944 3303 0000 00
String:        ID3

Alternate match #3
Format:        MPEG-1 Audio Layer 3 (MP3) audio file
Confidence:    60.0%
Extension:     .mp3
MIME:          audio/mpeg
Offset:        -128
Bytes Matched: b'ID3TAG'
Hex:           4944 3354 4147
String:        ID3TAG

Alternate match #4
Format:        MPEG-1 Audio Layer 3 ID3v2.3.0 (MP3) audio file
Confidence:    50.0%
Extension:     .mp3
MIME:          audio/mpeg
Offset:        0
Bytes Matched: b'ID3\x03\x00'
Hex:           4944 3303 00
String:        ID3

Alternate match #5
Format:        MPEG-1 Audio Layer 3 (MP3) audio file
Confidence:    30.0%
Extension:     .mp3
MIME:          audio/mpeg
Offset:        0
Bytes Matched: b'ID3'
Hex:           4944 33
String:        ID3

And with your new verbose output:


Total Possible Matches: 6

        Best Match
        name: MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file
        confidence: 0.9
        extension: .mp3
        mime_type: audio/mpeg
        byte_match: b'ID3\x03\x00\x00\x00\x01 \x1eMCDI'
        offset: 10

        Alertnative Match 1
        name: MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file
        confidence: 0.9
        extension: .mp3
        mime_type: audio/mpeg
        byte_match: b'ID3\x03\x00TAG'
        offset: -128

        Alertnative Match 2
        name: MPEG-1 Audio Layer 3 (MP3) audio file
        confidence: 0.9
        extension: .mp3
        mime_type: audio/mpeg
        byte_match: b'ID3TAG'
        offset: -128

        Alertnative Match 3
        name: MPEG-1 Audio Layer 3 ID3v2.3.0 (MP3) audio file
        confidence: 0.9
        extension: .mp3
        mime_type: audio/mpeg
        byte_match: b'ID3\x03\x00'
        offset: 0

        Alertnative Match 4
        name: MPEG-1 Audio Layer 3 (MP3) audio file
        confidence: 0.9
        extension: .mp3
        mime_type: audio/mpeg
        byte_match: b'ID3'
        offset: 0

        Alertnative Match 5
        name: Sprint Music Store audio
        confidence: 0.7
        extension: .koz
        mime_type:
        byte_match: b'ID3\x03\x00\x00\x00'
        offset: 0

@cdgriffith
Copy link
Owner Author

@NebularNerd is your script using an old version of puremagic or doing some other confidence mixes? I see that from 2 on they are different.

@NebularNerd
Copy link
Contributor

@NebularNerd is your script using an old version of puremagic or doing some other confidence mixes? I see that from 2 on they are different.

I noticed that as well, I'm running 1.27 in both examples but I'm a little confused as to why we're getting slightly off results. My script takes the results fed from Puremagic and prints them out, what's weird is the confidences on mine are lower as well, I always assumed (perhaps incorrectly) that 0.8 was the highest match. I would say your verbose is correct and I'm going to reexamine my script to see where it's being weird.

@cdgriffith
Copy link
Owner Author

@NebularNerd You can only get 0.9 if you also get file extension matching, line 123

@NebularNerd
Copy link
Contributor

NebularNerd commented Aug 8, 2024

@NebularNerd You can only get 0.9 if you also get file extension matching, line 123

I think this may be the issue, test with the front page example:

import puremagic
filename = r"M:\Music-Store\Artist\Adamski\[1999] Killer - The Best Of Adamski\(01) Adamski - Killer.mp3"
print(puremagic.magic_file(filename))

Gives:

[PureMagicWithConfidence(byte_match=b'ID3\x03\x00\x00\x00\x01 \x1eMCDI', offset=10, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file', confidence=0.9), PureMagicWithConfidence(byte_match=b'ID3\x03\x00TAG', offset=-128, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file', confidence=0.9), PureMagicWithConfidence(byte_match=b'ID3TAG', offset=-128, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) audio file', confidence=0.9), PureMagicWithConfidence(byte_match=b'ID3\x03\x00', offset=0, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 ID3v2.3.0 (MP3) audio file', confidence=0.9), PureMagicWithConfidence(byte_match=b'ID3', offset=0, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) audio file', confidence=0.9), PureMagicWithConfidence(byte_match=b'ID3\x03\x00\x00\x00', offset=0, extension='.koz', mime_type='', name='Sprint Music Store audio', confidence=0.7)]

Which matches yours, my script was using:

with open(filename, "rb") as file:
        y = puremagic.magic_stream(file)
        print (y)
       # Rest of script to give pretty print out

Which gives:

[PureMagicWithConfidence(byte_match=b'ID3\x03\x00\x00\x00\x01 \x1eMCDI', offset=10, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file', confidence=0.8), PureMagicWithConfidence(byte_match=b'ID3\x03\x00TAG', offset=-128, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) ID3v2.3.0 audio file', confidence=0.8), PureMagicWithConfidence(byte_match=b'ID3\x03\x00\x00\x00', offset=0, extension='.koz', mime_type='', name='Sprint Music Store audio', confidence=0.7), PureMagicWithConfidence(byte_match=b'ID3TAG', offset=-128, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) audio file', confidence=0.6), PureMagicWithConfidence(byte_match=b'ID3\x03\x00', offset=0, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 ID3v2.3.0 (MP3) audio file', confidence=0.5), PureMagicWithConfidence(byte_match=b'ID3', offset=0, extension='.mp3', mime_type='audio/mpeg', name='MPEG-1 Audio Layer 3 (MP3) audio file', confidence=0.3)]

If it's just the addition of the extension that is changing the scores then we have nothing to worry about. Adjusting my test script to use the filename method I now see the same results; I wrote it very early on and can't think why I went with stream over filename.

EDIT
I sort of remember the idea of why I went with stream over filename, it was to ensure that all I was testing was the raw data and not worry about extensions which may or may not be valid. It really makes no odds now I think about it.

@cdgriffith cdgriffith merged commit fecf1f7 into master Aug 8, 2024
20 checks passed
@cdgriffith cdgriffith deleted the develop branch August 8, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants