Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing character encoding conversion using Encode and other small fixes #4

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 14 additions & 7 deletions README
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
sub2srt - Convert subtitles from ".sub" to subviewer ".srt" format
sub2srt - Convert subtitles from ".sub" to SubRip ".srt" format


(c) 2003-2005 Roland "Robelix" Obermayer <roland\@robelix.com>
Project Homepage: http://www.robelix.com/sub2srt/


sub2srt is a simple tool to convert 2 common subtitle formats (microdvd
and subrip - both are known as ".sub") to subviewer ".srt" format.
sub2srt is a simple tool to convert 2 common subtitle formats (MicroDVD
and SubViewer - both are known as ".sub") to SubRip ".srt" format.
This is the format ogmmerge accepts for multiplexing into ogm streams.
This format is also used by mkvmerge for multiplexing into mkv streams.


This is Beta Software!
Expand All @@ -20,14 +21,14 @@ Please report bugs, problems, patches... to [email protected]

What it does not:
-----------------
It does not - and will never - convert DVD-subtitles or vobsub.
It does not - and will never - convert DVD-subtitles or VobSub.
Google on if you need this ;)


Installation:
-------------
Not necessary ;) - it's just a single perl-script.
But you may want to copy sub2srt to a directory that's in your $PATH -
But you may want to copy sub2srt to a directory that's in your $PATH -
like /usr/local/bin or /usr/bin.


Expand All @@ -49,14 +50,20 @@ Have a look to the file COPYING included in the archive.

Changelog:
----------
01 Aug 2015: PhobosK
* sub2srt-0.5.6
- Fixed character encoding conversion using Encode instead of piconv
- Fixed subtitle naming mix-up
- Fixed some spelling mistakes

10 Dec 2014: Toni Ahola
* sub2srt-0.5.5
- Added support to encode output with piconv

02 Nov 2013: Krzysztof Trybowski
* sub2srt-0.5.4
- Added support for 2 input formats "mpl2" and "tmp"

15 Jan 2005: Roland Obermayer <[email protected]>
* sub2srt-0.5.3
- Added support for a third input format "txtsub"
Expand All @@ -74,7 +81,7 @@ Changelog:

31 Aug 2003: Roland Obermayer <[email protected]>
* sub2srt-0.5.1
- Bugfix in the microdvd conversion routine
- Bugfix in the MicroDVD conversion routine

26 Aug 2003: Roland Obermayer <[email protected]>
* sub2srt-0.5
Expand Down
78 changes: 52 additions & 26 deletions sub2srt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/perl -w

# sub2srt - Convert subtitles from microdvd or subrip ".sub" to subviewer ".srt" format
# sub2srt - Convert subtitles from MicroDVD or SubViewer ".sub" to SubRip ".srt" format
# (c) 2003-2005 Roland "Robelix" Obermayer <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
Expand All @@ -19,10 +19,13 @@

use strict;
use warnings;
my $version = "0.5.5";
my $version = "0.5.6";

use Getopt::Long;
Getopt::Long::Configure("pass_through","no_ignore_case");
use Encode("from_to","find_encoding");
use File::Copy("mv");
use File::Temp;
my $help = 0;
my $fps = 25;
my $showvers = 0;
Expand Down Expand Up @@ -66,7 +69,7 @@ my $infile = shift || '';
if (!$infile) { help(); }

my $outfile = shift || '';
if (!$outfile) {
if (!$outfile) {
$outfile = $infile;
$outfile =~ s/(\.sub|\.txt)$//i;
$outfile .= ".srt";
Expand Down Expand Up @@ -107,8 +110,8 @@ print "Converting from $format to srt\n" if ($format ne "srt" && !$quiet);
open INFILE, "$infile" or die "Unable to open $infile for reading\n";
open OUTFILE, ">$outfile" or die "Unable to open $outfile for writing\n";

if ($format eq "subrip") {
conv_subrip();
if ($format eq "subviewer") {
conv_subviewer();
}
elsif ($format eq "microdvd") {
conv_microdvd();
Expand All @@ -123,7 +126,7 @@ elsif ($format eq "tmp") {
conv_tmp();
}
elsif ($format eq "srt") {
print "Input file is already subviewer srt format.\n";
print "Input file is already SubRip srt format.\n";
}


Expand All @@ -132,14 +135,37 @@ close OUTFILE;


if($convert) {
my $tmpfile = tmpnam();
system("mv $outfile $tmpfile");
system("piconv -f $fenc -t $tenc < $tmpfile > $outfile");
system("rm $tmpfile");
# Check if $fenc and $tenc are valid
if (!find_encoding($fenc)) {
print "--> $fenc <-- is not a valid From encoding. Encoding conversion skipped.\n";
exit 0;
}
if (!find_encoding($tenc)) {
print "--> $tenc <-- is not a valid To encoding. Encoding conversion skipped.\n";
exit 0;
}

my $tmpfile = tmpnam();

open(INPUT, "< :raw", $outfile)
or die "Unable to open < $outfile for reading: $!\n. Encoding conversion skipped.\n";
open(OUTPUT, "> :raw", $tmpfile)
or die "Unable to open > $tmpfile for writing: $!\n. Encoding conversion skipped.\n";
while (<INPUT>) {
from_to($_, $fenc, $tenc, Encode::FB_CROAK);
print OUTPUT;
}

close INPUT or die "Unable to close $outfile: $!\n";
close OUTPUT or die "Unable to close $tmpfile: $!\n";

mv($tmpfile, $outfile);

print "Encoding conversion from \"$fenc\" to \"$tenc\" done.\n" if (!$quiet);
}


sub conv_subrip {
sub conv_subviewer {
my $converted = 0;
my $failed = 0;
while (my $line1 = <INFILE>) {
Expand Down Expand Up @@ -348,7 +374,7 @@ sub write_srt {

sub frames_2_time {
# convert frames to time
# used for microdvd format
# used for MicroDVD format
my $frames = shift;
my $seconds = $frames / $fps;
my $ms = ($seconds - int($seconds)) * 1000;
Expand Down Expand Up @@ -395,17 +421,17 @@ sub detect_format {
$line =~ s/[\n\r]*$//;
print " Trying line $i: $line \n" if $debug;

# microdvd format
# MicroDVD format
# looks like:
# {startframe}{endframe}Text

if ( $line =~ m/^\{\d+\}\{\d+\}.+$/ ) {
print " seems to be microdvd format\n" if ($debug);
print " seems to be MicroDVD format\n" if ($debug);
my $line2 = <INFILE>;
$line2 =~ s/[\n\r]*$//;
print " checking next line: $line2\n" if ($debug);
if ($line2 =~ m/^\{\d+\}\{\d+\}.+$/) {
print "microdvd format detected!\n" if ($debug);
print "MicroDVD format detected!\n" if ($debug);
$detected = "microdvd";
}
}
Expand Down Expand Up @@ -442,14 +468,14 @@ sub detect_format {
}
}

# trying subrip format
# trying SubViewer format
# 3 lines:
# hh:mm:ss.ms,hh:mm:ss.ms
# text
# (empty line)

if ($line =~ m/^\d\d:\d\d:\d\d\.\d\d,\d\d:\d\d:\d\d\.\d\d$/) {
print " seems to be subrip format\n" if ($debug);
print " seems to be SubViewer format\n" if ($debug);
my $line2 = <INFILE>;
$line2 =~ s/[\n\r]*$//;
my $line3 = <INFILE>;
Expand All @@ -458,15 +484,15 @@ sub detect_format {
$line4 =~ s/[\n\r]*$//;
print " checking the next lines:\n $line2\n $line3\n $line4\n" if ($debug);
if ($line2 =~ m/^.+$/ && $line3 =~ m/^$/ && $line4 =~ m/^\d\d:\d\d:\d\d\.\d\d,\d\d:\d\d:\d\d\.\d\d$/) {
print "subrip format detected!\n" if ($debug);
$detected = "subrip";
print "SubViewer format detected!\n" if ($debug);
$detected = "subviewer";
}
}

# trying subviewer .srt format
# trying SubRip .srt format

if ($line =~ m/^\d\d:\d\d:\d\d\,\d\d\d\s-->\s\d\d:\d\d:\d\d\,\d\d\d$/) {
print "subviewer .srt format detected!\n" if ($debug);
print "SubRip .srt format detected!\n" if ($debug);
$detected = "srt";
}

Expand All @@ -477,7 +503,7 @@ sub detect_format {
# subtitle-text
# [endtime]
# (the endtime can be the starttime of the next sub)
# I've seen two variants with slightly diffrent time-formats
# I've seen two variants with slightly different time-formats
# a) [00:02:05.000]
# b) [00:02:05]
# Both are supported
Expand Down Expand Up @@ -507,7 +533,7 @@ print <<__HELP__;

sub2srt [options] inputfile.sub [outputfile.srt]

Convert subrip and microdvd ".sub" subtitle files to subviewer ".srt" format
Convert SubViewer and MicroDVD ".sub" subtitle files to SubRip ".srt" format
(the format accepted by ogmmerge for multiplexing into ogm files)


Expand All @@ -516,7 +542,7 @@ Options:
-v --version Display Program version.
-l --license Display License information.

-f=n --fps=n Fps to be used if input file is frame-based microdvd-format
-f=n --fps=n Fps to be used if input file is frame-based MicroDVD-format
Default: 25 fps. Ignored if input format is time-based.

-n --ntsc Sets the framerate to 29.97 fps. Overrides --fps.
Expand All @@ -528,7 +554,7 @@ Options:
-c --convert To convert character encoding.

--fenc=s From encoding. Overrides ISO-8859-1
--tenc=s To enconding. Overrides UTF-8
--tenc=s To encoding. Overrides UTF-8

--force Overwrite existing files without prompt

Expand All @@ -538,7 +564,7 @@ Options:

inputfile.sub
Input file
Both types usally have the ending .sub, the format is autodetected.
Both types usually have the ending .sub, the format is auto-detected.


[outputfile.srt]
Expand Down