Skip to content

Commit

Permalink
Fix encoding bug in tests on Ruby 3
Browse files Browse the repository at this point in the history
We need to guess the HTML encoding here otherwise some tests fail.

```
Failures:

  1) Readability images should show one image, but outside of the best candidate
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:80:in `new'
     # ./spec/readability_spec.rb:80:in `block (3 levels) in <top (required)>'

  2) Readability the cant_read.html fixture should work on the cant_read.html fixture with some allowed tags
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:555:in `new'
     # ./spec/readability_spec.rb:555:in `block (3 levels) in <top (required)>'
```

Fixes #87

It also adds the latest Ruby 3 version to CI to test for these sort of bugs regularly.
  • Loading branch information
apainintheneck committed Oct 21, 2024
1 parent 8abe36e commit 51d8118
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ruby.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
ruby-version: ['2.7']
ruby-version: ['2.7', '3.3']

steps:
- uses: actions/checkout@v2
Expand Down
2 changes: 1 addition & 1 deletion lib/readability.rb
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def initialize(input, options = {})
@options = DEFAULT_OPTIONS.merge(options)
@input = input

if RUBY_VERSION =~ /^(1\.9|2)/ && !@options[:encoding]
if RUBY_VERSION =~ /^(1\.9|2|3)/ && !@options[:encoding]
@input = GuessHtmlEncoding.encode(@input, @options[:html_headers]) unless @options[:do_not_guess_encoding]
@options[:encoding] = @input.encoding.to_s
end
Expand Down

0 comments on commit 51d8118

Please sign in to comment.