Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Sanitize with Loofah. Resolves #320 #351

Merged
merged 2 commits into from
Mar 13, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion gemspec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def specification(version, default_adapter, platform = nil)
s.add_dependency 'rouge', '~> 3.1'
s.add_dependency 'nokogiri', '~> 1.8'
s.add_dependency 'stringex', '~> 2.6'
s.add_dependency 'sanitize', '~> 2.1.1', '>= 2.1.1'
s.add_dependency 'loofah', '~> 2.3'
s.add_dependency 'github-markup', '~> 3.0'
s.add_dependency 'gemojione', '~> 3.2'
s.add_dependency 'octicons', '~> 8.5'
Expand Down
2 changes: 1 addition & 1 deletion lib/gollum-lib.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
# external
require 'github/markup'
require 'erb'
require 'sanitize'
require 'gemojione'
require 'loofah'

# internal
require File.expand_path('../gollum-lib/git_access', __FILE__)
Expand Down
6 changes: 6 additions & 0 deletions lib/gollum-lib/filter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ class Filter
# Setup the object. Sets `@markup` to be the instance of Gollum::Markup that
# is running this filter chain, and sets `@map` to be an empty hash (for use
# in your extract/process operations).

def initialize(markup)
@markup = markup
@map = {}
Expand All @@ -72,6 +73,11 @@ def process(data)
end

protected

def sanitize(data)
@markup.wiki.sanitizer.clean(data, @markup.historical)
end

# Render a (presumably) non-fatal error as HTML
#
def html_error(message)
Expand Down
2 changes: 1 addition & 1 deletion lib/gollum-lib/filter/render.rb
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def add_editable_header_class(data)
header['class'] = 'editable'
end
end
doc.to_xml(@markup.to_xml_opts)
doc.to_xml(@markup.class.to_xml_opts)
end

end
9 changes: 1 addition & 8 deletions lib/gollum-lib/filter/sanitize.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,6 @@ def extract(data)
end

def process(data)
if @markup.sanitize
doc = Nokogiri::HTML::DocumentFragment.parse(data)
doc = @markup.sanitize.clean_node!(doc)

doc.to_xml(@markup.to_xml_opts).gsub(/<p><\/p>/, '')
else
data
end
sanitize(data)
end
end
10 changes: 2 additions & 8 deletions lib/gollum-lib/filter/tags.rb
Original file line number Diff line number Diff line change
Expand Up @@ -173,14 +173,8 @@ def parse_image_tag_options(options)
# Return the String HTML if the tag is a valid external link tag or
# nil if it is not.
def process_external_link_tag(url, pretty_name = nil)
accepted_protocols = @markup.wiki.sanitization.protocols['a']['href'].dup
if accepted_protocols.include?(:relative)
accepted_protocols.select!{|protocol| protocol != :relative}
regexp = %r{^((#{accepted_protocols.join("|")}):)?(//)}
else
regexp = %r{^((#{accepted_protocols.join("|")}):)}
end
if url =~ regexp
@accepted_protocols_regex ||= %r{^((#{::Gollum::Sanitization.accepted_protocols.join('|')}):)?(//)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this stuff about :relative no longer relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see Sanitize had a separate flag for allowing :relative, loofah just accepts it.

What I don't really understand is the difference between the two regexes. Previously, if :relative was not allowed, the regex would match URI scheme's of the form http:, and not require http://. I don't know why the double slashes were not required in that case. Any thoughts?

if url =~ @accepted_protocols_regex
generate_link(url, pretty_name, nil, :external)
else
nil
Expand Down
6 changes: 3 additions & 3 deletions lib/gollum-lib/filter/toc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ def process(data)
add_entry_to_toc header, anchor_name
end
if not @toc_doc.nil?
toc_str = @toc_doc.to_xml(@markup.to_xml_opts)
toc_str = @toc_doc.to_xml(@markup.class.to_xml_opts)
end

data = @doc.to_xml(@markup.to_xml_opts)
data = @doc.to_xml(@markup.class.to_xml_opts)
end

@markup.toc = toc_str
Expand All @@ -51,7 +51,7 @@ def process(data)
e.remove
end
end
toc_clone.to_xml(@markup.to_xml_opts)
toc_clone.to_xml(@markup.class.to_xml_opts)
end
end

Expand Down
2 changes: 1 addition & 1 deletion lib/gollum-lib/filter/yaml.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ def extract(data)
data.gsub!(YAML_FRONT_MATTER_REGEXP) do
@markup.metadata ||= {}
begin
frontmatter = ::YAML.safe_load(@markup.sanitize.clean(Regexp.last_match[1]))
frontmatter = ::YAML.safe_load(sanitize(Regexp.last_match[1]))
@markup.metadata.merge!(frontmatter) if frontmatter.respond_to?(:keys) && frontmatter.respond_to?(:values)
rescue ::Psych::SyntaxError, ::Psych::DisallowedClass, ::Psych::BadAlias => error
@markup.metadata['errors'] ||= []
Expand Down
13 changes: 7 additions & 6 deletions lib/gollum-lib/markup.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ class Markup

class << self

def to_xml_opts
{ :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1, :indent => 0, :encoding => 'UTF-8' }
end

# Only use the formats that are specified in config.rb
def formats
if defined? Gollum::Page::FORMAT_NAMES
Expand Down Expand Up @@ -65,16 +69,15 @@ def register(ext, name, options = {}, &block)
attr_accessor :toc
attr_accessor :metadata
attr_reader :encoding
attr_reader :sanitize
attr_reader :format
attr_reader :wiki
attr_reader :page
attr_reader :parent_page
attr_reader :sub_page
attr_reader :name
attr_reader :include_levels
attr_reader :to_xml_opts
attr_reader :dir
attr_reader :historical

# Initialize a new Markup object.
#
Expand All @@ -92,7 +95,6 @@ def initialize(page)
@page = page
@dir = ::File.dirname(page.path)
@metadata = nil
@to_xml_opts = { :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1, :indent => 0, :encoding => 'UTF-8' }
end

# Whether or not this markup's format uses reversed-order links ([description | url] rather than [url | description]). Defaults to false.
Expand Down Expand Up @@ -146,12 +148,11 @@ def process_chain(data, filter_chain)
#
# Returns the formatted String content.
def render(no_follow = false, encoding = nil, include_levels = 10, &block)
@sanitize = no_follow ? @wiki.history_sanitizer : @wiki.sanitizer

@historical = no_follow
@encoding = encoding
@include_levels = include_levels

data = @data.dup
data = @data.dup

filter_chain = @wiki.filter_chain.reject {|filter| skip_filter?(filter)}
filter_chain.map! do |filter_sym|
Expand Down
2 changes: 1 addition & 1 deletion lib/gollum-lib/page.rb
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def name
#
# Returns the fully sanitized String title.
def title
Sanitize.clean(name).strip
@wiki.sanitizer.clean(name).strip
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could make a direct Loofah :strip or :prune call here (no need for the postprocessing in the Sanitizer class here).

end

# Public: Determines if this is a sub-page
Expand Down
Loading