Skip to content

Latest commit

 

History

History
723 lines (601 loc) · 29.5 KB

CHANGELOG.rdoc

File metadata and controls

723 lines (601 loc) · 29.5 KB

Mechanize CHANGELOG

2.1.1 / ??

  • Bug fixes

    • Meta refresh URIs are now escaped (excluding %). Issue #177

2.1 / 2011-12-20

  • Deprecations

    • Mechanize#get no longer accepts an options hash.

    • Mechanize::Util::to_native_charset has been removed.

  • Minor enhancements

    • Mechanize now depends on net-http-persistent 2.3+. This new version brings idle timeouts to help with the dreaded “too many connection resets” issue when POSTing to a closed connection. Issue #123

    • SSL connections will be verified against the system certificate store by default.

    • Added Mechanize#retry_change_requests to allow mechanize to retry POST and other non-idempotent requests when you know it is safe to do so. Issue #123

    • Mechanize can now stream files directly to disk without loading them into memory first through Mechanize::Download, a pluggable parser for downloading files.

      All responses larger than Mechanize#max_file_buffer are downloaded to a Tempfile. For backwards compatibility Mechanize::File subclasses still load the response body into memory.

      To force all unknown content types to download to disk instead of memory set:

      agent.pluggable_parser.default = Mechanize::Download
      
    • Added Mechanize#content_encoding_hooks which allow handling of non-standard content encodings like “agzip”. Patch #125 by kitamomonga

    • Added dom_class to elements and the element matcher like dom_id. Patch #156 by Dan Hansen.

    • Added support for the HTML5 keygen form element. See dev.w3.org/html5/spec/Overview.html#the-keygen-element Patch #157 by Victor Costan.

    • Mechanize no longer follows meta refreshes that have no “url=” in the content attribute to avoid infinite loops. To follow a meta refresh to the same page set Mechanize#follow_meta_refresh_self to true. Issue #134 by Jo Hund.

    • Updated ‘Mac Safari’ User-Agent alias to Safari 5.1.1. ‘Mac Safari 4’ can be used for the old ‘Mac Safari’ alias.

    • When given multiple HTTP authentication options mechanize now picks the strongest method.

    • Improvements to HTTP authorization:

      • mechanize raises Mechanize::UnathorizedError for 401 responses which is a sublcass of Mechanize::ResponseCodeError.

      • Added support for NTLM authentication, but this has not been tested.

    • Mechanize::Cookie.new accepts attributes in a hash.

    • Mechanize::CookieJar#<<(cookie) (alias: add!) is added. Issue #139

    • Different mechanize instances may now have different loggers. Issue #122

    • Mechanize now accepts a proxy port as a service name or number string. Issue #167

  • Bug fixes

    • Mechanize now handles cookies just as most modern browsers do, roughly based on RFC 6265.

      • domain=.example.com (which is invalid) is considered identical to domain=example.com.

      • A cookie with domain=example.com is sent to host.sub.example.com as well as host.example.com and example.com.

      • A cookie with domain=TLD (no dots) is accepted and sent if the host name is TLD, and rejected otherwise. To retain compatibility and convention, host/domain names starting with “local” are exempt from this rule.

      • A cookie with no domain attribute is only sent to the original host.

      • A cookie with an Effective TLD is rejected based on the public suffix list. (cf. publicsuffix.org/)

      • “Secure” cookies are not sent via non-https connection.

      • Subdomain match is not performed against an IP address.

      • It is recommended that you clear out existing cookie jars for regeneration because previously saved cookies may not have been parsed correctly.

    • Mechanize takes more care to avoid saving files with certain unsafe names. You should still take care not to use mechanize to save files directly into your home directory ($HOME). Issue #163.

    • Mechanize#cookie_jar= works again. Issue #126

    • The original Referer value persists on redirection. Issue #150

    • Do not send a referer on a Refresh header based redirection.

    • Fixed encoding error in tests when LANG=C. Patch #142 by jinschoi.

    • The order of items in a form submission now match the DOM order. Patch #129 by kitamomonga

    • Fixed proxy example in EXAMPLE. Issue #146 by NielsKSchjoedt

2.0.1 / 2011-06-28

Mechanize now uses minitest to avoid 1.9 vs 1.8 assertion availability in test/unit

  • Bug Fixes

    • Restored Mechanize#set_proxy. Issue #117, #118, #119

    • Mechanize::CookieJar#load now lazy-loads YAML. Issue #118

    • Mechanize#keep_alive_time no longer crashes but does nothing as net-http-persistent does not support HTTP/1.0 keep-alive extensions.

2.0 / 2011-06-27

Mechanize is now under the MIT license

  • API changes

    • WWW::Mechanize has been removed. Use Mechanize.

    • Pre connect hooks are now called with the agent and the request. See Mechanize#pre_connect_hooks.

    • Post connect hooks are now called with the agent and the response. See Mechanize#post_connect_hooks.

    • Mechanize::Chain is gone, as an internal API this should cause no problems.

    • Mechanize#fetch_page no longer accepts an options Hash.

    • Mechanize#put now accepts headers instead of an options Hash as the last argument

    • Mechanize#delete now accepts headers instead of an options Hash as the last argument

    • Mechanize#request_with_entity now accepts headers instead of an options Hash as the last argument

    • Mechanize no longer raises RuntimeError directly, Mechanize::Error or ArgumentError are raised instead.

    • The User-Agent header has changed. It no longer includes the WWW- prefix and now includes the ruby version. The URL has been updated as well.

    • Mechanize now requires ruby 1.8.7 or newer.

    • Hpricot support has been removed as webrobots requires nokogiri.

    • Mechanize#get no longer accepts the referer as the second argument.

    • Mechanize#get no longer allows the HTTP method to be changed (:verb option).

    • Mechanize::Page::Meta is now Mechanize::Page::MetaRefresh to accurately depict its responsibilities.

    • Mechanize::Page#meta is now Mechanize::Page#meta_refresh as it only contains meta elements with http-equiv of “refresh”

    • Mechanize::Page#charset is now Mechanize::Page::charset. GH #112, patch by Godfrey Chan.

  • Deprecations

    • Mechanize#get with an options hash is deprecated and will be removed after October, 2011.

    • Mechanize::Util::to_native_charset is deprecated as it is no longer used by Mechanize.

  • New Features

    • Add header reference methods to Mechanize::File so that a reponse object gets compatible with Net::HTTPResponse.

    • Mechanize#click accepts a regexp or string to click a button/link in the current page. It works as expected when not passed a string or regexp.

    • Provide a way to only follow permanent redirects (301) automatically: agent.redirect_ok = :permanent GH #73

    • Mechanize now supports HTML5 meta charset. GH #113

    • Documented various Mechanize accessors. GH #66

    • Mechanize now uses net-http-digest_auth. GH #31

    • Mechanize now implements session cookies. GH #78

    • Mechanize now implements deflate decoding. GH #40

    • Mechanize now allows a certificate and key to be passed directly. GH #71

    • Mechanize::Form::MultiSelectList now implements #option_with and #options_with. GH #42

    • Add Mechanize::Page::Link#rel and #rel?(kind) to read and test the rel attribute.

    • Add Mechanize::Page#canonical_uri to read a </tt><link rel=“canonical”></tt> tag.

    • Add support for Robots Exclusion Protocol (i.e. robots.txt) and nofollow/noindex in meta tags and the rel attribute. Automatic exclusion can be turned on by setting:

      agent.robots = true
      
    • Manual robots.txt test can be performed with Mechanize#robots_allowed? and #robots_disallowed?.

    • Mechanize::Form now supports the accept-charset attribute. GH #96

    • Mechanize::ResponseReadError is raised if there is an exception while reading the response body. This allows recovery from broken HTTP servers (or connections). GH #90

    • Mechanize#follow_meta_refresh set to :anywhere will follow meta refresh found outside of a document’s head. GH #99

    • Add support for HTML5’s rel=“noreferrer” attribute which indicates no “Referer” information should be sent when following the link.

    • A frame will now load its content when #content is called. GH #111

    • Added Mechanize#default_encoding to provide a default for pages with no encoding specified. GH #104

    • Added Mechanize#force_default_encoding which only uses Mechanize#default_encoding for parsing HTML. GH #104

  • Bug Fixes:

    • Fixed a bug where Referer is not sent when accessing a relative URI starting with “http”.

    • Fix handling of Meta Refresh with relative paths. GH #39

    • Mechanize::CookieJar now supports RFC 2109 correctly. GH #85

    • Fixed typo in EXAMPLES.rdoc. GH #74

    • The base element is now handled correctly for images. GH #72

    • Image buttons with no name attribute are now included in the form’s button list. GH#56

    • Improved handling of non ASCII-7bit compatible characters in links (only an issue on ruby 1.8). GH #36, GH #75

    • Loading cookies.txt is faster. GH #38

    • Mechanize no longer sends cookies for a.b.example to axb.example. GH #41

    • Mechanize no longer sends the button name as a form field for image buttons. GH #45

    • Blank cookie values are now skipped. GH #80

    • Mechanize now adds a ‘.’ to cookie domains if no ‘.’ was sent. This is not allowed by RFC 2109 but does appear in RFC 2965. GH #86

    • file URIs are now read in binary mode. GH #83

    • Content-Encoding: x-gzip is now treated like gzip per RFC 2616.

    • Mechanize now unescapes URIs for meta refresh. GH #68

    • Mechanize now has more robust HTML charset detection. GH #43

    • Mechanize::Form::Textarea is now created from a textarea element. GH #94

    • A meta content-type now overrides the HTTP content type. GH #114

    • Mechanize::Page::Link#uri now handles both escaped and unescaped hrefs. GH #107

1.0.0

  • New Features:

    • An optional verb may be passed to Mechanize#get GH #26

    • The WWW constant is deprecated. Switch to the top level constant Mechanize

    • SelectList#option_with and options_with for finding options

  • Bug Fixes:

    • Rescue errors from bogus encodings

    • 7bit content-encoding support. Thanks sporkmonger! GH #2

    • Fixed a bug with iconv conversion. Thanks awesomeman! GH #9

    • meta redirects outside the head are not followed. GH #13

    • Form submissions work with nil page encodings. GH #25

    • Fixing default values with serialized cookies. GH #3

    • Checkboxes and fields are sorted by page appearance before submitting. #11

0.9.3

  • Bug Fixes:

    • Do not apply encoding if encoding equals ‘none’ Thanks Akinori MUSHA!

    • Fixed Page#encoding= when changing the value from or to nil. Made it return the assigned value while at it. (Akinori MUSHA)

    • Custom request headers may be supplied WWW::Mechanize#request_headers RF #24516

    • HTML Parser may be set on a per instance level WWW::Mechanize#html_parser RF #24693

    • Fixed string encoding in ruby 1.9. RF #2433

    • Rescuing Zlib::DataErrors (Thanks Kelley Reynolds)

    • Fixing a problem with frozen SSL objects. RF #24950

    • Do not send a referer on meta refresh. RF #24945

    • Fixed a bug with double semi-colons in Content-Disposition headers

    • Properly handling cookies that specify a path. RF #25259

0.9.2 / 2009/03/05

  • New Features:

    • Mechanize#submit and Form#submit take arbitrary headers(thanks penguincoder)

  • Bug Fixes:

    • Fixed a bug with bad cookie parsing

    • Form::RadioButton#click unchecks other buttons (RF #24159)

    • Fixed problems with Iconv (RF #24190, RF #24192, RF #24043)

    • POST parameters should be CGI escaped

    • Made Content-Type match case insensitive (Thanks Kelly Reynolds)

    • Non-string form parameters work

0.9.1 2009/02/23

  • New Features:

    • Encoding may be specified for a page: Page#encoding=

  • Bug Fixes:

    • m17n fixes. ありがとう konn!

    • Fixed a problem with base tags. ありがとう Keisuke

    • HEAD requests do not record in the history

    • Default encoding to ISO-8859-1 instead of ASCII

    • Requests with URI instances should not be polluted RF #23472

    • Nonce count fixed for digest auth requests. Thanks Adrian Slapa!

    • Fixed a referer issue with requests using a uri. RF #23472

    • WAP content types will now be parsed

    • Rescued poorly formatted cookies. Thanks Kelley Reynolds!

0.9.0

  • Deprecations

    • WWW::Mechanize::List is gone!

    • Mechanize uses Nokogiri as it’s HTML parser but you may switch to Hpricot by using WWW::Mechanize.html_parser = Hpricot

  • Bug Fixes:

    • Nil check on page when base tag is used #23021

0.8.5

  • Deprecations

    • WWW::Mechanize::List will be deprecated in 0.9.0, and warnings have been added to help you upgrade.

  • Bug Fixes:

    • Stopped raising EOF exceptions on HEAD requests. ありがとう:HIRAKU Kuroda

    • Fixed exceptions when a logger is set and file:// requests are made.

    • Made Mechanize 1.9 compatible

    • Not setting the port in the host header for SSL sites.

    • Following refresh headers. Thanks Tim Connor!

    • Cookie Jar handles cookie domains containing ports, like ‘mydomain.com:443’ (Thanks Michal Ochman!)

    • Fixing strange uri escaping problems [#22604]

    • Making content-type determintation more robust. (thanks Han Holl!)

    • Dealing with links that are query string only. [#22402]

    • Nokogiri may be dropped in as a replacement.

      WWW::Mechanize.html_parser = Nokogiri::HTML
      
    • Making sure the correct page is added to the history on meta refresh.

      #22708
    • Mechanize#get requests no longer send a referer unless they are relative requests.

0.8.4

  • Bug Fixes:

    • Setting the port number on the host header.

    • Fixing Authorization headers for picky servers

0.8.3

  • Bug Fixes:

    • Making sure logger is set during SSL connections.

0.8.2

  • Bug Fixes:

    • Doh! I was accidentally setting headers twice.

0.8.1

  • Bug Fixes:

    • Fixed problem with nil pointer when logger is set

0.8.0

  • New Features:

    • Lifecycle hooks. Mechanize#pre_connect_hooks, Mechanize#post_connect_hooks

    • file:/// urls are now supported

    • Added Mechanize::Page#link_with, frame_with for searching for links using criteria.

    • Implementing PUT, DELETE, and HEAD requests

  • Bug Fixes:

    • Fixed an infinite loop when content-length and body length don’t match.

    • Only setting headers once

    • Adding IIS authentication support

0.7.8

  • Bug Fixes:

    • Fixed bug when receiving a 304 response (HTTPNotModified) on a page not cached in history.

    • #21428 Default to HTML parser for ‘application/xhtml+xml’ content-type.

    • Fixed an issue where redirects were resending posted data

0.7.7

  • New Features:

    • Page#form_with takes a criteria hash.

    • Page#form is changed to Page#form_with

    • Mechanize#get takes custom http headers. Thanks Mike Dalessio!

    • Form#click_button submits a form defaulting to the current button.

    • Form#set_fields now takes a hash. Thanks Tobi!

    • Mechanize#redirection_limit= for setting the max number of redirects.

  • Bug Fixes:

    • Added more examples. Thanks Robert Jackson.

    • #20480 Making sure the Host header is set.

    • #20672 Making sure cookies with weird semicolons work.

    • Fixed bug with percent signs in urls. d.hatena.ne.jp/kitamomonga/20080410/ruby_mechanize_percent_url_bug

    • #21132 Not checking for EOF errors on redirect

    • Fixed a weird gzipping error.

    • #21233 Smarter multipart boundry. Thanks Todd Willey!

    • #20097 Supporting meta tag cookies.

0.7.6

0.7.5

  • Fixed a bug when fetching files and not pages. Thanks Mat Schaffer!

0.7.4

  • doh!

0.7.3

  • Pages are now yielded to a blocks given to WWW::Mechanize#get

  • WWW::Mechanize#get now takes hash arguments for uri parameters.

  • WWW::Mechanize#post takes an IO object as a parameter and posts correctly.

  • Fixing a strange zlib inflate problem on windows

0.7.2

  • Handling gzipped responses with no Content-Length header

0.7.1

  • Added iPhone to the user agent aliases. [#17572]

  • Fixed a bug with EOF errors in net/http. [#17570]

  • Handling 0 length gzipped responses. [#17471]

0.7.0

  • Removed Ruby 1.8.2 support

  • Changed parser to lazily parse links

  • Lazily parsing document

  • Adding verify_callback for SSL requests. Thanks Mike Dalessio!

  • Fixed a bug with Accept-Language header. Thanks Bill Siggelkow.

0.6.11

0.6.10

0.6.9

0.6.8

  • Keep alive can be shut off now with WWW::Mechanize#keep_alive

  • Conditional requests can be shut off with WWW::Mechanize#conditional_requests

  • Monkey patched Net::HTTP#keep_alive?

  • #9877

    Moved last request time. Thanks Max Stepanov

  • Added WWW::Mechanize::File#save

  • Defaulting file name to URI or Content-Disposition

  • Updating compatability with hpricot

  • Added more unit tests

0.6.7

  • Fixed a bug with keep-alive requests

  • #9549

    fixed problem with cookie paths

0.6.6

  • Removing hpricot overrides

  • Fixed a bug where alt text can be nil. Thanks Yannick!

  • Unparseable expiration dates in cookies are now treated as session cookies

  • Caching connections

  • Requests now default to keep alive

  • #9434

    Fixed bug where html entities weren’t decoded

  • #9150

    Updated mechanize history to deal with redirects

0.6.5

0.6.4

  • Adding the “redirect_ok” method to Mechanize to stop mechanize from following redirects.

rubyforge.org/tracker/index.php?func=detail&aid=6571&group_id=1453&atid=5712

0.6.3

0.6.2

0.6.1

0.6.0

  • Changed main parser to use hpricot

  • Made WWW::Mechanize::Page class searchable like hpricot

  • Updated WWW::Mechanize#click to support hpricot links like this: @agent.click (page/“a”).first

  • Clicking a Frame is now possible: @agent.click (page/“frame”).first

  • Removed deprecated attr_finder

  • Removed REXML helper methods since the main parser is now hpricot

  • Overhauled cookie parser to use WEBrick::Cookie

0.5.4

  • Added WWW::Mechanize#trasact for saving history state between in a transaction. See the EXAMPLES file. Thanks Johan Kiviniemi.

  • Added support for gzip compressed pages

  • Forms can now be accessed like a hash. For example, to set the value of an input field named ‘name’ to “Aaron”, you can do this:

    form['name'] = "Aaron"
    

    Or to get the value of a field named ‘name’, do this:

    puts form['name']
    
  • File uploads will now read the file specified in FileUpload#file_name

  • FileUpload can use an IO object in FileUpload#file_data

  • Fixed a bug with saving files on windows

  • Fixed a bug with the filename being set in forms

0.5.3

  • Mechanize#click will now act on the first element of an array. So if an array of links is passed to WWW::Mechanize#click, the first link is clicked. That means the syntax for clicking links is shortened and still supports selecting a link. The following are equivalent:

    agent.click page.links.first
    agent.click page.links
    
  • Fixed a bug with spaces in href’s and get’s

  • Added a tick, untick, and click method to radio buttons so that radiobuttons can be “clicked”

  • Added a tick, untick, and click method to check boxes so that checkboxes can be “clicked”

  • Options on Select lists can now be “tick”ed, and “untick”ed.

  • Fixed a potential bug conflicting with rails. Thanks Eric Kolve

  • Updated log4r support for a speed increase. Thanks Yinon Bentor

  • Added inspect methods and pretty printing

0.5.2

  • Fixed a bug with input names that are nil

  • Added a warning when using attr_finder because attr_finder will be deprecated in 0.6.0 in favor of method calls. So this syntax:

    @agent.links(:text => 'foo')
    

    should be changed to this:

    @agent.links.text('foo')
    
  • Added support for selecting multiple options in select tags that support multiple options. See WWW::Mechanize::MultiSelectList.

  • New select list methods have been added, select_all, select_none.

  • Options for select lists can now be “clicked” which toggles their selection, they can be “selected” and “unselected”. See WWW::Mechanize::Option

  • Added a method to set multiple fields at the same time, WWW::Mechanize::Form#set_fields. Which can be used like so:

    form.set_fields( :foo => 'bar', :name => 'Aaron' )
    

0.5.1

  • Fixed bug with file uploads

  • Added performance tweaks to the cookie class

0.5.0

  • Added pluggable parsers. (Thanks to Eric Kolve for the idea)

  • Changed namespace so all classes are under WWW::Mechanize.

  • Updating Forms so that fields can be used as accessors (Thanks Gregory Brown)

  • Added WWW::Mechanize::File as default object used for unknown content types.

  • Added ‘save_as’ method to Mechanize::File, so any page can be saved.

  • Adding ‘save_as’ and ‘load’ to CookieJar so that cookies can be saved between sessions.

  • Added WWW::Mechanize::FileSaver pluggable parser to automatically save files.

  • Added WWW::Mechanize::Page#title for page titles

  • Added OpenSSL certificate support (Thanks Mike Dalessio)

  • Removed support for body filters in favor of pluggable parsers.

  • Fixed cookie bug adding a ‘/’ when the url is missing one (Thanks Nick Dainty)

0.4.7

  • Fixed bug with no action in forms. Thanks to Adam Wiggins

  • Setting a default user-agent string

  • Added house cleaning to the cookie jar so expired cookies don’t stick around.

  • Added new method WWW::Form#field to find the first field with a given name. (thanks to Gregory Brown)

  • Added WWW::Mechanize#get_file for fetching non text/html files

0.4.6

  • Added support for proxies

  • Added a uri field to WWW::Link

  • Added a error class WWW::Mechanize::ContentTypeError

  • Added image alt text to link text

  • Added an visited? method to WWW::Mechanize

  • Added Array#value= which will set the first value to the argument. That allows syntax as such: form.fields.name(‘q’).value = ‘xyz’ Before it was like this: form.fields.name(‘q’).first.value = ‘xyz’

0.4.5

  • Added support for multiple values of the same name

  • Updated build_query_string to take an array of arrays (Thanks Michal Janeczek)

  • Added WWW::Mechanize#body_filter= so that response bodies can be preprocessed

  • Added WWW::Page#body_filter= so that response bodies can be preprocessed

  • Added support for more date formats in the cookie parser

  • Fixed a bug with empty select lists

  • Fixing a problem with cookies not handling no spaces after semicolons

0.4.4

  • Fixed error in method signature, basic_authetication is now basic_auth

  • Fixed bug with encoding names in file uploads (Big thanks to Alex Young)

  • Added options to the select list

0.4.3

  • Added syntactic sugar for finding things

  • Fixed bug with HttpOnly option in cookies

  • Fixed a bug with cookie date parsing

  • Defaulted dropdown lists to the first element

  • Added unit tests

0.4.2

  • Added support for iframes

  • Made mechanize dependant on ruby-web rather than narf

  • Added unit tests

  • Fixed a bunch of warnings

0.4.1

  • Added support for file uploading

  • Added support for frames (Thanks Gabriel)

  • Added more unit tests

  • Fixed some bugs

0.4.0

  • Added more unit tests

  • Added a cookie jar with better cookie support, included expiration of cookies and general cookie security.

  • Updated mechanize to use built in net/http if ruby version is new enough.

  • Added support for meta refresh tags

  • Defaulted form actions to ‘GET’

  • Fixed various bugs

  • Added more unit tests

  • Added a response code exception

  • Thanks to Brian Ellin ([email protected]) for: Added support for CA files, and support for 301 response codes