Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mute in ~/.tidyrc runs fine but triggers exit(1) #752

Closed
dse opened this issue Aug 31, 2018 · 21 comments
Closed

mute in ~/.tidyrc runs fine but triggers exit(1) #752

dse opened this issue Aug 31, 2018 · 21 comments

Comments

@dse
Copy link

dse commented Aug 31, 2018

If I specify a value for the mute option in ~/.tidyrc, for example:

mute: MISSING_DOCTYPE

This causes tidy to work normally, observing the mute option I specified, but it also causes tidy to output the following error message and exit with a return code of 1:

Loading config file "~/.tidyrc" failed, err = 1

Specifying --mute MISSING_DOCTYPE on the command line instead of specifying it in ~/.tidyrc does not trigger this problem.

Reproducing is simple enough. Run the following command with mute: MISSING_DOCTYPE in ~/.tidyrc:

echo '<html><head><title>foo</title></head><body></body></html>' | tidy

Then take that line out of ~/.tidyrc then run the following:

echo '<html><head><title>foo</title></head><body></body></html>' | tidy --mute MISSING_DOCTYPE
@geoffmcl
Copy link
Contributor

@dse had to build a special Windows tidy, add -DENABLE_CONFIG_FILES:BOOL=YES, and then enhance the ExpandTilde(...) service to see this...

That problem is that the service used to read the config file, in TY_(DefineMutedMessage(...) calls Report

    /* Must come *after* adding to the list, in case it's muted, too. */
    TY_(Report)( doc, NULL, NULL, STRING_MUTING_TYPE, name );

This will bump the case TidyConfig: doc->optionErrors++; break; and will output the Loading config file "~/.tidyrc" failed, err = 1...

This it the patch to begin the fix in Windows -

diff --git a/src/config.c b/src/config.c
index f5e8379..085f4b6 100644
--- a/src/config.c
+++ b/src/config.c
@@ -877,8 +877,26 @@ static ctmbstr ExpandTilde( TidyDocImpl* doc, ctmbstr filename )
     if (filename[1] == '/')
     {
         home_dir = getenv("HOME");
-        if ( home_dir )
+        if (home_dir) {
             ++filename;
+        }
+#ifdef _MSC_VER 
+        else if (strlen(filename) >= 3) {   /* at least '~/+1' */
+            /* no HOME env in Windows - got for HOMEDRIVE=C: HOMEPATH=\Users\user */
+            char * hd = getenv("HOMEDRIVE");
+            char * hp = getenv("HOMEPATH");
+            if (hd && hp) {
+                ctmbstr s = TidyDocAlloc(doc, _MAX_PATH);
+                strcpy(s, hd);
+                strcat(s, hp);
+                strcat(s, "\\");
+                strcat(s, &filename[2]);
+                return s;
+            }
+
+        }
+#endif /* _MSC_VER */
+
     }
 #ifdef SUPPORT_GETPWNAM
     else

But that does nothing about the error output Loading config file "~/.tidyrc" failed, err = 1... just lets me see it, debug it, in Windows...

Need to understand more... this happens in TY_(DeclareListItem)( doc, option, buf );... how is this by passed if the parameter is on the command line... anyway, beginning to look... any help, pointers, feedback would be great...

@ler762
Copy link
Contributor

ler762 commented Sep 2, 2018

@geoffmcl I'm having a hard time following the code. But while I'm working on that could you look into using something other than #ifdef _MSC_VER for building a windows executable? The mingw gcc doesn't define that:

$ cat ifdef.c
/*
 * see: https://sourceforge.net/p/predef/wiki/OperatingSystems/
 *      https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/64-bit-compiler
 *        _WIN32 also defined by the 64-bit compiler for backward compatibility.
 */
#include <stdio.h>
int main(int argc, char *argv[]) {
   printf("begin\n");
#ifdef __MINGW32__
    printf("__MINGW32__ defined\n");
#endif
#ifdef _MSC_VER
    printf("_MSC_VER defined\n");
#endif
#ifdef _WIN32
    printf("_WIN32 defined\n");
#endif
#ifdef _WIN64
    printf("_WIN64 defined\n");
#endif
#ifdef __WINNT__
    printf("__WINNT__ defined\n");
#endif
   printf("end\n");
}

  ------- cross-compiling under cygwin to create a windows executable
$ i686-w64-mingw32-gcc  ifdef.c -o ifdef.exe
$ ./ifdef
begin
__MINGW32__ defined
_WIN32 defined
__WINNT__ defined
end

  ------- compiling in cygwin; executable requires cygwin1.dll
$ gcc  ifdef.c -o ifdef.exe
$ ./ifdef
begin
end
$

geoffmcl added a commit that referenced this issue Sep 2, 2018
@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 2, 2018

@ler762 thanks for looking at the patch... in this case that could probably be _WIN32 or WIN32, prefer the latter, since expanding a tilde might be needed... but if you in a bash type cygwin/msys2 prompt, this may aready being done... but that is way off this topic...

Ok, this is NOT just the .tidyrc parsing problem... but is there also in a few other config file loads... which monitor the return code...

    /*
     * Look for default configuration files using any of
     * the following possibilities:
     *  - TIDY_CONFIG_FILE - from tidyplatform.h, typically /etc/tidy.conf
     *  - HTML_TIDY        - environment variable
     *  - TIDY_USER_CONFIG_FILE - from tidyplatform.h, typically ~/tidy.conf
     */

#ifdef TIDY_CONFIG_FILE
    if ( tidyFileExists( tdoc, TIDY_CONFIG_FILE) )
    {
        status = tidyLoadConfig( tdoc, TIDY_CONFIG_FILE );
        if ( status != 0 ) {
            fprintf(errout, tidyLocalizedString( TC_MAIN_ERROR_LOAD_CONFIG ), TIDY_CONFIG_FILE, status);
            fprintf(errout, "\n");
        }
    }
#endif /* TIDY_CONFIG_FILE */

    if ( (cfgfil = getenv("HTML_TIDY")) != NULL )
    {
        status = tidyLoadConfig( tdoc, cfgfil );
        if ( status != 0 ) {
            fprintf(errout, tidyLocalizedString( TC_MAIN_ERROR_LOAD_CONFIG ), cfgfil, status);
            fprintf(errout, "\n");
        }
    }
#ifdef TIDY_USER_CONFIG_FILE
    else if ( tidyFileExists( tdoc, TIDY_USER_CONFIG_FILE) )
    {
        status = tidyLoadConfig( tdoc, TIDY_USER_CONFIG_FILE );
        if ( status != 0 ) {
            fprintf(errout, tidyLocalizedString( TC_MAIN_ERROR_LOAD_CONFIG ), TIDY_USER_CONFIG_FILE, status);
            fprintf(errout, "\n");
        }
    }
#endif /* TIDY_USER_CONFIG_FILE */


    /*
     * Read command line
     */

    while ( argc > 0 )

If using int TY_(ParseConfigFileEnc)( TidyDocImpl* doc, ctmbstr file, ctmbstr charenc ), it exits with a compare of the uint opterrs = doc->optionErrors; at the start to the end of the parse ... oops, triggers bad config file parse, because of a Config: msg output, in case TidyConfig: bumps doc->optionErrors++; -

 /* any new config errors? If so, return warning status. */
    return (doc->optionErrors > opterrs ? 1 : 0);

Command line option uses tidyOptParseValue(tdoc, argv[1]+2, argv[2]), to return TY_(ParseConfigOption)( impl, optnam, val );, status = TY_(ParseConfigValue)( doc, option->id, optval );, and command line -config tidy.conf uses the same parser but ignores the return from tidyLoadConfig( tdoc, argv[2] );, so avoid the problem...

Still digging... any pointers very welcome...

@ler762
Copy link
Contributor

ler762 commented Sep 2, 2018

@geoffmcl the config file checking leaves much to be desired. I'm still trying to follow the code to see how hard it would be to fix, but check this out - I added a printf right after the doc->optionErrors++; in message.c:

        case TidyConfig:
            doc->optionErrors++;
printf("debug: got a TidyConfig option error\n");
            break;

I don't get an error for a totally bogus line in ~/.tidyrc:

$ echo $TIDY
/git/mytidy/tidy-html5/build/cmake/tidy.exe

$ cat ~/.tidyrc
malformered input
mute: MISSING_DOCTYPE
probablyAnInvalidKeyword: xporto

$ echo '<html><head><title>foo</title></head><body></body></html>' | $TIDY
debug: got a TidyConfig option error
Config: messages of type "MISSING_DOCTYPE" will not be output
debug: got a TidyConfig option error
Config: unknown option: probablyAnInvalidKeyword
Loading config file "~/.tidyrc" failed, err = 1
Info: Document content looks like HTML5
Tidy found 1 warning and 0 errors!

<!DOCTYPE html>
   ...

It doesn't even mention the existence of malformered input in ~/.tidyrc !? And loading of the config file did not fail since I don't get the
line 1 column 1 - Warning: missing <!DOCTYPE> declaration
message.

I'm not sure where to go with this - just have TY_(ParseConfigFileEnc) always return 0? Turn this into a design bug and figure out how to properly check a config file, or ???

wrt the off-topic ifdef comment It'd be nice if a windows build of tidy could be easily done using gcc.
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/64-bit-compiler
says their 64 bit compiler defines _WIN64 and _WIN32 so it seems like
#ifdef _WIN32
is the way to go, but however you want to doit.

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 3, 2018

@ler762 thanks for the feedback...

I do not understand the config file checking leaves much to be desired! You need to express more on what you desire it to do differently ;=)) ... that is, point out the code error you see, if any... it's quite complicated to follow in parts, but does its job, generally...

Yes, we are talking about an API service, tidyLoadConfig, which advertises Returns 0 upon success, or any other value if there was an error., but no it does not, if the config file contains mute: MISSING_DOCTYPE, ...

So agree there maybe a logic question here...

Chain: tidyLoadConfig - ParseConfigFile - ParseConfigFileEnc)( doc, file, "ascii" ) - uint opterrs = doc->optionErrors;, and then returns any new optionErrors.

Why does output of a Config: information message... bump this optionErrors? It does not seem like an error... just advice got your message successfully...

But as you point out, if the config file contains a bad option, like badopt: val, again optionErrors gets bumped... as it should... I must check lines that do not have a colon... maybe they are silently skipped...

So this points to at least, an adjustment/enhancement in the { TC_MAIN_ERROR_LOAD_CONFIG, 0, "Loading config file \"%s\" failed, err = %d" },... and maybe the API docs...

Maybe something like "Loading config file \"%s\" failed, or had option parsing errors as indicated. err = %d", or the like...

In the 3 cases shown in the above code, the failed to open the file, would already be reported, so could be just "Config: file \"%s\" had option parsing errors as indicated, if not muted. err = %d"...

But the problem I see is that the mute is more an informative message, and maybe should be excluded from bumping optionErrors...

Alternatively, these 3 additional error message outputs could be removed from tidy.c... we have already checked the file exists, and the return of the tidyLoadConfig be ignored, ie not checked... like it is in another case... -config tidyrc command line, uses just tidyLoadConfig( tdoc, argv[2] );, no return check...

For the API docs, suggest at least -

diff --git a/include/tidy.h b/include/tidy.h
index c701890..c1bbd41 100644
--- a/include/tidy.h
+++ b/include/tidy.h
@@ -481,16 +481,16 @@ TIDY_EXPORT void TIDY_CALL        tidyGeneralInfo( TidyDoc tdoc );
     
     
 /** Load an ASCII Tidy configuration file and set the configuration per its
- ** contents.
- ** @result Returns 0 upon success, or any other value if there was an error.
+ ** contents. Reports config option errors, which can be filtered.
+ ** @result Returns 0 upon success, or any other value if there was an option error.
  */
 TIDY_EXPORT int TIDY_CALL         tidyLoadConfig(TidyDoc tdoc,      /**< The TidyDoc to which to apply the configuration. */
                                                  ctmbstr configFile /**< The complete path to the file to load. */
                                                  );
 
 /** Load a Tidy configuration file with the specified character encoding, and
- ** set the configuration per its contents. 
- ** @result Returns 0 upon success, or any other value if there was an error.
+ ** set the configuration per its contents.  Reports config option errors, which can be filtered.
+ ** @result Returns 0 upon success, or any other value if there was an option error.
  */
 TIDY_EXPORT int TIDY_CALL         tidyLoadConfigEnc(TidyDoc tdoc,       /**< The TidyDoc to which to apply the configuration. */
                                                     ctmbstr configFile, /**< The complete path to the file to load. */

Ideas, feedback welcome...

OT: Just checked in my msys2 bash prompt and 1. HOME is defined, so the WIN32 code would not be required, and 2. the tilde has already been expanded, so the WIN32 code would not be required. But maybe there is a case for a native mingw-x64 build, like __MINGW32__... Will try to check that... Meantime, is WIN32 also defined?

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 3, 2018

PS: Actually the 3 cases in tidy.c already checks for the files existance, and ignores an access failure, so we are not dealing with a missing file case... I am starting to feel that ignoring the return code in this sample code is the best option... but feedback welcome...

@ler762
Copy link
Contributor

ler762 commented Sep 3, 2018

@geoffmcl after looking at the code much more, maybe ignoring the return code or having TY_(ParseConfigFileEnc) always return 0 isn't the way to go.

consider:

$ cat ~/.tidyrc
malformered input
mute: MISSING_DOCTYPE
probablyAnInvalidKeyword: xporto
indent: yes
tidy-mark: no

$ echo '<html><head><title>foo</title></head><body></body></html>' | $TIDY -q
debug: got a TidyConfig option error
Config: messages of type "MISSING_DOCTYPE" will not be output
debug: got a TidyConfig option error
Config: unknown option: probablyAnInvalidKeyword
Loading config file "~/.tidyrc" failed, err = 1
<!DOCTYPE html>
<html>
  <head>
    <title>
      foo
    </title>
  </head>
  <body>
  </body>
</html>

$

indent: yes and tidy-mark: no aren't printed out, so maybe printing
Config: messages of type "MISSING_DOCTYPE" will not be output
is the error? Seems right - take a look at typdef TidyReportLevel in include/tidyenum.h

/** Message severity level, used throughout LibTidy to indicate the severity
 ** or status of a message
 **
 ** @remark These enum members all have associated localized strings available
 **         via their enum values. These strings are suitable for use as labels.
 */
typedef enum
{
    TidyInfo = 350,         /**< Report: Information about markup usage */
    TidyWarning,            /**< Report: Warning message */
    TidyConfig,             /**< Report: Configuration error */
    TidyAccess,             /**< Report: Accessibility message */
    TidyError,              /**< Report: Error message - output suppressed */
    TidyBadDocument,        /**< Report: I/O or file system error */
    TidyFatal,              /**< Report: Crash! */
    TidyDialogueSummary,    /**< Dialogue: Summary-related information */
    TidyDialogueInfo,       /**< Dialogue: Non-document related information */
    TidyDialogueFootnote,   /**< Dialogue: Footnote */
    TidyDialogueDoc = TidyDialogueFootnote, /**< Dialogue: Deprecated (renamed) */
} TidyReportLevel;

Considering the comment that TidyConfig is supposed to be set for a config error, the code that wants to print the Config: messages of type "MISSING_DOCTYPE" will not be output message should have set message->level to TidyDialogueInfo

Which would leave only the issue of malformered input being silently ignored?

ifdef OT: compiler defines:

$ touch foo.c; i686-w64-mingw32-gcc -dM -E foo.c | grep -i win
#define _WIN32 1
#define __WINT_MAX__ 0xffff
#define __WINT_MIN__ 0
#define __WIN32 1
#define __WINNT 1
#define __WINNT__ 1
#define __WIN32__ 1
#define __SIZEOF_WINT_T__ 2
#define WIN32 1
#define __WINT_TYPE__ short unsigned int
#define WINNT 1

$

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 4, 2018

@ler762 - Thanks for looking deeper into the code... the best clue was to look at message->level...

Agree 100% with having TY_(ParseConfigFileEnc) always return 0 isn't the way to go. That is, no change in the API, except perhaps some improvements in the documentation... maybe should say clearly valid config lines, except mute, are silently marked in the active config related to the document created... no success message output... or something... but have pushed a small change...

Using a html input, echo "hello world" > hw.html... then tidy hw.html... and a valid config file contents of -

C:\Users\user>type .tidyrc
mute: MISSING_DOCTYPE
mute: TAG_NOT_ALLOWED_IN
mute: PREVIOUS_LOCATION
mute: INSERTING_TAG
mute: MISSING_TITLE_ELEMENT
mute: STRING_CONTENT_LOOKS
show-info: no

Now mute: <valid id> is presently reported as TY_(Report)( doc, NULL, NULL, STRING_MUTING_TYPE, name );... that is controlled in the static struct _dispatchTable[]... presently set at TidyConfig level... consider making that TidyDialogueInfo, or even TidyInfo...

F:\Projects\tidy-html5\build\temp-752>Release\tidy.exe C:\Users\user\hw.html
messages of type "MISSING_DOCTYPE" will not be output
messages of type "TAG_NOT_ALLOWED_IN" will not be output
messages of type "PREVIOUS_LOCATION" will not be output
messages of type "INSERTING_TAG" will not be output
messages of type "MISSING_TITLE_ELEMENT" will not be output
messages of type "STRING_CONTENT_LOOKS" will not be output
Tidy found 4 warnings and 0 errors!

<!DOCTYPE html>
<html>
<head>
<meta name="generator" content=
"HTML Tidy for HTML5 for Windows version 5.7.16.I752">
<title></title>
</head>
<body>
hello world
</body>
</html>

Maybe that is ok... then trying TidyInfo gets Info: messages of type "MISSING_DOCTYPE" will not be output... so this is ok too...

Is this message a Dialogue: Non-document related information, or Report: Information about markup usage... seems in this case prefer the latter, but not sure why...

But this does not address the config error cases... like noopt: val, but have slightly improved removing failed, replacing it with problems... but still see ignoring this return, like we presently do more 1 in 4 cases in tidy.c... I do not see this being extended to the potential 3 config file reads done before the command line as a problem... but there is an argument for keeping it, but better documenting that potentially many options could have been silently successfully parsed from the config file...

I can now see if the line does not contain a colon, :, it is quietly discarded... should such a line be flagged as a TidyConfig error? half anf half - do not like silently accepting, since we do have a comment line indicators, # or //... user should be warned that a certain line failed, so they can fix it...

Have pushed some of these fixes to the issue-752 branch, so others can try, and advise preference...

Still exploring more here... feedback very welcome...

OT: hope closed ;=))

@ler762
Copy link
Contributor

ler762 commented Sep 5, 2018

@geoffmcl I'm glad you figured it out - I was still struggling.

I did a build from your issue-752 branch - question: how to supress the Info: message?

$ cat ~/.tidyrc
malformered input
mute: MISSING_DOCTYPE
# probablyAnInvalidKeyword: xporto
tidy-mark: no

$ echo '<html><head><title>foo</title></head><body></body></html>' | $TIDY -q
Info: messages of type "MISSING_DOCTYPE" will not be output
<!DOCTYPE html>
<html>
   -- snip --

OT: hope closed ;=))

Yes, looks good. Thank you :)

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 5, 2018

@ler762 again thanks for checking out issue752 branch, building and testing...

question: how to supress the Info: message?

A quick answer is, put as the first line, in the config file, ~/.tidyrc, the option show-info: no. In fact I have moved it to the top in the above sample config...

Remember these, potentially 3 config files, are read BEFORE the command line is parsed, so nothing on it, like -q, --show-info no, ... will have any effect on these already issued messages, like Info: blah blah ..., for mute, and errors like rubber: duck shows Config:unknown option: rubber, ...

It seems an error like tidy-mark: duck will show Config: missing or malformed argument for option: tidy-mark, which seems repeated twice???

At present, an option error will cause Loading config file "~/.tidyrc" problems, err = 1 to also be emitted. I am suggesting this particular message be deleted/removed/suppressed in tidy.c... messages about problems in the config file parsing have already be reported by the library, through the proper channels...

And if you are directly using libtidy then you can setup a callback, and choose to output some other message, or suppress these, as desired...

So I am suggesting tidy.c, the console app issue none, or the minimum of direct message...

And no one reponded on whether the line like malformered input, like any line lacking a colon, should be silently accepted...

Looks like we already have a typo correction, PR #753... thanks...

Waiting for further feedback...

@ler762
Copy link
Contributor

ler762 commented Sep 5, 2018

Remember these, potentially 3 config files, are read BEFORE the command line is parsed

To remember, one has to have know at some time in the past :)

Wow! This bug report has opened a can of worms:

$ cat ~/.tidyrc
show-info: no
tidy-mark: no
mute: MISSING_DOCTYPE
invalidKeyWord: mu

$ cat ~/.nullrc
nullrc: option


$ echo $TIDY
/git/tidy/tidy-html5/build/cmake/tidy.exe

$ echo '<html><head><title>foo</title></head><body></body></html>' | $TIDY -config ~/.nullrc
Config: unknown option: invalidKeyWord
Loading config file "~/.tidyrc" problems, err = 1
Config: unknown option: nullrc
Tidy found 1 warning and 0 errors!

<!DOCTYPE html>
<html>
<head>
<title>foo</title>
</head>
<body>
</body>
</html>


$

So the man page needs clarification that

       -config <file>
              set configuration options from the specified <file>

adds the specified config file for options processing and does not replace or prevent processing of ~/.tidyrc

I haven't checked the regression test docs, but it probably needs something about having a ~/.tidyrc invalidates the testing
and the example of running the regression test should check for the existence of a ~/.tidyrc, throw an error and quit if it exists.

And no one reponded on whether the line like malformered input, like any line lacking a colon, should be silently accepted...

No, it should not be silently ignored; it should generate a warning.

So I am suggesting tidy.c, the console app issue none, or the minimum of direct message...

Sounds good, but it kind of makes it harder to debug if there's more than one options config file.
see above

Config: unknown option: nullrc

It'd be nice if tidy told you which config file had the problem.

I don't have time to work on this today - hopefully tomorrow I'll be able to work on it & submit a matching change to the regression test.

$ diff -U 3  ../cases/testbase-expects/case-629.txt ../cases/testbase-results/case-629.txt
--- ../cases/testbase-expects/case-629.txt      2018-08-29 15:42:23.173861400 -0400
+++ ../cases/testbase-results/case-629.txt      2018-09-05 16:37:25.934450700 -0400
@@ -1,6 +1,6 @@
 Config: option "mute" given bad argument "FAKE_TAG" (STRING_ARGUMENT_BAD)
-Config: messages of type "MISSING_ENDTAG_OPTIONAL" will not be output (STRING_MUTING_TYPE)
-Config: messages of type "MISSING_ENDTAG_FOR" will not be output (STRING_MUTING_TYPE)
+Info: messages of type "MISSING_ENDTAG_OPTIONAL" will not be output (STRING_MUTING_TYPE)
+Info: messages of type "MISSING_ENDTAG_FOR" will not be output (STRING_MUTING_TYPE)
 line 18 column 1 - Warning: <p> proprietary attribute "download" (PROPRIETARY_ATTRIBUTE)
 Info: Document content looks like HTML5 (STRING_CONTENT_LOOKS)
 Tidy found 2 warnings and 0 errors!

geoffmcl added a commit that referenced this issue Sep 5, 2018
@ler762
Copy link
Contributor

ler762 commented Sep 6, 2018

@geoffmcl The more I think about it the less I like tidy not processing command line options first.

What do you/anyone think about having tidy process the command line options first and allowing them to over-ride default behaviors?

If nothing else, it'd be nice to be able to run the regression tests without having to make sure ~/.tidyrc doesn't exist

@geoffmcl
Copy link
Contributor

geoffmcl commented Sep 7, 2018

@ler762 sometimes, perhaps, you need to think a little more before you leap... but thank you for your feedback ... here and there ;=))

The more I think about it the less I like tidy not processing command line options first.

I am mainly a Windows person, but in the unix app world that seems one thing clear... if there exists a ~/.<proj>rc, or the like, then this is processed before the command line... sets up a default environment... a good thing... I think... you oppose this? wow, ok... tell me more... where did you get this command line first idea? ... Would you describe yourself as a Unix, or Windows person?... just interested...

In running the regression tests, at least in windows, there is a test for ENV HTML_TIDY to make sure you are not overriding the tests with a config file, like ~/tidyrc, or any other config file... there should also be a test for ~/tidyrc, and abort if found...

So agree 100%, before running the regression tests, check for such a config files existance, otherwise all bets are off on the tests succeeding... many of which depend on a very specific config environment of their own being established... simply, must be no pre-command line configuration file loaded, silently, mostly... unless there are errors, some of which are also silent...

This suggests a PR in tests... in tools-sh, and tools-cmd... look forward to suggestions, patch, PR...

Just saying, maybe not time to adjust the tests just yet... but thanks for testing and running... and the results...

So working on regression test changes before this bug is solved seems time not well spent... but agree it is important... to see, to know, the impact on this or that change... as it evolves... so, thanks...

We need to finalize this issue... get it accepted... discussed/agreed.. merged into next, then, ...

If there are some tests update required, then that is the time for a PR's on that...

Need more feedback on this can of worms issue...

@ler762
Copy link
Contributor

ler762 commented Sep 7, 2018

@geoffmcl Yes, I oppose having a ${HOME}/.XXXrc file that can't be over-ridden by a command line option. For tidy, adding an option like -config-file <filename> that would tell tidy to read filename instead of .tidyrc and another -no-config-file option that would tell tidy to not process any initialization file would be my preference. (or pick some other name for the option. I just want to be able to over-ride reading .tidyrc)
BUT I think that's too big a change too late in the game. Something like that should go into the next release.

where did you get this command line first idea?

It seems like most unix utilities allow you to over-ride processing the .XXXrc file.
eg. bash --norc or curl -q. dig is the only one I've found that doesn't let me over-ride processing the .digrc file

Would you describe yourself as a Unix, or Windows person?

Neither. Windows used to win on ease of use + software availability & cygwin made it a pretty nice platform, but I just can't accept Windows 10. The last version of Unix I liked was SunOS, so that should date me pretty well :) Linux seems to like chasing the latest shiny whatever.. altho Debian stable seems pretty good at long-term support, so that's what I'm trying now

In running the regression tests, at least in windows, there is a test for ENV HTML_TIDY

envars always seemed like a better-than-nothing workaround. I don't much like 'em, but I've used them when there wasn't some other way.

This suggests a PR in tests... in tools-sh, and tools-cmd... look forward to suggestions, patch, PR...

I really liked Nancy Reagans' "Just Say NO" suggestion. So I said no to building & testing if ~/.tidyrc existed:

$ cat build-and-test.sh
#!/bin/sh

# make sure regression tests aren't borked
if [ -f "${HOME}/.tidyrc" ] ; then
   echo "Having a ~/.tidyrc tends to break regression testing."
   echo "Please try again after renaming ~/.tidyrc"
   exit 1
fi

# starting from my local tidy git repository
cd tidy-html5/build/cmake
    ... etc ...

At this stage of the game, I'd suggest documenting the issue & leaving a better fix for later.

Need more feedback on this can of worms issue...

I liked your issue-752 branch; I think that should go into the next release.
With the exception of removing the Loading config file "~/.tidyrc" problems, err = 1 message, I'm not sure if anything else should go in for the next release except some doc updates.

  • man page needs clarification
  • regression test docs need something about ~/.tidyrc possibly breaking the tests
  • I like the idea of adding an example of building tidy & running the regression test, but that's a "nice to have" not a "should have"

I'd leave off fixing the parsing of .tidyrc & changing/adding command line options processing for later.

Anything I left off?

@ler762
Copy link
Contributor

ler762 commented Sep 10, 2018

@geoffmcl take a look at #754. I think this removes the need for most of the documentation changes.

So the man page needs clarification that ...

no longer needed. Tidy processes the -config file command line option first and, if present, doesn't read any other initialization file.

I haven't checked the regression test docs, but it probably needs something about
having a ~/.tidyrc invalidates the testing ...
...
the example of running the regression test should check for the existence of a ~/.tidyrc ...

no longer needed. The regression tests specify a -config file on the command line, so ~/.tidyrc isn't read. One can run the regression tests with a ~/.tidyrc and/or have HTML_TIDY set and the regression tests still work properly.

See why I don't like having a ${HOME}/.XXXrc file that can't be over-ridden? :)

@geoffmcl
Copy link
Contributor

@ler762 well this issue seems to have morphed into a Feature Request to alter the pre-command line config file reading, present since the birth of tidy, or at least the 2000 source I have, seemingly just because of your don't like ... file that can't be over-ridden ;=))

While some unix apps do have an over-ride option, many do not. I know at least one app that will create such a file, even if one does not exist... so it does seem a choice, which tidy has made... a long time ago...

But for starters, with tidy, you can over-ride it now, several ways... one of which is to set the environment HTML_TIDY=.somenulrc... then tidy will not read the default ~/.tidyrc... What is wrong with this? Except I seem to remembering reading something like you also don't like using the environment? ... oh, well... can not please everyone I suppose...

Also, except for some new options, like mute, you can also over-ride any options given in such a pre-command config file by the contents, and setting of a -config tidy.conf, read later... this opens the idea that all options should be reversable, like say a --nomute XXXX, but that is getting way off topic...

And I have no problem informing a tidy regression tester that they should carefully check before running the tests... they are not strictly casual user tests... such testers are already building tidy from source, preaumably for testing a fix, feature, etc, so would be familiar to ensuring the right test environment exists, or all bets are off...

So I for one, am against PR #754 altering this part of tidy history... love how sort of modular the source is, in that it can be done with one variable and bit of cut and paste... but as always may be swayed by feedback...

Look forward to further comments on the issue-752 branch... thanks...

This was referenced Sep 16, 2018
@ler762
Copy link
Contributor

ler762 commented Sep 17, 2018

@geoffmcl Isn't that the way open-source works -- don't like ... make better :-)

Considering your earlier comment about having to build a special windows tidy & enhance the code, I can sort of see why you wouldn't want to change anything; this doesn't affect you. But the standard tidy build on *nix defines TIDY_CONFIG_FILE and I don't see a way to prevent tidy from reading /etc/tidy.conf. Along with no way to tell tidy to be quiet while processing the file other than adding a show-info: no as the first line of the file.

That tidy has been doing it this way since forever carries about as much weight with me as mute in ~/.tidyrc runs fine but triggers exit(1) has been the way tidy works for however long. Tidy can be made better.

I have no problem informing a tidy regression tester that they should carefully check ...

I have a problem with that, because if the order of processing is changed to cli options first, it's not necessary. No work-arounds needed, no code changes to add a no-$WHATEVER option to undo an option set in a config file you can't prevent tidy from processing, no need for hidden variables (ie. envars), no work-arounds necessary.

If tidy is changed to do cli options first & read one init file if the user doesn't give a -config *filename* option seems like a clear improvement.

@geoffmcl
Copy link
Contributor

@ler762 sorry if I found the reasoning of don't like just a little amusing... while don't like is probably the first driver that causes me too to look for change, I will try to back it up with valid facts, reasoning, ... and present those, leaving out the purely emotive don't like argument as implicit, implied... but maybe just me...

I am all for change if it improves tidy... that is makes if better, in every OS, or at least in the OS where the change applies, if specific... I try my best to have tidy's well being at heart...

And in open source, where there are choices, sometimes what one person likes can be disliked by others... even used as a loving, expected, features of the app by some, or ignored, or worked around by others... sometimes generating noise...

So it seems here the question is something like What is the correct order for reading, dealing with config files?, or change the processing of init files, or... and more... A quite difficult and involved choice...

The creators, and all maintainers since then, chose to read certain config files before processing the command line, as the norm... as [m]any good apps do... And in all that I can read around, that seems a valid choice still for today, for an 18+ year old, The granddaddy of HTML tools, with support for modern standards app... seek references that show this or otherwise...

Now @ler762 suggests we change that... maybe he makes some points... but none that sway me that it is a clear improvement to tidy... maybe... but...

I could argue that if I take the steps to create a ~/.tidyrc, add env HTML_TIDY=something or an /etc/tidy.conf file, an active, positive step - no such thing is created by a tidy install - then I expect that/those to be read before the command line...

But that I can also add one, or more, -config xxxx.conf in the command line that does not effect the above reading, but does modify, over-rides, the final config used for the following docs... repeated...

That is what tidy presently does, as documented in say man tidy...

Look forward to some comments from others... time to change, or not?

@ler762
Copy link
Contributor

ler762 commented Sep 18, 2018

@geoffmcl no problem; don't like was what got me here in the first place asking for a change :) In any case, I've been thinking more about the reasons for why I don't like..

Now @ler762 suggests we change that... maybe he makes some points... but none that sway me

OK - take a look at CMakeLists.txt

if ( UNIX AND SUPPORT_CONSOLE_APP )
  option ( ENABLE_CONFIG_FILES "Set to OFF to disable Tidy runtime configuration file support" ON )
  ...
else ()
  option ( ENABLE_CONFIG_FILES "Set to ON to enable Tidy runtime configuration file support" OFF )

*nix users have automatically processed init files by default. Windows users don't see the problem because there are no automatically processed init files on windows unless you go out of your way to enable them.

I'm in the *nix user category, so I do have a problem. For example, the tidy regression tests fail because I have a ~/.tidyrc (I was both surprised & pleased to discover that #754 fixed that :)

A bit far-fetched, but consider what happens if someone decides to go along with
https://wiki.debian.org/XDGBaseDirectorySpecification
and builds tidy with -DTIDY_CONFIG_FILE=~/.config/tidy (or whatever the correct syntax is)
Whatever work-arounds are added to the regression tests to check for a tidy init file & exit/warn if one exists fail because tidy has been configured to use something other than ~/.tidyrc as an init file.

Or what happens to an end-user whose sysadmin decides to create an /etc/tidy.conf

I help out on a project that uses tidy to clean up docbook generated html. I can't even remember to do a git checkout next before doing a git checkout -b aslr, so there's no way I'm going to always remember to move my ~/.tidyrc aside before doing a make doc .. which means I'll end up not having a ~/.tidyrc to make sure that I don't screw up a doc build.

In short, #754 enables *nix users to have their kate and edith too :)
(oops... my wife just read that line)

echoing Geoff's request:
Looking forward to comments from others.

@geoffmcl geoffmcl mentioned this issue Oct 23, 2018
@geoffmcl
Copy link
Contributor

geoffmcl commented Nov 7, 2018

Still waiting for feedback on my PR #764 which addresses several issues brought up... namely add windows tilde expansion 94e62b2, plus 8c3ef4b, 4ba6106, ... etc...

@ler762, these seem valid regardless of the of the Feature Request to change the load order, and add a new option, no-config, is accepted or not...

I think for the regression tests case, being able to disable such default configs is important... but that is a discussion to be had in its own issue... will try to get around to adding that, if someone does not beat me to it...

Hope to get around to merging PR #764, and some others, soon... still marching towards maybe a 6.0.0 release, maybe for new year...

@geoffmcl
Copy link
Contributor

@ler762, I have lost or forgotten the main thread here... a lot has been said... have now merged #764... and some others...

Am closing this... if there are outstanding issue, then feel free to re-open...

Or better still open a new, clean issue... thanks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants