-
Notifications
You must be signed in to change notification settings - Fork 67
/
TODO.txt
74 lines (54 loc) · 3.05 KB
/
TODO.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
* Make sure that if there is a default value for a class that it is used
when none is provided (when reading).
* Update all project READMEs.
* When serializing, check if source property declaration is an interface
and whether the written type is declared as a subtype already. If not,
write "class" with fully qualified name. Make sure FQN are supported
on reading as well.
* Do a swipe of all class/method names to apply case-sensitivity in
a consistent way when using acronyms (e.g., XmlBlah vs XMLBlah).
* Redo all metadata constants to be more consistent in naming and syntax,
plus remove references to "collector.". Make them all prefixed the same,
including Importer (e.g., "doc.*")
Configuration:
---------------
* Maybe, have a Configurable interface with set/getConfiguration method
which will get autowired with the matching class.
or look at autowiring framework as a global addition.
Or, have annotation @config that will take the type of the property
it decorates, or accepts an optional configuration object as a value.
Dagger 2 seems best: well-maintained, lightweight and compile-time
* Consider replacing XSD with OpenAPI JSON Schema. That way, using the crawler
as a service could automatically build a swagger api and clients, in
addition that users can use a bunch of existing json-schema-based editors.
We could also integrate https://github.com/rjsf-team/react-jsonschema-form
in our SaaS app.
MUI5 example: https://rjsf-team.github.io/react-jsonschema-form/
* Have a Server mode, that would allow crawling of passed URL mixed
with config snippets, based on JSON schema.
* Remove all "restrictTo" across the board in favor of XML flow (if/else, etc).
At a minimum, if we keep it, wrap in "restrictions" to improve null handling.
(e.g., blanking a default list)
* Add this https://www.mojohaus.org/versions/versions-maven-plugin/examples/display-plugin-updates.html
* Remove all @since prior V4 since the package change
makes all classes "new" anyway?
* Remove all @author tags in favor of Git commit author?
* V3 to V4 config converter? or add a flag to support V3
config as argument?
- Consider renaming (and deprecating) classes/methods with
acronyms/abbreviation to use camel-case.
E.g. XmlSomething as opposed to XMLSomething.
That includes in Commons Lang
* For javadoc custom taglets, add a placeholder equivalent to <THIS_CLASS>
that is replaced by the fully qualified name of the current class.
* Maybe have a data store just for general session/crawl state
persistence, not just documents. This would allow us to store things
such as :
* Was queue initialization complete (useful to know when resuming).
* What else?
* On online documentation, add a flag that indicates whether a provider
of something is "nullable" as a way to disable it. Else, an exception
is thrown (nullpointer or whatever).
* Add extra validation for invalid config state. For instance,
specifying a sitemap start URL and disabling sitemap support should
prevent the crawler from doing anything.