I want to go on a bit of a digression here to try to justify my feeling that a "code" editor is the best starting place. This seems counter to many people's instincts. TEI is complicated, so why can't we use technology to make editing it simpler? My argument against this position goes like this:
- Technology usually can't make anything simpler, although it can often make tasks easier. You can hide complexity or move it somewhere else.
- Texts are fiendishly complex, multifacted, multivalent things. Why would you expect something that can model texts to be simple?
- We can build simpler, "tagless" editors, but only in cases where we know up front what we want the outcome to look like. That is, we can do this for specific cases, but not for the general case, not without broad acceptance of a general model for text anyway.
#1 will seem obviously, intuitively wrong to most people. Technology has made our lives simpler in so many ways, after all, right? But what computers actually do for us is to take routinized tasks and package them up into executable functions: Find me the nearest coffee shop and tell me how to get there. Show me a list of books in the library in this subject area published after 2009. It can't tell me whether I will like that coffee shop, nor which, if any, of the books in that list will be crucial to my research. At some point, actual engagement with the material in question is needed. We should understand a thing before making intellectual judgements about it. Of course, very often we don't do this, and very often it's possible to produce a result without really understanding what we're doing, and in general we suffer no repercussions for this. But it seems not unlikely that you will get better results from such understanding, there are domains (and I'd argue that scholarship is one) where doing the work to achieve understanding is important. Can we create useful models of text without understanding what we're doing when we model texts? We can get into arguments here about whether text encoding is a scholarly activity, and I can only say that my position is that it is indeed, and that therefore concealing how it works risks undermining the basis of any argument you make with it. Any application of technology to a problem involves making decisions about what to represent and what to discard, how to quantize continuously varying data, how to represent uncertain information, and so on. You might be fine if you outsource those decisions, but you might not. We cannot avoid having to make these sorts of compromises, but you might be better off if you had a basis for making them.
#2 is a related question. From the perspective of brackets, we don't know what sort of text modeling users might want to do, and therefore we can't know how to simplify it. Furthermore, making decisions about what our users are able to model means removing some of their agency. It's certain that some compromises will be forced upon us—maybe we'll have to pick a single schema type to work with, for example, or a limited set of predetermined schemas. And making that choice wouldn't mean the end product wasn't useful, but it would be less useful than an editor that could work with any schema. We have to realize that when we say "simple" here, we actually mean "easy" and the simple fact is that complex tasks can only be made easy by taking decision-making power away from the use and giving it to the developer. The upshot is, that until I know what task you want to accomplish, I shouldn't try to automate it, and "I want to mark up a text" isn't a comprehensive enough description of a task to permit sensible decisions about simplifying it.
#3 follows on from the above, but has some very practical ramifications. If we're going to build some sort of WYSIWYG (or WYSIWYM—what you see is what you mean) editor, then we have to figure out just what it is you're going to see, as well as what it means. In other words, we have to come up with a visual language flexible enough to represent TEI markup sensibly. For some things this is straightforward, a <head>
is some sort of title, so we can make the text bigger and display it as a block. But what's the appropriate visual representation for <placeName ref="https://pleiades.stoa.org/places/727070">Alexandria</placeName>
? or <choice><orig>Thys</orig><reg>This</reg></choice>
? We can develop a set of conventions for all the TEI things that lack a "natural" visual representation, but our users will have to learn that representation. Might that mental energy not be better used learning TEI? This kind of thing can work in an environment where there already exists a set of rules for representing meaning in texts, but that sort of environment is necessarily a specialized, rather than a generalized one. There isn't a single, universal set of visual conventions for representing the features of a text.
So to sum up, while I'm not at all against the idea of building visual editors for specific markup tasks, I don't believe it's reasonable to build a "one size fits all" version. Actually, I think such a thing might turn out to be actively harmful!