Here comes the machine-mangled prose

Last month, Microsoft demonstrated, or at least talked about, some of the technologies that it has been working on, at its annual TechFest in Redmond.

Amongst all the in-car entertainment systems and gesture recognition, was a project to provide "Next Generation Writing Assistance".

The idea is for a word processor to incorporate a massive thesaurus of words and phrases and then use "very large language models" to suggest more interesting or expressive words than the ones you have chosen.

This is an entirely horrible idea. Let's try to pretend that somehow the database is large enough and the context-checking engine sufficiently powerful that it always picks an appropriate word or phrase and we don't end up with the kind of nonsensical gaffes that plague those who rely on automatic translators.

The very best that such a system could aim for is to produce a text that means the same as the original but at a slightly lower semantic resolution. There is no such thing as a perfect synonym; all words convey a nuance that goes beyond their basic meaning.

Ancient is not the same as prehistoric. Elephant is not the same as pachyderm. Happy is not synonymous with glad, whatever Wiktionary might think.

If you perform automatic synonym replacements according to rules coded into an algorithm, you remove some of the information from the text. That information is called style.

Of course, this is just modern-day Luddism. I don't want everyone else to be able to write lucid, expressive text because then I'll be out of a job. But I read a lot more words than I write and what I really don't want is to have to read homogenous, machine-mangled word-slurry, just because everyone is too lazy to compose a coherent sentence for themselves any more.