I try to maximize the dark evenings for evaluating tools and researching topics that I don’t get time for during a normal working day.
First up for the New Year is LanguageTool. Automated Quality Assurance is an important aspect of many production processes and at VistaTEC we have deployed various commercial and in-house tools. My motivation for looking at LanguageTool was its use of Part of Speech tagging and user definable rules which can be combined with regular expressions to encode sophisticated linguistic checks.
My test domain was Marketing translations. Content full of emotive, symbolic and suggestive language.
The custom rule encoding is necessarily verbose given that the serialisation format is XML. Our internal tool, Cerberus, suffers from the same characteristics – elements and escaping of regular expression meta characters. Some of the advanced rule constructs are initially difficult to grasp. We ran many small tests in order to get to understand the operation of functionality like skip scope. Hopefully the evolving Rule Editor will support more advanced rule constructs soon and this will aid learning and perhaps speed up rule writing.
We were not able to build rules for all of the constructs that we wanted. Extending the tool via Java for these is a possibility. That will be another days work.
Next up has been Neural Networks. I’ve been viewing Andrew Ng’s Machine Learning lectures on Coursera. Learning is always helped by consulting several references and James McCaffrey’s articles have been straight-forward to understand. I really like that James included worked examples in his articles – it’s a great way of being able to check your own understanding (or lack of). Finally, it’s nice that the code examples are in C# rather than the ubiquitous Python (I can read it but let’s say it’s not my native tongue).
I’d prefer to be resident in warmer climes between November and March so that I could be out more without the need for layers of thermal underwear but I do love the different seasons.