I have been working for the last few weeks on designing a new intuitive way to visualise Review Sentinel conformance data at a document level.
Ideas have culminated in what I’m calling a Document Profile. This is essentially a scatter chart of the individual segment conformance scores sorted by score in ascending numeric (descending conformance) order. This is plotted and labelled with a single overall numeric indicator for the document.
Conformance scores cannot be naively aggregated (summed or averaged) because a document with a large number of good conformance scores and few very poor conformance scores could conceivably give an overall result similar to a document that contains all medium severity conformance scores.
Instead we have identified a threshold score which ideally the vast majority of segment scores would fall below. We can then express the number of segments below this threshold as a percentage of the whole document.
D3 was easy to come up to speed on. There are masses of sample charts on their web site and as always a good Pluralsight course. I was able to prototype quickly and easily using Plunker, JSFiddle, etc.
It’s a testament to Angular’s clear, concise and modular architecture that I was able to learn everything I needed to without writing practically a line of code. I have few slots of contiguous focus time these days but I was able to pick an Angular concept (scopes, binding, directives) and study it for an hour here and there (on planes, buses and walks). I then did all of the coding practically in a single sitting.
The finished article is small, modular, elegant and very easily enhanced.
I’ll leave a deployment here for a short while for people to play with. The data is for one US English source document machine translated into three languages: Spanish, French and Brazilian Portuguese. Three charts show the conformance of the raw MT output against a human translated reference corpora and the fourth shows the conformance of the Brazilian Portuguese document after post-editing (using directed post-editing effort in Ocelot).