Monthly Archives: November 2013

Disruptive Behaviour

What a great morning! Just finished a presentation and demonstration of the power of combining two projects that are close to my heart: Review Sentinel and Ocelot.
I walked through the scenario of a post-editing workflow and editing session using Review Sentinel, some configured modules of the Okapi Framework and Ocelot. I will post more details later but the customer agreed that the value proposition was totally compelling and proposed a live pilot on the spot.
This is the enjoyable aspect of my job – shanking up existing processes with technology!
And this is just the start. We have at least three other business scenarios where the technology/process combination can destabilize the rug under people’s feet.

Text Munging with PowerGREP

I work with text A LOT. File contents are in different languages, encodings and formats and what I need to extract from them varies. Often I write small little executables which shallow parse files using regular expressions. I’m pretty fast at this but even quick change code, re-compile and run cycles can become tedious. (I know, scripting languages like Perl and Python can eradicate the compile and I have used these languages but I’m trying to consolidate my tool sets these days).

As a long time RegexBuddy customer I recently invested in Just Great Software’s PowerGREP. It was somewhat of a “hunch” purchase – the documentation gave the impression it was pretty powerful and flexible – so knowing the next text extraction task would inevitably be just around the corner, I paid the modest license fee.

The task that I’ve been cutting my teeth on with PowerGREP is this: Extract French sentences from an English-French bilingual translation memory (aligned string pairs basically) and clean them up (remove formatting codes, etc.).

What total fun it’s been!

  1. Identify the French sentences. PowerGREP allows you to section the file, that is, define portions of the file within which you want to search and ignore the rest.
  2. Pick out the part of the French sentence I’m interested in. Simple, define a capture regex to isolate the sentence from its enclosing boundary tags.
  3. But what about cleaning what I’ve captured – guess I’ll have to save those captured strings to a file and then process that. Not at all. Enable the Extra Processing checkbox and specify a list of regex replacement patterns.
  4. Save the contents to a file. As straight-forward as Save As.

Task accomplished with no coding and within a single, iterative environment. A cool product with well thought out features! Nice job Just Great Software!