Related video on YouTube.
I recently added a section on counting volumes to my Translation Project Management courses. During the two-hour session, we review the countable production unit types that can be taken into consideration for linguistic tasks (characters, words, lines, pages) and for technical tasks (pages, illustrations, animations). We also discuss the challenge of estimating hours, especially for some specific production steps. I feel future professionals should master this subject so they can analyse their own projects properly and work on a good basis for budgeting and scheduling. Although counting volumes does not generally pose many issues, in some cases it can turn into a finicky task that needs to be examined carefully.
Highly common projects, such as documentation localisation, sometimes include technical tasks, for instance desktop publishing and illustration localisation. All the unit sub-types should be meticulously counted, since productivity is not usually the same when working with different programs. For example, quantify the number of slides to reformat in Microsoft PowerPoint on the one hand and the number of pages in the Adobe InDesign files on the other. As the production effort will probably vary between these two tasks, unit rates and metrics must be adapted to arrive at a correct budget and schedule. Besides this point, although some discussion might arise on whether to include blank pages in the count, most of the time, counting pages is not a big deal. As far as illustrations are concerned, the first step is to identify those that need to be changed, since some might not require any translation or adaptation. We divide images containing text into those whose text can be extracted or overwritten and non-editable illustrations, which are more time-consuming. Screenshots are counted separately as the task involved is not the same as illustration translation.
Technical tasks that cannot easily be associated with countable source units, like software testing and debugging, multilingual website creation, animation rebuilding, etc., might become problematic as time estimates vary based on many factors (source material, clients' requirements, guidelines, context, resources involved, etc.). This can sometimes lead to endless discussions with clients or subcontractors as everyone tries to justify the number of hours or the budget arrived at. Unfortunately, no single process can calculate the volume of working hours needed for those specific tasks. While underestimating will result in profitability issues, overestimating might frighten clients away to seek proposals with lower costs and shorter timeframes. Only in-depth analyses, assistance from senior staff and experience can help paint a realistic picture. But it is hard to prevent misestimates on technical tasks. If you have established a trusted relationship with your clients, you can potentially make an approximation, talk openly about it with your requestors and propose to fine-tune the planned working time after performing a certain percentage of the task.
When it comes to text to be translated or revised, however you quote, at some stage, you need to check the volume you have to deal with. You might use this information to prepare your quote, plan the time you’ll need and even assess your profitability. Or you might have to share this data with your clients, employees and sub-contractors. Even though counting characters or words is fairly easy in most cases, in some projects, this task can become quite complex. If you receive the source text on paper or in a scanned format, some pre-processing might be needed to determine the volume. Rough estimates could sometimes be enough, for you or the other stakeholders, but in many cases, an accurate count is preferred. On some occasions, source programs don’t contain any statistical features displaying the number of words or characters to process. Some translation requests might also involve audio or video files, for which the amount of text is not easy to count. Some text files might contain content not to be translated or not directly accessible, like scanned sections or embedded documents. Finally, when using the analysis features in Translation Memory (TM) tools to count words or characters, you might face problems such as document corruptions, lack of support for specific file formats, or even content not well processed or tagged. All this could cause some confusion and make you lose time or money.
During the course on volumes, I also explain to my students that people using different tools or methods, or even working on other computers, can get inconsistent results. To exemplify the problem, I created a Microsoft PowerPoint presentation, adding lots of shapes, frames, effects and animations and used various methods to count the source words. I launched an analysis on my own machine with a TM tool and asked some colleagues to do the same, using other TM tools or the same as mine. One of them even used the same version as my own tool. The results were not surprisingly quite varied. The table below shows the figures we obtained, considering only final word counts:
- TM tool 1: 537 words
- TM tool 2: 473 words
- TM tool 3, version 2011: 648 words
- TM tool 3, version 2015: 619 words
- TM tool 3, version 2014
- on machine 1: 648 words
- on machine 2: 621 words
- MS PowerPoint statistics: 553 words
- Manual word count: 524 words
Due to the variation in the tools’ word counts, I decided to count the words manually, slide by slide, since, in my opinion, a manual word count could represent reality better. It was rather intriguing to see that one tool, whatever the version, was far above my own word count (from 18.1% to 23.6% more). I also found it interesting that the results of the MS PowerPoint statistical feature were close to the manual figure. In fact, I remember cases in which the TM tool analysis was much higher than the statistics shown in the layout program, which caused some conflicts with clients referring to the MS Word feature.
When I tried to understand the reasons for these differences, I found that (not exhaustive):
- The Master slide in my .PTT file contained 10 words to be translated which had been extracted 12 times by TM Tool 3.
- The translatable content of 2 frames had not been extracted by TM tool 2.
We know that tools use different word counting schemes. Nonetheless, when faced with a client asking us to justify why we have quoted 648 words when they counted 553, explaining that this is due to the tools we have chosen to use is tricky. Especially if we previously convinced them that those tools increase productivity and reduce quotes ;-). Obviously, this mainly occurs for files with heavy formatting, but it could still prove annoying.
You could overcome this problem by removing volume details from your quote, quoting per hour or indicating a lump sum. Nevertheless, you should be aware of potential issues that might, at times, create uncomfortable situations or erroneous estimates. Similarly, when using TM tools, making sure that all the translatable content has been properly identified is critical. You can double-check the target file to make sure nothing has been missed, but it is by far preferable to spot this before launching the translation process. Some file preparation might consequently be needed and, in some cases, I even recommend comparing the source text appearing in the TM tool with the content displayed in the source format to make sure everything has been properly extracted. Last tip, if available, cross-check the statistics in the source program against the final word count displayed in the TM tool.
Regardless of our role in a project, counting or checking volumes is essential in our daily management tasks. If you are the only person responsible for this task, being considered reliable is preferable, so you should ensure your counts are fair and the methods used easy to clarify. Being aware of potential issues is equally important. If you receive count data from end clients or translation agencies, be cautious and double-check them all before starting any work. Not everyone is trying to fool you, but they might have left out some important aspects of the project, failed to spot some file corruptions or were simply distracted. Whatever your case, knowing how to estimate volumes for your own work and possible pitfalls should normally help you deliver as promised and, hopefully, remain profitable.
Nancy Matis, March 2017