162 Chapter 12 / Language Translation The translation service is dedicated software that runs apart from client applications. The translation database stores corresponding Phrases for various Languages and Abbrevi- ationTypes. (A person must populate the translation database.) Upon request, the service finds the translation given a source Phrase, target Language, and target AbbreviationType. Figure 12.4 shows a sample application table that could be subject to the translation mechanism. The phrase–to–phrase approach has a language bias. For example, the source data may be stored in English and converted to another language only upon translation mapping. Architecturally, a language bias is undesirable because users may detect the favored language. The pseudocode in Figure 12.5 illustrates the logic for finding a translation. (The pseudocode is written using the UML’s Object Constraint Language [Warmer-1999].) The basic logic is to first look for an exact match to the target language. Otherwise, if a Dialect is specified, look for the corresponding MajorLanguage. If that fails, then make one more try to look for the AllLanguage record. Figure 12.3 Phrase–to–phrase translation: IDEF1X model. languageDiscrim languageID Language languageName (AK1.1) languageDiscrim MajorLanguage majorLanguageID (FK) allLanguageID (FK) Dialect dialectID (FK) majorLanguageID (FK) AllLanguage allLanguageID (FK) phraseID Phrase string phraseEquivalenceID (FK) (AK1.1) languageID (FK) (AK1.2) abbreviationTypeID (FK) (AK1.3) abbreviationTypeID AbbreviationType abbreviationTypeName (AK1.1) phraseEquivalenceID PhraseEquivalence Person personalName birthdate birthPlace_English familyName hairColor_English eyeColor_English height weight Figure 12.4 Phrase–to–phrase translation: Person model. 12.4 Language–Neutral Translation 163 12.4 Language–Neutral Translation Figure 12.6 and Figure 12.7 show a model for a language–neutral translation service. This ap- proach separates the multiple meaning of words and phrases for a clean translation. However, you replace translatable strings with concepts IDs, limiting this approach to new applications. A Phrase is a string with a specific Language and AbbreviationType. The Language for a string can be a Dialect, a MajorLanguage, or AllLanguage. A MajorLanguage is a natural language, such as French, English, and Japanese. A Dialect is a variation of a MajorLan- guage, such as UK English, US English, and Australian English. AllLanguage has a single record for strings that do not vary across languages. Each Phrase has an AbbreviationType which is the maximum length for a string. For example, a name may be short (5 characters), medium (10 characters), long (20 characters), and extra long (80 characters). Abbreviations are especially handy for reports and forms. Phrase::getTargetPhrase (aTargetAbbrevType, aTargetLanguage) RETURNS aPhrase /* Start with all possible phrases for a source Phrase */ aSetOfPhrases := self.PhraseEquivalence.Phrase /* Second, restrict phrases to matching the target abbreviation type */ aSetOfPhrases := aSetOfPhrases->SELECT (abbreviationType == aTargetAbbrevType) /* Third, look for an exact match to the target language */ card := Cardinality (aSetOfPhrases=>SELECT(language==aTargetLanguage)); IF card == 1 THEN RETURN aSetOfPhrases->SELECT(language == aTargetLanguage) ELSEIF card > 1 THEN RETURN "Error: Ambiguous" ENDIF /* Otherwise for a dialect look for a major language */ IF TypeOf(aTargetLanguage) == "Dialect" THEN RETURN self.getTargetPhrase (aTargetAbbrevType, aTargetLanguage.MajorLanguage); ENDIF /* Otherwise for a major language look for a default */ IF TypeOf(aTargetLanguage) == "MajorLanguage" THEN RETURN self.getTargetPhrase (aTargetAbbrevType, aTargetLanguage.AllLanguage); ENDIF /* Else failure. Could not find a translation. */ RETURN "Error: No translation found" END getTargetPhrase Figure 12.5 Phrase–to–phrase translation: Pseudocode for finding a phrase. 164 Chapter 12 / Language Translation Figure 12.6 Language–neutral translation: UML model. Consider for new applications that require a robust translation approach. Language name {unique} * 1 * 1 Dialect MajorLanguage AllLanguage Phrase string 1 * TranslationConcept * 1 AbbreviationType name {unique} * 1 ConceptEquivalence * 1 0 1 1 preferredConcept {TranslationConcept + AbbreviationType + Language is unique.} Figure 12.7 Language–neutral translation: IDEF1X model. languageDiscrim languageID Language languageName (AK1.1) languageDiscrim MajorLanguage majorLanguageID (FK) allLanguageID (FK) Dialect dialectID (FK) majorLanguageID (FK) AllLanguage allLanguageID (FK) phraseID Phrase string translationConceptID (FK) (AK1.1) languageID (FK) (AK1.2) abbreviationTypeID (FK) (AK1.3) abbreviationTypeID AbbreviationType abbreviationTypeName (AK1.1) translationConceptID TranslationConcept conceptEquivalenceID (FK) conceptEquivalenceID ConceptEquivalence preferredConceptID (FK) (AK1.1) 12.4 Language–Neutral Translation 165 A TranslationConcept is the idea in a person’s mind that underlies a group of related Phrases. The premise of language–neutral translation is that an idea can be precisely ex- pressed in any Language. Of course, this assumption is not exactly true as each language has its nuances. However, it is a good approximation for translating short phrases such as those that occur in user interface screens and reports. The translation service is not intended for long passages such as those in documents and books. Table 12.2 shows a simple example. A person has the concept “truck” in mind with a translationConceptID of 2054. •A MajorLanguage of English and long AbbreviationType yields a Phrase of “truck.” •A MajorLanguage of French and long AbbreviationType yields a Phrase of “camion.” •A MajorLanguage of English and short AbbreviationType yields a Phrase of “trk.” •A Dialect of British English and long AbbreviationType yields a Phrase of “lorry.” In practice, many persons could populate data and define redundant TranslationConcepts. Multiple definitions are undesirable but difficult to avoid. These multiple definitions ripple throughout application databases and are difficult to consolidate. ConceptEquivalence provides a cross reference for synonymous TranslationConcepts and effects a logical merge. (See Chapter 11.) The application tables store translationCon- ceptIDs. ConceptEquivalence serves only as a cross-reference and is not referenced by ap- plication tables. (See the Symmetric relationship antipattern in Chapter 8.) Each occurrence of ConceptEquivalence has a preferred TranslationConcept. The translation service is dedicated software that runs apart from client applications. To use the service, an application database substitutes a translationConceptID for each translat- able phrase. For each TranslationConcept, the translation database stores the corresponding Phrases for the pertinent Languages and AbbreviationTypes. (A person must populate the translation database.) Upon request, the service finds the Phrase for the specified Transla- tionConcept, Language, and AbbreviationType. Figure 12.8 shows a sample application table that is subject to language–neutral trans- lation. The use of concept IDs works well for a new application. But it would be disruptive for an existing application to change strings to IDs. The pseudocode in Figure 12.9 illustrates the logic for finding a phrase, given a Trans- lationConcept, AbbreviationType, and Language. The basic logic is to first look for an exact translationConceptID Language AbbreviationType Phrase 2054 MajorLanguage = English long truck MajorLanguage = French long camion MajorLanguage = English short trk Dialect = British English long lorry Table 12.2 Language–Neutral Translation: Sample Phrases 166 Chapter 12 / Language Translation match to the target language. Otherwise, if a Dialect is specified, look for the corresponding MajorLanguage. If that fails, then make one more try to look for the AllLanguage record. Person personalName birthdate birthPlace_conceptID familyName hairColor_conceptID eyeColor_conceptID height weight Figure 12.8 Language–neutral translation: Person model. Figure 12.9 Language–neutral translation: Pseudocode for finding a phrase. TranslationConcept::getPhrase (aTargetAbbrevType, aTargetLanguage) RETURNS aPhrase /* Start with all possible phrases for a TranslationConcept */ aSetOfPhrases := self.ConceptEquivalence.preferredConcept.Phrase /* Second, restrict phrases to matching the target abbreviation type */ aSetOfPhrases := aSetOfPhrases->SELECT (abbreviationType == aTargetAbbrevType) /* Third, look for an exact match to the target language */ card := Cardinality (aSetOfPhrases=>SELECT(language==aTargetLanguage)); IF card == 1 THEN RETURN aSetOfPhrases->SELECT(language == aTargetLanguage) ELSEIF card > 1 THEN RETURN "Error: Ambiguous" ENDIF /* Otherwise for a dialect look for a major language */ IF TypeOf(aTargetLanguage) == "Dialect" THEN| RETURN self.getPhrase (aTargetAbbrevType, aTargetLanguage.MajorLanguage); ENDIF /* Otherwise for a major language look for a default */ IF TypeOf(aTargetLanguage) == "MajorLanguage" THEN RETURN self.getPhrase (aTargetAbbrevType, aTargetLanguage.AllLanguage); ENDIF /* Else failure. Could not find a translation. */ RETURN "Error: No translation found" END getPhrase . of 2054. •A MajorLanguage of English and long AbbreviationType yields a Phrase of “truck.” •A MajorLanguage of French and long AbbreviationType yields a Phrase of “camion.” •A MajorLanguage of. AbbreviationType yields a Phrase of “trk.” •A Dialect of British English and long AbbreviationType yields a Phrase of “lorry.” In practice, many persons could populate data and define redundant TranslationConcepts. Multiple. for a source Phrase */ aSetOfPhrases := self.PhraseEquivalence.Phrase /* Second, restrict phrases to matching the target abbreviation type */ aSetOfPhrases := aSetOfPhrases->SELECT (abbreviationType