Wednesday, April 13, 2005

COMPUTING: How the wrong choice of a script for Urdu hindered use of computers

The path for the introduction of new technology is seldom smooth or short. For personal computing in Urdu, it was not only tortuous but also very long. The people, who were supposed to give the direction, themselves took a wrong turn. Rather than following the users of the other Arabic script languages, who had adopted a character-based version of the script, an attempt was made to apply the computer technology to a primarily ligature-based script, which was not worth the time, effort and money, even if ultimately achievable. In the process, about two decades were lost, huge waste of resources.

The problem caused by a ligature-based script is not unique to Pakistan. Many non-Latin languages face a similar problem because of their ligatures. These include the national languages of Bangladesh and Sri Lanka and the Indian languages, particularly those related to Sanskrit. Their problem, however, is mitigated by the fact that the number of ligatures for these languages is only in hundreds, while it runs into thousands in case of Urdu. Consequently, the companies producing types for use in their hot-metal machines for commercial printing, like Linotype and Monotype, managed to design types for these languages but not for Urdu. Therefore, until recent years, Urdu was the only major language that depended entirely on calligraphists for composing text of books, magazines, and even daily newspapers. The publications in Arabic and Farsi still use calligraphists but only for titles, headlines and for special effects.

This paper aims at highlighting the problem that is hindering the use of computers for ligature-based scripts. UNICODE, accepted internationally, allows the allocation of a code each for every characters of every written language of the world. For compatibility and convenience, it is desirable that the script should also be character-based. There may be special needs, like as the use of ligatures like “ffi” and “fi” in English books for aesthetic reasons, but such exceptions do not cause any problem in computing.

It was ironical that all this happened in a country, which was one of the pioneers in the adoption of the personal computer. While many developing countries were merely talking and thinking about the use of personal computer, Pakistan took a lead in early 1980s by making its import free, both literally and figuratively. Those were the days of very restricted international trade, with no imports allowed unless specifically included in the list of importable items. Then there were heavy customs duties as a further discouragement. However, the import of personal computers was allowed without any restrictions. The customs duty was also waived completely soon afterwards and the exemption has continued since then, even during the times of financial crises. The Government has always treated the personal computer as an educational tool, not business equipment.

Besides personal computing, Pakistan was also far ahead of its neighbors in starting the use of Internet. BrainNet, a pioneer organization owned by three Alvi brothers, Shahid, Amjad and Basit, introduced email service in late 1980s and the Internet service in early 1990s. Other aggressive operators soon followed them. Internet is now available to three-fourth of the entire population. The connection charges are also coming down.

Naskh versus Nasta’lique. Ironically, the encouraging trend in the use of personal computers did not lead to similar growth in its use for Urdu. One reason, of course, was the widespread use of English as the language of government, business and education, a legacy of the British colonial rule. The main cause, however, has been the preference for the Nasta’lique form of the Arabic script for Urdu, which caused problems similar to those that the Chinese language faced, because it too uses thousands of ligatures.

Nasta’lique is similar in basic structure to Naskh, which is used for Arabic as well as other language that are written in the Arabic script, such as Farsi, Kurdish, Uighur (the language of the Turkic people of the Xinjiang region in China), and Jawi (in Malaysia and Indonesia). Naskh, ironically, is used also for all regional languages of Pakistan. (More about it a little later.) Naskh is taken as a representative of all other character-based type styles because it happens to be the most widely used for various purposes.

Nasta’lique was designed in Iran for Farsi in the early part of the Second Millennium, primarily to be different from Arabic because of historical rivalry. Since Farsi was the court language also of the Mughal rulers of India, Nasta’lique was the official script in the capital Delhi for centuries. It was also adopted for Urdu when the language emerged later as the lingua franca of India. Since calligraphy was the only way in those days to produce books, there was no problem until type was introduced in South Asia in late 18th century.

There was an opposite trend in the regions, which were far away from Delhi and hence remained practically independent of the central government there. In these areas, which were predominantly Muslim, the Arabic script was quite familiar because of the Holy Quran, which everybody learnt to read. Since Naskh was always used for the Holy Quran (even in the areas of the world where Nasta’lique was used for all other purposes), the familiarity of the script made it the obvious choice for the regional languages. Naskh was, therefore, the standard script even during the calligraphy days for Sindhi, Pashto, Balochi, Brahvi and Kashmiri.

Punjabi was the only exception because Farsi was the court language of the Punjab also as late as the mid-19th century even though the Sikh rulers were non-Muslims. The local Farsi tradition and the influence of the Urdu-speaking officials from the capital (Delhi), who accompanied the British rulers into the Punjab after the defeat of the Sikhs, combined to make Nasta’lique the standard script for the Punjabi language also.

For thousands of years, all writings in all languages were by hand. Using the freedom of creativity, the calligraphists applied their artistic and graphic capabilities and developed completely new scripts. They molded and modified individual letters of a script for aesthetic reasons, and created combinations of characters in the process that later came to be known as ligatures. The problem arose for the character-based scripts only after the invention of movable type. The ligatures, however, are not unique to Nasta’lique. The ligatures, such as “ff,” “ffi,” and “fi,” were used right from the beginning in the Latin script languages due to aesthetic reasons. However, these could be avoided if necessary. For example, the typewriters for Latin script never provided for ligatures due to technical limitations. Since the typewritten text was quite readable without the ligatures, the users learned to live with it. It was in printing that ligatures were considered necessary and had to be provided. Their number, however, was never large. On the other hand, every South Asian language with its own script always had the ligatures ever since mechanical type composing became available, with the number running into hundreds. But that is still a fraction of what is required for Nasta’lique! No wonder a typewriter could never be produced for it.

The introduction of typesetting. Nasta’lique has the same alphabetical characters and the variations in the shapes (initial, middle, final and independent) of various characters. The difference lies in the way the characters are placed on a base line. Naskh follows more or less the pattern of the Latin script, with one character following the other on the same base line. In Nasta’lique, some character shapes are placed vertically or in a slope to the preceding character. While a calligraphist could place these character shapes on a line quite easily, a personal computer could not, at least not in the early days.

The mechanical type composing machines replaced calligraphy during the 19th century for other Arabic script languages, using Naskh. There was a pressure even then from the traditionalists to use Nasta’lique for hand or mechanical typesetting but the technical limitations at that time were much worse than for the early personal computer. Therefore, Urdu did not fully convert to typesetting.

Several attempts were made to design a type font for Nasta’lique but the results were not acceptable. Therefore, Naskh was the only alternative if type was to be used. Interestingly, Naskh (in a modified form) was adopted in Iran itself, the birthplace of Nasta’lique. Except for Farsi, no language using the Arabic script had adopted Nasta’lique at any time, even before the invention of type. Hence, their switchover from calligraphy to type (and later to computer) was naturally smooth and prompt.

The resistance to the switchover to typesetting for Urdu got support also from the economic factor. A calligraphist was paid much less than a type compositor. Moreover, unlike mechanical composing, there was no investment required by a printer for using calligraphy. (A calligraphist, called katib in Urdu, learnt his trade on his own before seeking work.) As a result, most Urdu publishers reverted to calligraphy and the trend continued for several decades until 1980s.
However, despite higher costs, some publishers (especially Maktaba-i Jadeed and Urdu Science Board, both of Lahore) did use mechanical typesetting in Naskh for encyclopedias, major dictionaries and other larger reference works and quality books. They saw some overwhelming advantages, such as speedier input, easier corrections, better page design, faster space fitting. Most important was the perfect uniformity in output, which was extremely difficult to maintain while many calligraphists were doing the same book.

During late 1950s, in a first attempt to use mechanical typesetting for a newspaper, Lahore-based national Urdu daily, Nawa-i-Waqt, asked Linotype, the well-known manufacturer of typesetting equipment for newspapers, to create a font for it. Unfortunately, the designers of the type style attempted to come close to Nasta’lique and produced what was not very readable. It would have been much better if Linotype had modified for Urdu the common, and far more readable, Naskh font that was being used on its machines for the Arabic newspapers. The Nawa-i-Waqt readers, two-thirds of whom welcomed the use of type in a survey by the newspaper, complained about the poor readability of the type style. The creation of hot metal molds for another font would have been very expensive. As a result, the owner-editor of the newspaper, Hameed Nizami, had to abandon his pioneering effort. During 1970s, Mir Khalil-ur-Rehman, owner-publisher of the leading Karachi-based daily, Jang, started using Naskh phototypesetting equipment for some editorial matter to prepare his readers for a change. He would have switched over completely to Naskh in phases if Noori Nasta’lique had not emerged in the meantime. (More about it a little later.)

Early software for Urdu.
The personal computers in the early days had very limited processing power, memory and storage but still could handle the character-based Naskh. Therefore, several enthusiastic developers worked on software for Urdu word processing. No attempt was, however, made to modify the Arabic software for Urdu though adding the extra characters required could do it much more easily and economically.

There were three main reasons why this modification was not done. First reason was the lack of awareness. The companies, which had created the word processing software for Arabic, never realized the potential of the market for Urdu, which is spoken by more people than all speakers of Arabic put together. The Arabic market attracted great interest and investment also because of the oil boom in the region during that period. But the Arabic software engineers never realized that they could tap the huge Urdu market by doing nothing more than adding some extra characters to their font.

The second reason was the fixation, or rather infatuation, for Nasta’lique, mostly among the old people, who unfortunately were also the decision-makers. Therefore, the software developers tried to design a font as close to Nasta’lique as possible, even if it could not be the real thing due to severe technical limitations of the personal computer at that time. Since they did not use the Arabic software as the base, they had to do everything from the scratch. And, to make it worse, they all chose different paths. That made their programs incompatible with one another, making it impossible to save a file in one program and open it with another one. No organization or collective body tried coordination and standardization.

The third reason was the lack of sufficient investment, both by the government and by the private sector. As English continued to be the official language, at least at the highest levels, there was no compelling urgency to develop software for Urdu. The private sector, on the other hand, was not sure of adequate return when the government, which could be the largest single buyer, was not deeply involved, and large business companies were quite happy with English software whenever they decided to use personal computers in their offices.

Due to these reasons, the software developed for Urdu turned out to be barely adequate (especially when compared with the sophisticated packages for English), very expensive due to low sales and unattractive to users due to lack of compatibility among various packages.

Nasta’lique for phototypesetting. While the progress in the use of Urdu in personal computers was severely restricted, a major effort was made to make Nasta’lique available for photocomposing in commercial printing by Ahmad Jameel, an artist and later a leading printer of Karachi. (The following account of his product is taken from his interview, published in Karachi Urdu monthly, Science Digest, of 1983.)

In 1979, Ahmad Jameel saw a demonstration of the phototypesetting equipment for the Chinese language in an exhibition in Singapore. The manufacturer was Monotype Corporation, the British firm that had been making for about a hundred years mechanical typesetting equipment for many non-Latin script languages, especially of South Asia. The equipment was based on a patented technology that allowed perforation of a paper tape on a separate keyboard. The tape was then fed into a typecasting machine that produced every character individually. (Since the handling of individual characters was very difficult during a rush against deadlines, Linotype, another patented technology, was developed to produce column-wide solid lines of type to meet the needs of the newspaper industry.)

Monotype, with its vast experience in Arabic script languages, had maintained its hold on the market after phototypesetting technology began replacing the hot metal equipment in 1960s, using its typefaces of the earlier days in the new technology. To make Nasta’lique available was, however, still very difficult. The technology developed for the Chinese language was adequate for Nasta’lique also but there were not enough commercial prospects to justify the needed investment.

The basic requirement for producing a Nasta’lique font was the calligraphy on separate cards of all possible ligatures, which could be combined to produce words required for all types of work in Urdu. Ahmad Jameel offered to do the design work on ligatures on his own and at his own cost, leaving only scanning and subsequent steps to the firm’s engineers.

The task was so demanding and laborious that even the best Urdu calligraphists were either not capable or were not willing to do it. Consequently, helped by his artistic background, Jameel himself painstakingly did the calligraphy of 18,000 ligatures for his script, which he named “Noori Nasta’lique,” after his calligraphist and artist father, Noor Ahmad. The script was formally unveiled on December 6, 1980. Still, as experience showed, there were words for which no ligatures had been provided in the font. It was quite ironical that in such cases the machine was programmed to switch automatically to the Naskh script, proving the superiority of the character-based script. (The company kept on adding ligatures for new or missed words as the customers pointed out the omissions, especially in newspapers.)

Despite Jameel’s financial contribution, the investment of the company on the development work was so large that a single machine cost Rs 6,000,000, and that too CIF Karachi, before payment of any customs duties. (In comparison, a phototypesetting machine with Naskh for Urdu cost about one-fifth, even after payment of all duties and taxes.) Still there appeared on the scene an enthusiastic buyer for the Monotype machine. Mir Khalil-ur-Rehman, the founder and owner of the Jang Group of newspapers, was very keen to use the Noori Nasta’lique for the new edition from Lahore of his flagship “Jang” newspaper, started in October 1981. His experience at the Karachi edition had made him search for an alternative to the calligraphists, whom he considered no longer worth the money they were paid, nor able to meet the speed requirements of the modern newspaper production.

There were, however, few other takers of the expensive new technology. According to newspaper reports, the Government was persuaded to exempt the first 10 machines from customs duties and taxes and buy some machines for its own printing organizations. The tax-free machines were sold ultimately but there was no further demand.

A cheaper version. The demand for Nasta’lique at an affordable cost persuaded some young computer engineers to break the protection code of the Noori Nasta’lique and make the software available clandestinely on personal computers at a fraction of the price of the original. Monotype, faced with no further demand for its product, marketed a more affordable version but met with limited success.

When the personal computers practically took over the input work for typesetting, an Indian affiliate of Monotype developed a software package for Urdu, called “InPage,” which is marketed by Monotype and is used on its typesetting machines for output on film. Monotype, which has the copyright on Noori Nasta’lique font, allowed it to be incorporated into InPage. A “dongle” or hardware protection was added to prevent piracy. Its high price, however, tempted the hackers and a pirated version is now available at less than Rs 50.

InPage provides only the basic features of a word processing program. It pales before the sophisticated software for English, such as Microsoft Word, though its price is quite high (about US$350). While the firm will expect a substantial market to justify further investment on a major revision of the current version, the buyers will want a price that is in keeping with their own purchasing power. A customer in a developing country cannot afford to pay a price what may be justifiable in a country like the US. The dilemma may continue, with only some corporate users (like newspaper and magazine publishers) buying the authorized InPage, while others going for the pirated version.

Monotype, as a commercial organization, is naturally not happy over the situation. It may want to improve and promote the software but cannot make the required investment unless there are good prospects for a reasonable return. The position of other word processing packages for Urdu is about the same or worse. They face the same vicious circle: high price, low sales, small return on investment, low funds for further research and development.

Ahmad Jameel is probably the only relevant person who is not unhappy! He did not make any money to begin with. (In fact, he spent a lot of his own money on the project.) And the lack of substantial sales due to piracy does not offer better prospects for the future. But he is happy that his creation is serving the Urdu language. He said in a recent magazine interview that he was satisfied that his labor of love contributed to the use of Nasta’lique in Urdu publications.

The fall of Naskh. The dominance of Nasta’lique for Urdu would have ended but for a reversal caused by the impromptu action of the head of the government in 1984. Twenty years earlier, President Ayub Khan called a meeting of the experts and sought their opinion on making Naskh the official script. Ayub Khan himself was in its favor for three reasons:
a) As the country was entirely Muslim, every child learnt to read the Holy Qur’an. Since the holy book was always in Naskh, a child would have no difficulty in learning to read Urdu if it was also written in Naskh.
b) Since the regional languages were written in Naskh, the use of the same script would bring the national language closer to them.
c) The use of Naskh would create greater affinity with the Arab countries, which used the same script. (Pakistan International Airlines uses Naskh in its logo, causing a positive impact on the Arabs and other Muslims.)

The reasons, and that too coming from an army man, impressed the experts, as stated by Ishfaque Ahmad, a great writer and intellectual, and a participant of the meeting. The experts had their own reasons in favor of Naskh:
a) It allowed the addition of diacritical marks (zer, zabar, pesh), essential for correct pronunciation by children and newly literates;
b) It enabled the use of the more efficient and economical printing typesetting equipment, developed for Arabic and other languages;
c) It provided a variety of type styles for graphic designs, like other character-based fonts;
d) It could allow input of Urdu into the computers (which were mainly mainframes and minis at that time).

With the approval of the experts, the decision was taken to adopt Naskh for Urdu in all official uses. It applied also to all textbooks for schools and colleges. The decision, unfortunately, remained confined to the government. Its adoption was voluntary for the private sector. Since calligraphists were quite economical and available in abundance in those days, even the publishers of major newspapers did not purchase the phototypesetting equipment for Naskh. The same applied to publishers of magazines and books. And most calligraphists, who were trained only in Nasta’lique, were not willing to devote time and effort to learn and practice Naskh.

Consequently, while all children had their books only in Naskh, the readers of daily newspapers, magazines and general books saw only Nasta’lique. The Government did not realize the consequences of the dichotomy and allowed it to continue. Ultimately, the rising wages of the calligraphists and the falling prices of the phototypesetting equipment would have forced the private publishers to switch over to Naskh if the Noori Nasta’lique had not emerged in early 1980s – and another army man had not reversed the decision of his predecessor.

Gen. Zia-ul-Haq had great respect for teachers. In 1984, an old teacher complained to him that “the handwriting of children had deteriorated drastically after the adoption of Naskh for their textbooks.” For him there was no alternative to Nasta’lique and he suggested the reversal of Ayub Khan’s policy. (The teacher failed to realize that all languages written in the Arabic script had different styles for printing and handwriting, just as the Latin script languages do, and the children and others would evolve their own in due course of time.) With no ideas of his own, Gen. Zia promptly agreed. Without allowing a debate on the subject and consulting experts, he reintroduced Nasta’lique in all official uses, including textbooks.

The need to return to Naskh for use in computers is still pressing. At the same time, it is easier to meet it. Now, after years of development, Microsoft Windows XP and Office XP provide full support to Urdu in Naskh script. As a result, we can use the very advanced features of the software, for not only word processing but also database, spreadsheet, etc. The Urdu users can enjoy the same facilities that are available for English worldwide.

President Gen. Pervez Musharraf has brought about major reforms since he took over. He may undo the harm caused by the thoughtless reversal of policy and reintroduce Naskh as the official script for all purposes, including education, science and technology. The private sector may also be asked to switch over to Naskh. With the use of coputers, it should be only a matter of pressing a few keys.

