Tech News (2019, week 36)

Please find below the latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. 

Translations are available: Bahasa Indonesia • ‎Deutsch • ‎Tiếng Việt • ‎français • ‎polski • ‎português do Brasil • ‎suomi • ‎čeština • ‎Ελληνικά • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎فارسی • ‎中文 • ‎日本語

Due to Wikimania and summer vacation, no issue of Tech News has been distributed last week.

Recent changes

  • You can use the new termbox interface if you edit Wikidata on a mobile device. This is to edit labels, descriptions and aliases easier on the mobile pages. [1]
  • The new version of MediaWiki has been deployed during the last week.
  • The previously announced change of positions of the “Wikidata item” link on all wikis has been rollbacked due to unexpected cache issues. [2]
  • The limit for rollbacks has been increased from 10 to 100 rollbacks per minute. [3]
  • The advanced version of the edit review pages (Recent Changes, Watchlist, and Related Changes) now include two new filters. These filters are for “All contents” and “All discussions”. They will filter the view to just those namespaces. However the “All discussions” filter does not include pseudo talk pages, like discussions that are in the Project: or Wikipedia: namespaces. But it will include changes happening on Project talk: or the Wikipedia talk:[4]

Changes later this week

  • The new version of MediaWiki will be on test wikis and from 3 September. It will be on non-Wikipedia wikis and some Wikipedias from 4 September. It will be on all wikis from 5 September (calendar).
  • When you log in, the software checks your password to see if it follows the Password policy. From this week, it will also complain if your password is one of the most common passwords in the world. If your password is not strong enough, please consider to change your password for a stronger password[5]


Future changes

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe to receive it on wiki.

Building a CRM to better support diversity and growth in Wikimedia

Wikimania 2019 Volunteers — Wikimania 2019 Volunteers.jpg () by Rudolf H. Boettcher, CC-BY-SA-4.0.

The Wikimedia movement aims to become a platform that serves open knowledge to everyone. We are focusing on building strong and diverse communities that reflect the world, including those that have been left out by structures of power and privilege. In order to better support the growth of this thriving movement, the Wikimedia Foundation is improving the community infrastructure to provide safe spaces and equitable processes for all participants.

As we build strong and diverse communities and as we break down social, political, and technical barriers, a certain limitation emerges: we cannot expect to know everybody in the Wikimedia movement. The Wikimedia Foundation supports the engagement and development of volunteers in many ways, from calls for feedback to scholarships for events or project grants. We must provide equitable support to everyone, with special consideration to new and emerging contributors and underrepresented communities. For that, we need to establish a platform to increase our breadth and depth of knowledge about Wikimedia communities worldwide.

Meet the upcoming Wikimedia movement CRM (an acronym that we are defining as Community Relationship Manager). As of today it is just a plan for a first iteration between now and June 2020. The main goals of this plan are:

  • Set up a CRM based on CiviCRM software that is accessible to Wikimedia Foundation teams interacting with communities. 
  • Introduce data reflecting the main structures of the movement: affiliates, committees, Wikimedia projects. Goals include: Make it easy to contact committees and keep track of their changes in membership, thus keeping a better record of our collective history. This will pave the road for broader groups to be included and more sophisticated queries.
  • Whenever possible, offer a registration tool to Wikimedia events, from Wikimania to a small editathon. A next step will be to handle scholarships of these events, offering the committees approving scholarships better infrastructure for fair judgement.

There are many other possible use cases, and we will explore them as time allows. For example:

  • Offering a form where volunteers can sign up to receive information about projects or programs Provide organizers of training courses and editathons a process to issue certificates of completion and badges for achievements.
  • Provide the GLAM community and other projects working with partners the possibility to share contacts and relationships with external organizations. 

Check our initial plan and share your feedback here. The Community Relations team is leading the development of this CRM program. A top priority for us is to assure that our workflows for storing and handling this information follow best practice standards of security and privacy. We are committed to the transparency of this program. We will keep sharing details about the development of the CRM and its use.

Also, we have just published a job posting for a CRM Specialist (LinkedIn, Twitter), the person who will lead this program. Please help us spread the word about this opportunity, and if you think you may be right for the position, submit your application. 

Upcoming Wikimedia events for September 2019

English: Panorama of Kew Gardens and the TEMPERATE HOUSE , taken from the PAGODA . This is a composite image showing the passage of a year, from spring blossom, through Summer greens and autumnal reds, through to winter snow.
This is a photo of listed building number
Four seasons at Kew Gardens.jpg () by James Morley, CC-BY-SA-3.0.

September brings the end of seasons (Winter/Summer) and yet the movement carries on. We’ve collected some upcoming events from across the movement to share. Did we miss any? Add your events to the calendar, and leave a comment with suggestions!

September 1

September 5

September 7

September 8

September 9

September 12

September 13

September 14

September 15

September 19

September 20

September 21

September 24

September 26

September 27

September 28

September 29

Month-long Events

Recurring Events

Wikisource For A Social Justice:Story of Gujarati Wikimedians Helping the Visually Impaired

Gujarati Wikimedians have been working on the Audio Book project on Wikisource to help the visually impaired people. An Interview with the lead contributor Mr Modern Bhatt by Abhinav Srivastava with inputs from Sushant Savla. 

Gujurati Wikisource logo – image by Dsvyas CC BY-SA 3.0 Unported 

Q.1) It is always said Indians love Wikisource, but, Audio Books? That isn’t a routine. How did the idea come up?

I am a regular visitor to a blind school in Bhavnagar which also happens to be my hometown. I help students with English and Mathematics and quite often on request by students, I used to narrate stories to them. It was then that the school director, who is also blind himself, proposed an idea of having pre-recorded books. 

The audio recording then started and there is no looking back. 

Q.2) Is there a specific thematic area where you work upon say a specific Gujarati author? 

If I have to name just one, then it has to be Jhaverchand Meghani. Ever since my childhood till today, have fondness and admire his contribution to the Gujarati literature. He wrote on the history of the Saurashtra region of Gujarat from where I belong. He travelled from one village to another discovering facts and evidences. 

Q.3 ) The Open Knowledge Movement is for a better society, however, its end-merit remain incidental. Your initiative directly helps the visually imparied. What motivates you?

Obviously those students from the blind school however it is also encouraging to know that normal people like you and me busy in daily routine for bread and butter are also the beneficiary. Audio books saves time. 

Q.4) Did peers from your community also join you in the initiative? Tell us how they read this? What kind of conversations happen around Audio Books?

I am humbled with the support I receive from my fellow Gujarati Wikimedians like User:sushant_savla and others. We have an active Whatsapp Group where we regularly debate and discuss Audio books project. To share a more precise and latest update, we are sampling voice for Women authors and also selected few, to name,  Bharti Chavda.

Q5.) Unheard and unfound, there are a lot of challenges, efforts and struggle that go in with passionate Wikimedians? Tell us something in brief.

I was working on a book, ‘Saurashtra ma Rasdar’ which has 28 Chapters and roughly 350 pages. While working on that book, I was a victim of sore throat and had to consult a doctor. The doctor gave me a few medicines and I recovered my voice. 

That’s all. Otherwise, as they say in Gujarati, I have been in, ‘Majama’.


Q6.) Back in your mind, you would have done the maths on number of books which you wish to complete. There would a rational maths to say the number of books that are practically possible but there would also be a dream number. Tell us about that dream number and more. 

I would like to answer this differently. The Audio Books Projects still happens to be very new and we are learning and gaining experience everyday. Not quoting any number but having as many number of books is the aim. I retire from my job this July 2019  and would devote much more time to the project and make the maximum possible.

Q7.) India has Wikisource active in so many languages? Any message for them on Audio books?

Gujarati Wikimedians have the highest regard for each and every Indian language , they show the diverse Indian culture. We all are always there to assist to the best of our potential.  A learning to share would be, struggle in finding volunteers. That’s an important area that needs to be contemplated. 

Q8.) Gujarati language has a very close connection with Kutchi language. Kutchi does not have a Wikimedia project and remain in incubation. Do you believe something like Audio books could provide a stimulus in their growth?

Well, that needs to be seen but Yes there is a possibility. I can say there are a lot of material for Kutchi language to be worked on Wikisource. Necessary we find a group committed volunteers to take it up. Also, I would like to mention, Blind people association have shown support to host activities in the Kutch area of Gujarat.   

Q9.) Tell us something more, do you also edit on other Wikimedia Projects ? Briefly share your experience. 

At the moment, I devote all my time and energy to Wikisource. However, someday maybe definitely. All Wikimedia Projects are public goods for welfare. 

Q10.) Tell us something about your personal life? Where do you belong? What do you do in your personal life etc?

I stay in Bhavnagar, city of Gujarat and I am a banker with The Bank of India. India is a developing world and there are so many unbanked. My professional life deals with developing saving habits and promoting financial inclusion for a better India. 

Originally published by Wikimedia India on 25 June, 2019.

Working with Structured Data on Commons: A Status Report

English: American University SOC students helping National Archive scan files and photos.
Editathon at national archive.-American University COMM535.JPG () by Xiaweiyang, CC-BY-SA-3.0.

The beginnings of Structured Data on Commons have been available for a little over half a year now, so let’s take a look at how editors can already work with it, and what more is coming soon. (Disclaimer: though the author is a Wikimedia chapter employee, this post is written in a volunteer capacity only.)

What’s already available

You can, of course, edit the structured data (captions and statements) directly on the file pages. Like any other changes, these edits will show up in the page history, in recent changes, on your watchlist, etc., so other editors can see, inspect, patrol, improve or undo them as usual. This is a great way to get started with Structured Data and get a grasp on how it works.

The Upload Wizard supports structured data as well, and you can set captions on each file before uploading it (and, like with the description, categories, etc., you can copy one file’s captions into remaining files, if you want to use the same caption for a whole batch of uploads), as well as edit each file’s statements.

Another way to add Structured Data is offered by the ISA tool, which is focused on improving the metadata of pictures uploaded as part of “Wiki Loves …” campaigns. It allows participants to add captions in different languages, as well as “depicts” statements, to photos that are part of the campaign (as selected by the campaign coordinator via a category). The coordinator can optionally limit a campaign to only captions or statements if they don’t want to overwhelm their participants or they think that only one of those aspects is necessary.

The Wikipedia Android app also allows you to edit the captions of images embedded in Wikipedia articles. (The iOS app doesn’t seem to have any such feature.)

You can also search the structured data in the regular wiki search, using special search keywords. The full documentation is at mw:Help:Extension:WikibaseCirrusSearch, but the most important keywords are hascaption, incaption and haswbstatement: hascaption:en searches for files that have an English caption, incaption:"search text" searches for “search text” in a file’s captions (and not in its description, categories, etc.), and haswbstatement:P180 searches for files that have a matching statement. All of these can be combined with other search terms as usual – for example, “adoptado hascaption:es -hascaption:fr haswbstatement:P180=Q146” searches for files that depict cats and where the (non-structured) description contains the word «adoptado» (“adopted” in Spanish) which have a caption in Spanish but not in French.

There is also a way to edit the statements of multiple files at once: the user script Add to Commons / Descriptive Claims (AC/DC), written by yours truly, lets you add the same collection of statements (including qualifiers) to a whole set of files. You can use this, for example, to add a suitable “depicts” statement to all the files in a category. (But make sure that all the files actually depict the category subject and are not merely related to it! This wouldn’t work at all for Category:Käthe Kollwitz, for example, because it combines media depicting her with media by her. Sometimes suitable subcategories like Category:Potraits of Käthe Kollwitz exist.)

And finally, if you’re a technical expert you can always use the MediaWiki and Wikibase APIs directly to make any edits you want – for example, User:Multichill did this during the Wikimedia Hackathon 2019 in T223746.

What’s coming soon

A full-featured SPARQL query service for Structured Data on Commons is in the works (T141602); this basically blows the haswbstatement search keyword mentioned earlier out of the water, letting you search not just for simple “has statement” matches but providing a powerful way to query the whole data graph. For example, this will make it possible to search for files that were taken anywhere within a certain city (without having to mention that city on each file – connections from districts etc. to the surrounding city are already on Wikidata), or files depicting animals within a certain family or order. It will also allow users to query the qualifiers of statements, which is not possible in the regular search either. Regular search will remain the best way to search within the file captions (or traditional descriptions), but fortunately the two can be combined using MWAPI.

Lua support is also underway; this will make it possible to embed the structured data in the wikitext, usually via templates. For example, {{Location}} could be updated to get the coordinates from the structured data (specifically the property coordinates of the point of view) if they are not specified as a template argument, similar to how on many Wikipedias, {{official website}} gets the official website from Wikidata if it’s not specified as a template argument. Other templates could also automatically categorize images based on their structured data, similar to how {{Wikidata infobox}} already adds some parent categories to category pages based on the information in Wikidata. This will be up for discussion and implementation by the community, of course.

We can also expect to see support for Structured Data on Commons in more tools. QuickStatements, the Swiss Army knife for editing Wikidata, will hopefully gain support for editing captions and statements on Commons soon (T181062 – in fact there is some very rudimentary support already, but it’s so fragile that I don’t want to give any guidance on it). This will allow for more fine-grained editing than the AC/DC user script mentioned above, though I hope that AC/DC will remain useful as a more user-friendly tool for a common use-case. Support for the Pywikibot library (T223820) and the Pattypan upload tool (T181057) are also planned. And tools should learn to work better together: PagePile support in VisualFileChange or Cat-a-lot and AC/DC would allow you to select a set of files using the former tools and then add statements to all of them using the latter, by exchanging the selection of files via the PagePile tool.

Tech News (2019, week 34)

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you.

Translations are available: Bahasa Indonesia • ‎Deutsch • ‎English • ‎Tiếng Việt • ‎español • ‎français • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎中文 • ‎日本語

Tech News

  • There will be no Tech News issue next week. The next issue of Tech News will be sent out on 2 September 2019.


  • Some abuse filters stopped working because of a code change. Only variables for the current action will work. Variables defined inside a branch may not work outside of that branch. You can read more to see how to fix the filters.
  • Only six accounts can be created from one IP address per day. Between 12 August and August 15 this was two accounts per day. This was because of a security issue. It is now six accounts per day again. [1]

Changes later this week

  • Only a limited number of accounts can be created from one IP address. An IP address can be whitelisted so that it can create as many accounts as needed. This is useful at events where many new persons learn to edit. IP addresses that are whitelisted for this reason will also not show CAPTCHAs when you create accounts. This will happen on Wednesday. [2]
  • The new version of MediaWiki will be on test wikis and from 20 August. It will be on non-Wikipedia wikis and some Wikipedias from 21 August. It will be on all wikis from 22 August (calendar).


  • You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 21 August at 15:00 (UTC). See how to join.

Future changes

  • There is an RFC about creating a new global user group with the right to edit abuse filters. This will be used to fix broken filters and make sure all filters will still work when software changes happen. You can read more and comment.
  • Special:Contributions/newbies will no longer be working. This is because of performance reasons. It showed edits by new accounts. You can see this in the recent changes feed instead. [3]

Tech news prepared by Tech News writers. More information on how to contribute is available at Tech/News#contribute

Translate • Get help • Give feedback • Subscribe or unsubscribe for on wiki distribution.

Structured Data on Commons Part Three – Multilingual File Captions

Wikimedia Commons

Wikimedia Commons holds over fifty million freely-licensed media files. These millions of images, sounds, video, documents, three-dimensional files and more contain a vast amount of information related to the contents of the file and the the context for the world around them. As Commons has collected files over the years, the volunteers who curate and maintain the site have developed a system to contain and present this information to the world, using MediaWiki, wikitext, and templates.

A description template is the first and primary way information about a file is show to users. These templates can be a powerful tool for displaying information about files; descriptions provide meaningful context and information about the work presented. Descriptions can be as long as the user would like, providing wikitext markup and links for others to find out more. Description templates can also hold translations by adding language fields. However, the Structured Data team saw some areas that a feature like captions could improve upon from descriptions templates.

A description template that contains a lot of well-organized information, but might not be serving the purpose that a caption can. Template credit Wikimedia Commons, text may be CC-BY-SA 3.0 where applicable.
A caption for the same file based on facts in the description template.

Multilingual captions help share the burden of descriptions by providing a space to describe a file in a way that is standard across all files, easy to translate, and easy to use. Captions do not support wikitext so there is no knowledge needed of how to links work in this space — links can still be provided in the more expansive file description. Captions are added during the upload process using the UploadWizard, or they can be added directly on any file page on Commons. The translation feature for captions is a simple interface that requires only a few steps to create and share a caption translation.

Adding other languages to a caption.

The “multilingual” in “multilingual captions” highlights a primary focus of Structured Data features: opening up access to Commons to as many languages as possible beyond its present capabilities. This is enormously beneficial to the Wikimedia movement and Wikimedia Foundations’ mission of sharing knowledge with the world. In addition to captions, future features planned provide supporting adding “statements” from Wikidata to files, effectively describing them in an organized way that can be accessed by programs and bots to present media. These statements can be multilingual as Wikidata supports translations, which will make statements searchable in any language that has a translation provided.

Next: Part Four – Depicts Statements

Previously: Part Two – Federated Wikibase and Multi-Content Revisions

Half a Million Articles Translated With Content Translation

File does not exist : Book-covered_walls_(Unsplash).jpg

Content Translation achieved a new milestone, supporting already the creation of 500,000 Wikipedia articles. The Language team has been working during the last year to make the tool more solid, and has plans to expand the use of translation to help more communities to grow.

Wikipedia users can learn about many topics. However, the exact number of topics they can access is very different depending on the language they speak. While English speaking users can access more than 5 million articles, Bengali speakers have access to 75 thousand articles.

Translating articles into new languages is a practice that can help content to propagate more fluently across languages, and reduce this language gap. To facilitate this process, we here at the Wikimedia Foundation developed a content translation tool that helps Wikipedia editors to easily translate articles. Content Translation simplifies translating Wikipedia articles into different languages by automating many of the boring steps of the manual translation process.

In early August, Content Translation reached a new milestone: more than half a million articles were created since the tool was released four years ago, making this a good time to reflect on the impact of the tool and discuss future plans.

A more reliable tool

During the past year, the Language team worked on a new version of the tool. Based on user research and feedback, the plan was to create a more solid version of Content Translation to increase the tool adoption and use.

For the new version we replaced the default editing surface provided by the browser with Visual Editor, which supports rich wiki content in a way that is much more reliable. This required a rewrite most of the translation tools, and we wanted to take this opportunity to review them and provide better guidance for newcomers.

As the new version became more complete it was gradually exposed more prominently during the year, and finally replaced the previous version completely without major regressions. During the year more than 149.000 translations were created, a 23% increase compared to the previous year.

We started conversations with different communities to identify the main blockers before the tool could be provided by default and exposed to more users.

Better collaboration between humans and machines

In addition to the number of articles created, we focused on the quality of the content. The new version improved the guidance provided to newcomers. In particular, a new system was created to encourage users to review and edit the initial machine translation, and approaches based on Artificial Intelligence were explored to improve some automatic steps.

Content Translation provides machine translation as initial content for editors to review and improve. The machine translation is provided as a starting point, and translators are highly encouraged to rewrite the content, in order to eliminate errors and make the translation sound more natural. 

The new version incorporates new quality control mechanisms for machine translation. Now the tool encourages translators to review the initial automatic translations on a paragraph basis, keeps in a tracking category those translations published with unmodified content for editors to review, and prevents publishing those which exceed the limits defined. The limits to prevent publishing become more strict for users with previous deleted translations, users ignoring the warnings, and cases where several paragraphs contain unmodified contents. In this way, the limits adapt to reduce potential recurrent misuse of the tool.

This system can be customized to address the particular needs of each community, and proved to be useful to help the Indonesian Wikipedia editors to reduce the creation of low quality translations.

In general, our measurements suggest that translations are less likely to be deleted than the articles started from scratch. The survival rate for translations even when those are created by newcomers seems quite good. A recent study shows that a significant percentage of the translations created with the tool survive the community review. Although the survival rate is better for experienced users, it is still very good for newcomers (users that created their account during the last 6 months). For example, only 7.5% of translations created by newcomers in last june were deleted after a month.

In addition, Artificial Intelligence is becoming more present in the tool to make the initial translations better:

We believe that automation with adequate quality control mechanisms makes it easy for translators to create higher quality translations more easily.

Future plans

Translation has helped already many communities to create new content. However, there are still communities with potential to grow by using translation that have not been using the tool as much.

Content Translation’s Boost initiative is aimed at expanding the use of translation to help more communities grow. By enabling new and more visible ways to contribute by using translation, we expect communities to attract new editors, and expand the knowledge available in their languages.

We identified potential for expanding its use to more contexts that can benefit from translation:

  • Translation can be used by more wikis. The adoption of Content Translation varies significantly from wiki to wiki, and there are wikis with potential to benefit from using translation more.
  • Translation can be used in more ways. Currently, Content Translation focuses on creating new articles on desktop. Supporting new kinds of contribution such as expanding existing articles with new sections, or mobile translation enable more opportunities to contribute.

During the next months we will focus on wikis with potential to grow by translation. As a representative set of those wikis we have initially selected Malayalam, Bengali, Tagalog, Javanese, and Mongolian. We’ll be contacting these communities to gauge the interest in the project, and learn about their particular needs to support them better. We expect these and similar communities to benefit as a result.
Our specific plans will be heavily influenced by research in the selected communities and their feedback. Please, provide any feedback about this initiative in the discussion page. We are interested in hearing your ideas on how to help communities grow by using translation.

Tech News (2019, week 33)

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. 

Translations are available: Bahasa Indonesia • ‎Deutsch • ‎English • ‎Tiếng Việt • ‎español • ‎français • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎中文 • ‎日本語

Recent changes

  • Editors using the mobile website on Wikipedia can opt-in to new advanced features via your settings page. This will give access to more interface links, special pages, and tools. Feedback on the discussion page is appreciated. [1]
  • Due to the absence of volunteer maintenance of Cologne Blue skin, the link to activate it will be hidden. The skin will still work, but editors using it are encouraged to switch to another skin. [2]

Changes later this week

  • Due to Wikimania, there is no deployment this week. [3]


  •   You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 13 August at 15:00 (UTC). See how to join.

Future changes

  • The “Wikidata item” link will be moved from “Tools” to “In other projects” section on all Wikimedia projects, starting on August 21st. Full announcementPhabricator task.

Tech news prepared by Tech News writers. More information on how to contribute is available at Tech/News#contribute

Translate • Get help • Give feedback • Subscribe or unsubscribe for on wiki distribution.

By digitizing venerable translations, they’re bringing the world’s literary history to Punjabi speakers

English: The Municipal Library Patiala
The Municipal Library Patiala.jpg () by Wikilover90, CC-BY-SA-4.0.

The Punjabi-language Wikisource is the fastest-growing Wikimedia project in the world. Rupika Sharma, a volunteer Wikimedia editor and community member, writes about one of the initiatives that has helped made this a reality.

Imagine a world where you grew up in a world where the greatest literary works in history never existed.

For many of the world’s language speakers, this can be their functional reality. Titles like these have either never been translated, or were translated decades ago and now hide in ancient paper-bound texts on dusty library shelves.

As an example of this problem, let’s take a look at the Punjabi language. Separated as part of the 1947 partition of British India, the language is today spoken by 120 million people in regions of Pakistan and India. I’m one of them. I grew up in northwest India and can still remember hearing about Chambe Diyan Kaliyan, a short story collection by Leo Tolstoy that was adapted into the Punjabi by Abhai Singh. That particular book is frequently cited in the history of Punjabi literature as one of the first collections of short stories to be published in the language.

You’ll note, though, that I didn’t say I can remember reading it—I’ve never been able to track  down one of the published books to read it for myself, nor have I been able to find anything but a bunch of pop-culture songs with similar titles when I search for it online in Punjabi. All of which is to say that when I was growing up, reading and learning from Tolstoy’s story was functionally impossible for Punjabi speakers.

Thankfully, times are changing. While there are still many barriers to surmount, the advent of the internet has made the fundamental problem of publishing and distributing of translations far easier. The Wikimedia community has an entire project devoted to this sort of thing: Wikisource.

Wikipedia 18, Patiala — Wikipedia 18, Patiala (15 January 2018) 01.jpg () by Satdeep Gill, CC-BY-SA-4.0.

Bringing the lost literature of long-forgotten times into the modern era for interested users, Wikisource is a free e-library that provides freely licensed or public domain books free of cost, in different formats, and able to be used for any purpose. It is one of thirteen collaborative knowledge projects operated by the Wikimedia Foundation, the largest of which is Wikipedia, and Wikisource is available in nearly seventy languages.

The Punjabi-language Wikisource was and is small compared to other language Wikisources, and to grow this resource, I formed a partnership with a government library in the Indian city of Patiala to digitize public domain books. By making rare literature books accessible in languages that have little to no presence online, Wikisource serves the common people, allowing them to freely browse these resources.

As a titled Wikimedian-in-residence at the library, I helped their staff scan a selection of important books. The collaboration brought forty-two public domain Punjabi-language works online—including a reprint of Chambe Diyan Kaliyan, the Tolstoy short story collection. But just making the scanned images available online isn’t enough; they are not easy to read and often rank low in search engines. Wikisource plays a crucial middleman role: they host the images and pair them with searchable text versions, created and vetted by volunteers. They’re helped in this process by Jay Prakash’s IndicOCR, a new tool that helps to easily transcribe any Indic language to Wikisource. (It replaced an older Linux-based tool that could not be used on many devices.) In addition, Wikisource makes everything available in different file formats so that readers can download whatever works best on their device, whether it’s a computer, tablet, phone, or otherwise.

Finally, Wikisource also allows anyone to contribute, and so I helped organize an online contest, held from December 2018 to January 2019. Prize offerings and in-person trainings brought around three dozen new volunteers to the project, including twenty-four who made more than fifty edits. Kuljit Singh Khuddi, a new volunteer who joined Punjabi Wikisource during the contest, says that “I am proud to be able to contribute to my mother tongue on Wikisource. Such contests help make my language known worldwide.”

The results were stark—the contest made the Punjabi Wikisource the fastest-growing Wikimedia project in the entire world in both content and editors. As of October of last year, the Punjabi Wikisource contained a bit over 1,200 pages. By January of this year, it had over 6,770 belonging to 200 different books. Moreover, over 6,000 of these pages had been proofread by volunteers.

The growth of the Punjabi Wikisource through the contest and other volunteer work is just a beginning. There are a number of opportunities for supporting the project with technical contributions and GLAM partnerships with different government organizations and institutions.

Moreover, they’re just one of several expanding Wikisources in the region. The Wikisources for the Indic languages of Marathi, Kannada, and Assamese each more than doubled in size in the last year, and with every edit, they’re bringing the sum of all knowledge into their own mother tongues.

Rupika Sharma, Wikimedia community member

Originally published by Rupika Sharma on Wikimedia News 11 July 2019