Grants:APG/Proposals/2017-2018 round 2/Wikimedia Indonesia/Progress report form
Purpose of the report
[edit]This form is for organizations receiving Annual Plan Grants to report on their progress after completing the first 6 months of their grants. The time period covered in this form will be the first 6 months of each grant (e.g. 1 January - 30 June of the current year). This form includes four sections, addressing grant metrics, program stories, financial information, and compliance. Please contact APG/FDC staff if you have questions about this form, or concerns submitting it by the deadline. After submitting the form, organizations will also meet with APG staff to discuss their progress.
Metrics and results overview – all programs
[edit]Metrics | Goals | Achieved outcome | Status | Explanation | ||||
---|---|---|---|---|---|---|---|---|
1. number of total participants | 1443 | 1919 |
|
We reached our metrics for a year in one semester, since in the first semester, we did a lof ediathon and training as our community emerging because of our effort by building two more community spaces in two cities. We also got a lot of new partners from the community. | ||||
2. number of newly registered users | 1150 | 1560 |
|
See the above explanation. | ||||
3. number of content pages created or improved, across all Wikimedia projects | 31300 | 13554 |
|
On track. In the first semester, the Wikidata project still in building the tools for importing the data, we expect the outcome will increase before June 2019. | ||||
4. Active collaborations | 44 | 42 |
|
We signed the MoU with several universities and GLAM partners and collaborate with various local community partners, including the government, the private sector, and informal community. | ||||
5. Volunteer hours | 10702 | 5045.5 |
|
Background
[edit]In 2018, Wikimedia Indonesia has two big grants (Wikimedia Indonesia and Ford Foundation) and various small grants and in-kind donation (e.g. European Union, Goethe-Institut Indonesien). We are trying to build the capacity, reduce the gender gap in Indonesia art world, and support the women artist for making art creation, curatorial, research, and travel via Cipta Media Ekspresi, an open call grant by Wikimedia Indonesia and Ford Foundation. We have more than 1,400 applicants with 41 grantees and distribute more than USD 230,000 grants amount. We also did some competition, such as Ganesha Project with Goethe-Institut Indonesien to create and improve social science content in Indonesia Wikipedia; we also held EUforia Project, a joint-collaboration project with European Union to improve EU-related articles in Indonesian Wikipedia.
In 2018, Wikimedia via APG Grants have 4 umbrella programs, Education, GLAM, Content Creation, and Community Engagement. In Education, we train new editors, capacity building for our volunteer trainers, and encourage education institution to involve more in Wikimedia projects. In GLAM, we collaborate with various partners to provide free local contents, from old magazines, also photo collections. In Content Creation, we focus to improve Wikidata contents, increasing photo and Wikipedia articles about Indonesian cultures, also collaborating with a local university to build Javanese OCR for Wikisource. In Community Engagement, we allocate funds for Wikimedia projects volunteers and planning to have first Indonesian WikiConference, named WikiNusantara.
Telling your program stories – all programs
[edit]Education
[edit]Since 2016, we rebuild and maintain regular education program within Indonesia. We brought a lot of local contents, including Indonesian arts, culture, woman artist, and various Indonesian related topics. As we received enormous request to hold an education program from various partners, communities, education institution, and several other organization, we did our first Training of Trainers program for our volunteers. We trained them by building their capacity, including planning an event, how to make a good presentation, how to be a good presenter, also tweaking minor technical problem during the events. We hope they can be our “front guard” as we understand that we need their help to collaborate and to increase the visibility of Wikimedia movements and projects by training and educating various new partners and new volunteers, and they are precious assets to us because we are nothing without them.
WikiLatih (WikiTraining)
[edit]On 25th and 26th August 2018, we invited 20 volunteer trainers from various cities, including Padang, Bandung, Yogyakarta, and Cirebon to gather in our two days workshop. This workshop aims to give them advanced training on how to conduct a well-prepared WikiLatih. They are also our movement ambassador, so we want to improve their capacity building and we hope they can understand well Wikimedia projects and movements and can explain how our project and movements work; Wikimedia projects policies and guidelines; solving the usual troubleshooting issues during the event, and understand well about Creative Commons license. During the event, we bond them and the result is significant, we have got a lot of opportunities with new partners from the participants, including Wikipedia Goes to Campus, a collaboration with one of the participant's lecturer in Andalas University, Padang, West Sumatera. The lecturer, Pramono, invited Padang Wiki Community to teach how to contribute in Wikipedia, 5 local Wikipedians (Hardi, Denas, Adhmi, Sonia, and Riska) including from this project teaching 3 class in the last semester; we have got 121 new articles in Minangkabau and Indonesian Wikipedia, also 95 new images in Commons. Pramono would like to continue our collaboration by seeing this result.
Narrative
[edit]WikiLatih is our education and outreach project that has been started since 2016. This project connects us to new potential partners and encourages a lot of new volunteers, also strengthen the local community. We hope through this project, Indonesian people understand how important sharing knowledge is and understand well how our movement works. We believe that outreach program for our movement should be continuously held.
Wikilatih brought much new partnership institution to Wikimedia Indonesia, such as Syiah Kuala University (Banda Aceh), Imam Bonjol Islamic State University (Padang), and International Animal Rescue Foundation (Bogor). We also thanked Creative Commons Indonesia and also members of Wikimedia Indonesia because many of those institutions were first introduced by them. Last but not least, we keep maintaining our relationship with longtime partner institutions such as Indorelawan, Embassy of Sweden, Museum of Modern and Contemporary Art in Nusantara (MACAN), and Lontar Foundation.
Embassy of Sweden invited us again to collaborate in WikiGap in Padang, West Sumatra. We also invited by Indorelawan to participate in National Volunteer Month 2019. Museum MACAN, who have Open Source History program, brought in art teachers and art critics to write in Wikipedia – they believe that Wikipedia is the best place to write all knowledge about art. We hope that we can maintain a good relationship with them while disseminating the importance of Wikipedia for society. We also plan to reach more partners in eastern Indonesia, as we do not have a lot of partnership in that area.
Lesson learned
[edit]In recent years, we received a lot of requests to do WikiTraining, as we understand that all requests can not be fulfilled, we tried to solve the problem by having Training of Trainers program, to increase volunteers capacity and involvements in offline events.
Wikipédia Menyang Sekolah (Wikipedia Goes to School)
[edit]This project has opened a gate for further collaborations with educational institutions, not only for a similar project but also for other possible Wikimedia Indonesia’s projects. By collaborating with the educational institutions, especially working with people of the field of Javanese language and literature, we also find interesting topics as well as questions from the students that we can bring into discussions with the Javanese Wikipedia community. The most noted one is their feedback that shows the difference in understanding about Javanese spelling, which needs to discuss further between the community and the educational institutions.
In this project, we also open a simple internship that is expected to help in organizing this project. This internship lasts from November 2018 to January 2019 and so far has shown good outcomes. We selected two people that met the qualification as interns. The interns have helped a lot with the preparation and execution of WMS, especially in preparing the references for students in writing articles and doing the documentation of the project itself. They managed to find references on local culture that benefits article creation for this project. Apart from that, the interns are also ready to be involved not only in WMS but in other projects, so that this internship admittedly has desired results.
Narrative
[edit]Wikipédia Menyang Sekolah (WMS) is our pilot project on education that focuses on the Javanese Wikipedia. This is the first project involving the Javanese Wikipedia community in order to integrate the Wikipedia editing activities with the schools/universities’ curricula. We manage to find three educational institutions that are willing to collaborate with us. The collaboration will be done with the teachers and the students from the field of the Javanese language and literature in each educational institution.
By implementing this project, we expect to change belief in learning the Javanese language that is thought to be boring. Reaching out to educational institutions to raise awareness about the movement is a challenge that we are taking. By collaborating with these educational institutions for this project, we expect to be able to present a new and different form of Javanese language pedagogy. Furthermore, the students’ capability to think critically is expected to improve gradually.
Through the introduction of Wikipedia and the training to edit Wikipedia in a classroom for 10 meetings at most, our expectation is that the students’ confidence and writing skills in Javanese by using an advanced technology will be stimulated, thus erasing the above-mentioned belief among younger generation starting from these students. Additionally, the students’ contributions on Wikipedia will enrich the contents of Javanese Wikipedia, which we also expect to be about local cultures. Besides, this project is also expected to grow the local Wikipedia community by getting more active contributors to the project.
This project lasts from July 2018 – July 2019, including preparation stage, an execution stage, and evaluation stage of the entire project. It is divided into two batches, I and II. The batch I begins on July 2019 to January 2019 and batch II will start on the first week of February and will last until June at most. The preparation stage for the program is mostly done from July-November which covers preparation of references, outreaching, and coordination with the educational institutions, budget planning, and training volunteers. The execution stage of the batch I started on the last days of November 2019 and ends on the mid of January 2019. While the execution stage of batch II is planned to start in February 2019 and ended on May 2019. This execution stage will include volunteer training, monthly meeting, and also meetings with educational institutions 30 times in total. The evaluation stage with educational institutions is conducted every three months at most, or at the end of the project. The evaluation stage for the batch I will be done on the mid of January 2019 after the university’s examination has ended.
Lesson learned
[edit]So far, we have done the preparation stage and the implementation in one of the three educational institutions that we aim to collaborate with, in batch I. The full evaluation stage will be done after the university exam on the mid of January 2019. After done with batch I, the results can be seen even though it still needs a few more days to be completed, because the assessment is not due yet. However, there are lessons that we can learn and evaluate from batch I. In terms of preparations, there should have been more coordination between the educational institution and the Project Coordinator regarding the attendance, the assessment, and also the technical supports, such as internet connection and computers. In terms of execution, there should have been a better time and reference management. The quantity and quality of articles submitted by the students do not show significant improvements yet. This is most likely due to the low rate of attendance per class per meeting, and also a short class period. We expect by the due date of the assessment, there will be a significant rise in terms of quality and also the quantity of the articles.
Reflecting the lesson learned from batch I, it has not yet shown the desired outcomes. We are aware that there are important steps that need to be taken so that batch II which will begin February 2019 can result in desired and more successful outcomes. The first important step is to have a meeting with the WMS team, consisting of three volunteers. In that meeting, we will discuss the best time management, the most proper training methods for the classroom, and also the references. The second important step is to have better coordination methods with the educational institutions so that there will not be any miscommunications during the realization of the program. The last significant step is to have follow-up coordination with each educational institution.
GLAM
[edit]Digitizing manuscripts is important for any gallery, library, archive, and museum (GLAM). The process helps them in cataloging so that they are aware of collections that they have, collections that are most accessible to visitors, and collections that need special treatment. Wikimedia Indonesia, as an institution with a mission to freed knowledge, pays attention to these matters because GLAM's collections can be sources of information for free to everybody. Therefore, efforts should be made to keep these valuable resources from possible damage.
One of the efforts made by Wikimedia Indonesia was providing support for the digitalization team in the city of Padang, West Sumatra. The team consisted of one lecturer and several people who are experienced in the conservation of this manuscript, submitted a proposal to Wikimedia Indonesia for manuscript preservation in two locations which were centers of knowledge of the Minangkabau: (1) Kutub Chanah in Maninjau, and (2) Jorong Lurah Bukik in Payakumbuh. Kutub Chanah is a location where Hamka and her father wrote their famous historical writings. Meanwhile, Jorong Lurah Bukik is a location where A. Damhoeri wrote important works that until now some of them are still used as references on learning the Minangkabau language and culture. The collections of these two locations are valuable for the present society to reflect from. We also believe that the digitization can bring benefits to researchers with interest in the works of the two famous figures. Our previous digitization on the correspondences of Ki Hadjar Dewantara and their Indonesian translations were proven useful for visitors at Dewantara Kirti Griya Museum in Yogyakarta.
“ | I want to thank you for your kind efforts in digitizing the letters. It is very helpful for us (the Museum) to exhibit the transcription of the letters at the local museum exhibition in Yogyakarta in October (2018). | ” |
— Nyi Sri Muryani, Museum Dewantara Kirti Griya |
Narrative
[edit]GLAM 2018 is a subproject of Wikimedia Indonesia APG 2018 focusing on digitizing texts, such as magazines, books, and letters that have become public domain. This year, GLAM 2018 collaborates with several local archives and government agencies to help digitize and archive their collections. There are five agencies that have been and are being approached for cooperation. These agencies are (1) Minangkabau Cultural Information and Documentation Center (Padang Panjang City, West Sumatra; ongoing), (2) Tamansiswa Dewantara Kirti Griya Museum (Yogyakarta City, DI Yogyakarta; ongoing), (3) Ajip Rosidi Library (Bandung City, West Java; ongoing), (4) Yayasan Sastra Lestari (Surakarta City, Central Java; ongoing), and (5) Geology Agency (Bandung City, West Java; proposing cooperation). Agencies (2) and (4) have been working with Wikimedia Indonesia since 2017, while (1), (3), and (5) are first-time collaborators.
Minangkabau Cultural Information and Documentation Center
[edit]The collaboration with Minangkabau Cultural Information and Documentation Center ("PDIKM") is a collaboration focusing on digitizing magazines, newspapers, and books that are considered urgent to digitize for their lack of digital copies. Therefore, immediate action is necessary to minimize damage to the collection. In addition, most of the collections are already in the public domain. The publication of the magazine/newspaper took place in the Dutch East Indies, so that the authors and editors of the works are deemed to have passed away (at least if personal works on the publications belong to the publisher, the copyright of the works are already in the public domain because they have passed 50 years since their publication (1920)). Collaboration with PDIKM has been underway since August 2018 and digitalization results can be accessed on Wikimedia Commons here. The number of pages successfully digitized from PDIKM is 1,500 pages.
Tamansiswa Dewantara Kirti Griya Museum
[edit]This year, collaboration with Tamansiswa Dewantara Kirti Griya Museum ("Museum") is to continue digitizing Ki Hadjar Dewantara's correspondences that had previously been done in the wmid:Digitalisasi Konten 2017 subproject (see Dokumentasi Aktivitas § 28 November). At this stage, Wikimedia Indonesia has completed digitizing most of the correspondence and is reviewing them before uploaded to Wikimedia Commons at the end of January 2019. Besides, in this term, Wikimedia Indonesia will also digitize several Javanese magazines published in the 1930s and bilingual books on children songs from the Museum’s collection. Digitizing these collections are planned to complete by April 2019 so that they can be translated into Indonesian to improve their readability for readers who cannot understand Dutch or Javanese.
Ajip Rosidi Library
[edit]Collaboration with Ajip Rosidi Library ("Library") is the first-time collaboration of Wikimedia Indonesia with the institution. This collaboration was initiated by the Library's request through a Sundanese Wikipedia volunteer, Ilham Nurwansah, to digitize their collections. The collaboration is important because most of the collections in there are manuscripts that are already fragile so that special treatment to them is necessary. Until this report was written, more than 90 pages of the manuscripts have been digitized.
Yayasan Sastra Lestari
[edit]Collaboration with Sastra Lestari Foundation ("Yastri") is an ongoing collaboration that has been underway since 2017. This year, Wikimedia Indonesia plans to continue digitizing a number of old magazines. In 2018, Wikimedia Indonesia has completed the digitizing Kajawen Magazine published from 1927 to 1941. The 1932 collection and later will be published on Wikimedia Commons later under Category:Majalah Kajawen. Due to some technical problems such as poor visibility in the scanning results, uploading cannot be conducted until the problem is resolved.
Badan Geologi
[edit]Wikimedia Indonesia plans to file a request for Geological Agency ("Agency") to release a number of collections of the Agency to Wikimedia Commons. The collections are in the form of photographs and geological themed magazines. This attempt is important to help to improve the quality of geology-themed articles in the Indonesian Wikipedia in terms of references and photographs. With the collaboration, it is hoped that geology-themed articles can be improved into good-quality articles, and there will be opportunities for geology-themed writing competitions on the Wikipedia in the future.
Lesson learned
[edit]We get two important lessons from the first semester of GLAM 2018. First, a digitizing guideline is necessary for operators to follow. Guideline can be in form of stages of digitizing manuscripts, such as how to choose a text (based on their priority), how to put it under a scanner and to keep it from folded or shifted, to make a list of manuscripts that have been digitized so that the catalogue and metadata of the manuscripts can be made. This guideline will be realized in the second semester of this subproject and will be available in digital form.
Second, the GLAM 2018 really requires volunteers who can dedicate their time to assist activities across five locations. The same problem occurred in the wmid:Digitalisasi Konten 2017, but it can be resolved because activities were in only two locations. At the moment, this problem has been anticipated by finding volunteers with much spare time for the second semester. At the end of that period, there will be a number of experienced volunteers to continue digitizing activities at the designated locations.
Nevertheless, this subproject is going as planned based on the proposed timeline.
Content Creation
[edit]In Content Creation, we focus on Indonesian related content, emphasize on cultural scope. We improve Wikidata with Indonesian content, building the Javanese OCR, and hold photography and writing competition.
Wikidata
[edit]Wikidata is beautiful, and would even be more beautiful if more data can be added into it. This, however, is not an easy task: A combination of technological and societal approaches is a must. During the first six months, our project "Improving the Indonesian content of Wikidata" has made an impact to us. It is an eye opener moment when we realized that Indonesia, the fourth most populous country in the world, is still underrepresented in Wikidata. This has sparked our interest to add more data about Indonesia-related entities into Wikidata, and promote Wikidata to Indonesians.
With the collaboration of Wikidata community, Wikimedia Indonesia, and the Faculty of Computer Science, Universitas Indonesia, we have made a little Wikidata improvement. Cultural heritage entities, such as Tlasri Inscription, Duck Race or Pacu Itiak, and Laughing Chicken Contest, have now been recorded in Wikidata, enriching its knowledge diversity. Not only that, various Indonesian culinaries such as Bloranese Chicken Skewers, Gorontalonese Poki-Poki Chili Sauce, and Jambinese Fat Rice, have been put in Wikidata, making Wikidata yummier. In line with this effort, we have also introduced Wikidata to students of the Faculty of Computer Science, Universitas Indonesia.
Who knows, someday, perhaps later in their software projects in academia or professional life, those students might build applications (e.g., Indonesian culinary apps) that rely on Wikidata as the data source? We hope that our little Wikidata improvement can go a long way, to the future of those students, and all people around the world.
Narrative
[edit]Indonesia is a country rich in cultural heritage. An archipelagic country of more than 17000 islands, and of over 300 ethnic groups, Indonesia contributes a wide array of cultural heritage entities ranging from temples and inscriptions to dances and languages. Documenting such entities are not an easy task, and capturing these entities in a knowledge base (KB) poses an even more challenging job.
Our project aims to improve the content of knowledge about Indonesia-related entities in the Wikidata KB. To have a more concrete impact, we focus our efforts on three topics: cultural heritage, education, and government. For the Indonesian cultural heritage topic, we have managed to create around 1900 new entities using a semi-automatic approach. We have also enriched the knowledge of existing Indonesian cultural heritage entities in Wikidata by adding statements about entity types, location (regency/city and province) and country (= Indonesia). In total, the number of edits for this topic has reached more than 13000.
For the education and government topics, we have designed a pipeline to import data from open data portals in Indonesia into Wikidata. Open data portals such as data.go.id, data.jakarta.go.id, and data.bandung.go.id provide a vast amount of knowledge about topics such as education and government, that is still buried in the tabular, CSV format. According to the 5-star data rating scheme coined by Tim Berners-Lee, the inventor of the (Semantic) Web, data in such open data portals already meets the criteria of a 3-star rating. The next steps for a 5-star rating are to publish data using the structured, Web-friendly RDF data format and to link data to other data (e.g., Wikidata) to provide context. We have initialized an effort and developed a prototype to convert CSV data from Indonesian open data portals to RDF format. The next step that is to be realized within the next 6 months is to import the converted RDF data into Wikidata.
Last but not least, we are proud to have held a workshop in promoting Wikidata, and disseminating our project progress to students of Universitas Indonesia. The workshop, called Wikidata for Artificial Intelligence (WD4AI), held on Nov 3, 2018. The talks in the workshop were Introduction to Wikidata by Raisha Abdillah, and Ontologies on Wikidata by Adila Krisnadhi, Ph.D. Around 140 participants attended the workshop, where each of them has created a new Wikipedia/Wikidata account, and improved at least 4 Wikidata statements. In total, over 600 Wikidata statements about Indonesia have been improved.
Lesson learned
[edit]We have learned several aspects from the ongoing project. First, the CSV-to-RDF conversion module still needs to be polished. Since the pipeline of importing from CSV into Wikidata relies on this conversion module, we need to complete the module as soon as possible. A rigorous testing of converting CSV datasets from the three open data portals (i.e., data.go.id, data.bandung.go.id, and data.jakarta.go.id) is necessary to ensure that only clean RDF data is fed to the next stage in the pipeline (that is, the next stage is basically to import the RDF data into Wikdata). Second, we are currently writing publication drafts of the proposed approaches. We hope that by Jan 2019, we have already submitted one, and by April 2019 for another one. We plan also to give a talk about the project at the WikiNusantara 2019 conference in April 2019, which is held by Wikimedia Indonesia. Third, we are still finishing an ontology for describing the converted CSV datasets, and are developing a semi-automatic alignment tech
Wiki Cinta Budaya (Wiki Loves Culture)
[edit]In the Wiki Loves Earth 2018 photo competition, Wikimedia Indonesia carried the theme of Indonesian culture. We tried to find partners to hold this competition. We sent a proposal to the Directorate General of Culture (DGC), Ministry of Education and we finally met to discuss this competition further. Their response turned out to be positive. They were willing to help to disseminate information about this competition to their networks. In addition, they were also willing to provide cultural trip prizes to three winners.
Working with government institutions was not as easy as imagined. The DGC had a lot of works to do so that the communication between us took quite a long time. For example, when we had to determine the time and location of the cultural trips and photo exhibition. Besides getting the cameras, the winners were also sent to Toraja (Central Sulawesi) to attend Lovely December annual cultural festival on December 27-30, 2018. The photographs of these cultural trips will also be uploaded to Wikimedia Commons. The DGC and Wikimedia Indonesia would organize a photo exhibition as well, in February 2019 in Jakarta.
Starting from collaboration in the Wiki Loves Earth competition, Wikimedia Indonesia was invited by the DGC to attend the Pre-Cultural Congress on November 4-6, 2018. At that pre-congress, representatives from Wikimedia Indonesia completing the recommendations for the formulation of cultural strategies by discussing it with other representatives.
Narrative
[edit]Wikimedia Indonesia organized a photo competition on Indonesian culture (dance, music, traditional games, foods, and drinks) online that lasted for 60 days (September 1 – October 31, 2018) on Wikimedia Commons. Initially, the competition lasted for 30 days (September 1- September 30, 2018). However, the committee extended the competition period because apparently, capturing photos related to cultures was more difficult. To introduce this competition, we conducted socialization to 3 cities.
- Palu (Central Indonesia): August 20, 2018
- Aceh (West Indonesia): September 4, 2018
- Sorong (East Indonesia): September 20, 2018
While for the judging phase, there were several stages, which covered the selection by the committee and the jury
- November 1-14, 2018: Photo selection by the committee (online and offline)
- November 15-26, 2018: Photo selection by the jury phase 1 (online)
- November 29, 2018: Photo selection by the jury phase 2 (offline)
The distribution of participants who took part in the Wiki Loves Culture was very diverse. They came from 26 provinces (from 34 provinces in Indonesia), as listed below:
This project was organized because of our desire to add free license photos of Indonesian cultures so that they could be used by anyone. During this time, many photos of Indonesian culture on Wikimedia Commons were originated from the Royal Netherlands Institute of Southwest Asian and Caribbean (KITLV) and the Tropenmuseum (museum in the Netherlands) - thanks to Wikimedia Nederland for doing those collaborations. Therefore, we also wanted Indonesian to participate and to contribute in capturing the photos of Indonesia culture that use a free license, under Creative Commons.
We noticed that many Indonesian citizens tended to upload their photos on social media, such as Facebook, Instagram, or Twitter, while Wikimedia Commons as a platform that provides digital photos with a free license was not yet known by many people. Through this competition, we hoped many people would know more about Wikimedia Commons and would be willing to upload their photos with a free license so that it would be easier to be used by other people. Furthermore, with this project that was focused on capturing moments related to cultures, we also expected that there would be more photos about Indonesia culture that are open. So far, we still have difficulties to find photographs of Indonesian culture. If any, we have to pay to use the photos.
The competition managed to attract many people’s attention. Many of them joined as a participant and uploaded their photos on Wikimedia Commons during the competition, thus the numbers of photos about Indonesian cultures increased. This can be viewed by checking the photos based on the categories on Wikimedia Commons.
Unfortunately, after the competition ended, participants no longer contributed to Wikimedia Commons. This condition has concerned us. We realized that we need to find another way in order to make people be interested in contributing to Wikimedia Commons voluntarily, without any prizes given. Some of the steps we may take, such as socializing Wikimedia Commons to the beginner photographers. This socialization contains an explanation of open licenses and the procedures of uploading photos on Wikimedia Commons. One of the judges of the Wiki Loves Culture competition offered to meet the representative of Antara Photo Gallery (gallery and photography training venue) to discuss this activity.
Lesson learned
[edit]There are some lessons that we can learn from Wiki Loves Culture. First, we need early central notice installation. Publications of the competition were done through social media, distributing the posters, socialization to three cities, and installing the central notice in Indonesian Wikipedia, local Wikipedia, Wikidata and Wikimedia Commons. The publication that can reach many people quickly is the one that is through the central notice. However, the committee was late to ask for the installation of the central notice on Meta. It took about two weeks from the request until the central notice was installed. After some time, the central notice appeared in the middle of the competition period. From this case, the committee learned that for the next competition, the central notice request must be made two to three weeks prior to the start of the competition.
Second, a different approach is needed to explain Wikimedia Commons to the photographers. The committee conducted the socialization in Aceh (West Indonesia), Palu (Central Indonesia), and Sorong (East Indonesia). Some participants are photographers. The explanation about Wikimedia Commons was rather difficult for them to understand, especially about the Creative Commons license (CC BY-SA). Most photographers still thought about copyright and economic rights when they uploaded photos to Wikimedia Commons. The jury's input regarding this matter was to emphasize the existence of a digital footprint when they uploaded on Wikimedia Commons. The existence of a digital footprint makes the name of the uploader known to people on the internet. One of the juries, Gunawan Kartapranata, said that his photo on Wikimedia Commons was used by several organizations. His name eventually became known by many people.
Third, we need to create a registration form. To take part in Wiki Loves Culture competition, participants could directly upload photos and then send the URL through email. The absence of a registration form made it more difficult for us to contact the winners because the username used and the email address was different. For the next photo competition, a registration form is needed so that the committee can contact participants by phone.
Fourth, the committee needs to make additional points for the photo description. Besides giving an explanation of the photo, the description can be used as a material for writing articles on Wikipedia.
Javanese OCR
[edit]There are two significant issues that need to highlight as we are conducting the project on the Optical Character Recognition OCR) for Javanese characters.
The first deals with efforts of providing data. In the beginning, it was projected to use the raw data taking the form of scanned manuscripts which are available on Wikimedia Commons. The main reason is to enable the researchers to concentrate on exploring different algorithms at every step of the OCR process. However, the scanned manuscripts in the image format have low resolution, while the original scanned manuscripts from the museum were scanned in low resolution far below the standard for OCR process. Then, the research team decided to provide raw data by scanning the different manuscripts. It turned out that providing standard data for the recognition process could not be trivialized since the scanned manuscript contained noises which result in rescanning them. The process of providing data becomes more laborious, as the needed data sets should comprise raw data (scanned manuscripts), training data, test data, and evaluation data in which each set is required to be different.
The second issue relates to the segmentation process which encompasses the foreground (the character pixels) separation from its background (the paper color or noises), the horizontal (characters per line) and vertical segmentation (each character in a line). Based on the characteristics of Javanese script which is classified into Abugida – whereas each set of characters symbolizes a syllable with diacritics or character components -- the segmentation process should consider to include these components which could be located above, under, on the left or right of, or directly attached to a basic character. The Connected Component technique was applied to segment the text images vertically and horizontally. This technique resulted in the separation of diacritics from their basic characters in separate indexes that it is hard to identify which diacritic belongs to which the main character. This made us switch into applying the Projection Profile technique (PP) which is able to handle this problem nicely. One disadvantage of PP is that it is unable to split overlapped lines. To cope with this problem, we turned into the statistical approach on how to calculate an outlier in a data set. The amazing thing is that this simple method works well and successfully decreases the number of merged lines unsegmented by PP technique.
Narrative
[edit]In this digital Renaissance era, plenty of primary sources (books, manuscripts, articles) from 1500s-2010s could be accessed through digital libraries or the Internet. This is enabled by the massive digitizing projects on valuable and historical books. The main goal of these projects is to preserve books containing knowledge, science, outstanding thoughts, culture, and wisdom to pass them on the next generation. Such massive projects do not stop as the manuscripts have been scanned but continue to the OCR process which recognizes these manuscripts' characters. Thus the conversion format of document images into accessible and searchable texts is made possible. Unfortunately, such massive projects are mostly applied to the European or central and East Asian manuscripts. For this reason, two lecturers from Faculty of Information Technology, en:Duta Wacana Christian University collaborated with Wikimedia Indonesia to form a project called TRAWACA, with goals to develop a software which is capable of recognizing Javanese characters through an OCR process. The long-term goal is to inherit local knowledge and wisdom depicted in the manuscripts to the millennial generation with a hope that the Javanese literary works, knowledge, and wisdom would not disappear along with the manuscript degradation.
Excluding the scanning process and the data collection, the OCR process comprises some stages, i.e. noise removal, correction of image orientation, grayscaling, feature extraction, character classification a.k.a character recognition and the evaluation. The first two stages are skipped in this project for the reason of saving time. The consequence is that the scanned documents as input should be free from any noise, in a good condition and have the standard resolution for the OCR process. As per December 31st 2018, the grayscaling process was done by utilizing a function provided by Python, the programming language used in this project. The foreground segmentation, the horizontal (splitting a page into lines) as well as vertical segmentations (the individual character in a line) were implemented by applying the Projection Profile technique (PP). A statistical method for computing data outlier was then applied to refine PP outputs. Prior to the use of PP, a technique called Connected Component (CC) was implemented. However, it yielded unsatisfactory results. The feature extraction process applied two methods, namely 9-zone feature and the ratio of the character area. At the time this report is being written, we are dealing with the training data preparation for the classification process.
Two stages of work need to be done comprise the character classification and evaluation process. Based on our experiment on conducting the segmentation process, the implementation of character classification would unlikely depend on one technique or method only. It is planned to conduct an experiment on two different methods, i.e. nearest centroid classifier and either a variant of Bayes model or Convolutional Neural Network (CNN) which are parts of the machine learning approaches. It is expected that one of those methods would give satisfactory recognition rate.
Lesson learned
[edit]So far, we got two invaluable lessons. One deals with the OCR workflow and another lesson has to do with the creativity for finding a refining method which is able to handle the disadvantages of a previously applied method. The OCR stages as explained earlier is sequential, meaning that the outputs of a former process would be the inputs for the succeeding process. Thus a late implementation of a stage would influence the start of the rest stages. In this semi-annual project implementation, the data annotation process has been unable to deliver the targeted amount of annotated data on time, which depends on the feature extraction process. We assumed that there are 2 main factors causing this delay. The first factor relates to our inability to predict that data annotation is costly and time-consuming so that we did not propose a budget for this process on our proposal. The second factor is that we lack experienced and passionate annotators having both IT and Javanese knowledge. The available assistants doing annotation are IT students having few knowledge on Javanese characters. Besides, data annotation is a repetitive and routine task so that it is boring for IT students who prefer to coding rather than labeling data manually. Thus, we need to find a strategy on how to proceed with the OCR process despite the unfinished task of the former process.
Another lesson learned is that in implementing each OCR stage, we could not entirely rely on the standardized methods whose functions have been predefined. The reason is that most of these methods have been designed and implemented for OCR process of recognizing characters in the writing system of European languages so that they present different problems as they are applied to Javanese characters. Based on the segmentation process, creativity for finding a refining method which is capable of handling the disadvantage of a previously applied method becomes necessary. One drawback from this scenario is that it takes time to experiment several methods just for a specific phase of OCR.
Community Engagement
[edit]In Community Engagement, we support and strengthen Indonesian Wikimedia community through grant and conference. The grants provide financial support for various Wikimedia projects, including Wikipedia, Wikimedia Commons, Wiktionary, Incubator for new languages, and wide-ranged of topics. The conference will be held on April 2019 in Yogyakarta. The committee mainly from Indonesian, Javanese, Sundanese, and Minangkabau projects.
Grants
[edit]The grant is unexpectedly of interest to small Wikimedia communities in Indonesia. The Javanese, Sundanese, Minangkabau, and Gorontalo Wikipedia communities, with the number of active editors is 20s or less per month, belong to the small communities compared to the Indonesian Wikipedia one that has active editors reaching 500 and more per month.
The small communities have done a variety of activities within this semester from camping to provisions of references for the community. The WikiCamp by the Minangkabau community is the first-time camping event for new Wikimedia contributors in Indonesia in order to create a new project, the Minangkabau Wiktionary. Participated by 15 students from high schools across West Sumatera, the event was held in Carolina Beach in Padang City for two nights. During their stay by the beach, the participants added new entries to the Wiktionary so that it will soon be released from the Wikimedia Incubator. The two-night event could successfully generate more than 1,500 entries of Minangkabau words to the Minangkabau Wiktionary. Another activity was a provision of references for the community by the Gorontalo and Javanese communities. The two communities purchased bilingual dictionaries that are useful for the communities, especially new contributors, to write in their Wikipedias with the standard language.
The grant was also of interest to a group of students on starting a Wikipedia in the Osing language, a language native to the Banyuwangi Regency in East Java. The activity was initiated by a Wikipedian in Yogyakarta Province who invited five Osing people studying in the province to take part. They managed the grant for holding several meetups, purchasing dictionaries, and travelling to Banyuwangi in order to introduce more Osing people in their homeland with the Osing Wikipedia by hoping that the local people would also take part in building the Wikipedia. There are now more than 100 articles in the Osing language in the Incubator.
Narrative
[edit]The WMID Grant is a program on providing financial support for Wikimedia communities within the Republic of Indonesia. Opened on August 17th, 2018, this program provides a small amount of funding up to IDR25,000,000 for a group or an individual. It is the first of its kind in Indonesia by Wikimedia Indonesia and hopes to grow the interest in improving local contents on Wikimedia projects by local volunteers. The landing page for the grant can be accessed through Meta where any Indonesian Wikimedian can propose a grant and report their activities. The program expects to boost activities on the Wikimedia projects taken care by WMID, especially the Indonesian, Javanese, Sundanese, Minangkabau, and Gorontalo Wikipedias. Those wikis have at least three people maintaining the projects. With the grant, communities were encouraged to apply for funding. They may use the grant for activities such as meetups, editathons, and photowalks that may lead to the growth of the community in terms of organizing funding and events independently.
With such expectation, we measure our success with the number of the submitted proposals. We set a figure of 5 proposals a year. Surprisingly, up to this semester, 18 proposals were submitted through the Meta with 8 of them got accepted and already received the financial support, thus responsible for managing the funding independently. The Sundanese community submitted one proposal while the Javanese, Minangkabau, and Gorontalo communities submitted two proposals. Most of their proposals were accepted by considering their interests in improving local contents and growing activity in their language Wikipedia. Two more accepted proposals were financially handled by Wikimedia Indonesia. These two proposals are distinguished from other proposals due to the conception of the proposals was as part of collaborations between Wikimedia Indonesia and other parties. In total, there are 20 submitted proposals and 10 accepted proposals this far. The requested funds varied from IDR1,000,000 to IDR25,000,000. All the accepted proposals already used up the budget for the grant program.
Given the type of activities, most accepted proposals are about thematic photo-taking activities, 4 proposals in total, that were held in West Sumatera (theme: islets), South Sulawesi (theme: undersea creatures), Surakarta (theme: local foods), and North Kalimantan (theme: public buildings). All of the photos are uploaded to Commons. Two proposals are about creating new projects on the Wikimedia Incubator, the Minangkabau Wiktionary and the Osing Wikipedia. Both activities involve students in increasing the contents. On the Minangkabau Wiktionary, the local communities trained 15 high school students to add contents through two-night camping by the Carolina Beach. The Osing Wikipedia involve 5 university students of Banyuwangi origin studying in Yogyakarta. They routinely hold meetups to increase the contents on the Osing Wikipedia. Another type of activity is provision of references for the community. It is purchasing books to help communities editing Wikipedia. The Javanese and Gorontalo communities both purchased dictionaries that are useful for them editing Wikipedia in the standard language. The Gorontalo community also purchased books on the Gorontalo culture.
Lesson learned
[edit]At the moment, there are two pending proposals that we are going to give funding by using excess money from another program’s budget. The high number of proposals tells us about increasing the budget for the grant in the future and channels all proposals to a landing page either in Meta or the Wikimedia Indonesia website.
Revenues received during this six-month period
[edit]Please use the exchange rate in your APG proposal.
Table 2 Please report all spending in the currency of your grant unless US$ is requested.
- Please also include any in-kind contributions or resources that you have received in this revenues table. This might include donated office space, services, prizes, food, etc. If you are to provide a monetary equivalent (e.g. $500 for food from Organization X for service Y), please include it in this table. Otherwise, please highlight the contribution, as well as the name of the partner, in the notes section.
- Note: Q1 is per 31 December 2018.
Details report in hereRevenue source Currency Anticipated Q1 Q2 Q3 Q4 Cumulative Anticipated ($US)* Cumulative ($US)* Explanation of variances from plan WMF APG IDR 1,882,230,394 1,882,230,394 - - - 1,882,230,394 131,756 131,756 WMF ESEAP Conference Grants IDR 30,339,826 30,339,826 - - - 30,339,826 2,124 2,124 ESEAP Grant unused In-kind Donation IDR 85,000,000 105,935,000 - - - 105,935,000 5,950 7,415 Those in-kind donation mainly donate for WikiLatih project, so our expense for the WikiLatih is lower, but the metric is increased.
* Provide estimates in US Dollars
Spending during this six-month period
[edit]Table 3 Please report all spending in the currency of your grant unless US$ is requested.
- (The "budgeted" amount is the total planned for the year as submitted in your proposal form or your revised plan, and the "cumulative" column refers to the total spent to date this year. The "percentage spent to date" is the ratio of the cumulative amount spent over the budgeted amount.)
- Note: Q1 is per 31 December 2018.
Details report in hereExpense Currency Budgeted Q1 Q2 Q3 Q4 Cumulative Budgeted ($US)* Cumulative ($US)* Percentage spent to date Explanation of variances from plan Administration IDR 575,200,000 381,552,646 381,552,646 40,264 26,709 66.33% Salary IDR 1,142,750,000 516,000,000 516,000,000 79,993 36,120 45.15% Education IDR 377,230,000 233,442,423 33,442,423 26,406 16,341 61.88% GLAM IDR 280,200,000 108,692,242 108,692,242 19,614 7,608 38.79% Content Creation IDR 343,480,000 81,346,039 81,346,039 24,044 5,694 23.68% We plan to spend the travel grants for conference (around 40% of the budget) to disseminate the result of Wikidata and Javanese OCR project on the 2nd semester. Community Engagement IDR 415,000,000 83,340,673 83,340,673 29,050 5,834 20.08% The conference around 71% of total Community Engagement budget will be held on April 2019 Supporting IDR 114,490,000 61,203,113 61,203,113 8,014 4,284 53.46% TOTAL IDR 3,248,350,000 1,465,577,136 1,465,577,136 227,385 102,590 45.12%
* Provide estimates in US Dollars
Compliance
[edit]Is your organization compliant with the terms outlined in the grant agreement?
[edit]As required in the grant agreement, please report any deviations from your grant proposal here. Note that, among other things, any changes must be consistent with our WMF mission, must be for charitable purposes as defined in the grant agreement, and must otherwise comply with the grant agreement.
Are you in compliance with all applicable laws and regulations as outlined in the grant agreement? Please answer "Yes" or "No".
- Yes
Are you in compliance with provisions of the United States Internal Revenue Code (“Code”), and with relevant tax laws and regulations restricting the use of the Grant funds as outlined in the grant agreement? Please answer "Yes" or "No".
- Yes
Signature
[edit]- Once complete, please sign below with the usual four tildes.