Inconvenient Exposure: Managing controversial content in digital collections

From family history to wrongful arrests to genocide denial, our community collections are reaching more people in more places, and not everyone is happy about it. So, how do you handle online pushback about your digital collections? Is it censorship or good policy to remove a newspaper article from the collection because someone’s checkered past is affecting their present? What happens when a collection sheds new light on a controversy?

This session discusses a wide array of examples of individual and community response to controversial content online. ODW Projects Coordinator Jess Posgate talks about how organizations are managing everything from personal information removal requests to hacked servers as new or buried narratives emerge through digitization. The session hopes to instigate conversation around planning digitization of controversial – or potentially controversial – material with respect and honesty, audience experience with in-house policies around personal information, and idea sharing for sustainable and comprehensive community representation online.

Presenting at the 2022 conferences for audiences at Ontario Library Association Super Conference and Atlantic Provinces Library Association, Jess Posgate walks through scenarios that might be familiar to some and provides tips on creating organizational policy to safeguard our community members when local history goes global.

Pilot: Handwritten Character Recognition (HCR)

As part of our digitization post production services, ODW has been achieving excellent results processing handwritten materials with Google’s Optical Character Recognition software. For a pilot project, we processed approximately 1120 duplex pages of pre-1910 handwritten Parish registers (births, marriages, deaths, mainly baptisms) digitized from public-use microfilm. Despite the quality of the images (scratched film and high contrast photography) the page images were split, deskewed, cropped and run through the  OCR software for some very rewarding results.

Applying this to our ongoing work with the Federated Women’s Institutes of Ontario (FWIO), we processed a recent batch of scrapbooks from the Grace Patterson Branch to provide full text search of the entire contents whether handwritten or typed. For all-in-one projects we will continue to apply the HCR software

Moving forward, we intend to experiment with Microsoft’s Azure HCR support which may be surpassing Google’s project — definitely worth trying to compare some pages! The development of HCR is burgeoning at companies like Google and Microsoft, so we can expect progressively better results over time.

Quick Access: Ontario Government Documents

The Ontario Government Documents Portal (https://govdocs.ourontario.ca) provides easy access and full-text search of over 120,000 government documents from the collections of the Ontario Legislative Library and the Ministry of the Environment, Conservation and Parks.

You can search and find:

  • Over 90,000 documents (monographs, serials, newsletters, press releases and more) from the eArchive collections of the Ontario Legislative Library, from 1867 to now, as downloadable PDFs
  • 8000 documents from the Ministry of Environment, Conservation and Parks, dating from the 1960’s to now, as downloadable PDFs
  • Links to the item if also found on the Internet Archive, with additional formats available, including DAISY for accessibility
  • The collections of the OCUL Government Information Community, the Ministry of the Environment and the Legislative Assembly of Ontario, digitized and hosted by Internet Archive
  • Ontario Sessional Papers and the Official Report of Debates (Hansard), digitized and hosted by the Internet Archive

Since 2008, the ODW Ontario Government Documents portal has supported widespread access to content created by government agencies. This project is a collaboration between OurDigitalWorld and the Ontario Legislative Library.

What’s new with VITA 6.3

In April 2022, the VITA Digital Collections Toolkit was upgraded to version 6.3. This release includes a balance of public engagement features and back-end management options. Inspired by feedback from the user community on both sides of the collections, VITA 6.3 focuses on: increasing linked discovery (like indexing non-Newspaper volumes); better search options (like search within Publication and on/off filters for results sets); expanding and scoping VITA collection audiences (with OAI-PMH integrations and IP restricted sites, respectively); and some fun stuff like interactive jigsaw puzzles and enhanced pan-zoom viewing. We hope you will explore the collections to see some of the changes!

Improved & Engaging Public Site options

  • Contribute Audio/Video/Document files to eligible accounts
  • Search within a Publication (e.g. Home & Country Newsletters) allows your results to stay focused on a single volume or newspaper publication
  • Jigsaw puzzles for a different way to interact with historical images
  • Optional indexing for non-newspaper volumes like church or cemetery records
  • Results filters for instant scoping and backing up through results sets
  • Browser-activated audio/video player
  • IIIF viewer for pan-zoom-rotate view of all Full, Detail and Reverse images (e.g. this postcard)
Jigsaw Puzzles

New Audience Options: Integrations and Scoped sites

  • OAI-PMH feature for extending discovery in other spaces like the Digital Public Library of America (DPLA)
  • IP limited sites for collections with access restrictions (talk to us about this option)

For a full description of all VITA 6.3 upgrades for the public and VITA users, see our latest VITA Partner newsletter

Broken tiles: A retro-conversion project

Over time, certain file formats become obsolete. When ODW implemented the first pan-zoom viewer in the VITA Toolkit in the 2010s, it was based on uploading large files made up of hundreds of little tiles all zipped into a folder. The once-free tool is called Zoomify. Over the years, we encouraged our users to “Zoomify” their full images and any pages of multipage items so that those items could be zoomed into and rotated for a dynamic user experience. This was particularly useful for scrapbooks where pasted items were sometimes in different orientation within a single page. Also, detailed items like the Welland Canal Records benefitted from this “Zoomification”. However, these folders of tiles were quite “heavy”, i.e. required more storage and some eventually became corrupt. 

Zoomify Tool “tiling” an image

Luckily, as technology has advanced and streamlined, the standard is now to use JPEG-2000 (JP2) files that automatically trigger the open-source IIIF (International Image Interoperable Framework)viewer in VITA. So, any user uploading full images, details, or pages can upload the considerably lighter and mobile-friendly JP2 file and it displays with all the pan, zoom and rotate options people expect for viewing this kind of material online. The trick was that we needed to go through our system and replace the old Zoomify folders with JP2 files. We were able to do this systematically for the most part, but some stubborn items required manual intervention and conversion. We were lucky to have Christine Anderson, a Mohawk College Library Technician student, who was willing and able to take on the task. Here’s Christine’s take on the project:

In my time at ODW, I have worked on (and completed) the Dezoomify project which primarily involved using the VITA Toolkit to access and replace collection images and other software for the conversion process. ODW provided me with a list that identified records with broken Zoomify files and I got started on the clean-up-work!

My primary task was to open and convert the broken Zoomify files and then replace them with JP2 files. This was done for Full images, some Details and Reverse images, as well as for many book and scrapbook pages. Using a RecordID list that was organized by Agency, I could identify all of the records with images that needed to be replaced and re-loaded. 

This work was accomplished by:

  • Using the Dezoomify tool which works by copying and pasting the item’s public URL into the tool
  • “Dezoomify” merges the tiles that make up a Zoomify file and that merged image can then be saved as a JPG
  • I used Irfanview software to convert JPG files to JP2 files, and I assigned their original file names so that the agencies could trace the display files back to their master copies
  • In the data management side of the VITA toolkit, I then activated a task-specific button to replace the broken Zoomify files with newer (and unbroken) JP2 image files
  • When certain Zoomify files were identified as too corrupt and this simpler workflow did not work, a workaround was created:
    • In some cases, I could open the PDF file associated with pages and save them as JP2 – although these tend to be quite large, so we adjusted the quality during the conversion process to reduce the storage overhead
    • In other cases, where there was no PDF, I would open an alternate JPG file for Full and Detail images and simply used the standard “Replace” button for the Full or Details file
  • The new files then automatically populate along with their records and now remain either public or non-public according to their original setting.

The JP2 files open in a IIIF viewer and provide excellent Pan-Zoom capabilities, like the slideshow below illustrates.

The Dezoomify project concentrated mostly on file creation and replacement (for example: digital collections from libraries’ local history/genealogy departments), and to an extent included working on the Metadata for the files submitted. The project consisted of a bunch of repetitive tasks that were not able to be automated and had to be manually manipulated/updated. This was important database work that will ensure the integrity and currency of the files uploaded to the clients’ digital collections and sites going forward.

There will always be advancements in technology standards and these inevitably require adjustment and retroconversion activities. With Christine’s work complete, the ODW team was able to purge a considerable overhead of corrupt and cumbersome Zoomify folders from the database. The positive outcomes of this work is a reduction in the affected agencies’ storage and the cumulative burden of these obsolete files on the servers, plus Christine gained new technological skills that she can carry forward in her career as a Library Technician. It’s a win-win!

Building Multicultural History Timelines With VITA

Guest blog post from Victoria Scioli, placement student from University of Toronto Mississauga History Program

Over the past semester, I had the opportunity to work on the multicultural timelines with OurDigitalWorld. I became acquainted with the OurOntario.ca database and used the VITA Toolkit to implement the work. I really enjoyed having access to so much primary source material and learning how to search for appropriate sources to create the timelines.

Being a history student, much of my time is spent looking at primary sources to use in research papers, but I had never used them in a creative format like a timeline before. I spent a lot of time looking at the primary sources related to the histories of Japanese-Canadians, the Black community, and women in Ontario and trying to come up with a storyline that would best present significant information to the reader. I learned how to keep my descriptions concise and pick primary sources that could best provide insight into these different aspects of Ontario’s history.

When I was younger, I remember learning about the Japanese Internment but never any details about what the camps looked like nor images and testimonies of men and women who suffered the effects of this kind of discrimination. While working on my timeline, I was able to look at primary sources that pictured what the conditions were like for men at Internment camps like the one in Schreiber, Ontario. I also learned about significant survivors like author Yon Shimizu and Japanese Canadian politician Bev Oda. Both of these people contributed to creating awareness of the ways Japanese Canadians were discriminated against during the war and how that generation were forced to restart their lives in Canada. I learned a great deal from the research that went into these timelines, I hope that it inspires viewers to learn more on each of the subjects. 

One issue I ran into with the timeline feature was finding a proper way to begin and end my timeline that would provide the reader with context to the history before jumping into the actual items being featured. This was resolved by adding umbrella panels for my start and end dates. These allowed me to provide a short “Introduction” and “Conclusion” for the timeline. They introduce the viewer to any important information or advisory before getting into the timeline content and provide a short summary.

It was such a pleasure to work on such an interesting project! I found that the timelines are a great way to engage with primary source materials from different institutions in order to illustrate and explore significant events in Ontario’s history.

Explore the timelines:

Japanese Internment in Ontario and its Impact

The Progression of Settler Women’s Roles in Ontario From 1800s to 1960s

A Timeline of Black History and Significant Figures in Ontario

Government Information Days 2021

December 14th and December 15th, 2021, from 12:00-3:00 PM EST.

We are very excited to announce that Dr. Debby Wilson Danard will open this year’s Government Information Day(s) Conference, with a keynote on INDIGIPEDIA.CA, the Indigenous Digital Encyclopedia. 

This year’s conference will include:

  • Dr. Jill Stuart on data ownership in space exploration
  • Nicole Bonnell from the Northwest Territories Legislative Assembly on working with official languages (11!)
  • Jacob Turner on regulating artificial intelligence

Session topics include discovery services for Canadian census and geospatial data, preservation, and implementing more inclusive practices in programming and subject headings. There will also be announcements and updates on projects, including the newly-formed National Shared Print Program, North. 

Full programme details to follow on www.governmentinformationday.ca.

The conference is free, but registration is required. Registration will open on December 1st, 2021. A reminder with a link to registration will be sent to the list. 

On behalf of the Planning Committee,

Sandra Craig, Ravit H. David, Loren Fantin, Simone O’Byrne

Digitizing the Angelo Principe Italian-Canadian Newspaper Collection

Adapted from The ‘Angelo Principe’ Italian Canadian Newspaper Collection by Dr. Matteo Brera

Mastehad of La Vittoria (The Victory) Italian-Canadian newspaper

In 2014, researcher and scholar Dr. Angelo Principe donated his extensive newspaper and book collection to the Clara Thomas Archives and Special Collections of York University Libraries. The ‘Angelo Principe Collection’ includes materials entrusted to him for preservation by Italian Canadian activists from the first half of the twentieth century like Attilio Bortolotti and Benny Bottos, as well as the surviving documents belonging to Augusto Bersani, transnational political activist, facilitator and secret agent for the Royal Canadian Mounted Police (RCMP).

Six years later, a key part of the collection was digitized in a collaboration between Michael Moir, Head of the Clara Thomas Archives and Special Collections, and OurDigitalWorld, resulting in a unique online collection of rare interwar Italian-language newspapers published in North America. These include Il Bollettino Italo-Canadese, Il Cittadino Canadese, Il Giornale Italo-Canadese, Il Lavoratore, L’Araldo del Canada, L’Italia, L’Italia Nuova, L’Italo Canadese, L’Operaio Italo-Canadese, La Vittoria, La Voce degli Italo-Canadesi, and La Voce Operaia. The newspapers were processed using OurDigitalWorld’s multilingual Optical Recognition Software (OCR) and are full text searchable in both English and Italian.

The significance of this donation cannot be overstressed. Thanks to Michael Moir’s vision in working with OurDigitalWorld, and to Dr. Matteo Brera for his work adding rich contextual and descriptive metadata to the collection items, Dr. Principe’s legacy for the study of the construction of the Italian Canadian identity and transcultural exchanges between the Old and the New World is manifest in this online collection, providing an invaluable research tool to be used and enjoyed by scholars and the community.

Explore the collection at https://vitacollections.ca/yul-italiancanadiannewspapers/search

This research and digitization project was conceptualized and directed by Dr. Matteo Brera (mbrera@yorku.ca) and was made possible by generous funding from the Zorzi Family Italian-Canadian Archival Fund, established in 2017 and dedicated to encouraging the study of Italian-Canadian archival materials. The project was also sponsored by York University’s Faculty of Liberal Arts and Professional Studies.  

ODW Quarterly Newsletter September 2021

311 Steamers CITY OF TOLEDO and CITY OF ALPENA II
courtesy of the Alpena Public Library Great Lakes Maritime Collection

This autumn, the ODW Quarterly Newsletter knows no borders!

Learn more about our US partner collections, from Chicago-area newspapers to Great Lakes history, these extensive sites are one-stop research goldmines.

The Fall conference season will see us speaking at the Creative Commons Global Summit with “GLAM Project for Access to Community Newspapers” and at the Access2021 Conference with “Revisiting the Paper Files:
OCR from paper versus microfilm
“.

As well, we’ll be virtually exhibiting at the Michigan and Illinois Library Association conferences in October.

Stay tuned for the Government Information Day(s) 2021 in December that OurDigitalWorld is co-hosting with the Ontario Ministry of the Environment, Conservation and Parks.

Have your say and learn more about the National Heritage Digitization Strategy community calls for feedback on their Strategic Plan.