OurDigitalWorld Newsletter September 2023

On the importance of our community newspapers

Over the years, ODW has been motivated to support grassroots organizations preserve and share their community history and local newspapers. Why? Because these are the stories of our neighbours, near and far; because they help us understand our own and each other’s communities and bind us as a society; because they build a valuable web of information about our past and our present. We are therefore disheartened at the announcement of severe cuts to community newspaper paper publications across Ontario, some that have been running for nearly 200 years (Small-town community papers take big hit after Metroland files for bankruptcy), and for the staff and journalists behind those publications who are affected by the changes. This is a tremendous blow, isolating individuals without access to the internet and further distancing people from their community through homogenized “news” reported from centralized, mainly urban, sources.

Reflecting on these recent developments, ODW wants to thank the organizations who steward their local newspaper collections and provide access to the stories, personalities and issues that reflect the fabric of our communities. The Ontario LAM Newspapers Working Group is also be convening about this announcement. Please get in touch about their work, and/or to share any news or concerns you may have about your community newspaper by writing to: CommunityNewspapersON@gmail.com

ODW Projects

New life for old database records

13,000 index records have been meticulously cleaned up and migrated to be included in the collections from Waterloo Public Library. These index records highlight contents from magazines, newspapers, city directories and vertical files in the physical Local History collection, allowing users to identify which items to access and review. The initial labour involved in creating this depth of indexing is substantial, so making them available online honours that work and provides enhanced access for the public. Explore the collection

Hyper-local news has historical clout

The South Marysburgh Mirror is an independent newsletter that began publishing in 1990 in the southernmost part of Loyalist-settled Prince Edward County, Ontario. The Mirror covers annual events like the still-running agricultural Milford Fall Fair and the New Year’s Levee, and includes monthly updates from the local Women’s Institute, churches, and transcriptions of local rural diaries. These hyper-local stories fill many gaps in the larger publications in the area like the Picton Gazette and Belleville Intelligencer, offering decades of insight into a tightly knit, rural community. Explore the collection

Historical Society achieves long-term goal

The Lake Scugog Historical Society, a volunteer-run, not for profit organization, achieved a significant milestone by finishing their newspaper digitization project. The result is more than 150 years of coverage from local newspaper collections that were preserved on paper, microfilm, microfiche and born-digital PDF. The collection includes early dailies like the Ontario Observer and Port Perry Star and the more recent Scugog Citizen weekly and Focus on Scugog monthly magazine. Thanks to their perseverance, this adds breadth to the online and historical coverage available for the central Durham, Ontario region. (Photo from The History of Port Perry’s Newspapers). Explore the collection

New VITA Digital Collections

Archived Oxford: A view into Oxford County’s past

A new project from Oxford County Archives, described as starting with “260 individual postcards [that] feature images of local streets, scenery, and buildings in Oxford County, along with some promotional material from historical businesses. We will be expanding to include … just over 3,000 photographs … which again, include locations and buildings throughout Oxford County, events, people, businesses, etc.” Explore the collection

Great Lakes Vessels database sails again

The Wisconsin Maritime Museum has migrated the Gerald C. Metzler Great Lakes Vessel Database into ODW’s unique Vessels metadata templates. These historical records detail vessel dimensions, classification, builder and shipmaster names, as well as changes over time. Much like other vital statistics, the vessel records capture the places of a ship’s birth, travel, and, if sunk, final location – a real boon for wreck divers. This collection can be searched independently but will also add more than 13,000 detailed vessel records to the fast growing Great Lakes History search site. Explore the collection

Coming Soon

Oshawa Newspapers

Oshawa Public Library is adding later years of the Oshawa Times and Daily Times-Gazette to their already extensive newspaper collection. The newspapers will be scanned from microfilm and made full text searchable, enhancing access to the Oshawa collection as well as the Durham Newspapers regional site. Explore the collection

Dryden Observer

Covering life in Northern Ontario for more than a century, the Dryden Observer includes “significant moments in Dryden’s development such as the building of the Dryden Public High School, advertisements from local businesses, and records of life from citizens, from obituaries to personal letters to the editor.” Dryden Museum

Join us at the 2024 Ontario Library Association annual Super Conference in Toronto, Ontario, January 24-27, 2024. Visit our booth to meet the team, catch up on our latest projects, learn more about us and find out how we can help with your digitization projects.

Header photo credit: Rosseau Historical Society “Highway 632, Bridge over Shadow River – 2 – RV0041

Improving access to heritage newspaper content: Replacing microfilm with original paper scans

From guest blogger Walter Lewis, Great Lakes historian and software developer for OurDigitalWorld.

Some years ago the Center for Archival Collections at Bowling Green State University organized the microfilming of many of the early issues of both the Marine Record and the Marine Review up to end of 1902. In 2010, we added just short of 17,000 pages from that microfilm to the Maritime History of the Great Lakes website covering the years 1883-1902. Thanks to issues shared by the Dossin Museum in Detroit, along with Ron Beaupre and Greg Rudnick, I have been able to both extend the coverage of the Marine Review to its end in 1935, but also to replace all but 2500 pages of the microfilm with images from the originals. The result is just over 55,000 pages of marine journalism published in Cleveland, Ohio. The journals had deep roots in Great Lakes shipping and although from World War I, there was an increasing emphasis on global developments.

One question I have been asked is “why go to the time and effort to re-shoot the issues from the originals?” A couple of examples may explain why.

Almost all microfilm is photographed in black and white, with an emphasis on high contrast exposures that improves the ability to read the text on standard microfilm readers. The company that digitized the BGSU microfilm emphasized this contrast in the files they produced for us. For pages from the era of woodcut engravings this is less of an concern, although the additional generation of negative/positive print before digitization can still introduce focus issues. The challenge in many films comes from shadows in gutters in instances when the paper wasn’t disbound before filming (true here). Content in those columns may come up very dark, and after digitization, black on black. In part this is because many digitization projects, especially ones done ten or more years ago, were struggling to reduce file size and assumed that bitonal (aka each pixel in the image is either black or white) images would be acceptable. In some instances they are. But with the increasing use of photographs in the 1890s, the degree of greyness at which a given point on the page was converted to either black or white, makes for some very unhappy images. The Marine Review prided itself on its illustrations. Reshooting these, not just in greyscale, but in colour restored a significant amount of detail. This was especially true, when some earlier owner of the issue marked it up with a blue or other coloured pencil.

Image from a microfilm scan
From Microfilm

The conversion to bitonal files also has a significant impact on the quality of the Optical Character Recognition (OCR) of the files. This is a computer process that converts the images of the text to text that can be searched in our indexes. When, for example, letters have parts that print more faintly, or where there is bleed-through from the ink on the other side of the page, the results are far from satisfactory.

From the Microfilm

Iuststpruentthereisalittle flurryin Wuhingtoubetween the navy department and the Marine Hospital service. navy ‘departlnent has recently yent 050.000 establishing a coding nation at Dry Tor- tuga: an In
equt wha considers. u the island. the most im- rortan ‘ 1ss::erntheChesa eand Central America. A ew bp g was en rised to receive a notifiatiqn from the ta-usury department to stop war at Dr! TOTIIIEII 5! t\P”‘ 1. 55 Surgeon General W needed the place to are for yellow lever and
bubonic plague patients. The ma thinks that the-e are sevwal other adjacent s a avail: . ‘lfllfvgaa and will de- elinetosurrenderDry ortugasnnlesslpecl yofildtdmdoiohr lhepréktthirnlell  .

From reshoot of original

A FLURRY OVER DRY TORTUGAS. Just at present there is a little flurry in Washington between the navy
department and the Marine Hospital service. The navy department_has
recently spent about $50,000 in establishing a coaling station at Dry Tor-
tugas and in equipping what it considers, upon the island, the most im-
portant strategic base between the Chesapeake and Central America. A
few days ago Secretary Long was surprised to receive a notification from
the treasury department to stop work at Dry Tortugas by April 1, as
Surgeon General Wyman needed the place to care for yellow fever and
bubonic plague patients. The navy department thinks that there are
several other adjacent spots available for hospital purposes and will de-
cline to surrender Dry Tortugas unless specifically ordered to do so by
the president himself.

There are still minor gaps in the files where pages are missing from issues, and a significant number of early issues are still missing, but the results are worth the effort. Now if we could only locate some issues of the Record from before 1883.

To read the full article, see Walter Lewis’ Maritime History of the Great Lakes website: http://stories.maritimehistoryofthegreatlakes.ca/digitizing-the-marine-record-and-marine-review/

ODW Quarterly June 2023

Our recent Quarterly newsletter is now available!

Find out what we’ve been up to and explore these highlights:

  • Digitizing Newspapers with ODW: Addressing Copyright & Public Access
  • Interactive 3D Shipwrecks
  • Celebrating National Indigenous History Month
  • Introducing the Great Lakes History search site
  • VITA Toolkit: New Features & Functionality
  • Welcome Brantford Digital Archives
  • Looking forward: Oshawa and Dryden Newspapers

Read more…

Expanding opportunities: Text Data Mining with Newspapers

In late February 2023, the Leddy Library’s Academic Data Centre at the University of Windsor hosted a workshop series called RDM & TDM in JupyterHub with Newspapers. TDM is an acronym for Text Data Mining (TDM) and one increasingly common approach to TDM highlighted in the workshop is the use of Optical Character Recognition (OCR) from newspapers for text processing. The imagery for several of the newspaper titles used for the workshop was improved to raise the OCR accuracy levels to better serve TDM technologies.

First observed with  the Feb. 4, 1892 edition of the Comber Herald, initial tests with Topaz suggested a 20% improvement in OCR accuracy. This past year has seen the entire collection reprocessed, which allowed the Herald to be included in the corpus for a TDM workshop held at the Leddy Library.
Another sample from 2021, this time the Sept. 22, 1971 edition of the Essex Times. Like the Herald, the Times was completely processed this past year with Topaz.

The series was funded with a grant from Compute Ontario and showcased OurDigitalWorld’s extensive history with newspaper digitization, as well as its long-standing partnership with the University of Windsor. The growing interest in TDM among Ontario libraries was further confirmed by a Colloquium on TDM in Libraries event held at the University of Toronto in early May 2023. The use of newspapers for TDM was a major theme for the colloquium and a common strategy was identified where newspaper collections become substantial data assets for text processing.

ODW collections not only formed the basis of the Windsor workshop series, a subsequent data challenge using the newspaper collection was launched in March, which featured a collaboration with Hackforge, and a kick-off event at the LaSalle public library in partnership with the Essex County Library System. The Compute Ontario grant that supported these activities also provided funding for two graduate students, Akram Vasighizaker and Sumaiya Deen Muhammad, to carry out original research and the results are publicly available on the workshop github site.

TDM is an exciting new direction for newspaper digitization and represents a convergence between recent advances in artificial intelligence (AI) and machine learning with what is frequently the most extensive record of a community’s past, the local newspaper. Unique insights into the past and identifying trends and patterns are enhanced with the power of TDM and digitized newspapers, and it is hoped that ODW can continue to help libraries contribute to this promising area of research.

What’s new with VITA 6.4

VITA Digital Collections Toolkit was upgraded in September 2022, making it easier for user to provide better attribution and search results. This version upgrade means users can automatically assign copyright labels, process text items with OCR and hit highlighting, and share improved display for linked index records and more…

Exciting new changes include:

  • digital files uploaded as category “page” can automatically generate OCR and apply hit highlighting to search results – great for newspaper issues, documents, even headstone photos!
  • copyright holder statements can be automatically applied to serial publications 95 years old or younger (here’s how)
  • index records with links to digital pages will now display the linked page image in the details panel instead of the sidebar
  • personal information and cookies policy statements are now available for both VITA users and the public
  • apply “section” fields for non-newspaper pages e.g. Chapter headings
  • updated “help” for on-screen support (and correlating MAP updates)

Want to stay up to date with VITA Toolkit news? Use the subscription form on the home page of the VITA Help site.

Digitizing the Angelo Principe Italian-Canadian Newspaper Collection

Adapted from The ‘Angelo Principe’ Italian Canadian Newspaper Collection by Dr. Matteo Brera

Mastehad of La Vittoria (The Victory) Italian-Canadian newspaper

In 2014, researcher and scholar Dr. Angelo Principe donated his extensive newspaper and book collection to the Clara Thomas Archives and Special Collections of York University Libraries. The ‘Angelo Principe Collection’ includes materials entrusted to him for preservation by Italian Canadian activists from the first half of the twentieth century like Attilio Bortolotti and Benny Bottos, as well as the surviving documents belonging to Augusto Bersani, transnational political activist, facilitator and secret agent for the Royal Canadian Mounted Police (RCMP).

Six years later, a key part of the collection was digitized in a collaboration between Michael Moir, Head of the Clara Thomas Archives and Special Collections, and OurDigitalWorld, resulting in a unique online collection of rare interwar Italian-language newspapers published in North America. These include Il Bollettino Italo-Canadese, Il Cittadino Canadese, Il Giornale Italo-Canadese, Il Lavoratore, L’Araldo del Canada, L’Italia, L’Italia Nuova, L’Italo Canadese, L’Operaio Italo-Canadese, La Vittoria, La Voce degli Italo-Canadesi, and La Voce Operaia. The newspapers were processed using OurDigitalWorld’s multilingual Optical Recognition Software (OCR) and are full text searchable in both English and Italian.

The significance of this donation cannot be overstressed. Thanks to Michael Moir’s vision in working with OurDigitalWorld, and to Dr. Matteo Brera for his work adding rich contextual and descriptive metadata to the collection items, Dr. Principe’s legacy for the study of the construction of the Italian Canadian identity and transcultural exchanges between the Old and the New World is manifest in this online collection, providing an invaluable research tool to be used and enjoyed by scholars and the community.

Explore the collection at https://vitacollections.ca/yul-italiancanadiannewspapers/search

This research and digitization project was conceptualized and directed by Dr. Matteo Brera (mbrera@yorku.ca) and was made possible by generous funding from the Zorzi Family Italian-Canadian Archival Fund, established in 2017 and dedicated to encouraging the study of Italian-Canadian archival materials. The project was also sponsored by York University’s Faculty of Liberal Arts and Professional Studies.  

Daily British Whig 1902-1926 now online

OurDigitalWorld is excited to announce that the Daily British Whig from 1902-1926 is online. The Frontenac Heritage Foundation undertook the project to digitize this significant set of community news, covering the first of the World Wars, and make the papers available as part of the larger Kingston newspaper collection hosted by the Kingston Frontenac Public Library.

With the addition of these almost 90,000 pages, the online Kingston newspaper collection has doubled and now ranges more than 100 years, from 1810-1926. The Digital Kingston VITA Toolkit site at http://vitacollections.ca/digital-kingston/search allows users to search by keyword and facet results to sort or narrow them by date, publication, and more.

Daily British Whig October 9, 1909

OurDigitalWorld worked with Library and Archives Canada via the Canadian Research Knowledge Network to access and digitize the microfilm copies, and with University of Windsor to achieve high quality positional OCR processing. The newspapers are uploaded into the VITA Digital Toolkit for search and display with full text search and hit highlighted results. Frontenac Heritage Foundation member John Grenville used the new primary materials to research a local architect Ernest Beckwith, designer of the Orpheum Theatre in Kingston, and returned very specific results.

ODW, Kingston Frontenac Public Library and the Frontenac Heritage Foundation encourage genealogists, students, and other researchers’ use and exploration of this important set of newspapers. To read the full press release and for contact information regarding the project, click here.

Featured Image courtesy of Maritime History of the Great Lakes Digital collection