Machine Learning and AI Implementation in Research Institutions

Designation Number:

CAN/DGSI 128

Standard Type:

National Standard of Canada - Domestic

Standard Development Activity:

New Standard

ICS code(s):

03.100.02; 35.020; 35.240.01

Status:

Proceeding to development

SDO Comment Period Start Date:

2024-05-22

SDO Comment Period End Date:

2024-06-12

Posted On:

2024-05-14

Scope:

This proposed standard aims to specify minimum requirements for the implementation of machine learning (ML) and generative artificial intelligence (GAI) systems within the internal systems of libraries, archives, and research institutions at Canadian universities. This includes systems such as library catalogue search, catalogue management, internal evaluation tools, research aides, and other systems that handle data and information for the purpose of supporting, storing, and/or conducting research.

Project need:

Despite the extensive international and national discussions about AI ethics and policy that have taken place over the past year, the higher education and research sector has been largely left out of that discussion. Any exploration of policy for higher education has focused primarily on academic integrity issues. Discussions of research policy have focused on the potential risks of developing powerful large language models and frontier AI models rather than exploring the risks of using existing or future AI and ML tools within the internal systems of research institutions.

However, implementation of automated and other ML systems in research libraries and other repositories of research data and information poses a significant risk to the reliability and security of information systems, while also posing a significant opportunity to maximize and enhance research capabilities for a changing and dynamic future research landscape. It is also important to note that major library services software companies, and the companies that control the majority of copyrighted materials licensed to Canadian universities, are multinationals based in the USA, causing a major reliance of the Canadian research system on private US companies.

This standard will ensure information reliability and security within Canada’s leading research institutions to support a future of research excellence and sustainability; combat disinformation, data poisoning, and other threats to the consistency and reliability of information that could be inserted into information systems through imperfect foundations models or through ML and AI malfunctions; promote an environment of experimentation and innovation in methods and systems within research institutions which will build a community of practice with organically developed norms that can inform future, more robust standards for emerging risk areas; promote experimentation and the development of internal tools will support the protection of the Canadian market from the threat of external multinationals dominating the AI policy space for higher education through a lack of other options; decrease reliance on external monopolies of copyright licensing from the USA and help to promote open research, open-source information sharing, open access agreements that benefit Canadians and research in general; and support the Canadian government’s stated goals of promoting research security for sensitive IP. Although this standard refers to systems that are not directly connected to sensitive IP development data governance, this system-wide focus will support the resilience, reliability, and autonomy of Canadian research systems which will in turn support research security in more sensitive domains.

This is a national standard which will also set a precedent and foundation for international standards as there has yet to be any international work on this pressing issue. The international and interconnected nature of research networks means that expanding from national to international standardization will be imperative. There is no existing standard that deals directly with this issue. There are many contexts in which existing data governance regimes, copyright licenses, privacy legislation and other existing standards or policies might be relevant, but this standard will bring those together where necessary. This includes sensitive information and data such as collections pertaining to vulnerable communities and/or first nations communities, building off existing efforts in universities and research libraries to better address issues of data sovereignty and ethical stewardship over first nations and other community materials. This standard will draw from and support the existing requirement that all Canadian librarians must have American Library Association (ALA) certification but will not propose a new certification process.

This initial standard would serve as a base to establish guidance at a time when many norms and practices, as well as innovation, are emerging. This may expand into other standards after based on emerging needs, but it is unlikely that this will be replaced as it is intended to be a baseline guidance standard. This standard will be periodically maintained to monitor a changing AI and ML landscape. This standard will be available in both of Canada’s official languages.

Note: The information provided above was obtained by the Standards Council of Canada (SCC) and is provided as part of a centralized, transparent notification system for new standards development. The system allows SCC-accredited Standards Development Organizations (SDOs), and members of the public, to be informed of new work in Canadian standards development, and allows SCC-accredited SDOs to identify and resolve potential duplication of standards and effort.

Individual SDOs are responsible for the content and accuracy of the information presented here. The text is presented in the language in which it was provided to SCC.